Writing append only gzipped log files in Python

2024/10/18 12:40:35

I am building a service where I log plain text format logs from several sources (one file per source). I do not intend to rotate these logs as they must be around forever.

To make these forever around files smaller I hope I could gzip them in fly. As they are log data, the files compress very well.

What is a good approach in Python to write append-only gzipped text files, so that the writing can be later resumed when service goes on and off? I am not that worried about losing few lines, but if gzip container itself breaks down and the file becomes unreadable that's no no.

Also, if it's no go, I can simply write them in as plain text without gzipping if it's not worth of the hassle.

Answer

Note: On unix systems you should seriously consider using an external program, written for this exact task:

  • logrotate (rotates, compresses, and mails system logs)

You can set the number of rotations so high, that the first file would be deleted in 100 years or so.


In Python 2, logging.FileHandler takes an keyword argument encoding that can be set to bz2 or zlib.

This is because logging uses the codecs module, which in turn treats bz2 (or zlib) as encoding:

>>> import codecs
>>> with codecs.open("on-the-fly-compressed.txt.bz2", "w", "bz2") as fh:
...     fh.write("Hello World\n")$ bzcat on-the-fly-compressed.txt.bz2 
Hello World

Python 3 version (although the docs mention bz2 as alias, you'll actually have to use bz2_codec - at least w/ 3.2.3):

>>> import codecs
>>> with codecs.open("on-the-fly-compressed.txt.bz2", "w", "bz2_codec") as fh:
...     fh.write(b"Hello World\n")$ bzcat on-the-fly-compressed.txt.bz2 
Hello World
https://en.xdnf.cn/q/73078.html

Related Q&A

How to configure bokeh plot to have responsive width and fixed height

I use bokeh embedded via the components function. Acutally I use :plot.sizing_mode = "scale_width"Which scales according to the width and maintains the aspect ratio. But I would like to have …

Matplotlib show multiple images with for loop [duplicate]

This question already has an answer here:Can I generate and show a different image during each loop?(1 answer)Closed 8 years ago.I want to display multiple figure in Matplotlib. Heres my code:for i in…

How do I efficiently fill a file with null data from python?

I need to create files of arbitrary size that contain no data. The are potentially quite large. While I could just loop through and write a single null character until Ive reached the file size, that s…

Setting specific permission in amazon s3 boto bucket

I have a bucket called ben-bucket inside that bucket I have multiple files. I want to be able to set permissions for each file URL. Im not too sure but Im assuming if I wanted URL for each file inside …

Create new column in dataframe with match values from other dataframe

Have two dataframes, one has few information (df1) and other has all data (df2). What I am trying to create in a new column in df1 that finds the Total2 values and populates the new column accordingly…

MYSQL- python pip install error

I tried to get build an app on Django and I wanted to use MySQL as the database. After setting up the settings.py right, I tried to migrate. Then I got the obvious error saying that MySQL is not instal…

How to do a boxplot with individual data points using seaborn

I have a box plot that I create using the following command: sns.boxplot(y=points_per_block, x=block, data=data, hue=habit_trial)So the different colors represent whether the trial was a habit trial or…

Load QDialog directly from UI-File?

I work with QT Designer and create my GUIs with it. To launch the main program, I use this code:import sys from PyQt4 import uic, QtGui, QtCore from PyQt4.QtGui import * from PyQt4.QtCore import *try:_…

Is there a way to detect if running code is being executed inside a context manager?

As the title states, is there a way to do something like this:def call_back():if called inside context:print("running in context")else:print("called outside context")And this would …

Adding title to the column of subplot below suptitle

Is there a simple way to add in to my original code so that I can add another title to both column of my subplot? for example like somewhere in the pink region shown in the picture below.Someone refer…