efficient way to change the header of a file in Python

2024/10/16 1:23:03

I am trying to write a python script to update the header (only the first line) of some huge files, but as the new header is not necessary to be the same size (in bytes) as the original one, is there anyway I could change the header without touching the rest of the huge file? or I have to read through them all and write them back to file?

Answer

No, the only operations you can do on files without touching the whole file are truncation, replacement of same size, and appending.

You can, however, buffer relatively small parts of the file and write them after you've read all data currently residing in the new position, to avoid memory exhaustion. If speed is an issue, consider using mmap.

https://en.xdnf.cn/q/69216.html

Related Q&A

Converting a numpy array of dtype objects to dtype complex

I have a numpy array which I want to convert from an object to complex. If I take that array as dtype string and convert it, there is no problem:In[22]: bane Out[22]: array([1.000027337501943-7.3310852…

Python ZeroMQ PUSH/PULL -- Lost Messages?

I am trying to use python with zeroMQ in PUSH / PULL mode, sending messages of size 4[MB] every few seconds.For some reason, while it looks like all the messages are sent, ONLY SOME of them appear to h…

Using object as key in dictionary in Python - Hash function

I am trying to use an object as the key value to a dictionary in Python. I follow the recommendations from some other posts that we need to implement 2 functions: __hash__ and __eq__ And with that, I a…

Compressing request body with python-requests?

(This question is not about transparent decompression of gzip-encoded responses from a web server; I know that requests handles that automatically.)ProblemIm trying to POST a file to a RESTful web serv…

pyspark row number dataframe

I have a dataframe, with columns time,a,b,c,d,val. I would like to create a dataframe, with additional column, that will contain the row number of the row, within each group, where a,b,c,d is a group k…

Python mysql-connector hangs indefinitely when connecting to remote mysql via SSH

I am Testing out connection to mysql server with python. I need to ssh into the server and establish a mysql connection. The following code works: from sshtunnel import SSHTunnelForwarder import pymysq…

Smooth the edges of binary images (Face) using Python and Open CV

I am looking for a perfect way to smooth edges of binary images. The problem is the binary image appears to be a staircase like borders which is very unpleasing for my further masking process. I am att…

Is there some way to save best model only with tensorflow.estimator.train_and_evaluate()?

I try retrain TF Object Detection API model from checkpoint with already .config file for training pipeline with tf.estimator.train_and_evaluate() method like in models/research/object_detection/model_…

Matching words with NLTKs chunk parser

NLTKs chunk parsers regular expressions can match POS tags, but can they also match specific words? So, suppose I want to chunk any structure with a noun followed by the verb "left" (call th…

How to create a dual-authentication HTTPS client in Python without (L)GPL libs?

Both the client and the server are internal, each has a certificate signed by the internal CA and the CA certificate. I need the client to authenticate the servers certificate against the CA certificat…