I have a very large big-endian binary file. I know how many numbers in this file. I found a solution how to read big-endian file using struct and it works perfect if file is small:
data = []file = open('some_file.dat', 'rb')for i in range(0, numcount)data.append(struct.unpack('>f', file.read(4))[0])
But this code works very slow if file size is more than ~100 mb. My current file has size 1.5gb and contains 399.513.600 float numbers. The above code works with this file an about 8 minutes.
I found another solution, that works faster:
datafile = open('some_file.dat', 'rb').read()f_len = ">" + "f" * numcount #numcount = 399513600numbers = struct.unpack(f_len, datafile)
This code runs in about ~1.5 minute, but this is too slow for me. Earlier I wrote the same functional code in Fortran and it run in about 10 seconds.
In Fortran I open the file with flag "big-endian" and I can simply read file in REAL array without any conversion, but in python I have to read file as a string and convert every 4 bites in float using struct. Is it possible to make the program run faster?