How to read complex data from TB size binary file, fast and keep the most accuracy?

2024/11/15 18:25:50

Use Python 3.9.2 read the beginning of TB size binary file (piece of it) as below:

file=open(filename,'rb')
bytes=file.read(8)
print(bytes)
b'\x14\x00\x80?\xb5\x0c\xf81'

I tried np.fromfile np.fromfile(np.complex64) ways to read the file filename.

float_data1 = np.fromfile(filename,np.float32)
float_data2 = np.fromfile(filename,np.complex64)

As the binary file always bigger than 500GB,even TB size,how to read complex data from TB size binary file, fast and keep the most acuuracy?

Answer

This is related to your ham post.

samples = np.fromfile(filename, np.complex128)

and

Those codes equal to -1.9726906072368233e-31,+3.6405886029665884e-23.

No, they don't equal that. That's just your interpretation of bytes as float64. That interpretation is incorrect!

You assume these are 64-bit floating point numbers. They are not; you really need to stop assuming that; it's wrong, and we can't help you if you still act as if it were 64-bit floats forming a 128 bit complex value.

Besides documents,I compare the byte content in the answer,that is more than reading docs.

As I already pointed out, that is wrong. Your computer can read anything as any type, just as you tell them, even if it's not the original type it's been stored in. You stored complex64, but read complex128. That's why your values are so inplausible.

It's 32-bit floats, forming a 64 bit complex value. The official block documentation for the file sink also points that out, and even explains the numpy dtype you need to use!

Anyways, you can use numpy's memmap functionality to map the file contents without reading them all to RAM. That works. Again, you need to use the right dtype, which is, to repeat this the 10th time, not complex128.

It's really easy:

data = numpy.memmap(filename, dtype=numpy.complex64)

done.

https://en.xdnf.cn/q/119628.html

Related Q&A

How to get spans text without inner attributes text with selenium?

<span class="cname"><em class="multiple">2017</em> Ford </span> <span class="cname">Toyota </span>I want to get only "FORD" …

List of 2D arrays with different size into 3D array [duplicate]

This question already has answers here:How do you create a (sometimes) ragged array of arrays in Numpy?(2 answers)Closed last year.I have a program that generating 2D arrays with different number of r…

How can I read data from database and show it in a PyQt table

I am trying to load data from database that I added to the database through this code PyQt integration with Sqlalchemy .I want the data from the database to be displayed into a table.I have tried this …

Python: Cubic Spline Regression for a time series data

I have the data as shown below. I want to find a CUBIC SPLINE curve that fits the entire data set (link to sample data). Things Ive tried so far:Ive gone through scipys Cubic Spline Functions, but all …

python CSV , find max and print the information

My aim is to find the max of the individual column and print out the information. But there is problem when I print some of the information. For example CSIT135, nothing was printed out. CSIT121 only p…

Error on python3 on windows subsystem for linux for fenics program

Im just starting to use fenics in python3 on windows subsystem ubuntu, and when I open the first titurial file I got this error. Solving linear variational problem. Traceback (most recent call last): …

python regex: how to remove hex dec characters from string [duplicate]

This question already has answers here:What does a leading `\x` mean in a Python string `\xaa`(2 answers)Closed 8 years ago.text="\xe2\x80\x94" print re.sub(r(\\(?<=\\)x[a-z0-9]{2})+,&quo…

Iterating through list and getting even and odd numbers

yet one more exercise that I seem to have a problem with. Id say Ive got it right, but Python knows better. The body of the task is:Write a function that takes a list or tuple of numbers. Return a two-…

Cannot import tensorflow-gpu

I have tried to import tensorflow-gpu and Im getting the same error with different versions of CUDA and cuDNN. My GPU is compatible with CUDA and I have no problems installing but when I try to import …

comparing two Dataframe columns to check if they have same value in python

I have two dataframes,new1.Name city0 sri won chn1 pechi won pune2 Ram won mum0 pec won keralanew3req 0 pec 1 mutI tried, mask=new1.Name.str.contains("|".join(…