loading my data in numpy genfromtxt get errors

2024/9/20 20:40:59

I have my data file contain 7500 lines with :

Y1C 1.53    -0.06   0.58    0.52    0.42    0.16    0.79        -0.6    -0.3    
-0.78   -0.14   0.38    0.34    0.23    0.26    -1.8    -0.1    -0.17   0.3 
0.6 0.9 0.71    0.5 0.49    1.06    0.25    0.96    -0.39   0.24    0.69    
0.41    0.7 -0.16   -0.39   0.6 1.04    0.4 -0.04   0.36    0.23    -0.14   
-0.09   0.15    -0.46   -0.05   0.32    -0.54   -0.28   -0.15   1.34    0.29    
0.59    -0.43   -0.55   -0.18   -0.01   0.68        -0.06   -0.11   -0.67                   
-0.25   -0.34   -0.38   0.02    -0.21   0.12    0.01    0.07    0.15    0.14                
0.15    -0.11   0.07    -0.41   -0.2    0.24    0.06    0.12    0.12    0.11    
0.1 0.24    -0.71   0.22    -0.02   0.15    0.84    1.39    0.13    0.48    
0.19    -0.23   -0.12   0.33    0.37    0.18    0.06    0.32    0.09    
-0.09   0.02    -0.01   -0.06   -0.23   0.52    0.14    0.24    -0.05   0.37    
0.1 0.45    0.38    1.34    0.74    0.5 0.92    0.91    1.34    1.78    2.26    
0.05    0.29    0.53    0.17    0.41    0.47    0.47    1.21    0.87    0.68    
1.08    0.89    0.13    0.5 0.57    -0.5    -0.78   -0.34   -0.3    0.54    
0.31    0.64    1.23    0.335   0.36    -0.65   0.39    0.39    0.31    0.73    
0.54    0.3 0.26    0.47    0.13    0.24    -0.6    0.02    0.11    0.27    
0.21    -0.3    -1  -0.44   -0.15   -0.51   0.3 0.14    -0.15   -0.27   -0.27Y2W -0.01   -0.3    0.23    0.01    -0.15   0.45    -0.04   0.14    -1.16   
-0.14   -0.56   -0.13   0.77    0.77    -0.57   0.48    0.22    -0.08   
-1.81   -0.46   -0.17   0.2 -0.18   -0.45   -0.4    1.35    0.81    1.21    
0.52    0.02    -0.06   0.37    0   -0.38   -0.02   0.48    0   0.58    0.81    
0.54    0.18    -0.11   0.03    0.1 -0.38   0.17    0.37    -0.05   0.13    
-0.01   -0.17   0.36    0.22    0   -1.4    -0.67   -0.45   -0.62   -0.58   
-0.47   -0.86   -1.12   -0.43   0.1 0.06    -0.45   -0.14   0.68    -0.16      
0.14    0.14    0.18    0.14    0.17    0.13    0.07    0.05    0.04    0.07    
-0.01   0.03    0.05    0.02    0.12    0.34    -0.04   -0.75   1.68    0.23    
0.49    0.38    -0.57   0.17    -0.04   0.19    0.22    0.29    -0.04   -0.3    
0.18    0.04    0.3 -0.06   -0.07   -0.21   -0.01   0.51    -0.04   -0.04   
-0.23   0.06    0.9 -0.14   0.19    2.5 2.84    3.27    2.13    2.5 2.66    
4.16    3.52    -0.12   0.13    0.44    0.32    0.44    0.46    0.7 0.68    
0.99    0.83    0.74    0.51    0.33    0.22    0.01    0.33    -0.19   0.4 
0.41    0.07    0.18    -0.01   0.45    -0.37   -0.49   1.02    -0.59   
-1.44   -1.53   -0.74   -1.48   0.12    0.05    0.02    -0.1    0.57    
-0.36   0.1 -0.16   -0.23   -0.34   -0.61   -0.37   -0.14   -0.22   -0.27   
-0.08   -0.08   -0.17   0.18    -0.74Y3W 0.15    -0.07   -0.25   -0.3    -1.12   -0.67   -0.15   -0.43   0.63    
0.92    0.25    0.33    0.81    -0.12   -0.12   0.67    0.86    0.86        
1.54    -0.3    0   -0.29   -0.74   0.15    0.59    0.15    0.34    0.23    
0.5 0.52    0.25    0.86    0.53    0.51    0.25        -1.29   -1.79           
-0.45   -0.64   0.01    -0.58   -0.51   -0.74   -1.32               -0.47       
-0.81   0.55    -0.09   0.46    -0.3    -0.2    -0.81   -1.56   -2.74   1.03    
1   1.01    0.29    -0.64   -1.03   0.07    0.46    0.33    0.04    -0.6    
-0.64   -0.51   -0.36   -0.1    0.13    -1.4    -1.17   -0.64   -0.16   -0.5    
-0.47   0.75    0.62    0.7 1.06    0.93    0.56    -2.25   -0.94   -0.09   
0.08    -0.15   -1.6    -1.43   -0.84   -0.25   -1.22   -0.92   -1.22   
-0.97   -0.84   -0.89   0.24    0   -0.04   -0.64   -0.94   -1.56   -2.32   
0.63    -0.17   -3.06   -2.4    -2  -1.4    -0.81   -1.6    -3.06   -1.79   
0.17    0.28    -0.67   -2.82   -1.47   -1.82   -1.69   -1.38   -1.96   
-1.88   -2.34   -3.06   -0.18   0.5 -0.03   -0.49   -0.61   -0.54   -0.37   
0.1 -0.92   -1.79   -0.03   -0.54   0.94    -1  0.15    0.95    0.55    
-0.36   0.4 -0.73   0.85    -0.26   0.55    0.14    -0.36   0.38    0.87    
0.62    0.66    0.79    -0.67   0.48    0.62    0.48    0.72    0.73    0.29    
-0.3    -0.81Y4W 0.24    0.76    0.2 0.34    0.11    0.07    0.01    0.36    0.4 -0.25   
-0.45   0.09    -0.97   0.19    0.28    -1.81   -0.64   -0.49   -1.27   
-0.95   -0.1    0.12    -0.1    0   -0.08   0.77    1.02    0.92    0.56    
0.1 0.7 0.57    0.16    1.29    0.82    0.55    0.5 1.83    1.79    0.01    
0.24    -0.67   -0.85   -0.42   -0.37   0.2 0.07    -0.01   -0.17   -0.2    
-0.43   -0.34   0.12    -0.21   -0.23   -0.22   -0.1    -0.07   -0.61   
-0.14   -0.43   -0.97   0.27    0.7 0.54    0.11    -0.5    -0.39   0.01    
0.61    0.88    1   0.35    0.67    0.6 0.78    0.46    0.09    -0.06   
-0.35   0.08    -0.14   -0.32   -0.11   0   0.01    0.02    0.77    0.18    
0.36    -1.15   -0.42   -0.19   0.06    -0.25   -0.81   -0.63   -1.43   
-0.85   -0.88   -0.68   -0.59   -1.01   -0.68   -0.71   0.15    0.08    0.08    
-0.03   -0.2    0.03    -0.18   -0.01   -0.08   -1.61   -0.67   -0.74   
-0.54   -0.8    -1.02   -0.84   -1.91   -0.22   -0.02   0.05    -0.84   
-0.65   -0.82   -0.4    -0.83   -0.9    -1.04   -1.23   -0.91   0.28    0.68    
0.57    -0.02   0.4 -1.52   0.17    0.44    -1.18   0.04    0.17    0.16    
0.04    -0.26   0.04    0.1 -0.11   -0.64   -0.09   -0.16   0.16    -0.05   
0.39    0.39    -0.06   0.46    0.2 -0.45   -0.35   -1.3    -0.26   -0.29   
0.02    0.16    0.18    -0.35   -0.45   -1.04   -0.69Y5C 2.85    3.34                            -1  -0.47   -0.66   -0.03   1.41    
0.8 0   0.41    -0.14   -0.86   -0.79   -1.69       0           0   1.52    
1.29    0.84    0.58    1.02    1.35    0.45    1.02    1.47    0.82    0.46    
0.25    0.77    0.93            -0.58   -0.67   -0.18   -0.56   -0.01   0.25    
-0.71   -0.49           -0.43   0   -1.06   0.44    -0.29   0.26    -0.04   
-0.14   -0.1    -0.12   -1.6    0.33    0.62    0.52    0.7 -0.22   0.44    
-0.6    0.86    1.19    1.58    0.93    1   0.85    1.24    1.06    0.49    
0.26    0.18    0.3 -0.09   -0.42   0.05    0.54    0.24    0.37    0.86    
0.9 0.49    -1.47   -0.2    -0.43   0.2 0.1 -0.81   -0.74   -1.36   -0.97   
-0.94   -0.86   -1.56   -1.89   -1.89   -1.06   0.12    0.06    0.04    
-0.01   -0.12   0.01    -0.15   0.76    0.89    0.71    -1.12   0.03    
-0.86   0.26    0   -0.25   -0.06   0.19    0.41    0.58    -0.46   0.01    
-0.15   0.04    -1.01   -0.57   -0.71   -0.3    -1.01       1.83    0.59        
1.04    -1.43   0.38    0.65    -6.64   -0.42   0.24    0.46    0.96    0.24    
0.7 1.21    0.6 0.12    0.77    -0.03   0.53    0.31    0.46    0.51    
-0.45   0.23    0.32    -0.34   -0.1    0.1 -0.45   0.74    -0.06   0.21    
0.29    0.45    0.68    0.29    0.45Y7C -0.22   -0.12   -0.29   -0.51   -0.81   -0.47   0.28    -0.1    0.15    
0.38    0.18    -0.27   0.12    -0.15   0.43    0.25    0.19    0.33    0.67    
0.86    -0.56   -0.29   -0.36   -0.42   0.08    0.04    -0.04   0.15    0.38    
-0.07   -0.1    -0.2    -0.03   -0.29   0.06    0.65    0.58    0.86    2.05    
0.3 0.33    -0.29   -0.23   -0.15   -0.32   0.08    0.34    0.15    0   
-0.01   0.28    0.36    0.25    0.46    0.4     0.7 0.49    0.97    1.04    
0.36    -0.47   -0.29   0.77    0.57    0.45    0.77    0.24    -0.23   0.12    
0.49    0.62    0.49    0.84    0.89    1.08    0.87    -0.18   -0.43   
-0.39   -0.18   -0.02   0.01    0.2 -0.2    -0.03   0.01    0.25    0.1 
-0.07   -1.43   -0.2    -0.4    0.32    0.72    -0.42   -0.3    -0.38   
-0.22   -0.81   -1.15   -1.6    -1.89   -2.06   -2.4    0.08    0.34    0.1 
-0.15   -0.06   -0.17   -0.47   -0.4    0.15    -1.22   -1.43   -1.03   
-1.03   -1.64   -1.84   -2.64   -2  0.05    0.4 0.88    -1.54   -1.21   
-1.46   -1.92   -1.52   -1.92   -1.7    -1.94   -1.86   -0.1    -0.02   
-0.22   -0.34   -0.48   0.28    0   0.14    0.4 -0.29   -0.27   -0.3    
-0.67   -0.09   0.23    0.33    0.23    0.1 0.38    -0.51   0.23    -0.73   
0.22    -0.47   0.24    0.68    0.53    0.23    -0.1    0.11    -0.18   0.16    
0.68    0.55    0.28    -0.03   0.03    0.08    0.12

There is a missing value, I wanted to load it as matrix I used :

data = np.genfromtxt("This_data.txt", delimiter='\t', missing_values=np.nan)

When I print data I get :

Traceback (most recent call last):File "matrix.py", line 8, in <module>data = np.genfromtxt("This_data.txt", delimiter='\t', missing_values=np.nan ,usecols=np.arange(0,174))File "/home/anaconda2/lib/python2.7/numpy/lib/npyio.py", line 1769, in genfromtxtraise ValueError(errmsg)
ValueError: Some errors were detected !Line #25 (got 172 columns instead of 174)

I used to put:

data = np.genfromtxt("This_data.txt", delimiter='\t', missing_values=np.nan ,usecols=np.arange(0,174))

But I have same errors. Any suggestion?

Answer

A short sample bytestring substitute for a file:

In [168]: txt = b"""Y7C\t-0.22\t-0.12\t-0.29\t-0.51\t-0.81 ...: Y7C\t-0.22\t-0.12\t-0.29\t-0.51\t-0.81 ...: Y7C\t-0.22\t-0.12\t-0.29\t-0.51\t-0.81  ...: """

Minimal load with correct delimiter. Note the first column is nan, because it can't convert the strings to float.

In [169]: np.genfromtxt(txt.splitlines(),delimiter='\t')
Out[169]: 
array([[  nan, -0.22, -0.12, -0.29, -0.51, -0.81],[  nan, -0.22, -0.12, -0.29, -0.51, -0.81],[  nan, -0.22, -0.12, -0.29, -0.51, -0.81]])

with dtype=None it sets each column dtype automatically, creating a structured array:

In [170]: np.genfromtxt(txt.splitlines(),delimiter='\t',dtype=None)
Out[170]: 
array([(b'Y7C', -0.22, -0.12, -0.29, -0.51, -0.81),(b'Y7C', -0.22, -0.12, -0.29, -0.51, -0.81),(b'Y7C', -0.22, -0.12, -0.29, -0.51, -0.81)], dtype=[('f0', 'S3'), ('f1', '<f8'), ('f2', '<f8'), ('f3', '<f8'), ('f4', '<f8'), ('f5', '<f8')])

Spell out the columns to use, skipping the first:

In [172]: np.genfromtxt(txt.splitlines(),delimiter='\t',usecols=np.arange(1,6))
Out[172]: 
array([[-0.22, -0.12, -0.29, -0.51, -0.81],[-0.22, -0.12, -0.29, -0.51, -0.81],[-0.22, -0.12, -0.29, -0.51, -0.81]])

But if I ask for more columns that it finds I get an error, like yours:

In [173]: np.genfromtxt(txt.splitlines(),delimiter='\t',usecols=np.arange(1,7))
---------------------------------------------------------------------------
.... 
ValueError: Some errors were detected !Line #1 (got 6 columns instead of 6)Line #2 (got 6 columns instead of 6)Line #3 (got 6 columns instead of 6)

Your missing_values parameters doesn't help; that's the wrong use for that

This is the correct use of missing_values - to detect the string value and replace it with a valid float value:

In [177]: np.genfromtxt(txt.splitlines(),delimiter='\t',missing_values='Y7C',filling_val...: ues=0)
Out[177]: 
array([[ 0.  , -0.22, -0.12, -0.29, -0.51, -0.81],[ 0.  , -0.22, -0.12, -0.29, -0.51, -0.81],[ 0.  , -0.22, -0.12, -0.29, -0.51, -0.81]])

If the file has sufficient delimiters, it can treat those as missing values

In [178]: txt = b"""Y7C\t-0.22\t-0.12\t-0.29\t-0.51\t-0.81\t\t ...: Y7C\t-0.22\t-0.12\t-0.29\t-0.51\t-0.81\t\t ...: Y7C\t-0.22\t-0.12\t-0.29\t-0.51\t-0.81\t\t  ...: """
In [179]: np.genfromtxt(txt.splitlines(),delimiter='\t')
Out[179]: 
array([[  nan, -0.22, -0.12, -0.29, -0.51, -0.81,   nan,   nan],[  nan, -0.22, -0.12, -0.29, -0.51, -0.81,   nan,   nan],[  nan, -0.22, -0.12, -0.29, -0.51, -0.81,   nan,   nan]])
In [180]: np.genfromtxt(txt.splitlines(),delimiter='\t',filling_values=0)
Out[180]: 
array([[ 0.  , -0.22, -0.12, -0.29, -0.51, -0.81,  0.  ,  0.  ],[ 0.  , -0.22, -0.12, -0.29, -0.51, -0.81,  0.  ,  0.  ],[ 0.  , -0.22, -0.12, -0.29, -0.51, -0.81,  0.  ,  0.  ]])

I believe the pandas csv reader can handle 'ragged' columns and missing values better.

https://en.xdnf.cn/q/119534.html

Related Q&A

Install dlib with cuda support ubuntu 18.04

I have CUDA 9.0 and CUDNN 7.1 installed on Ubuntu 18.04(Linux mint 19). Tensorflow-gpu works fine on GPU(GTX 1080ti).Now i am trying to build dlib with CUDA support:sudo python3 setup.py install --yes …

View 3 dimensional Numpy array in Matplotlib and taking arguments from Keyboard or mouse

I have 3 dimensional data say (5,100,100). Now I would like to see them slice by slice upon hitting the down arrow button.

python default argument syntax error

I just wrote a small text class in python for a game written using pygame and for some reason my default arguments arent working. I tried looking at the python documentation to see if that might give m…

Variable not defined (Python)

FlightType=input("Which flight would you like to fly? Type 2 Seater, 4 Seater, or Historic.") # No validation included for the inputFlightLen=input("Would you like to book the 30 minu…

PyGame: draw.rect() has invalid parameters

Im trying to learn mouse events with PyGame, and Im trying to draw a box wherever the user clicks. Im setting a variable equal to pygame.mouse.get_pos(), and calling individual tuple members according …

Cant press enter in selenium2library

Im trying to make a test that will open Facebook, log in and search something. However Im having trouble getting Facebook to search. Selenium types whatever it needs in the search bar, but I cant find …

Converting string to datetime in Python using strptime

Im trying to convert the following String to datetime object in Python.datetime_object = datetime.strptime(Sat, 26 Nov 2016 15:17:00 +0000, %a, %b %d %Y %H:%c %z)I get the following error,File "&l…

Is there any differences between python2 and python3 about adding menu bar to Frame in tkinter?

Im trying to porting a project 2to3 on python, and stuck in tkinter.In python2, there is no problem with adding menu bar to Frame in tkinter,but python3 occured attribute error. (Frame object has no at…

Standard Input having weird characters with them in different programming lanuage

I am getting confused with the standard input of these programming languages : [Note:] I added details about so many programming languages as the problem with all of them is same in this matter and my …

How to turn a numpy array to a numpy object?

I have a NumPy array as follows: [[[ 0 0]][[ 0 479]][[639 479]][[639 0]]]and I would like to convert it into something like so: [( 0 0)( 0 479)(639 479)(639 0), dtype=dtype([(x, <i2), (y…