Reading csv header white space and case insensitive

2024/10/1 21:36:47

Is there a possibility to read the header of a CSV file white space and case insensitive? As for now I use csv.dictreader like this:

import csv
csvDict = csv.DictReader(open('csv-file.csv', 'rU'))# determine column_A name
if 'column_A' in csvDict.fieldnames:column_A = 'column_A'
elif ' column_A' in csvDict.fieldnames:# extra spacecolumn_A = ' column_A'
elif 'Column_A' in csvDict.fieldnames:# capital Acolumn_A = 'Column_A'# get column_A data
for lineDict in csvDict:print(lineDict[column_A])

As you can see from the code, my csv files sometimes differ in extra white space or capital letters, for example

  • "column_A"
  • " column_A"
  • "Column_A"
  • " Column_A"
  • ...

I want to use something like this:

    column_A = ' Column_A'.strip().lower()print(lineDict[column_A])

Any ideas?

Answer

You can redefine reader.fieldnames:

import csv
import iocontent = '''column_A " column_B"
1 2'''
reader = csv.DictReader(io.BytesIO(content), delimiter = ' ')
reader.fieldnames = [field.strip().lower() for field in reader.fieldnames]
for line in reader:print(line)

yields

{'column_b': '2', 'column_a': '1'}
https://en.xdnf.cn/q/70925.html

Related Q&A

How to remove the seconds of Pandas dataframe index?

Given a dataframe with time series that looks like this:Close 2015-02-20 14:00:00 1200.1 2015-02-20 14:10:00 1199.8 2015-02-21 14:00:00 1199.3 2015-02-21 14:10:00 1199.0 2015-02-22 14:00:00 1198.4…

Slow loading SQL Server table into pandas DataFrame

Pandas gets ridiculously slow when loading more than 10 million records from a SQL Server DB using pyodbc and mainly the function pandas.read_sql(query,pyodbc_conn). The following code takes up to 40-4…

compress a string in python 3?

I dont understand in 2.X it worked :import zlib zlib.compress(Hello, world)now i have a :zlib.compress("Hello world!") TypeError: must be bytes or buffer, not strHow can i compress my string …

How to set color of text using xlwt

I havent been able to find documentation on how to set the color of text. How would the following be done in xlwt?style = xlwt.XFStyle()# bold font = xlwt.Font() font.bold = True style.font = font# ba…

How to apply linregress in Pandas bygroup

I would like to apply a scipy.stats.linregress within Pandas ByGroup. I had looked through the documentation but all I could see was how to apply something to a single column like grouped.agg(np.sum)or…

Python Shared Memory Array, no attribute get_obj()

I am working on manipulating numpy arrays using the multiprocessing module and am running into an issue trying out some of the code I have run across here. Specifically, I am creating a ctypes array f…

What is a qualified/unqualified name in Python?

In Python: what is a "qualified name" or "unqualified name"?Ive seen it mentioned a couple of times, but no explanation as to what it is.

Python code explanation for stationary distribution of a Markov chain

I have got this code: import numpy as np from scipy.linalg import eig transition_mat = np.matrix([[.95, .05, 0., 0.],\[0., 0.9, 0.09, 0.01],\[0., 0.05, 0.9, 0.05],\[0.8, 0., 0.05, 0.15]])S, U = eig(tr…

Detecting insertion/removal of USB input devices on Windows 10

I already have some working Python code to detect the insertion of some USB device types (from here).import wmiraw_wql = "SELECT * FROM __InstanceCreationEvent WITHIN 2 WHERE TargetInstance ISA \W…

FastAPI as a Windows service

I am trying to run FastAPI as a windows service.Couldnt find any documentation or any article to run Uvicorn as a Windows service. I tried using NSSM as well but my windows service stops.