Returning a row from a CSV, if specified value within the row matches condition

2024/9/8 10:56:22

Ahoy, I'm writing a Python script to filter some large CSV files.

I only want to keep rows which meet my criteria.

My input is a CSV file in the following format

Locus         Total_Depth  Average_Depth_sample   Depth_for_17
chr1:6484996  1030         1030                   1030
chr1:6484997  14           14                     14
chr1:6484998  0            0                      0

I want to return lines where the Total_Depth is 0.

I've been following this answer to read the data. But am stuck trying to parse over the rows and pull out the lines that meet my condition.

Here is the code I have so far:

import csvf = open("file path", 'rb')
reader = csv.reader(f) #reader object which iterates over a csv file(f)
headers = reader.next() #assign the first row to the headers variable
column = {} #list of columns
for h in headers: #for each headercolumn[h] = []
for row in reader: #for each row in the reader objectfor h, v in zip(headers, row): #combine header names with row values (v) in a series of tuplescolumn[h].append(v) #append each value to the relevant column

I understand that my data is now in a dictionary format, and I want to filter it based on the "Total_Depth" key, but I am unsure how to do this. I'm aiming to use an 'if' statement to select the relevant rows, but not sure how to do this with the dictionary structure.

Any advice would be greatly appreciated. SB :)

Answer

Use list comprehension.

import csvwith open("filepath", 'rb') as f:reader = csv.DictReader(f)rows = [row for row in reader if row['Total_Depth'] != '0']for row in rows:print row

DictReader

https://en.xdnf.cn/q/72425.html

Related Q&A

Python multiprocessing pool: dynamically set number of processes during execution of tasks

We submit large CPU intensive jobs in Python 2.7 (that consist of many independent parallel processes) on our development machine which last for days at a time. The responsiveness of the machine slows …

TypeError: cant escape psycopg2.extensions.Binary to binary

I try to store binary file into postgresql through sqlalchemy and file is uploaded from client. A bit google on the error message brings me to this source file:" wrapped object is not bytes or a…

Keras: Cannot Import Name np_utils [duplicate]

This question already has answers here:ImportError: cannot import name np_utils(19 answers)Closed 6 years ago.Im using Python 2.7 and a Jupyter notebook to do some basic machine learning. Im following…

Python 3 string index lookup is O(1)?

Short story:Is Python 3 unicode string lookup O(1) or O(n)?Long story:Index lookup of a character in a C char array is constant time O(1) because we can with certainty jump to a contiguous memory loca…

Using PIL to detect a scan of a blank page

So I often run huge double-sided scan jobs on an unintelligent Canon multifunction, which leaves me with a huge folder of JPEGs. Am I insane to consider using PIL to analyze a folder of images to detec…

Pandas: Filling data for missing dates

Lets say Ive got the following table:ProdID Date Val1 Val2 Val3 Prod1 4/1/2019 1 3 4 Prod1 4/3/2019 2 3 54 Prod1 4/4/2019 3 4 54 Prod2 4/1/2019 1 3 3…

Linear Regression: How to find the distance between the points and the prediction line?

Im looking to find the distance between the points and the prediction line. Ideally I would like the results to be displayed in a new column which contains the distance, called Distance.My Imports:impo…

How to draw a Tetrahedron mesh by matplotlib?

I want to plot a tetrahedron mesh by matplotlib, and the following are a simple tetrahedron mesh: xyz = np.array([[-1,-1,-1],[ 1,-1,-1], [ 1, 1,-1],[-1, 1,-1],[-1,-1, 1],[ 1,-1, 1], [ 1, 1, 1],[-1, 1, …

How to set seaborn jointplot axis to log scale

How to set axis to logarithmic scale in a seaborn jointplot? I cant find any log arguments in seaborn.jointplot Notebook import seaborn as sns import pandas as pddf = pd.read_csv("https://storage…

Convert decision tree directly to png [duplicate]

This question already has answers here:graph.write_pdf("iris.pdf") AttributeError: list object has no attribute write_pdf(10 answers)Closed 7 years ago.I am trying to generate a decision tree…