Parsing CSV file using Panda

2024/9/20 12:38:56

I have been using matplotlib for quite some time now and it is great however, I want to switch to panda and my first attempt at it didn't go so well.

My data set looks like this:

sam,123,184,2.6,543
winter,124,284,2.6,541
summer,178,384,2.6,542
summer,165,484,2.6,544
winter,178,584,2.6,545
sam,112,684,2.6,546
zack,145,784,2.6,547
mike,110,984,2.6,548
etc.....

I want first to search the csv for anything with the name mike and create it own list. Now with this list I want to be able to do some math for example add sam[3] + winter[4] or sam[1]/10. The last part would be to plot it columns against each other.

Going through this page

http://pandas.pydata.org/pandas-docs/stable/io.html#io-read-csv-table

The only thing I see is if I have a column header, however, I don't have any headers. I only know the position in a row of the values I want.

So my question is:

  1. How do I create a bunch of list for each row (sam, winter, summer)
  2. Is this method efficient if my csv has millions of data point?
  3. Could I use matplotlib plotting to plot pandas dataframe?

ie :

fig1 = plt.figure(figsize= (10,10))
ax = fig1.add_subplot(211)
ax.plot(mike[1], winter[3], label='Mike vs Winter speed', color = 'red')
Answer

You can read a csv without headers:

data=pd.read_csv(filepath, header=None)

Columns will be numbered starting from 0. Selecting and filtering:

all_summers = data[data[0]=='summer']

If you want to do some operations grouping by the first column, it will look like this:

data.groupby(0).sum()
data.groupby(0).count()
...

Selecting a row after grouping:

sums = data.groupby(0).sum()
sums.loc['sam']

Plotting example:

 sums.plot()import matplotlib.pyplot as pltplt.show()

For more details about plotting, see: http://pandas.pydata.org/pandas-docs/version/0.18.1/visualization.html

https://en.xdnf.cn/q/119370.html

Related Q&A

Getting division by zero error with Python and OpenCV

I am using this code to remove the lines from the following image:I dont know the reason, but it gives me as output ZeroDivisionError: division by zero error on line 34 - x0, x1, y0, y1 = (0, im_wb.sha…

Pandas complex calculation based on other columns

I have successfully created new columns based on arithmetic for other columns but now I have a more challenging need to first select elements based on matches of multiple columns then perform math and …

how to generate word from a to z [closed]

Its difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying thi…

How to webscrape all shoes on nike page using python

I am trying to webscrape all the shoes on https://www.nike.com/w/mens-shoes-nik1zy7ok. How do I scrape all the shoes including the shoes that load as you scroll down the page? The exact information I …

Pyo in Python: name Server not defined

I recently installed Pyo, and I entered Python 3.6 and typedfrom pyo import * s = Server().boot() s.start() sf = SfPlayer("C:\Users\myname\Downloads\wot.mp3", speed=1, loop=True).out()but I …

Limited digits with str.format(), and then only when they matter

If were printing a dollar amount, we usually want to always display two decimal digits.cost1, cost2 = 123.456890123456789, 357.000 print {c1:.2f} {c2:.2f}.format(c1=cost1, c2=cost2)shows123.46 357.00…

How is covariance implemented internally in numpy?

This is the definition of a covariance matrix. http://en.wikipedia.org/wiki/Covariance_matrix#DefinitionEach element in the matrix, except in the principal diagonal, (if I am not wrong) simplifies to E…

Pulling excel rows to display as a grid in tkinter

I am imaging fluorescent cells from a 384-well plate and my software spits out a formatted excel analysis of the data (16 rowsx24 columns of images turns into a list of data, with 2 measurements from e…

Django Migrating DB django.db.utils.ProgrammingError: relation django_site does not exist

Doing a site upgrade for Django, now pushing it to the server when I try python manage.py makemigrations I get this error (kpsga) sammy@kpsga:~/webapps/kpsga$ python manage.py makemigrations Traceback …

list intersection algorithm implementation only using python lists (not sets)

Ive been trying to write down a list intersection algorithm in python that takes care of repetitions. Im a newbie to python and programming so forgive me if this sounds inefficient, but I couldnt come …