Pandas: Use iterrows on Dataframe subset

Pandas: Use iterrows on Dataframe subset

2024/10/15 15:23:18

What is the best way to do iterrows with a subset of a DataFrame?

Let's take the following simple example:

import pandas as pddf = pd.DataFrame({'Product': list('AAAABBAA'),'Quantity': [5,2,5,10,1,5,2,3],'Start' : [DT.datetime(2013,1,1,9,0),DT.datetime(2013,1,1,8,5),DT.datetime(2013,2,5,14,0),DT.datetime(2013,2,5,16,0),DT.datetime(2013,2,8,20,0),                                      DT.datetime(2013,2,8,16,50),DT.datetime(2013,2,8,7,0),DT.datetime(2013,7,4,8,0)]})df = df.set_index(['Start'])

Now I would like to modify a subset of this DataFrame using the itterrows function, e.g.:

for i, row_i in df[df.Product == 'A'].iterrows():row_i['Product'] = 'A1' # actually a more complex calculation

However, the changes do not persist.

Is there any possibility (except a manual lookup using the index 'i') to make persistent changes on the original Dataframe ?

Answer

Why do you need iterrows() for this? I think it's always preferrable to use vectorized operations in pandas (or numpy):

df.ix[df['Product'] == 'A', "Product"] = 'A1'

https://en.xdnf.cn/q/69266.html

Related Q&A

Can I parameterize a pytest fixture with other fixtures?

I have a python test that uses a fixture for credentials (a tuple of userid and password)def test_something(credentials)(userid, password) = credentialsprint("Hello {0}, welcome to my test".f…

fit method in python sklearn

I am asking myself various questions about the fit method in sklearn.Question 1: when I do:from sklearn.decomposition import TruncatedSVD model = TruncatedSVD() svd_1 = model.fit(X1) svd_2 = model.fit(…

Django 1.9 JSONField update behavior

Ive recently updated to Django 1.9 and tried updating some of my model fields to use the built-in JSONField (Im using PostgreSQL 9.4.5). As I was trying to create and update my objects fields, I came a…

Using Tweepy to search for tweets with API 1.1

Ive been trying to get tweepy to search for a sring without success for the past 3 hours. I keep getting replied it should use api 1.1. I thought that was implemented... because I can post with tweepy.…

Retrieving my own data via FaceBook API

I am building a website for a comedy group which uses Facebook as one of their marketing platforms; one of the requirements for the new site is to display all of their Facebook events on a calendar.Cur…

Python -- Optimize system of inequalities

I am working on a program in Python in which a small part involves optimizing a system of equations / inequalities. Ideally, I would have wanted to do as can be done in Modelica, write out the equation…

Pandas side-by-side stacked bar plot

I want to create a stacked bar plot of the titanic dataset. The plot needs to group by "Pclass", "Sex" and "Survived". I have managed to do this with a lot of tedious nump…

Turn off list reflection in Numba

Im trying to accelerate my code using Numba. One of the arguments Im passing into the function is a mutable list of lists. When I try changing one of the sublists, I get this error: Failed in nopython …

Identifying large bodies of text via BeautifulSoup or other python based extractors

Given some random news article, I want to write a web crawler to find the largest body of text present, and extract it. The intention is to extract the physical news article on the page. The original p…

Pandas reindex and interpolate time series efficiently (reindex drops data)

Suppose I wish to re-index, with linear interpolation, a time series to a pre-defined index, where none of the index values are shared between old and new index. For example# index is all precise times…

Latest Q&A