How to remove rows of a DataFrame based off of data from another DataFrame?

2024/9/24 17:14:33

I'm new to pandas and I'm trying to figure this scenario out: I have a sample DataFrame with two products. df =

  Product_Num     Date   Description  Price 10    1-1-18   Fruit Snacks  2.9910    1-2-18   Fruit Snacks  2.9910    1-5-18   Fruit Snacks  1.9910    1-8-18   Fruit Snacks  1.9910    1-10-18  Fruit Snacks  2.9945    1-1-18         Apples  2.99 45    1-3-18         Apples  2.9945    1-5-18         Apples  2.9945    1-9-18         Apples  1.4945    1-10-18        Apples  1.4945    1-13-18        Apples  1.4945    1-15-18        Apples  2.99 

I also have another small DataFrame that looks like this (which shows promotional prices of the same products): df2=

  Product_Num   Price 10    1.9945    1.49 

Notice that df2 does not contain columns 'Date' nor 'Description.' What I want to do is to remove all promo prices from df1 (for all dates that are on promo), using the data from df1. What is the best way to do this?

So, I want to see this:

  Product_Num     Date   Description  Price 10    1-1-18   Fruit Snacks  2.9910    1-2-18   Fruit Snacks  2.9910    1-10-18  Fruit Snacks  2.9945    1-1-18         Apples  2.99 45    1-3-18         Apples  2.9945    1-5-18         Apples  2.9945    1-15-18        Apples  2.99 

I was thinking of doing a merge on columns Price and Product_Num, then seeing what I can do from there. But I was getting confused because of the multiple dates.

Answer

isin with &

df.loc[~((df.Product_Num.isin(df2['Product_Num']))&(df.Price.isin(df2['Price']))),:]
Out[246]: Product_Num     Date  Description  Price
0            10   1-1-18  FruitSnacks   2.99
1            10   1-2-18  FruitSnacks   2.99
4            10  1-10-18  FruitSnacks   2.99
5            45   1-1-18       Apples   2.99
6            45   1-3-18       Apples   2.99
7            45   1-5-18       Apples   2.99
11           45  1-15-18       Apples   2.99

Update

df.loc[~df.index.isin(df.merge(df2.assign(a='key'),how='left').dropna().index)]
Out[260]: Product_Num     Date  Description  Price
0            10   1-1-18  FruitSnacks   2.99
1            10   1-2-18  FruitSnacks   2.99
4            10  1-10-18  FruitSnacks   2.99
5            45   1-1-18       Apples   2.99
6            45   1-3-18       Apples   2.99
7            45   1-5-18       Apples   2.99
11           45  1-15-18       Apples   2.99
https://en.xdnf.cn/q/71679.html

Related Q&A

Amazon S3 Python S3Boto 403 Forbidden When Signature Has + sign

I am using Django and S3Boto and whenever a signature has a + sign in it, I get a 403 Forbidden. If there is no + sign in the signature, I get the resource just fine. What could be wrong here?UPDATE: …

List comparison of element

I have a question and it is a bit hard for me to explain so I will be using lots of examples to help you all understand and see if you could help me.Say I have two lists containing book names from best…

Partition pyspark dataframe based on the change in column value

I have a dataframe in pyspark. Say the has some columns a,b,c... I want to group the data into groups as the value of column changes. SayA B 1 x 1 y 0 x 0 y 0 x 1 y 1 x 1 yThere will be 3 grou…

Error group argument must be None for now in multiprocessing.pool

Below is my python script.import multiprocessing # We must import this explicitly, it is not imported by the top-level # multiprocessing module. import multiprocessing.pool import timefrom random impor…

Making the diamond square fractal algorithm infinite

Im trying to generate an infinite map, as such. Im doing this in Python, and I cant get the noise libraries to correctly work (they dont seem to ever find my VS2010, and doing it in raw Python would be…

How do I generate coverage xml report for a single package?

Im using nose and coverage to generate coverage reports. I only have one package right now, ae, so I specify to only cover that: nosetests -w tests/unit --with-xunit --with-coverage --cover-package=aeA…

Asynchronous URLfetch when we dont care about the result? [Python]

In some code Im writing for GAE I need to periodically perform a GET on a URL on another system, in essence pinging it and Im not terribly concerned if the request fails, times out or succeeds.As I bas…

Python: How to fill out form all at once with splinter/Browser?

Currently, I’m filling out the form on a site with the following:browser.fill(‘form[firstname]’, ‘Mabel’) browser.fill(‘form[email]’, ‘[email protected]’) browser.select(‘form[color]’, ‘yel…

Dump elementtree into xml file

I created an xml tree with something like thistop = Element(top) child = SubElement(top, child) child.text = some texthow do I dump it into an XML file? I tried top.write(filename), but the method doe…

Crash reporting in Python

Is there a crash reporting framework that can be used for pure Python Tkinter applications? Ideally, it should work cross-platform.Practically speaking, this is more of exception reporting since the P…