Pandas - Groupby and create new DataFrame?

2024/9/30 23:28:10

This is my situation -

In[1]: data
Out[1]: Item                    Type
0  Orange           Edible, Fruit
1  Banana           Edible, Fruit
2  Tomato       Edible, Vegetable
3  Laptop  Non Edible, ElectronicIn[2]: type(data)
Out[2]: pandas.core.frame.DataFrame

What I want to do is create a data frame of only Fruits, so I need to groupby such a way that Fruit exists in Type.

I've tried doing this:

grouped = data.groupby(lambda x: "Fruit" in x, axis=1)

I don't know if that's the way of doing it, I'm having a little tough time understanding groupby. How do I get a new DataFrame of only Fruits?

Answer

You could use

data[data['Type'].str.contains('Fruit')]

import pandas as pddata = pd.DataFrame({'Item':['Orange', 'Banana', 'Tomato', 'Laptop'],'Type':['Edible, Fruit', 'Edible, Fruit', 'Edible, Vegetable', 'Non Edible, Electronic']})
print(data[data['Type'].str.contains('Fruit')])

yields

     Item           Type
0  Orange  Edible, Fruit
1  Banana  Edible, Fruit
https://en.xdnf.cn/q/71026.html

Related Q&A

Can pytest fixtures and confest.py modules be shared across packages?

Suppose I have packageA which provides a class usefulClass, pytest fixtures in a test_stuff.py module, and test configurations in a conftest.py module. Moreover, suppose that I have packageBand package…

Sort a list by presence of items in another list

Suppose I have two lists:a = [30, 10, 90, 1111, 17] b = [60, 1201, 30, 17, 900]How would you sort this most efficiently, such that:list b is sorted with respect to a. Unique elements in b should be pla…

Sample from a multivariate t distribution python

I am wondering if there is a function for sampling from a multivariate student t-distribution in Python. I have the mean vector with 14 elements, the 14x14 covariance matrix and the degrees of freedom …

Why is this Jinja nl2br filter escaping brs but not ps?

I am attempting to implement this Jinja nl2br filter. It is working correctly except that the <br>s it adds are being escaped. This is weird to me because the <p>s are not being escaped and…

select certain monitor for going fullscreen with gtk

I intend to change the monitor where I show a fullscreen window. This is especially interesting when having a projector hooked up.Ive tried to use fullscreen_on_monitor but that doesnt produce any visi…

Load Excel add-in using win32com from Python

Ive seen from various questions on here that if an instance of Excel is opened from Python using:xl = win32com.client.gencache.EnsureDispatch(Excel.Application) xl.Visible = True wb = xl.Workbooks.Open…

iterating through a list removing items, some items are not removed

Im trying to transfer the contents of one list to another, but its not working and I dont know why not. My code looks like this:list1 = [1, 2, 3, 4, 5, 6] list2 = []for item in list1:list2.append(item)…

Apply function to create string with multiple columns as argument

I have a dataframe like this:name . size . type . av_size_type 0 John . 23 . Qapra . 22 1 Dan . 21 . nukneH . 12 2 Monica . 12 . kahless . 15I wa…

Popping items from a list using a loop in Python [duplicate]

This question already has answers here:Strange result when removing item from a list while iterating over it in Python(12 answers)Closed 3 months ago.Im trying to write a for loop in python to pop out …

Django Admin Media prefix URL issue

i ve the following folder structuresrc\BAT\templates\admin\base.html src\BAT\media\base.css src\BAT\media\admin-media\base.csssettings.pyMEDIA_ROOT = os.path.join( APP_DIR, media ) MEDIA_URL = /media/ …