Pandas: select the first couple of rows in each group

2024/9/27 12:13:25

I can't solve this simple problem and I'm asking for help here... I have DataFrame as follows and I want to select the first two rows in each group of 'a'

df = pd.DataFrame({'a':pd.Series(['NewYork','NewYork','NewYork','Washington','Washington','Texas','Texas','Texas','Texas']), 'b': np.arange(9)})df
Out[152]: a  b
0     NewYork  0
1     NewYork  1
2     NewYork  2
3  Washington  3
4  Washington  4
5       Texas  5
6       Texas  6
7       Texas  7
8       Texas  8

that is, I want an output as follows:

            a  b
0     NewYork  0
1     NewYork  1
2  Washington  3
3  Washington  4
4       Texas  5
5       Texas  6

Thanks a lot for the help.

Answer

In pandas 0.13rc, you can do this directly using head (i.e. no need to reset_index):

In [11]: df.groupby('id', as_index=False).head(2)
Out[11]: id   value
0    1   first
1    1  second
3    2   first
4    2  second
5    3   first
6    3   third
9    4  second
10   4   fifth
11   5   first
12   6   first
13   6  second
15   7  fourth
16   7   fifth[13 rows x 2 columns]

Note: the correct indices, and this is significantly faster than before (with or without reset_index) even with this small example:

# 0.13rc
In [21]: %timeit df.groupby('id', as_index=False).head(2)
1000 loops, best of 3: 279 µs per loop# 0.12
In [21]: %timeit df.groupby('id', as_index=False).head(2)  # this didn't work correctly
1000 loops, best of 3: 1.76 ms per loopIn [22]: %timeit df.groupby('id').head(2).reset_index(drop=True)
1000 loops, best of 3: 1.82 ms per loop
https://en.xdnf.cn/q/71457.html

Related Q&A

Pandas: Approximate join on one column, exact match on other columns

I have two pandas dataframes I want to join/merge exactly on a number of columns (say 3) and approximately, i.e nearest neighbour, on one (date) column. I also want to return the difference (days) betw…

Adding a variable in Content disposition response file name-python/django

I am looking to add a a variable into the file name section of my below python code so that the downloaded files name will change based on a users input upon download. So instead of "Data.xlsx&quo…

TkInter: understanding unbind function

Does TkInter unbind function prevents the widget on which it is applied from binding further events to the widget ?Clarification:Lets say I bound events to a canvas earlier in a prgram:canvas.bind(&qu…

Dynamically get dict elements via getattr?

I want to dynamically query which objects from a class I would like to retrieve. getattr seems like what I want, and it performs fine for top-level objects in the class. However, Id like to also specif…

How do I copy an image from the output in Jupyter Notebook 7+?

Ive been working with Jupyter Notebooks for quite a while. When working with visualisations, I like to copy the output image from a cell by right clicking the image and selecting "Copy Image"…

How to join 2 dataframe on year and month in Pandas?

I have 2 dataframe and I want to join them on the basis of month and year from a date without creating extra columns:example :df1 :date_1 value_1 2017-1-15 20 2017-1-31 30 2016-2-15 20df2…

Sorting Python Dictionary based on Key? [duplicate]

This question already has answers here:How do I sort a dictionary by key?(33 answers)Closed 10 years ago.I have created a python dictionary which has keys in this form :11, 10, 00, 01, 20, 21, 31, 30T…

Flask: Template in Blueprint Inherit from Template in App?

Im a total Flask/Jinja2 newbie, so maybe Im overlooking something obvious, but:Shouldnt Flask, out of the box, allow a template that exists in a blueprints templates/ folder to extend a base template d…

Equivalent of python2 chr(int) in python3

# python2 print(chr(174)) ?# python3 print(chr(174)) Im looking for the equivalent of chr() from python2. I believe this is due to python 3 returning unicode characters rather than ASCII.

How To Pagination Angular2 with Django Rest Framework API

I am trying to create a simple blog application using Angular2 with Django Rest Framework. I am implementing pagination in Django, but I do not know how to rendering it in Angular.API has the following…