Question 1

I have a dataframe dfas:

  Election Year     Votes   Vote %      Party              Region   
0   2000            42289   29.40   Janata Dal (United)     A
1   2000            27618   19.20   Rashtriya Janata Dal    A
2   2000            20886   14.50   Bahujan Samaj Party     B 
3   2000            17747   12.40   Congress                B
4   2000            14047   19.80   Independent             C
5   2000            17047   10.80   JLS                     C
6   2005            8358    15.80   Janvadi Party           A
7   2005            4428    13.10   Independent             A
8   2005            1647    1.20    Independent             B
9   2005            1610    11.10   Independent             B
10  2005            1334    15.06   Nationalist             C
11  2005            1834    18.06   NJM                     C
12  2010            21114   20.80   Independent             A
13  2010            1042    10.5    Bharatiya Janta Dal     A
14  2010            835     0.60    Independent             B
15  2010            14305   15.50   Independent             B
16  2010            22211   17.70   Congress                C
16  2010            20011   14.70   INC                     C

How can I get the list of the regions that have two or more parties getting more than vote % greater than 10 every election year?

Desired output:

Election Year    Region    Vote %2000             A        29.402000             A        19.402000             C        19.802000             C        10.802005             A        15.802005             A        13.102005             C        15.062005             C        18.062010             A        20.802010             A        10.52010             C        17.702010             C        14.70

Output contains only regions having more than 10% vote every year and Election year and region name in sorted in ascending order. So, here only Region "A" and "C" will be there in the output.

I have used the following code to sort "Vote %" in descending order after grouping by "Election year" and "Region" and to then compare the top 2 Vote% every year, but it is giving an error.

df1 = df.groupby(['Election Year','Region'])sort_values('Vote %', ascending = False).reset_index()

Question 2

Try with groupby filter:

cols = ['Election Year', 'Region', 'Vote %']
df1 = (df.groupby('Region').filter(lambda g: g['Vote %'].ge(10).all()).sort_values(cols, ascending=(True, True, False))[cols].reset_index(drop=True)
)

df1:

    Election Year Region  Vote %
0            2000      A   29.40
1            2000      A   19.20
2            2000      C   19.80
3            2000      C   10.80
4            2005      A   15.80
5            2005      A   13.10
6            2005      C   18.06
7            2005      C   15.06
8            2010      A   20.80
9            2010      A   10.50
10           2010      C   17.70
11           2010      C   14.70

df used:

df = pd.DataFrame({'Election Year': [2000, 2000, 2000, 2000, 2000, 2000, 2005, 2005, 2005,2005, 2005, 2005, 2010, 2010, 2010, 2010, 2010, 2010],'Votes': [42289, 27618, 20886, 17747, 14047, 17047, 8358, 4428, 1647, 1610,1334, 1834, 21114, 1042, 835, 14305, 22211, 20011],'Vote %': [29.4, 19.2, 14.5, 12.4, 19.8, 10.8, 15.8, 13.1, 1.2, 11.1, 15.06,18.06, 20.8, 10.5, 0.6, 15.5, 17.7, 14.7],'Party': ['Janata Dal (United)', 'Rashtriya Janata Dal','Bahujan Samaj Party', 'Congress', 'Independent', 'JLS','Janvadi Party', 'Independent', 'Independent', 'Independent','Nationalist', 'NJM', 'Independent', 'Bharatiya Janta Dal','Independent', 'Independent', 'Congress', 'INC'],'Region': ['A', 'A', 'B', 'B', 'C', 'C', 'A', 'A', 'B', 'B', 'C', 'C', 'A','A', 'B', 'B', 'C', 'C']
})

Getting sub-dataframe after sorting and groupby

Related Q&A

Use .vss stencil file to generate shapes by python code (use .vdx?)

How can I capture detected image of object Yolov3 and display in flask

ValueError: Shapes (2, 1) and () are incompatible

Subtotals for Pandas pivot table index and column

Create two new columns derived from original columns in Python

In dataframe: how to pull minutes and seconds combinedly(mm:ss) from timedelta using python

Understanding python numpy syntax to mask and filter array

How to get the proper link from a website using python beautifulsoup?

tkinter frame propagate not behaving?

python modules installing Error Visual c++ 14.0 is required [duplicate]