Question 1

I have a Pandas DataFrame with customer refund reasons. It contains these example data rows:

    **case_type**       **claim_type**
1   service             service
2   service             service
3   chargeback          service
4   chargeback          local_charges
5   service             supplier_service
6   chargeback          service
7   chargeback          service
8   chargeback          service
9   chargeback          service
10  chargeback          service
11  service             service_not_used
12  service             service_not_used

I would like to compare the customer's reason with some sort of labeled reason. This is no problem, but I would also like to see the total number of records in a specific group (customer reason).

case_claim_type = df[["case_type", "claim_type"]]
case_claim_type.groupby(by=("case_type", "claim_type"))["case_type"].count()

Which gives me this output, for example:

**case_type**     **claim_type**                 
service           service                         2supplier_service                1service_not_used                2
chargeback        service                         6local_charges                   1

I would also like to have have the sum of the output per case_type. Something like:

**case_type**     **claim_type**                 
service           service                         2supplier_service                1service_not_used                2total:                          5
chargeback        service                         6local_charges                   1total:                          7

It doesn't necessarily has to be in this last output format, a column with the (aggregated) totals per case_type is also fine.

Question 2

Where:

df = pd.DataFrame({'case_type':['Service']*20+['chargeback']*9,'claim_type':['service']*5+['local_charges']*5+['service_not_used']*5+['supplier_service']*5+['service']*8+['local_charges']})df_out = df.groupby(by=("case_type", "claim_type"))["case_type"].count()

Let use pd.concat, sum with level parameter, and assign:

(pd.concat([df_out.to_frame(),df_out.sum(level=0).to_frame().assign(claim_type= "total").set_index('claim_type', append=True)]).sort_index())

Output:

                             case_type
case_type  claim_type                 
Service    local_charges             5service                   5service_not_used          5supplier_service          5total                    20
chargeback local_charges             1service                   8total                     9

Pandas groupby and sum total of group

Related Q&A

Capture webcam video using PyQt

Plot a 3d surface from a list of lists using matplotlib

Super fast way to compare if two strings are equal

Pandas DataFrames in reportlab

How to open and close a website using default browser with python

Comparing numpy array with itself by element efficiently

Kivy: BoxLayout vs. GridLayout

Flask circular dependency

How to create tox.ini variables

How to apply json_normalize on entire pandas column