Question 1

I have a pandas dataframe that looks like this:

genres.head()

   Drama   Comedy  Action  Crime   Romance Thriller    Adventure   Horror  Mystery Fantasy ... History Music   War Documentary Sport   Musical Western Film-Noir   News    number_of_genres
tconst                                                                                  
tt0111161   1   0   0   0   0   0   0   0   0   0   ... 0   0   0   0   0   0   0   0   0   1
tt0468569   1   0   1   1   0   0   0   0   0   0   ... 0   0   0   0   0   0   0   0   0   3
tt1375666   0   0   1   0   0   0   1   0   0   0   ... 0   0   0   0   0   0   0   0   0   3
tt0137523   1   0   0   0   0   0   0   0   0   0   ... 0   0   0   0   0   0   0   0   0   1
tt0110912   1   0   0   1   0   0   0   0   0   0   ... 0   0   0   0   0   0   0   0   0   2

I want to be able to get a table where the rows are the genres, the columns are the number of labels for a given movie and the values are the counts. In other words, I want this:

number_of_genres    1   2   3   totals
Drama   451 1481    3574    5506
Comedy  333 1108    2248    3689
Action  9   230 1971    2210
Crime   1   284 1687    1972
Romance 1   646 1156    1803
Thriller    22  449 1153    1624
Adventure   1   98  1454    1553
Horror  137 324 765 1226
Mystery 0   108 792 900
Fantasy 1   74  642 717
Sci-Fi  0   129 551 680
Biography   0   95  532 627
Family  0   60  452 512
Animation   0   6   431 437
History 0   32  314 346
Music   1   87  223 311
War 0   90  162 252
Documentary 70  82  78  230
Sport   0   78  142 220
Musical 0   13  131 144
Western 19  44  57  120
Film-Noir   0   11  50  61
News    0   1   2   3
Total   1046    5530    18567   25143

What is the best way of getting that table pythonistically? I solved the problem through the following code but was wondering if there's a better way:

genres['number_of_genres'] = genres.sum(axis=1)
pivots = []
for column in genres.columns[0:-1]:column = pd.DataFrame(genres[column])columns = column.join(genres.number_of_genres)pivot = pd.pivot_table(columns, values=columns.columns[0], columns='number_of_genres', aggfunc=np.sum)pivots.append(pivot)pivots_df = pd.concat(pivots)
pivots_df['totals'] = pivots_df.sum(axis=1)
pivots_df.loc['Total'] = pivots_df.sum()

[EDIT]: Added jupyter output that should be compatible with pd.read_clipboard(). If I can format the output better, please let me know how I can do so.

Question 2

Maybe I'm missing something but doesn't this work for you?

agg = df.groupby('number_of_genres').agg('sum').T
agg['totals'] = agg.sum(axis=1)

Edit: Solution via pivot_table

agg = df.pivot_table(columns='number_of_genres', aggfunc='sum')
agg['total'] = agg.sum(axis=1)

Pivoting a One-Hot-Encode Dataframe

Related Q&A

How to declare multiple similar variables in python? [duplicate]

what does means this error broken pipe? [duplicate]

Apply a function to each element of a pandas series

ValueError: could not convert string to float: in Python 3.10

How do I access Class fields in Python Graph-Tool property maps?

How to iterate through each line of a text file and get the sentiment of those lines using python?

RECURSIVE function that will sum digits of input

Make sure matrix row took from text file are same length(python3) [duplicate]

how to randomize order of questions in a quiz in python? [closed]

How transform days to hours, minutes and seconds in Python