Question 1

I'm trying to take my dataframe from a long format in which I have a column with a categorical variable, into a wide format in which each category has it's own price column. Currently, my data looks like this:

date-time            date       vendor    payment_type   price
03-10-15 10:00:00    03-10-15     A1            1          50
03-10-15 10:00:00    03-10-15     A1            2          60
03-10-15 10:00:00    03-11-15     A1            1          45
03-10-15 10:00:00    03-11-15     A1            2          70
03-10-15 10:00:00    03-12-15     B1            1          40
03-10-15 10:00:00    03-12-15     B1            2          45
03-10-15 10:00:00    03-10-15     C1            1          60
03-10-15 10:00:00    03-10-15     C1            1          65

My goal is to have a column for every vendor's price and for each payment type and one row per day. When there are multiple values per day, I want to use the maximum value. The end result should look something like this.

Date       A1_Pay1   A2_Pay2 ... C1_Pay1   C1_Pay2
03-10-15     50        60    ...   65        NaN
03-11-15     45        70    ...   NaN       NaN
03-12-15     NaN       NaN   ...   NaN       NaN

I tried using unstack and pivot, but I either wasn't getting what I was going for, or was getting an error about Date not being a unique index.

Any ideas?

Question 2

You can use pivot_table:

#convert column payment_type to string
df['payment_type'] = df['payment_type'].astype(str)df = pd.pivot_table(df, index='date', columns=['vendor', 'payment_type'], aggfunc=max)#remove top level of multiindex
df.columns = df.columns.droplevel(0)#reset multicolumns
df.columns = ['_Pay'.join(col).strip() for col in df.columns.values]print dfA1_Pay1  A1_Pay2  B1_Pay1  B1_Pay2  C1_Pay1
date                                                   
2015-03-10       50       60      NaN      NaN       65
2015-03-11       45       70      NaN      NaN      NaN
2015-03-12      NaN      NaN       40       45      NaN

EDIT:

If you need other statistics, you can add them as list to aggfunc:

#convert column payment_type to string
df['payment_type'] = df['payment_type'].astype(str)
df = pd.pivot_table(df, index='date', columns=['vendor', 'payment_type'], aggfunc=[np.mean, np.max, np.median])
print dfmean                    amax                 median              \price                   price                  price               
vendor          A1      B1        C1    A1      B1      C1     A1      B1       
payment_type     1   2   1   2     1     1   2   1   2   1      1   2   1   2   
date                                                                            
2015-03-10      50  60 NaN NaN  62.5    50  60 NaN NaN  65     50  60 NaN NaN   
2015-03-11      45  70 NaN NaN   NaN    45  70 NaN NaN NaN     45  70 NaN NaN   
2015-03-12     NaN NaN  40  45   NaN   NaN NaN  40  45 NaN    NaN NaN  40  45   vendor          C1  
payment_type     1  
date                
2015-03-10    62.5  
2015-03-11     NaN  
2015-03-12     NaN 
#remove top level of multiindex
df.columns = df.columns.droplevel(1)
#reset multicolumns
df.columns = ['_Pay'.join(col).strip() for col in df.columns.values]

print dfmean_PayA1_Pay1  mean_PayA1_Pay2  mean_PayB1_Pay1  \
date                                                            
2015-03-10               50               60              NaN   
2015-03-11               45               70              NaN   
2015-03-12              NaN              NaN               40   mean_PayB1_Pay2  mean_PayC1_Pay1  amax_PayA1_Pay1  \
date                                                            
2015-03-10              NaN             62.5               50   
2015-03-11              NaN              NaN               45   
2015-03-12               45              NaN              NaN   amax_PayA1_Pay2  amax_PayB1_Pay1  amax_PayB1_Pay2  \
date                                                            
2015-03-10               60              NaN              NaN   
2015-03-11               70              NaN              NaN   
2015-03-12              NaN               40               45   amax_PayC1_Pay1  median_PayA1_Pay1  median_PayA1_Pay2  \
date                                                                
2015-03-10               65                 50                 60   
2015-03-11              NaN                 45                 70   
2015-03-12              NaN                NaN                NaN   median_PayB1_Pay1  median_PayB1_Pay2  median_PayC1_Pay1  
date                                                                 
2015-03-10                NaN                NaN               62.5  
2015-03-11                NaN                NaN                NaN  
2015-03-12                 40                 45                NaN

Long to wide data. Pandas

Related Q&A

How to wrap text in Django admin(set column width)

Problems compiling mod_wsgi in virtualenv

Python - Multiprocessing Error cannot start a process twice

Printing unicode number of chars in a string (Python)

Pandas report top-n in group and pivot

virtualenv --no-site-packages is not working for me

pandas: Group by splitting string value in all rows (a column) and aggregation function

Seaborn Title Position

Expand/collapse ttk Treeview branch

Uploading images to s3 with meta = image/jpeg - python/boto3