Question 1

Here is a simple DataFrame:

> df = pd.DataFrame({'a': ['a1', 'a2', 'a3'],'b': ['optional1', None, 'optional3'],'c': ['c1', 'c2', 'c3'],'d': [1, 2, 3]})
> dfa          b   c  d
0  a1  optional1  c1  1
1  a2       None  c2  2
2  a3  optional3  c3  3

Pivot method 1

The data can be pivoted to this:

> df.pivot_table(index=['a','b'], columns='c')d     
c              c1   c3
a  b                  
a1 optional1  1.0  NaN
a3 optional3  NaN  3.0

Downside: data in the 2nd row is lost because df['b'][1] == None.

Pivot method 2

> df.pivot_table(index=['a'], columns='c')d          
c    c1   c2   c3
a                
a1  1.0  NaN  NaN
a2  NaN  2.0  NaN
a3  NaN  NaN  3.0

Downside: column b is lost.

How can the two methods be combined so that columns b and the 2nd row are kept like so:

                d     
c              c1   c2   c3
a  b                  
a1 optional1  1.0  NaN  NaN
a2      None  NaN  2.0  NaN
a3 optional3  NaN  NaN  3.0

More generally: How can information from a row be retained during pivoting if a key has NaN value?

Question 2

Use set_index and unstack to perform the pivot:

df = df.set_index(['a', 'b', 'c']).unstack('c')

This is essentially what pandas does under the hood for pivot. The stack and unstack methods are closely related to pivot, and can generally be used to perform pivot-like operations that don't quite conform with the built-in pivot functions.

The resulting output:

                d          
c              c1   c2   c3
a  b                       
a1 optional1  1.0  NaN  NaN
a2 NaN        NaN  2.0  NaN
a3 optional3  NaN  NaN  3.0

NaN values in pivot_table index causes loss of data

Pivot method 1

Pivot method 2

Related Q&A

ModuleNotFoundError in Docker

Can I use md5 authentication with psycopg2?

Python checking init parameter

Minidom getElementById not working

Optimization on piecewise linear regression

Python: Check if list of named tuples contains particular attribute value

javascript error: arguments[0].scrollIntoView is not a function using selenium on python

Uploading a static project to google app engines

Python cannot import DataFrame

Sum of product of combinations in a list