I want to replace a column of values in a DataFrame with a more accurate/complete set of values generated by a look-up table in the form of a Series that I have prepared.
I thought I could do it this way but the result is not as expected.
Here is the DataFrame I want to fix:
In [6]: df_normalised.head(10)
Out[6]: code name
0 8 Human development
1 11
2 1 Economic management
3 6 Social protection and risk management
4 5 Trade and integration
5 2 Public sector governance
6 11 Environment and natural resources management
7 6 Social protection and risk management
8 7 Social dev/gender/inclusion
9 7 Social dev/gender/inclusion
(Note the missing name in row 2).
Here is the look-up table I created to do the fixing:
In [20]: names
Out[20]:
1 Economic management
10 Rural development
11 Environment and natural resources management
2 Public sector governance
3 Rule of law
4 Financial and private sector development
5 Trade and integration
6 Social protection and risk management
7 Social dev/gender/inclusion
8 Human development
9 Urban development
dtype: object
Here is the way I thought could do it:
In [21]: names[df_normalised.head(10).code]
Out[21]:
code
8 Human development
11 Environment and natural resources management
1 Economic management
6 Social protection and risk management
5 Trade and integration
2 Public sector governance
11 Environment and natural resources management
6 Social protection and risk management
7 Social dev/gender/inclusion
7 Social dev/gender/inclusion
dtype: object
However, I expected the resulting series above to have the same index as the index of df_normalised (i.e. 0, 1, 2, 3) not an index based on the code values.
So I'm not sure how to replace the original values in the 'name' column in df_normalised with these series values because the indexes are not the same.
Incidentally, how is it possible to have an index with duplicate values as above?