Question 1

I want to replace a column of values in a DataFrame with a more accurate/complete set of values generated by a look-up table in the form of a Series that I have prepared.

I thought I could do it this way but the result is not as expected.

Here is the DataFrame I want to fix:

In [6]: df_normalised.head(10)
Out[6]: code                                          name
0    8                             Human development
1   11                                              
2    1                           Economic management
3    6         Social protection and risk management
4    5                         Trade and integration
5    2                      Public sector governance
6   11  Environment and natural resources management
7    6         Social protection and risk management
8    7                   Social dev/gender/inclusion
9    7                   Social dev/gender/inclusion

(Note the missing name in row 2).

Here is the look-up table I created to do the fixing:

In [20]: names
Out[20]: 
1                              Economic management
10                               Rural development
11    Environment and natural resources management
2                         Public sector governance
3                                      Rule of law
4         Financial and private sector development
5                            Trade and integration
6            Social protection and risk management
7                      Social dev/gender/inclusion
8                                Human development
9                                Urban development
dtype: object

Here is the way I thought could do it:

In [21]: names[df_normalised.head(10).code]
Out[21]: 
code
8                                Human development
11    Environment and natural resources management
1                              Economic management
6            Social protection and risk management
5                            Trade and integration
2                         Public sector governance
11    Environment and natural resources management
6            Social protection and risk management
7                      Social dev/gender/inclusion
7                      Social dev/gender/inclusion
dtype: object

However, I expected the resulting series above to have the same index as the index of df_normalised (i.e. 0, 1, 2, 3) not an index based on the code values.

So I'm not sure how to replace the original values in the 'name' column in df_normalised with these series values because the indexes are not the same.

Incidentally, how is it possible to have an index with duplicate values as above?

Question 2

you can use map() function for that:

In [38]: df_normalised['name'] = df_normalised['code'].map(name)In [39]: df_normalised
Out[39]:code                                          name
0     8                             Human development
1    11  Environment and natural resources management
2     1                           Economic management
3     6         Social protection and risk management
4     5                         Trade and integration
5     2                      Public sector governance
6    11  Environment and natural resources management
7     6         Social protection and risk management
8     7                   Social dev/gender/inclusion
9     7                   Social dev/gender/inclusion

Replace values in column of Pandas DataFrame using a Series lookup table

Related Q&A

Behavior of round function in Python

Pygame application runs slower on Mac than on PC

How to extract feature vector from single image in Pytorch?

Which language should I use for Artificial intelligence on web projects

Scrapy with selenium, webdriver failing to instantiate

How do I enable TLS on an already connected Python asyncio stream?

Validate with three xml schemas as one combined schema in lxml?

An unusual Python syntax element frequently used in Matplotlib

Control the power of a usb port in Python

Threads and local proxy in Werkzeug. Usage