Replace values in column of Pandas DataFrame using a Series lookup table

2024/9/27 7:21:10

I want to replace a column of values in a DataFrame with a more accurate/complete set of values generated by a look-up table in the form of a Series that I have prepared.

I thought I could do it this way but the result is not as expected.

Here is the DataFrame I want to fix:

In [6]: df_normalised.head(10)
Out[6]: code                                          name
0    8                             Human development
1   11                                              
2    1                           Economic management
3    6         Social protection and risk management
4    5                         Trade and integration
5    2                      Public sector governance
6   11  Environment and natural resources management
7    6         Social protection and risk management
8    7                   Social dev/gender/inclusion
9    7                   Social dev/gender/inclusion

(Note the missing name in row 2).

Here is the look-up table I created to do the fixing:

In [20]: names
Out[20]: 
1                              Economic management
10                               Rural development
11    Environment and natural resources management
2                         Public sector governance
3                                      Rule of law
4         Financial and private sector development
5                            Trade and integration
6            Social protection and risk management
7                      Social dev/gender/inclusion
8                                Human development
9                                Urban development
dtype: object

Here is the way I thought could do it:

In [21]: names[df_normalised.head(10).code]
Out[21]: 
code
8                                Human development
11    Environment and natural resources management
1                              Economic management
6            Social protection and risk management
5                            Trade and integration
2                         Public sector governance
11    Environment and natural resources management
6            Social protection and risk management
7                      Social dev/gender/inclusion
7                      Social dev/gender/inclusion
dtype: object

However, I expected the resulting series above to have the same index as the index of df_normalised (i.e. 0, 1, 2, 3) not an index based on the code values.

So I'm not sure how to replace the original values in the 'name' column in df_normalised with these series values because the indexes are not the same.

Incidentally, how is it possible to have an index with duplicate values as above?

Answer

you can use map() function for that:

In [38]: df_normalised['name'] = df_normalised['code'].map(name)In [39]: df_normalised
Out[39]:code                                          name
0     8                             Human development
1    11  Environment and natural resources management
2     1                           Economic management
3     6         Social protection and risk management
4     5                         Trade and integration
5     2                      Public sector governance
6    11  Environment and natural resources management
7     6         Social protection and risk management
8     7                   Social dev/gender/inclusion
9     7                   Social dev/gender/inclusion
https://en.xdnf.cn/q/71478.html

Related Q&A

Behavior of round function in Python

Could anyone explain me this pice of code:>>> round(0.45, 1) 0.5 >>> round(1.45, 1) 1.4 >>> round(2.45, 1) 2.5 >>> round(3.45, 1) 3.5 >>> round(4.45, 1) 4.5…

Pygame application runs slower on Mac than on PC

A friend and I are making a game in Python (2.7) with the Pygame module. I have mostly done the art for the game so far and he has mostly done the coding but eventually I plan to help code with him onc…

How to extract feature vector from single image in Pytorch?

I am attempting to understand more about computer vision models, and Im trying to do some exploring of how they work. In an attempt to understand how to interpret feature vectors more Im trying to use …

Which language should I use for Artificial intelligence on web projects

I have to do one project for my thesis involving Artificial intelligence, collaborative filtering and machine learning methods.I only know PHP/mysq/JS, and there is not much AI stuff examples in PHP.Th…

Scrapy with selenium, webdriver failing to instantiate

I am trying to use selenium/phantomjs with scrapy and Im riddled with errors. For example, take the following code snippet:def parse(self, resposne):while True:try:driver = webdriver.PhantomJS()# do so…

How do I enable TLS on an already connected Python asyncio stream?

I have a Python asyncio server written using the high-level Streams API. I want to enable TLS on an already established connection, as in STARTTLS in the SMTP and IMAP protocols. The asyncio event loop…

Validate with three xml schemas as one combined schema in lxml?

I am generating an XML document for which different XSDs have been provided for different parts (which is to say, definitions for some elements are in certain files, definitions for others are in other…

An unusual Python syntax element frequently used in Matplotlib

One proviso: The syntax element at the heart of my Question is in the Python language; however, this element appears frequently in the Matplotlib library, which is the only context i have seen it. So w…

Control the power of a usb port in Python

I was wondering if it could be possible to control the power of usb ports in Python, using vendor ids and product ids. It should be controlling powers instead of just enabling and disabling the ports. …

Threads and local proxy in Werkzeug. Usage

At first I want to make sure that I understand assignment of the feature correct. The local proxy functionality assigned to share a variables (objects) through modules (packages) within a thread. Am I …