Python/pandas: Find matching values from two dataframes and return third value

2024/5/20 10:08:38

I have two different dataframes (df1, df2) with completely different shapes: df1: (64, 6); df2: (564, 9). df1 contains a column (df1.objectdesc) which has values (strings) that can also be found in a column in df2 (df2.objdescription). As the two dataframes have different shapes I have to work with .isin() to get the matching values. I then would like to get a third value from a different column in df2 (df2.idname) from exactly those rows which match and add them to df1 - this is where I struggle.

example datasets:

df1

      Content    objectdesc    TS_id
0     sdrgs      1_OG.Raum45   55
1     sdfg       2_OG.Raum23   34
2     psdfg      GG.Raum12     78
3     sdfg       1_OG.Raum98   67

df2:

      Numb_val    object_count     objdescription    min   idname
0     463         9876             1_OG_Raum76       1     wq19
1     251         8324             2_OG.Raum34       9     zt45
2     456         1257             1_OG.Raum45       4     bh34
3     356         1357             2_OG.Raum23       3     if32
4     246         3452             GG.Raum12         5     lu76
5     345         8553             1_OG.Raum98       8     pr61

expected output:

      Content    objectdesc    TS_id    idname
0     sdrgs      1_OG.Raum45   55       bh34
1     sdfg       2_OG.Raum23   34       if32
2     psdfg      GG.Raum12     78       lu76
3     sdfg       1_OG.Raum98   67       pr61

This is my code so far:

def get_id(x, y):for values in x,y:if x['objectdesc'].isin(y['objdescription']).any() == True:return y['idname']df1['idname'] = get_id(df1, df2) 

This unfortunately only provides the values of df2['idname'] starting from index 0, instead of really giving me the values from the rows which match.

Any help is appreciated. Thank you!

Answer

may be try this:

df1.merge(df2, left_on='objectdesc', right_on='objdescription')[['Content', 'objectdesc', 'TS_id', 'idname']]

reference:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.merge.html

https://en.xdnf.cn/q/73002.html

Related Q&A

random.choice broken with dicts

The random.choice input is supposed to be a sequence. This causes odd behavior with a dict, which is not a sequence type but can be subscripted like one: >>> d = {0: spam, 1: eggs, 3: potato} …

Tornado [Errno 24] Too many open files [duplicate]

This question already has an answer here:Tornado "error: [Errno 24] Too many open files" error(1 answer)Closed 9 years ago.We are running a Tornado 3.0 service on a RedHat OS and getting the …

How to check if an RGB image contains only one color?

Im using Python and PIL.I have images in RGB and I would like to know those who contain only one color (say #FF0000 for example) or a few very close colors (#FF0000 and #FF0001).I was thinking about us…

python requests and cx_freeze

I am trying to freeze a python app that depends on requests, but I am getting the following error:Traceback (most recent call last):File "c:\Python33\lib\site-packages\requests\packages\urllib3\ut…

django changing a date field to integer field cant migrate

I recently changed a date field to an integer field (the data was specified in number of months remaining rather than a date). However all though the make migrations command works fine when I attempt t…

Sqlalchemy get row in timeslot

I have a model called Appointment which has the columns datetime which is a DateTime field and duration which is an Integer field and represents duration in minutes. Now I want to check if func.now() i…

How do I include non-.py files in PyPI?

I am a newb to PyPI...so let me qualify with that. I am trying to put a package on PyPI but having a bit of trouble when I try to install it with pip. When I upload the file to PyPI, I get a warning (b…

How to create a custom AutoField primary_key entry within Django

I am trying to create a custom primary_key within my helpdesk/models.py that I will use to track our help desk tickets. I am in the process of writing a small ticking system for our office.Maybe there …

Multiple HoverTools for different lines (bokeh)

I have more than one line on a bokeh plot, and I want the HoverTool to show the value for each line, but using the method from a previous stackoverflow answer isnt working:https://stackoverflow.com/a/2…

Cant Install PIL 1.7

I have python 2.7.3 and I want to install PIL 1.7. I downloaded "PIL-1.1.7.win32-py2.7" and try to install it but it shows me an error messege that it cant find python 2.7 in the registry.&qu…