Question 1

So i essentially want to implement the equivalent of R's match() function in Python, using Pandas dataframes - without using a for-loop.

In R match() returns a vector of the positions of (first) matches of its first argument in its second.

Let's say that I have two df A and B, of which both include the column C. Where

A$C = c('a','b')
B$C = c('c','c','b','b','c','b','a','a')

In R we would get

match(A$C,B$C) = c(7,3)

What is an equivalent method in Python for columns in pandas data frames, that doesn't require looping through the values.

Question 2

Here is a one liner:

B.reset_index().groupby('C')['index'].first()[A.C].values

This solution returns the results in the same order as the input A, as match does in R.

Full example:

import pandas as pdA = pd.DataFrame({'C':['a','b']})
B = pd.DataFrame({'C':['c','c','b','b','c','b','a','a']})B.reset_index().groupby('C')['index'].first()[A.C].values

Output array([6, 2])

Edit (2023-04-12): In newer versions of pandas .loc matches all rows that match the condition. Thus, the previous solution (B.reset_index().set_index('c').loc[A.c, 'index'].values) would return all the matches instead of only the first ones.

Python equivalence of Rs match() for indexing

Related Q&A

Why doesnt Pydantic validate field assignments?

Format OCR text annotation from Cloud Vision API in Python

Does pybtex support accent/special characters in .bib file?

How do I count specific values across multiple columns in pandas

Split Python source into separate directories?

How can I use a raw_input with twisted?

How to use Python and HTML to build a desktop software?

More efficient way to look up dictionary values whose keys start with same prefix

When should I use dt.column vs dt[column] pandas?

Quiver matplotlib : arrow with the same sizes