Update value for every row based on either of two previous columns

2024/10/10 8:18:38

I am researching ATP Tour male tennis data. Currently, I have a Pandas dataframe that contains ~60,000 matches. Every row contains information / statistics about the match, split between the winner and the loser. I have sorted the dataframe on date. Currently I am trying to calculate the ELO-rating of both the winner and the loser for every match (thus every row). To calculate the ELO-rating, one needs the ELO-rating for both players in their previous match. Another difficulty arises, as the winner of the current match might have been a loser in his previous match. As a result, the 'winner_player_id' value of the current match might be in the 'loser_player_id' column for the previous match.

I am not sure how to efficiently select the previous ELO-ratings for both players per row, as this entails a search across multiple columns.

Every row includes the following columns:

array(['match_id', 'tourney_dates', 'round_order', 'tourney_name','tourney_year_id', 'tourney_round_name', 'winner_player_id','winner_slug', 'loser_player_id', 'loser_slug', 'elo_player_1', 'elo_player_2'])

Your time is appreciated!

Answer

One approach would be to sort each winner and loser in each row by player name/ID, so the order will be stable regardless of who wins/loses. Here's an example:

df.join(pd.DataFrame(np.sort(df[['winner_name', 'loser_name']].values, axis=1),columns=['name1', 'name2']))df.head(10)

Output:

      winner_name         loser_name              name1          name2
0   Nicklas Kulti      Michael Stich      Michael Stich  Nicklas Kulti
1   Michael Stich        Jim Courier        Jim Courier  Michael Stich
2   Nicklas Kulti     Magnus Larsson     Magnus Larsson  Nicklas Kulti
3     Jim Courier      Martin Sinner        Jim Courier  Martin Sinner
4   Michael Stich        Jimmy Arias        Jimmy Arias  Michael Stich
5   Nicklas Kulti    Fabrice Santoro    Fabrice Santoro  Nicklas Kulti
6  Magnus Larsson      Patrik Kuhnen     Magnus Larsson  Patrik Kuhnen
7     Jim Courier      Paul Haarhuis        Jim Courier  Paul Haarhuis
8   Nicklas Kulti  Magnus Gustafsson  Magnus Gustafsson  Nicklas Kulti
9   Michael Stich        Gilad Bloom        Gilad Bloom  Michael Stich
https://en.xdnf.cn/q/118473.html

Related Q&A

Count consecutive equal values in array [duplicate]

This question already has answers here:Count consecutive occurences of values varying in length in a numpy array(5 answers)Closed 5 years ago.Say I have the following numpy array:a = np.array([1,5,5,2,…

how can I show please wait gif image before the process is complete

I want to show "please wait gif" image from img() class before the ListApp() class process is complete and then as soon as the process of that class is completed the screeen of ListApp should…

TypeError: list of indices must be integers, not str

What is wrong in my code to give me the error:TypeError: List of indices must be integers, not strHere is my code:print("This programe will keep track of your TV schedule.") Finish = False Sh…

Assignment in conditional not permitted in Python?

Why is code like if a = "hello":passinvalid in Python? The a = "Hello" is just a expression whose value is the Rvalue. Its valid in most languages like C or php. Some opinions?

Django - Join two Table without Foreign key

I have two tables and want to join them.. but I cant do that without rawQueryset and raw SQL. how can i join two models without foreign key? The columns for JOIN is not unique so it cant be PK and For…

Understanding lambda functions

Well I did try to read about Lambda functions but did not get across any link which explains few questions about its flow and the way it is handled by python interpretor or may be I could not understan…

JSON to Python dataframe: mapping values from another API

I have an API with student data like this, for every student id there will be a corresponding API link with mark details. for example: https://api.school.com/2020/students.json {"Students": […

Create a cycle out of scattered points

I know this sounds trivial, but my head is refusing to give an algorithm for this.I have a bunch of points scattered on a 2-D plane and want to store them in a list such that they create a ring. The po…

Python Dictionary w/ 2 Keys?

Can I have a python dictionary with 2 same keys, but with different elements?

Tkinter throwing a KeyError when trying to change frames

Im learning tkinter off of the Sentdex tutorials and I into a problem when trying to change pages. My compiler throws something about a KeyError that it doesnt give whenever I change the button on the …