Using Pandas to applymap with access to index/column?

2024/9/20 0:47:39

What's the most effective way to solve the following pandas problem?

Here's a simplified example with some data in a data frame:

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,10,size=(10, 4)), columns=['a','b','c','d'], index=np.random.randint(0,10,size=10))

This data looks like this:

   a  b  c  d
1  0  0  9  9
0  2  2  1  7
3  9  3  4  0
2  5  0  9  4
1  7  7  7  2
6  4  4  6  4
1  1  6  0  0
7  8  0  9  3
5  0  0  8  3
4  5  0  2  4

Now I want to apply some function f to each value in the data frame (the function below, for example) and get a data frame back as a resulting output. The tricky part is the function I'm applying depends on the value of the index I am currently at.

def f(cell_val, row_val):"""some function which needs to know row_val to use it"""try:return cell_val/row_valexcept ZeroDivisionError:return -1

Normally, if I wanted to apply a function to each individual cell in the data frame, I would just call .applymap() on f. Even if I had to pass in a second argument ('row_val', in this case), if the argument was a fixed number I could just write a lambda expression such as lambda x: f(x,i) where i is the fixed number I wanted. However, my second argument varies depending on the row in the data frame I am currently calling the function from, which means that I can't just use .applymap().

How would I go about solving a problem like this efficiently? I can think of a few ways to do this, but none of them feel "right". I could:

  • loop through each individual value and replace them one by one, but that seems really awkward and slow.
  • create a completely separate data frame containing (cell value, row value) tuples and use the builtin pandas applymap() on my tuple data frame. But that seems pretty hacky and I'm also creating a completely separate data frame as an extra step.
  • there must be a better solution to this (a fast solution would be appreciated, because my data frame could get very large).
Answer

IIUC you can use div with axis=0 plus you need to convert the Index object to a Series object using to_series:

In [121]:
df.div(df.index.to_series(), axis=0).replace(np.inf, -1)Out[121]:a         b         c         d
1  0.000000  0.000000  9.000000  9.000000
0 -1.000000 -1.000000 -1.000000 -1.000000
3  3.000000  1.000000  1.333333  0.000000
2  2.500000  0.000000  4.500000  2.000000
1  7.000000  7.000000  7.000000  2.000000
6  0.666667  0.666667  1.000000  0.666667
1  1.000000  6.000000  0.000000  0.000000
7  1.142857  0.000000  1.285714  0.428571
5  0.000000  0.000000  1.600000  0.600000
4  1.250000  0.000000  0.500000  1.000000

Additionally as division by zero results in inf you need to call replace to replace those rows with -1

https://en.xdnf.cn/q/72743.html

Related Q&A

Multiple URL segment in Flask and other Python frameowrks

Im building an application in both Bottle and Flask to see which I am more comfortable with as Django is too much batteries included.I have read through the routing documentation of both, which is very…

installing python modules that require gcc on shared hosting with no gcc or root access

Im using Hostgator shared as a production environment and I had a problem installing some python modules, after using:pip install MySQL-pythonpip install pillowresults in:unable to execute gcc: Permiss…

Using libclang to parse in C++ in Python

After some research and a few questions, I ended up exploring libclang library in order to parse C++ source files in Python.Given a C++ source int fac(int n) {return (n>1) ? n∗fac(n−1) : 1; }for …

Python one class per module and packages

Im trying to structure my app in Python. Coming back from C#/Java background, I like the approach of one class per file. Id like my project tree to look like this:[Service][Database]DbClass1.pyDbClass2…

PyMySQL Access Denied using password (no) but using password

Headscratcher here for me.I am attempting to connect to a database on my local MySQL 8.0.11.0 install from Python.Heres the code Im using :conn = pymysql.connect(host=localhost, port=3306, user=root, p…

Trouble importing Python modules on Ninja IDE

I have been trying to import modules into Ninja IDE for python. These are modules that I have working on the terminal (numpy, scipy, scitools, matplotlib, and mpl_toolkits), but will not run correctly …

UTF-8 error with Python and gettext

I use UTF-8 in my editor, so all strings displayed here are UTF-8 in file.I have a python script like this:# -*- coding: utf-8 -*- ... parser = optparse.OptionParser(description=_(automates the dice ro…

Add build information in Jenkins using REST

Does anyone know how to add build information to an existing Jenkins build? What Im trying to do is replace the #1 build number with the actual full version number that the build represents. I can do …

Combining element-wise and matrix multiplication with multi-dimensional arrays in NumPy

I have two multidimensional NumPy arrays, A and B, with A.shape = (K, d, N) and B.shape = (K, N, d). I would like to perform an element-wise operation over axis 0 (K), with that operation being matrix …

Target array shape different to expected output using Tensorflow

Im trying to make a CNN (still a beginner). When trying to fit the model I am getting this error:ValueError: A target array with shape (10000, 10) was passed for output of shape (None, 6, 6, 10) while …