I've got the following dataframe containing function names, their arguments, the default values of the arguments and argument types:
FULL_NAME ARGUMENT DEF_VALS TYPE
'function1' 'f1_arg1' NAN 'NoneType'
'function1' 'f1_arg2' NAN NAN
'function1' 'f1_arg3' NAN NAN
'function2' 'f2_arg1' 0 'int'
'function3' 'f3_arg1' True 'bool'
'function3' 'f3_arg2' 'something' 'str'
This dataframe can be reproduced as follows:
import pandas as pdD = {'FULL_NAME': ['function1', 'function1', 'function1', 'function2', 'function3', 'function3'], 'ARGUMENT': ['f1_arg1', 'f1_arg2', 'f1_arg3', 'f2_arg1', 'f3_arg1', 'f3_arg2'], 'DEF_VAL': [float('nan'), float('nan'), float('nan'), 0, True, 'something'], 'TYPE': ['NoneType', float('nan'), float('nan'), 'int', 'bool', 'str']}
dataframe = pd.DataFrame(D)
What I'm trying to obtain as a result must look this way:
args function
[a1=NONE, a2=, a3=] function1(f1_arg1=a1, f1_arg2=a2, f1_arg3=a3)
[a1=0] function2(f2_arg1=a1)
[a1=True, a2=something] function3(f3_arg1=a1, f3_arg2=a2)
All the values in the columns 'FULL_NAME' and 'ARGUMENT' are strings.
As regards a{i}, a{i} should be equal to an argument default value unless the default value is NAN
and its type is NAN
(in this case it should be followed by the '=' sign). If the default value of the argument is NAN
but the type is NoneType
then a{i} must be None
.
This can be achieved in the following way (the solution was suggested here):
df['args'] = 'a'+(df.groupby('FULL_NAME').cumcount()+1).astype(str)df['ARGUMENT'] = df['ARGUMENT']+ '=' + df['args']df['args'] += '='df['args'] = df.apply(lambda x: x['args']+'NONE' if x['TYPE'] == 'NoneType' else x['args'] if pd.isnull(x['TYPE']) else x['args']+str(x['DEF_VAL']),1 ) ndf = pd.concat([pd.DataFrame(df.groupby('FULL_NAME')['ARGUMENT'].apply(tuple)),pd.DataFrame(df.groupby('FULL_NAME')['args'].apply(list))],1)ndf['function'] = (ndf.reset_index()['FULL_NAME'] + ndf.reset_index()['ARGUMENT'].apply(str)).tolist()ndf = ndf.reset_index(drop=True).drop('ARGUMENT',1)ndf['function'].replace(["'",",\)"],["",")"],regex=True,inplace=True)
However, I would like to impose one important condition. Namely, some of those functions are actually class methods and the initial dataframe may look like this:
FULL_NAME ARGUMENT DEF_VAL TYPE
'function1' 'self' NAN NAN
'function1' 'f1_arg2' 0 'int'
'function1' 'f1_arg3' NAN TypeNone
'function2' 'f2_arg1' 0 'int'
'function3' 'f3_arg1' True 'bool'
'function3' 'f3_arg2' 'something' 'str'
In this case I would like 'self' to be ignored and the resulting frame look like this:
args function
[a1=0, a2=None] function1(f1_arg2=a1, f1_arg3=a2)
[a1=0] function2(f2_arg1=a1)
[a1=True, a2=something] function3(f3_arg1=a1, f3_arg2=a2)
The self
argument is ignored. How do I achieve it by using pandas?