I have a dataframe, df, where I would like to remove the values that come before the underscore '_'
and after the underscore '_'
, essentially, keeping the middle.
Also keeping the digits at the end and concatenate with the middle part extracted.
Data
col1 col2
a_bu1 dd
a_lap_aa1 d
a_lap_aa2 d
h_bb_led1 dd
Desired
col1 col2
bu1 dd
lap1 d
lap2 d
bb1 dd
Doing
re.sub(r'^.*?I', 'I', stri)
However, the entire dataset is not being maintained. I am still researching. Any advice is appreciated.