Find all lines in a dataframe that matches specific pattern and extract the strings after split

2024/11/18 11:19:51

I have a dataframe that looks like

LineEntry: [0x0000000002758261-0x0000000002758268): /a/b/c:7921:14
LineEntry: [0x0000000002756945-0x0000000002756960): /f/b/c:6545:10
LineEntry: [0x00000000027562c9-0x00000000027562d0): /k/b/c
LineEntry: [0x00000000027562c9-0x00000000027562d0): /c/d/f
....
....

I am interested only in strings that look like the first two entries( i.e 5 strings after before and after colon delimiter) and extract the last two strings before and after last colon ( Ex:7921, 14)

After filtering above, the dataframe should like
LineEntry: [0x0000000002758261-0x0000000002758268): /a/b/c:7921:14
LineEntry: [0x0000000002756945-0x0000000002756960): /f/b/c:6545:10

I have tried res = re.split(":", df ) and use res[3] and res[4] to extract last 2 strings before and after colon , but I get errors for 3 and 4th type of lines as above.

Any effective ways of filtering dataframe entries that exactly looks like the first two line entries?

Answer

I'm assuming that the example data given is a column in a dataframe and you want to filter for those rows that end with a digit (number). You can use str.isdigit() for this, like below:

arr = df['your_col_name'].to_list()
[x for x in arr if x[-1].isdigit()] # return just the column values as list

Output:

['LineEntry: [0x0000000002758261-0x0000000002758268): /a/b/c:7921:14','LineEntry: [0x0000000002756945-0x0000000002756960): /f/b/c:6545:10']

If instead, you want to filter the data frame:

df = df.reset_index() # this assumes your index is range(0, nrows)
arr = df['your_col_name'].to_list()
filter = [i for i,x in enumerate(arr) if x[-1].isdigit()]
df.iloc[filter,:] 

Output:

                                       your_col_name
0  LineEntry: [0x0000000002758261-0x0000000002758...
1  LineEntry: [0x0000000002756945-0x0000000002756...

Another option is to use re.search(r'\d+$', string) but this regex does something similar (searches if string ends with a digit)

https://en.xdnf.cn/q/118677.html

Related Q&A

Python: Concatenate many dicts of numpy arrays with same keys and size

I have a function called within a loop that returns a dict (dsst_mean) with roughly 50 variables. All variables are numpy arrays of length 10.The loop iterates roughly 3000 times. Im current concatenat…

intersection of 2 objects of different types

i want to take the intersection of: ((e, 13.02338360095244), (a, 11.820318700775383), (o, 9.20172171683253), (s, 7.635081506807498), (n, 7.547469320471335), (i, 7.219915745772025), (r, 6.70492704072287…

Enemy Projectiles Attack Way To Fast Problem

I am trying to make my enemy bullets attack the player but its attacking way to fast I dont know why VIDEO my enemy bullets class# enemys bulletsksud = pygame.image.load("heart.png")class Boo…

How to find the maximum digit of an integer? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.Want to improve this question? Add details and clarify the problem by editing this post.Closed 10 years ago.This p…

How to use python to generate a magazine cover? [closed]

Its difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying thi…

Why do I get this error? NoneType object has no attribute shape in opencv

Im working on real-time clothing detection. so i borrowed the code from GitHub like this:https://github.com/rajkbharali/Real-time-clothes-detection but (H, W) = frame.shape[:2]:following error in last …

Efficiently concatenate two strings from tuples in a list?

I want to concatenate the two string elements in a list of tuplesI have this:mylist = [(a, b), (c, d), (e, f), (g, h)] myanswer = []for tup1 in mylist:myanswer.append(tup1[0] + tup[1])Its working but i…

How to assert that a function call does not return an error with unittest?

Is there anyway with unittest to just assert that a function call does not result in an error, whether it is a TypeError, IOError, etc.example:assert function(a,b) is not errororif not assertRaises fun…

How to calculate Python float-number-th root of float number

I found the following answer here on Stackoverflow:https://stackoverflow.com/a/356187/1829329But it only works for integers as n in nth root:import gmpy2 as gmpyresult = gmpy.root((1/0.213), 31.5).real…

Need help making a Hilbert Curve using numbers in Python

I want to make a function that will create a Hilbert Curve in python using numbers. The parameters for the function would be a number and that will tell the function how many times it should repeat. To…