Can you search backwards from an offset using a Python regular expression?

2024/10/15 22:30:53

Given a string, and a character offset within that string, can I search backwards using a Python regular expression?

The actual problem I'm trying to solve is to get a matching phrase at a particular offset within a string, but I have to match the first instance before that offset.

In a situation where I have a regex that's one symbol long (ex: a word boundary), I'm using a solution where I reverse the string.

my_string = "Thanks for looking at my question, StackOverflow."
offset = 30
boundary = re.compile(r'\b')
end =, offset)
end_boundary = end.start()

Output: 33

end =[::-1], len(my_string) - offset - 1)
start_boundary = len(my_string) - end.start()

Output: 25


Output: 'question'

However, this "reverse" technique won't work if I have a more complicated regular expression that may involve multiple characters. For example, if I wanted to match the first instance of "ing" that appears before a specified offset:

my_new_string = "Looking feeding dancing prancing"
offset = 16 # on the word dancing
m = re.match(r'(.*?ing)', my_new_string) # Except looking backwards

Ideal output: feeding

I can likely use other approaches (split the file up into lines, and iterate through the lines backwards) but using a regular expression backwards seems like a conceptually-simpler solution.


Using positive lookbehind to make sure there are at least 30 characters before a word:

# re like: r'.*?(\w+)(?<=.{30})'
m = re.match(r'.*?(\w+)(?<=.{%d})' % (offset), my_string)
if m: print
else: print "no match"

For the other example negative lookbehind may help:

my_new_string = "Looking feeding dancing prancing"
offset = 16
m = re.match(r'.*(\b\w+ing)(?<!.{%d})' % offset, my_new_string)
if m: print

which first greedy matches any character but backtracks until it fails to match 16 characters backwards ((?<!.{16})).

