Python regular expression to replace everything but specific words

2024/10/10 8:24:06

I am trying to do the following with a regular expression:

import re
x = re.compile('[^(going)|^(you)]')    # words to replace
s = 'I am going home now, thank you.' # string to modify
print re.sub(x, '_', s)

The result I get is:

'_____going__o___no______n__you_'

The result I want is:

'_____going_________________you_'

Since the ^ can only be used inside brackets [], this result makes sense, but I'm not sure how else to go about it.

I even tried '([^g][^o][^i][^n][^g])|([^y][^o][^u])' but it yields '_g_h___y_'.

Answer

Not quite as easy as it first appears, since there is no "not" in REs except ^ inside [ ] which only matches one character (as you found). Here is my solution:

import redef subit(m):stuff, word = m.groups()return ("_" * len(stuff)) + words = 'I am going home now, thank you.' # string to modifyprint re.sub(r'(.+?)(going|you|$)', subit, s)

Gives:

_____going_________________you_

To explain. The RE itself (I always use raw strings) matches one or more of any character (.+) but is non-greedy (?). This is captured in the first parentheses group (the brackets). That is followed by either "going" or "you" or the end-of-line ($).

subit is a function (you can call it anything within reason) which is called for each substitution. A match object is passed, from which we can retrieve the captured groups. The first group we just need the length of, since we are replacing each character with an underscore. The returned string is substituted for that matching the pattern.

https://en.xdnf.cn/q/69916.html

Related Q&A

How do I raise a window that is minimized or covered with PyGObject?

Id been using the answer provided in the PyGTK FAQ, but that doesnt seem to work with PyGObject. For your convenience, here is a test case that works with PyGTK, and then a translated version that does…

How to bind multiple widgets with one bind in Tkinter?

I am wondering how to bind multiple widgets with one "bind".For expample: I have three buttons and I want to change their color after hovering.from Tkinter import *def SetColor(event):event.w…

Iterate a large .xz file line by line in python

I have a large .xz file (few gigabytes). Its full of plain text. I want to process the text to create custom dataset. I want to read it line by line because it is too big. Anyone have an idea how to do…

Detect multiple circles in an image

I am trying to detect the count of water pipes in this picture. For this, I am trying to use OpenCV and Python-based detection. The results, I am getting is a little confusing to me because the spread …

Need guidance with FilteredSelectMultiple widget

I am sorry if it question might turn to be little broad, but since I am just learning django (and I am just hobbyist developer) I need some guidance which, I hope, will help someone like me in the futu…

Django: determine which user is deleting when using post_delete signal

I want admins to be notified when certain objects are deleted but I also want to determine which user is performing the delete.Is it possible?This is the code:# models.py # signal to notify admins whe…

Double inheritance causes metaclass conflict

I use two django packages - django-mptt (utilities for implementing Modified Preorder Tree Traversal) and django-hvad (model translation).I have a model class MenuItem and I want to it extends Translat…

Mask area outside of imported shapefile (basemap/matplotlib)

Im plotting data on a basemap of the eastern seaboard of the U. S. and Canada through Matplotlib. In addition to the base layer (a filled contour plot), I overlayed a shapefile of this focus region ato…

Python Glob.glob: a wildcard for the number of directories between the root and the destination

Okay Im having trouble not only with the problem itself but even with trying to explain my question. I have a directory tree consisting of about 7 iterations, so: rootdir/a/b/c/d/e/f/destinationdirThe …

Get datetime format from string python

In Python there are multiple DateTime parsers which can parse a date string automatically without providing the datetime format. My problem is that I dont need to cast the datetime, I only need the dat…