Regex to match special list items

2024/11/16 18:51:05

I have weird list of items and lists like this with | as a delimiters and [[ ]] as a parenthesis. It looks like this:

| item1 | item2 | item3 | Ulist1[[ | item4 | item5 | Ulist2[[ | item6 | item7 ]] | item8 ]] | item9 | list3[[ | item10 | item11 | item12 ]] | item13 | item14

I want to match items in lists called Ulist* (items 4-8) using RegEx and replace them with Uitem*. The result should look like this:

| item1 | item2 | item3 | Ulist1[[ | Uitem4 | Uitem5 | Ulist2[[ | Uitem6 | Uitem7 ]] | Uitem8 ]] | item9 | list3[[ | item10 | item11 | item12 ]] | item13 | item14

I tryied almost everything I know about RegEx, but I haven't found any RegEx matching each item inside if the Ulists. My current RegEx:

/Ulist(\d+)\[\[(\s*(\|\s*[^\s\|]*)*\s*)*\]\]/i

What is wrong? I am beginner with RegEx.

It is in Python 2.7, specifically my code is:

    def fixDirtyLists(self, text):text = textlib.replaceExcept(text, r'Ulist(\d+)\[\[(\s*(\|\s*[^\s\|]*)*\s*)*\]\]', r'Ulist\1[[ U\3 ]]', '', site=self.site)return text

text gets that weird list, textlib replaces RegEx with RegEx. Not complicated at all.

Answer

If you install PyPi regex module (with Python 2.7.9+ it can be done by a mere pip install regex when in \Python27\Scripts\ folder), you will be able to match nested square brackets. You can match the strings you need, replace item with Uitem inside only those substrings.

The pattern (see demo, note that PyPi regex recursion resembles that of PCRE):

(Ulist\d+)(\[\[(?>[^][]|](?!])|\[(?!\[)|(?2))*]])
^-Group1-^^-----------Group2--------------------^

A short explanation: (Ulist\d+) is Group 1 that matches a literal word Ulist followed by 1 or more digits followed by (\[\[(?>[^][]|](?!])|\[(?!\[)|(?2))*]]) that matches substrings starting with [[ up to the corresponding ]].

And the Python code:

>>> import regex
>>> s = "| item1 | item2 | item3 | Ulist1[[ | item4 | item5 | Ulist2[[ | item6 | item7 ]] | item8 ]] | item9 | list3[[ | item10 | item11 | item12 ]] | item13 | item14"
>>> pat = r'(Ulist\d+)(\[\[(?>[^][]|](?!])|\[(?!\[)|(?2))*]])'
>>> res = regex.sub(pat, lambda m: m.group(1) + m.group(2).replace("item", "Uitem"), s)
>>> print(res)
| item1 | item2 | item3 | Ulist1[[ | Uitem4 | Uitem5 | Ulist2[[ | Uitem6 | Uitem7 ]] | Uitem8 ]] | item9 | list3[[ | item10 | item11 | item12 ]] | item13 | item14

To avoid modifying lists inside Ulist, use

def repl(m):return "".join([x.replace("item", "Uitem") if not x.startswith("list") else x for x in regex.split(r'\blist\d*\[{2}[^\]]*(?:](?!])[^\]]*)*]]', m.group(0))])

and replace the regex.sub with

res = regex.sub(pat, repl, s)
https://en.xdnf.cn/q/119165.html

Related Q&A

How to choose the best model dynamically using python

Here is my code im building 6 models and i am getting accuracy in that, how do i choose that dynamically which accuracy is greater and i want to execute only that model which as highest accuracy."…

How do you access specific elements from the nested lists

I am trying to access elements from the nested lists. For example, file = [[“Name”,”Age”,”Medal”,”Location”],[“Jack”,”31”,”Gold”,”China”],[“Jim”,”29”,”Silver”,”US”]]This data c…

Why does BLOCKCHAIN.COM API only return recipient BASE58 addresses and omits BECH32s?

Following this post, I am trying to access all transactions within the #630873 block in the bitcoin blockchain.import requestsr = requests.get(https://blockchain.info/block-height/630873?format=json) …

rename columns according to list

I have 3 lists of data frames and I want to add a suffix to each column according to whether it belongs to a certain list of data frames. its all in order, so the first item in the suffix list should b…

How to send a pdf file from Flask to ReactJS

How can I send a file from Flask to ReactJS? I have already code that in the frontend, the user upload a file and then that file goes to the Flask server, then in the flask server the file is modify, …

How to draw cover on each tile in memory in pygame

I am a beginner in pygame and I am not a English native speaker.My assignment is coding a game called Memory. This game contains 8 pairs pictures and an cover exists on each pictures. This week, our as…

Compare 2 Excel files and output an Excel file with differences

Assume for simplicity that the data files look like this, sorted on ID:ID | Data1 | Data2 | Data3 | Data4 199 | Tim | 55 | work | $55 345 | Joe | 45 | work | $34 356 | Sam |…

Problem to show data dynamically table in python flask

I want to show a table in the html using python flask framework. I have two array. One for column heading and another for data record. The length of the column heading and data record are dynamic. I ca…

RuntimeError: generator raised StopIteration

I am in a course and try to find my problem. I cant understand why if I enter something other than 9 digits, the if should raise the StopIteration and then I want it to go to except and print it out. W…

split list elements into sub-elements in pandas dataframe

I have a dataframe as:-Filtered_data[defence possessed russia china,factors driving china modernise] [force bolster pentagon,strike capabilities pentagon congress detailing china] [missiles warheads, d…