Detect or Generate Regular Expression from String

2024/10/15 15:28:43

I was wondering if there were any Python packages out there that detects a regular expression from a string. Conceptually this is easy enough to do but I wanted to see if there was anyone else who has solved this problem.

To the extent that I looked around on my own, I've read the re package docs and didn't find it, and I read the best hits I could find on Stack Overflow and couldn't find one either. I've googled it and the hits I find are how to use regex to parse strings. I've searched through PyPI but the only hit I could find is 'regexgen 1.0' and that didn't seem to lead anywhere...

To be clear, what I am looking for is something to the effect of:

def detect_regex(some_string):[does stuff..]return regular_expression

Thoughts would be greatly appreciated. If there aren't any, I can write this myself. I just didn't want to waste time re-creating what's already been done. Thanks!

Edit: I may not have been very clear in my question; the regex 'foo' does match the string 'foo' but my goal is the following -- given the three strings with values foo123abc, abc078963bar, and xyz8940baz, the return would be ^[a-z]+[0-9]+[a-z]+$... hypothetically. So the regex would be "general" to some extent.

Answer

Very limited, but a bit of fun. Only deals with alpha and numerics as per your examples. Was also just curious and hopefully this helps to give you a suitable starting point:

def auto_re(text):cur = first = str.isalpha(text[0])count = 0for letter in text:if str.isalpha(letter) != cur:cur = not curcount += 1modes = ["[0-9]+", "[a-z]+"]return "^" + modes[first] + (count/2) * (modes[not first ] + modes[first]) + (count%2) * (modes[not first]) + "$"tests = ["foo123abc", "abc078963bar", "xyz8940baz", "1a", "a1", "1a2b3c", "a12b3c"]for test in tests:print "%-20s %s" % (test, auto_re(test))

Gives the following output:

foo123abc            ^[a-z]+[0-9]+[a-z]+$
abc078963bar         ^[a-z]+[0-9]+[a-z]+$
xyz8940baz           ^[a-z]+[0-9]+[a-z]+$
1a                   ^[0-9]+[a-z]+$
a1                   ^[a-z]+[0-9]+$
1a2b3c               ^[0-9]+[a-z]+[0-9]+[a-z]+[0-9]+[a-z]+$
a12b3c               ^[a-z]+[0-9]+[a-z]+[0-9]+[a-z]+$
https://en.xdnf.cn/q/117818.html

Related Q&A

Date regex python

I am trying to match dates in a string where the date is formatted as (month dd, yyyy). I am confused by what I see when I use my regex pattern below. It only matches strings that begin with a date. Wh…

Not able to install the python-module numpy

After hours of trying Im still not able to install numpy. I READ LOTS OF HINTS, ANSWERS USW. BUT IT DOESNT HELP. Furthermore I have windows 7, 32 bit, Python 27. What I did:download numpy-1.10.2.zi…

Windows Error: 32 when trying to rename file in python

Im trying to rename some PDF files using pyPdf and my code it seems to work fine until it reaches the rename sentence. The While/if block of code looks for the page number where string "This stri…

I dont quite understand the while loop in python

def AddSingleCard(self):symbols = [heart, diamond, club, spade]#newCardSign = newCardNumber, newCardSign = raw_input().split()try:newCardNumber = int(float(newCardNumber))except:newCardNumber, newCardS…

Adding a FileField to a custom SignupForm with django-allauth

I have the following custom SignupForm (simplified, works perfectly without my_file):class SignupForm(forms.Form):home_phone = forms.CharField(validators=[phone_regex], max_length=15)my_file = forms.Fi…

Checkbox to determine if an action is completed or not

I have a list of dictionaries of clients in a format like this:dict_list = [{Name of Business : Amazon, Contact Name : Jeff Bezos, Email : [email protected]}, {Name of Business : Microsoft, Contact Nam…

Python not concatenating string and unicode to link

When I append a Unicode string to the end of str, I can not click on the URL.Bad:base_url = https://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xml&tit…

Decrypt python/django password with Symfony 2.5 (using symfony security)

I want to use symfony 2.5.10 security in order to login in from users that were created with pyhton/django security. Passwords in db that are encrypted in this format:pbkdf2_sha256$12000$dVPTWPll8poG$3…

Reuse getCmd object in pysnmp

In the pysnmp documentation there is a getCmd class, I was wondering if it was possible to just instantiate the class once and reuse it at a later point by passing it new oids. I am not sure if the ge…

Django plugged into Apache not working like Django standalone

I have encountered a hodgepodge of errors trying to bring my django site into production with Apache. Having finally gotten mod_wsgi sorted and Apache at least seeming to be trying to load the site Im …