I am trying to match dates in a string where the date is formatted as (month dd, yyyy). I am confused by what I see when I use my regex pattern below. It only matches strings that begin with a date. What am I missing?
>>> p = re.compile('[A-z]{3}\s{1,}\d{1,2}[,]\s{1,}\d{4}')>>> s = "xyz Dec 31, 2013 - Jan 4, 2014">>> print p.match(s).start()Traceback (most recent call last):File "<stdin>", line 1, in <module>AttributeError: 'NoneType' object has no attribute 'start'>>> s = "Dec 31, 2013 - Jan 4, 2014">>> print p.match(s).start()0 #Correct
Use re.findall
rather than re.match
, it will return to you list of all matches:
>>> s = "Dec 31, 2013 - Jan 4, 2014"
>>> r = re.findall(r'[A-z]{3}\s{1,}\d{1,2}[,]\s{1,}\d{4}',s)
>>> r
['Dec 31, 2013', 'Jan 4, 2014']
>>>
>>> s = 'xyz Dec 31, 2013 - Jan 4, 2014'
>>> r = re.findall(r'[A-z]{3}\s{1,}\d{1,2}[,]\s{1,}\d{4}',s)
>>> r
['Dec 31, 2013', 'Jan 4, 2014']
From Python docs:
re.match(pattern, string, flags=0)
If zero or more characters at thebeginning of string match the regular expression pattern, return acorresponding MatchObject instance
In the other hand:
findall()
matches all occurrences of a pattern, not just the first oneas search() does.