I'm new with python. Could anybody help me on how I can create a regular expression given a list of strings like this:
test_string = "pero pero CC tan tan RGantigua antiguo AQ0FS0que que CS según según SPS00 mi mi DP1CSS madre madre NCFS000"
How to return a tuple like this:
> ([madre, NCFS00],[antigua, AQ0FS0])
I would like to return the word with it's associated tag given test_string, this is what I've done:
# -- coding: utf-8 --
import re#str = "pero pero CC " \"tan tan RG " \"antigua antiguo AQ0FS0" \"que que CS " \"según según SPS00 " \"mi mi DP1CSS " \"madre madre NCFS000"tupla1 = re.findall(r'(\w+)\s\w+\s(AQ0FS0)', str)
print tupla1tupla2 = re.findall(r'(\w+)\s\w+\s(NCFS00)',str)
print tupla2
The output is the following:
[('antigua', 'AQ0FS0')] [('madre', 'NCFS00')]
The problem with this output is that if I pass it along test_string
I need to preserve the "order" or "occurrence" of the tags (i.e. I only can print a tuple if and only if they have the following order: AQ0FS0 and NCFS000 in other words: female adjective, female noun).