I want to make a function which checks a string for occurrences of other strings within them.
However, the sub-strings which are being checked may be interrupted within the main string by other letters.
For instance:
a = 'abcde'
b = 'ace'
c = 'acb'
The function in question should return as b
being in a
, but not c
.
I've tried set(a)
. intersection(set(b)) already, and my problem with that is that it returns c
as being in a
.
You can turn your expected sequence into a regex:
import redef sequence_in(s1, s2):"""Does `s1` appear in sequence in `s2`?"""pat = ".*".join(s1)if re.search(pat, s2):return Truereturn False# or, more compactly:
def sequence_in(s1, s2):"""Does `s1` appear in sequence in `s2`?"""return bool(re.search(".*".join(s1), s2))a = 'abcde'
b = 'ace'
c = 'acb'assert sequence_in(b, a)
assert not sequence_in(c, a)
"ace" gets turned into the regex "a.*c.*e", which finds those three characters in sequence, with possible intervening characters.