Which one of these is faster? Is one "better"? Basically I'll have two sets and I want to eventually get one match from between the two lists. So really I suppose the for loop is more like:
for object in set:if object in other_set:return object
Like I said - I only need one match, but I'm not sure how intersection()
is handled, so I don't know if its any better. Also, if it helps, the other_set
is a list near 100,000 components and the set
is maybe a few hundred, max few thousand.
from timeit import timeitsetup = """
from random import sample, shuffle
a = range(100000)
b = sample(a, 1000)
a.reverse()
"""forin = setup + """
def forin():# a = set(a)for obj in b:if obj in a:return obj
"""setin = setup + """
def setin():# original method:# return tuple(set(a) & set(b))[0]# suggested in comment, doesn't change conclusion:return next(iter(set(a) & set(b)))
"""print timeit("forin()", forin, number = 100)
print timeit("setin()", setin, number = 100)
Times:
>>>
0.0929054012768
0.637904308732
>>>
0.160845057616
1.08630760484
>>>
0.322059185123
1.10931801261
>>>
0.0758695262169
1.08920981403
>>>
0.247866360526
1.07724461708
>>>
0.301856152688
1.07903130641
Making them into sets in the setup and running 10000 runs instead of 100 yields
>>>
0.000413064976328
0.152831597075
>>>
0.00402408388788
1.49093627898
>>>
0.00394538156695
1.51841512101
>>>
0.00397715579584
1.52581949403
>>>
0.00421472926155
1.53156769646
So your version is much faster whether or not it makes sense to convert them to sets.