Speed differences between intersection() and object for object in set if object in other_set

2024/10/11 10:20:19

Which one of these is faster? Is one "better"? Basically I'll have two sets and I want to eventually get one match from between the two lists. So really I suppose the for loop is more like:

for object in set:if object in other_set:return object

Like I said - I only need one match, but I'm not sure how intersection() is handled, so I don't know if its any better. Also, if it helps, the other_set is a list near 100,000 components and the set is maybe a few hundred, max few thousand.

Answer
from timeit import timeitsetup = """
from random import sample, shuffle
a = range(100000)
b = sample(a, 1000)
a.reverse()
"""forin = setup + """
def forin():# a = set(a)for obj in b:if obj in a:return obj
"""setin = setup + """
def setin():# original method:# return tuple(set(a) & set(b))[0]# suggested in comment, doesn't change conclusion:return next(iter(set(a) & set(b)))
"""print timeit("forin()", forin, number = 100)
print timeit("setin()", setin, number = 100)

Times:

>>>
0.0929054012768
0.637904308732
>>>
0.160845057616
1.08630760484
>>>
0.322059185123
1.10931801261
>>>
0.0758695262169
1.08920981403
>>>
0.247866360526
1.07724461708
>>>
0.301856152688
1.07903130641

Making them into sets in the setup and running 10000 runs instead of 100 yields

>>>
0.000413064976328
0.152831597075
>>>
0.00402408388788
1.49093627898
>>>
0.00394538156695
1.51841512101
>>>
0.00397715579584
1.52581949403
>>>
0.00421472926155
1.53156769646

So your version is much faster whether or not it makes sense to convert them to sets.

https://en.xdnf.cn/q/69784.html

Related Q&A

Pandas.read_csv reads all of the file into one column

I have a csv file in the form "...","...","..."... with over 40 columns. When I used this simple code, it only gives me one massive key. Ive been messing with it for over …

Python lazy evaluation numpy ndarray

I have a large 2D array that I would like to declare once, and change occasionnaly only some values depending on a parameter, without traversing the whole array. To build this array, I have subclassed …

Python 2.7 NetworkX (Make it interactive)

I am new to NetworkX. Right now, I manage to connect all the nodes to this particular node. What I want to do next it to make it interactive e.g. able to make each of the node move by dragging using cu…

Normal Distribution Plot by name from pandas dataframe

I have a dataframe like below:dateTime Name DateTime day seconds zscore 11/1/2016 15:17 james 11/1/2016 15:17 Tue 55020 1.158266091 11/1/2016 13:41 james 11/1/2016 13:41 Tue 4926…

Change pyttsx3 language

When trying to use pyttsx3 I can only use English voices. I would like to be able to use Dutch as well. I have already installed the text to speech language package in the windows settings menu. But I …

pandas groupby dates and years and sum up amounts

I have pandas dataframe like this:d = {dollar_amount: [200.25, 350.00, 120.00, 400.50, 1231.25, 700.00, 350.00, 200.25, 2340.00], date: [22-01-2010,22-01-2010,23-01-2010,15-02-2010,27-02-2010,07-03-201…

Is Python on every GNU/Linux distribution?

I would like to know if is Python on every G/L distribution preinstalled or not. And why is it so popular on GNU/Linux and not so much on Windows?

Installing QuantLib in Anaconda on the Spyder Editor (Windows)

How do I install the QuantLib Package in Anaconda. I have tried the following code;import QuantLib as qlbut I am getting the following result;ModuleNotFoundError: No module named QuantLibCan anyone ass…

get rows with empty dates pandas python

it looks like this:Dates N-D unit 0 1/1/2016 Q1 UD 1 Q2 UD 2 Q3 UD 3 2/1/2016 Q4 UD 4 5/1/2016 Q5 UD 5 Q6 UDI want to filter out the empty Dates row…

Python: Gridsearch Without Machine Learning?

I want to optimize an algorithm that has several variable parametersas input.For machine learning tasks, Sklearn offers the optimization of hyperparameters with the gridsearch functionality.Is there a …