Fast way to get N Min or Max elements from a list in Python

2024/11/15 0:40:11

I currently have a long list which is being sorted using a lambda function f. I then choose a random element from the first five elements. Something like:

f = lambda x: some_function_of(x, local_variable)
my_list.sort(key=f)
foo = choice(my_list[:4])

This is a bottleneck in my program, according to the profiler. How can I speed things up? Is there a fast, inbuilt way to retrieve the elements I want (in theory shouldn't need to sort the whole list). Thanks.

Answer

Use heapq.nlargest or heapq.nsmallest.

For example:

import heapqelements = heapq.nsmallest(4, my_list, key=f)
foo = choice(elements)

This will take O(N+KlogN) time (where K is the number of elements returned, and N is the list size), which is faster than O(NlogN) for normal sort when K is small relative to N.

https://en.xdnf.cn/q/72472.html

Related Q&A

Continue if else in inline for Python

I have not been able to find the trick to do a continue/pass on an if in a for, any ideas?. Please dont provide explicit loops as solutions, it should be everything in a one liner.I tested the code wi…

HTTPError: HTTP Error 403: Forbidden on Google Colab

I am trying to download MNIST data in PyTorch using the following code:train_loader = torch.utils.data.DataLoader(datasets.MNIST(data,train=True,download=True,transform=transforms.Compose([transforms.T…

Pandas partial melt or group melt

I have a DataFrame like this>>> df = pd.DataFrame([[1,1,2,3,4,5,6],[2,7,8,9,10,11,12]], columns=[id, ax,ay,az,bx,by,bz]) >>> dfid ax ay az bx by bz 0 1 1 2 3 4 5 6…

How do I detect when my window is minimized with wxPython?

I am writing a small wxPython utility.I would like to use some event to detect when a user minimizes the application/window.I have looked around but did not find an event like wx.EVT_MINIMIZE that I co…

Pandas: How to select a column in rolling window

I have a dataframe (with columns a, b, c) on which I am doing a rolling-window.I want to be able to filter the rolling window using one of the columns (say a) in the apply function like belowdf.rolling…

What is the fastest way in Cython to create a new array from an existing array and a variable

Suppose I have an arrayfrom array import array myarr = array(l, [1, 2, 3])and a variable: myvar = 4 what is the fastest way to create a new array:newarray = array(l, [1, 2, 3, 4])You can assume all ele…

Subclassing and built-in methods in Python

For convenience, I wanted to subclass socket to create an ICMP socket:class ICMPSocket(socket.socket):def __init__(self):socket.socket.__init__(self, socket.AF_INET,socket.SOCK_RAW,socket.getprotobynam…

How to load Rs .rdata files into Python?

I am trying to convert one part of R code in to Python. In this process I am facing some problems.I have a R code as shown below. Here I am saving my R output in .rdata format.nms <- names(mtcars) s…

how to set cookie in python mechanize

After sending request to the serverbr.open(http://xxxx)br.select_form(nr=0) br.form[MESSAGE] = 1 2 3 4 5br.submit()I get the response title, which has set-cookieSet-Cookie: PON=xxx.xxx.xxx.111; expir…

How can I deal with a massive delete from Django Admin?

Im working with Django 2.2.10.I have a model called Site, and a model called Record. Each record is associated with a single site (Foreign Key).After my app runs for a few days/weeks/months, each site …