looking for an inverted heap in python

2024/10/13 9:16:47

I'd like to comb off the n largest extremes from a timeseries. heapq works perfectly for the nlargest

def nlargest(series, n):count = 0heap = []for e in series:if count < n:count+=1hp.heappush(heap, e)else:# keeps heap size fixed hp.heappushpop(heap,e)  ''' note: heap[0] is smallest '''return heap

but what about the n smallest? Note that i want a subset of the original series, so heapify and reversing the order won't work. What I'd like is essentially to overload the comparison operator from gt to lt. Not so familiar with overloading in python.

A less attractive option would (assuming numerical values) would be to negate the item before inserting and then negate the entire return heap (return a list or re-heap the negated list) but this seems kludgy and it no longer works for non-numericals that do have gt and lt. Any elegant solutions?

Answer

You can easily 'create' an inverted heap by multiplying the priorities of your items by -1.

So your nsmallest simply needs to be told how to 'invert' the priorities, decorating each value as needed:

def nsmallest(series, n, invert=lambda x: -1 * x):count = 0heap = []for e in series:if count < n:count += 1hp.heappush(heap, (invert(e), e))else:# keeps heap size fixedhp.heappushpop(heap, (invert(e), e))  # note: heap[0][1] is largest, remove inverted prioritiesreturn [h[1] for h in heap]

Note that we use a (invertedpriority, value) tuple to keep the heap inverted.

For non-numericals, you'd have to simply provide a inversion function that turns the priorities upside down, it only needs to be a simple key, not something that is readable or anything:

alphanumeric_invert = lambda x: [(ord(c) * -1) for c in x] 

However, rather than write your own, you want to use the heapq.nsmallest() function, which uses an optimised max-heap implementation (there is an internal _heappop_max() function it uses), which also adds a tie-breaker counting value to keep the sort stable. And there is a matching heapq.nlargest() function.

https://en.xdnf.cn/q/118097.html

Related Q&A

Concatenating Multiple DataFrames with Non-Standard Columns

Is there a good way to concatenate a list of DataFrames where the columns are not regular between DataFrames? The desired outcome is to match up all columns that are a match but to keep the ones that …

Python Conditionally Add Class to td Tags in HTML Table

I have some data in the form of a csv file that Im reading into Python and converting to an HTML table using Pandas.Heres some example data:name threshold col1 col2 col3 A 10 12 9 13…

Sort a dictionary of dictionaries python

I have a dictionary of dictionaries like the followingd = {hain: {facet: 1, wrapp: 1, chinoiserie: 1}, library: {sconc: 1, floor: 1, wall: 2, lamp: 6, desk: 1, table: 1, maine: 1} }So, I want to rever…

How can I get python generated excel document to correctly calculate array formulas

I am generating some excel files with python using python 3.6 and openpyxl.At one point I have to calculate standard deviations of a subsection of data. In excel this is done with an array formula. Wri…

Unable to locate element in Python Selenium

Im trying to locate an element using python selenium, and have the following code:zframe = driver.find_element_by_xpath("/html/frameset/frameset/frame[5]") driver.switch_to.frame(zframe) find…

How to import a variable from a different class

I have an instance of a class that i set the value to self.world inside a class named zeus inside a module named Greek_gods. and i have another class names World inside a module name World.How can i te…

Scrapy: AttributeError: YourCrawler object has no attribute parse_following_urls

I am writing a scrapy spider. I have been reading this question: Scrapy: scraping a list of links, and I can make it recognise the urls in a listpage, but I cant make it go inside the urls and save the…

initializer is not a constant, error C2099, on compiling a module written in c for python

i tried to compile a python module called distance, whith c "python setup.py install --with-c" using msvc 2017 on windows 10, i got this error ,Cdistance / distance.c (647): error C2099: init…

How can make pandas columns compare check cell?

I have a two file. a.txt has the below data.Zone,Aliase1,Aliase2 VNX7600SPB3_8B3_H1,VNX7600SPB3,8B3_H1 VNX7600SPBA_8B4_H1,VNX7600SPA3,8B4_H1 CX480SPA1_11B3_H1,CX480SPA1,11B3_H1 CX480SPB1_11B4_H1,CX480S…

Flask argument of type _RequestGlobals is not iterable

When I tried to use Flask-WTForms, I followed these steps:from flask_wtf import Form from wtforms import StringField, PasswordField from wtforms.validators import DataRequired, Emailclass EmailPassword…