Sort a dictionary with custom sorting function

2024/10/9 12:56:27

I have some JSON data I read from a file using json.load(data_file)

 {"unused_account":{"logins": 0,"date_added": 150},"unused_account2":{"logins": 0,"date_added": 100},"power_user_2": {"logins": 500,"date_added": 400,"date_used": 500},"power_user": {"logins": 500,"date_added": 300,"date_used": 400},"regular_user": {"logins": 20,"date_added": 200,"date_used": 300}
}

I want to sort the entries in a specific order. I have found lots of examples to sort by key or one single value. But I would like to sort the values by these rules:

  1. groupby logins descending, but users with 0 logins first
  2. sort users with 0 logins by date_added
  3. sort users with at least 1 login by date_used

Ideally I would write my own compare function like this:

def compare(elem1, elem2):"""Return >0 if elem2 is greater than elem1<0 if elem2 is lesser than elem10 if they are equal"""#rule 1 group by loginsif elem1['logins'] != elem2['logins']:if elem1['logins'] == 0:return -1if elem2['logins'] == 0:return 1return elem2['logins'] - elem1['logins']# rule 2 sort on date_addedif elem1['logins'] == 0 and elem2['logins'] == 0:return elem2['date_added'] - elem1['date_added']#rule 3 sort on date_usedif elem1['logins'] == elem2['logins'] and elem1['loigns'] > 0:return elem2['date_used'] - elem1['date_used']return 0  # default

I don't know where and how to plugin my sorting function.

Answer

I'm going to assume you know that dictionaries are unordered and that you want to sort either the values, or the key-value pairs. The following examples sort the values.

Your comparison function already works, provided you fix the loigns typo in the last if:

>>> sorted(sample.itervalues(), cmp=compare))
[{'logins': 0, 'date_added': 150}, {'logins': 0, 'date_added': 100}, {'logins': 500, 'date_added': 400, 'date_used': 500}, {'logins': 500, 'date_added': 300, 'date_used': 400}, {'logins': 20, 'date_added': 200, 'date_used': 300}]
>>> pprint(_)
[{'date_added': 150, 'logins': 0},{'date_added': 100, 'logins': 0},{'date_added': 400, 'date_used': 500, 'logins': 500},{'date_added': 300, 'date_used': 400, 'logins': 500},{'date_added': 200, 'date_used': 300, 'logins': 20}]

However, you can use the following sort key too:

(not d['logins'], d['logins'], d['date_used'] if d['logins'] else d['date_added'])

This creates a tuple of (has_logins, num_logins, date) where the date picked is based on whether or not the user has logged in.

Use it as the key argument to the sorted() function, and reverse the sort, like this:

>>> key = lambda d: (not d['logins'], d['logins'], d['date_used'] if d['logins'] else d['date_added'])
>>> pprint(sorted(sample.itervalues(), key=key, reverse=True))
[{'date_added': 150, 'logins': 0},{'date_added': 100, 'logins': 0},{'date_added': 400, 'date_used': 500, 'logins': 500},{'date_added': 300, 'date_used': 400, 'logins': 500},{'date_added': 200, 'date_used': 300, 'logins': 20}]

If you needed the keys as well, use dict.iteritems() and update the key function to accept a (k, d) tuple:

>>> key = lambda (k, d): (not d['logins'], d['logins'], d['date_used'] if d['logins'] else d['date_added'])
>>> pprint(sorted(sample.iteritems(), key=key, reverse=True))
[('unused_account', {'date_added': 150, 'logins': 0}),('unused_account2', {'date_added': 100, 'logins': 0}),('power_user_2', {'date_added': 400, 'date_used': 500, 'logins': 500}),('power_user', {'date_added': 300, 'date_used': 400, 'logins': 500}),('regular_user', {'date_added': 200, 'date_used': 300, 'logins': 20})]
https://en.xdnf.cn/q/118569.html

Related Q&A

Turtle make triangle different color

Hi guys Im trying to replicate this image:Its almost done I just have one issue, where the triangle is supposed to be yellow it isnt seeming to work.Mine:Code:fill(True) fillcolor(green) width(3) forwa…

How to DataBricks read Delta tables based on incremental data

we have to read the data from delta table and then we are joining the all the tables based on our requirements, then we would have to call the our internal APIS to pass the each row data. this is our g…

Converting an excel file to a specific Json in python using openpyxl library with datetime

I have the Excel data with the format shown in the image preview. How can I convert it into a JSON using Python? Expected Output: file_name = [ { A: Measurement( calculated_date=datetime(2022, 10, 1, …

How to find a word in a string in a list? (Python)

So im trying to find a way so I can read a txt file and find a specific word. I have been calling the file with myfile=open(daily.txt,r)r=myfile.readlines()that would return a list with a string for ea…

How to make a new default argument list every time [duplicate]

This question already has answers here:The Mutable Default Argument in Python(34 answers)Closed 10 years ago.I have the following setup:def returnList(arg=["abc"]):return arglist1 = returnLis…

How does one reorder information in an XML document in python 3?

Lets suppose I have the following XML structure:<?xml version="1.0" encoding="utf-8" ?> <Document><CstmrCdtTrfInitn><GrpHdr><other_tags>a</other_t…

Python - Replace only exact word in string [duplicate]

This question already has answers here:How to match a whole word with a regular expression?(4 answers)Closed 4 years ago.I want to replace only specific word in one string. However, some other words h…

How to write Hierarchical query in PYTHON

The given input is like:EMPLOYEE_ID NAME MANAGER_ID101 A 10102 B 1110 C 111 D 11 E nullEmployee Cycle LEVEL Path10…

Unable to launch selenium with python in mac

Im facing an issue with selenium with python in Mac OS.. Python 2.7 pydev 3.0My sample codefrom selenium import webdriver driver = webdriver.Firefox() driver.get("https://www.formsite.com/") …

Memory error In instantiating the numpy array

I have a list A of a 50,000 elements and each element is an array of shape (102400) I tried instantiating an array B.B=numpy.array(A)But this throws an exception MemoryError.I know that the memory and …