I have some JSON data I read from a file using json.load(data_file)
{"unused_account":{"logins": 0,"date_added": 150},"unused_account2":{"logins": 0,"date_added": 100},"power_user_2": {"logins": 500,"date_added": 400,"date_used": 500},"power_user": {"logins": 500,"date_added": 300,"date_used": 400},"regular_user": {"logins": 20,"date_added": 200,"date_used": 300}
}
I want to sort the entries in a specific order. I have found lots of examples to sort by key or one single value. But I would like to sort the values by these rules:
- groupby logins descending, but users with 0 logins first
- sort users with 0 logins by date_added
- sort users with at least 1 login by date_used
Ideally I would write my own compare function like this:
def compare(elem1, elem2):"""Return >0 if elem2 is greater than elem1<0 if elem2 is lesser than elem10 if they are equal"""#rule 1 group by loginsif elem1['logins'] != elem2['logins']:if elem1['logins'] == 0:return -1if elem2['logins'] == 0:return 1return elem2['logins'] - elem1['logins']# rule 2 sort on date_addedif elem1['logins'] == 0 and elem2['logins'] == 0:return elem2['date_added'] - elem1['date_added']#rule 3 sort on date_usedif elem1['logins'] == elem2['logins'] and elem1['loigns'] > 0:return elem2['date_used'] - elem1['date_used']return 0 # default
I don't know where and how to plugin my sorting function.
I'm going to assume you know that dictionaries are unordered and that you want to sort either the values, or the key-value pairs. The following examples sort the values.
Your comparison function already works, provided you fix the loigns
typo in the last if
:
>>> sorted(sample.itervalues(), cmp=compare))
[{'logins': 0, 'date_added': 150}, {'logins': 0, 'date_added': 100}, {'logins': 500, 'date_added': 400, 'date_used': 500}, {'logins': 500, 'date_added': 300, 'date_used': 400}, {'logins': 20, 'date_added': 200, 'date_used': 300}]
>>> pprint(_)
[{'date_added': 150, 'logins': 0},{'date_added': 100, 'logins': 0},{'date_added': 400, 'date_used': 500, 'logins': 500},{'date_added': 300, 'date_used': 400, 'logins': 500},{'date_added': 200, 'date_used': 300, 'logins': 20}]
However, you can use the following sort key too:
(not d['logins'], d['logins'], d['date_used'] if d['logins'] else d['date_added'])
This creates a tuple of (has_logins, num_logins, date)
where the date picked is based on whether or not the user has logged in.
Use it as the key
argument to the sorted()
function, and reverse the sort, like this:
>>> key = lambda d: (not d['logins'], d['logins'], d['date_used'] if d['logins'] else d['date_added'])
>>> pprint(sorted(sample.itervalues(), key=key, reverse=True))
[{'date_added': 150, 'logins': 0},{'date_added': 100, 'logins': 0},{'date_added': 400, 'date_used': 500, 'logins': 500},{'date_added': 300, 'date_used': 400, 'logins': 500},{'date_added': 200, 'date_used': 300, 'logins': 20}]
If you needed the keys as well, use dict.iteritems()
and update the key function to accept a (k, d)
tuple:
>>> key = lambda (k, d): (not d['logins'], d['logins'], d['date_used'] if d['logins'] else d['date_added'])
>>> pprint(sorted(sample.iteritems(), key=key, reverse=True))
[('unused_account', {'date_added': 150, 'logins': 0}),('unused_account2', {'date_added': 100, 'logins': 0}),('power_user_2', {'date_added': 400, 'date_used': 500, 'logins': 500}),('power_user', {'date_added': 300, 'date_used': 400, 'logins': 500}),('regular_user', {'date_added': 200, 'date_used': 300, 'logins': 20})]