I had a huge document that I parsed using regex to give a txt file (json.dump) similar to the following:
{"stuff": [{"name": ["frfer", "niddsi", ], "number": 11300, "identifier": "Tsdsad"}, {"name": ["Fast", "Guard", "Named", ], "number": 117900, "identifier": "Pdfms"}, {name: ["Fast", ], "number": 660, "identifier": "Unnamed"}, ]
}
Now I would like to sort this document in ascending order based on the number. (i.e. "Pdfms" first, "Tsdsad" second, "Unnamed" third). I am unsure how to start this off in python, could anyone give me a point in the right direction? Thanks in advance
First problem: That's not legitimate JSON. You have extra commas (JSON doesn't like [a,b,c,]
; it insists on [a,b,c]
) in the source, and you have some identifiers (the third instance of name
, e.g.) that are not quoted. Ideally, you will improve your initial text file parsing and JSONification to fix those issues. Or you can handle those fixups on the fly, like this:
json_source = """... your text data from above ...
"""import re
BADCOMMA = re.compile(r',\s+\]')
json_source = BADCOMMA.sub(']', json_source)BADIDENTIFIER = re.compile(r'\s+name:\s*')
json_source = BADIDENTIFIER.sub('"name":', json_source)
Beware, assuming you can fix every possible problem on the fly is a fragile pattern. Editing structured data files via regular expressions, likewise. Better to generate good JSON from the get-go.
Now, how to sort:
import json
data = json.loads(json_source)data['stuff'].sort(key=lambda item: item['number'], reverse=True)
That does an in-place sort of the "stuff" array, by the "number" value, and reverses it (because your example of how you want the output suggests a descending rather than the typical ascending sort).
To demonstrate that the sort has done what you want, the pprint
module can be handy:
from pprint import pprint
pprint(data)
Yields:
{u'stuff': [{u'identifier': u'Pdfms',u'name': [u'Fast', u'Guard', u'Named'],u'number': 117900},{u'identifier': u'Tsdsad',u'name': [u'frfer', u'niddsi'],u'number': 11300},{u'identifier': u'Unnamed', u'name': [u'Fast'], u'number': 660}]}