Sort a list with longest items first

2024/10/18 12:59:34

I am using a lambda to modify the behaviour of sort.

sorted(list, key=lambda item:(item.lower(),len(item)))

Sorting a list containing the elements A1,A2,A3,A,B1,B2,B3,B, the result is A,A1,A2,A3,B,B1,B2,B3.

My expected sorted list would be A1,A2,A3,A,B1,B2,B3,B.

I've already tried to include the len(item) for sorting, which didn't work. How to modify the lambda so that the sort result is instead?

Answer

Here is one way to do it:

>>> import functools
>>> def cmp(s, t):'Alter lexicographic sort order to make longer keys go *before* any of their prefixes'ls, lt = len(s), len(t)if ls < lt:   s += t[ls:] + 'x'elif lt < ls: t += s[lt:] + 'x'if s < t: return -1if s > t: return 1return 0>>> sorted(l, key=functools.cmp_to_key(cmp))
['A1', 'A2', 'A3', 'A', 'B1', 'B2', 'B3', 'B']

Traditionally, lexicographic sort order longer strings after their otherwise identical prefixes (i.e. 'abc' goes before 'abcd').

To meet your sort expectation, we first "fix-up" the shorter string by adding the remaining part of the longer string plus another character to make it the longer of the two:

compare abc to defg     -->  compare abcgx to defg
compare a   to a2       -->  compare a2x to a2

The functools.cmp_to_key() tool then converts the comparison function to a key function.

This may seem like a lot of work, but the sort expectations are very much at odds with the built-in lexicographic sorting rules.

FWIW, here's another way of writing it, that might or might not be considered clearer:

def cmp(s, t):'Alter lexicographic sort order to make longer keys go *before* any of their prefixes'for p, q in zip(s, t):if p < q: return -1if q < p: return 1if len(s) > len(t): return -1elif len(t) > len(s): return 1return 0

The logic is:

  • Compare character by character until a different pair is found
  • That differing pair determines the sort order in the traditional way
  • If there is no differing pair, then longest input goes first.
  • If there is no differing pair and the lengths are equal, the strings are equal.
https://en.xdnf.cn/q/72989.html

Related Q&A

Multivariate Root Finding in Python

Using excel solver, it is easy to find a solution (optimum value for x and y )for this equation: (x*14.80461) + (y * -4.9233) + (10*0.4803) ≈ 0However, I cant figure out how to do this in Python. The …

Python print not working when embedded into MPI program

I have an Python 3 interpreter embedded into an C++ MPI application. This application loads a script and passes it to the interpreter.When I execute the program on 1 process without the MPI launcher (s…

Detecting USB Device Insertion on Windows 10 using python

I cant get the following code for Detecting USB Device Insertion to work on my Windows 10 (64 bit) computer with Python 3.7.import win32serviceutil import win32service import win32event import servicem…

I get NotImplementedError when trying to do a prepared statement with mysql python connector

I want to use prepared statements to insert data into a MySQL DB (version 5.7) using python, but I keep getting a NotImplementedError. Im following the documentation here: https://dev.mysql.com/doc/con…

parsing transcript .srt files into readable text

I have a video transcript SRT file with lines in conventional SRT format. Heres an example:1 00:00:00,710 --> 00:00:03,220 Lorem ipsum dolor sit amet consectetur, adipisicing elit.2 00:00:03,220 --…

Sphinx Public API documentation

I have a large number of python file and I would like to generate public API documentation for my project. All the functions that are part of the api I have decorated with a decorator. for example:@api…

Debugging a scripting language like ruby

I am basically from the world of C language programming, now delving into the world of scripting languages like Ruby and Python.I am wondering how to do debugging. At present the steps I follow is, I c…

Running unittest Test Cases and Robot Framework Test Cases Together

Our group is evaluating Robot Test Framework for our QA group, not just for BDD, but also to possibly cover a lot of our regular functionality testing needs. It certainly is a compelling project.To wha…

numpy.ndarray enumeration over a proper subset of the dimensions?

(In this post, let np be shorthand for numpy.)Suppose a is a (n + k)&#x2011;dimensional np.ndarray object, for some integers n > 1 and k > 1. (IOW, n + k > 3 is the value of a.ndim). I w…

How can I work with a GLib.Array in Python?

I am writing a plugin for Rhythmbox, wherein a signal raised is passing in an object of type GArray. The documentation for GLib Arrays shows me a few methods I am interested in, but am unable to acces…