Automatically simplifying/refactoring Python code (e.g. for loops - list comprehension)?

In Python, I really enjoy how concise an implementation can be when using list comprehension. I love to do concise list comprehensions this:

myList = [1, 5, 11, 20, 30, 35] #input data
bigNumbers = [x for x in myList if x > 10]

However, I often encounter more verbose implementations like this:

myList = [1, 5, 11, 20, 30, 35] #input data
bigNumbers = []
for i in xrange(0, len(myList)):if myList[i] > 10:bigNumbers.append(myList[i])

When a for loop only looks through one data structure (e.g. myList[]), there is usually a straightforward list comprehension statement that is equivalent to the loop.
With this in mind, is there a refactoring tool that converts verbose Python loops into concise list comprehension statements?

Previous StackOverflow questions have asked for advice on transforming loops into list comprehension. But, I have yet to find a question about automatically converting loops into list comprehension expressions.

Motivation: There are numerous ways to answer the question "what does it mean for code to be clean?" Personally, I find that making code concise and getting rid of some of the fluff tends to make code cleaner and more readable. Naturally there's a line in the sand between "concise code" and "incomprehensible one-liners." Still, I often find it satisfying to write and work with concise code.


2to3 is a refactoring tool that can perform arbitrary refactorings, as long as you can specify them with a syntactical pattern. The pattern you might want to look for is this


This can be refactored safely to


In your specific example, this would give

bigNumbers = [myList[i] for i in xrange(0, len(myList)) if myList[i] > 10]

Then, you can have another refactoring that replaces xrange(0, N) with xrange(N), and another one that replaces




There are several problems with this refactoring:

  • EXPRESSION1PRIME must be EXPRESSION1 with all occurrences of VARIABLE1[VARIABLE2] replaced by VARIABLE3. This is possible with 2to3, but requires explicit code to do the traversal and replacement.
  • EXPRESSION1PRIME then must not contain no further occurrences of VARIABLE1. This can also be checked with explicit code.
  • One needs to come up with a name for VARIABLE3. You have chosen x; there is no reasonable way to have this done automatically. You could chose to recycle VARIABLE1 (i.e. i) for that, but that may be confusing as it suggests that i is still an index. It might work to pick a synthetic name, such as VARIABLE1_VARIABLE2 (i.e. myList_i), and check whether that's not used otherwise.
  • One needs to be sure that VARIABLE1[VARIABLE2] yields the same as you get when using iter(VARIABLE1). It's not possible to do this automatically.

If you want to learn how to write 2to3 fixers, take a look at Lennart Regebro's book.

