Python: the mechanism behind list comprehension

2024/9/20 8:16:07

When using list comprehension or the in keyword in a for loop context, i.e:

for o in X:do_something_with(o)


l=[o for o in X]
  • How does the mechanism behind in works?
  • Which functions\methods within X does it call?
  • If X can comply to more than one method, what's the precedence?
  • How to write an efficient X, so that list comprehension will be quick?

The, afaik, complete and correct answer.

for, both in for loops and list comprehensions, calls iter() on X. iter() will return an iterable if X either has an __iter__ method or a __getitem__ method. If it implements both, __iter__ is used. If it has neither you get TypeError: 'Nothing' object is not iterable.

This implements a __getitem__:

class GetItem(object):def __init__(self, data) = datadef __getitem__(self, x):return[x]


>>> data = range(10)
>>> print [x*x for x in GetItem(data)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

This is an example of implementing __iter__:

class TheIterator(object):def __init__(self, data) = dataself.index = -1# Note: In  Python 3 this is called __next__def next(self):self.index += 1try:return[self.index]except IndexError:raise StopIterationdef __iter__(self):return selfclass Iter(object):def __init__(self, data) = datadef __iter__(self):return TheIterator(data)


>>> data = range(10)
>>> print [x*x for x in Iter(data)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

As you see you need both to implement an iterator, and __iter__ that returns the iterator.

You can combine them:

class CombinedIter(object):def __init__(self, data) = datadef __iter__(self):self.index = -1return selfdef next(self):self.index += 1try:return[self.index]except IndexError:raise StopIteration


>>> well, you get it, it's all the same...

But then you can only have one iterator going at once. OK, in this case you could just do this:

class CheatIter(object):def __init__(self, data) = datadef __iter__(self):return iter(

But that's cheating because you are just reusing the __iter__ method of list. An easier way is to use yield, and make __iter__ into a generator:

class Generator(object):def __init__(self, data) = datadef __iter__(self):for x in x

This last is the way I would recommend. Easy and efficient.

