When using list comprehension or the in
keyword in a for loop context, i.e:
for o in X:do_something_with(o)
or
l=[o for o in X]
- How does the mechanism behind
in
works?
- Which functions\methods within
X
does it call?
- If
X
can comply to more than one method, what's the precedence?
- How to write an efficient
X
, so that list comprehension will be quick?
The, afaik, complete and correct answer.
for
, both in for loops and list comprehensions, calls iter()
on X
. iter()
will return an iterable if X
either has an __iter__
method or a __getitem__
method. If it implements both, __iter__
is used. If it has neither you get TypeError: 'Nothing' object is not iterable
.
This implements a __getitem__
:
class GetItem(object):def __init__(self, data):self.data = datadef __getitem__(self, x):return self.data[x]
Usage:
>>> data = range(10)
>>> print [x*x for x in GetItem(data)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
This is an example of implementing __iter__
:
class TheIterator(object):def __init__(self, data):self.data = dataself.index = -1# Note: In Python 3 this is called __next__def next(self):self.index += 1try:return self.data[self.index]except IndexError:raise StopIterationdef __iter__(self):return selfclass Iter(object):def __init__(self, data):self.data = datadef __iter__(self):return TheIterator(data)
Usage:
>>> data = range(10)
>>> print [x*x for x in Iter(data)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
As you see you need both to implement an iterator, and __iter__
that returns the iterator.
You can combine them:
class CombinedIter(object):def __init__(self, data):self.data = datadef __iter__(self):self.index = -1return selfdef next(self):self.index += 1try:return self.data[self.index]except IndexError:raise StopIteration
Usage:
>>> well, you get it, it's all the same...
But then you can only have one iterator going at once.
OK, in this case you could just do this:
class CheatIter(object):def __init__(self, data):self.data = datadef __iter__(self):return iter(self.data)
But that's cheating because you are just reusing the __iter__
method of list
.
An easier way is to use yield, and make __iter__
into a generator:
class Generator(object):def __init__(self, data):self.data = datadef __iter__(self):for x in self.data:yield x
This last is the way I would recommend. Easy and efficient.