Python Iterators
I was in college when I first met Python, and most of my code up to that point has been written in C. When making the transition, I got convinced that this:
for (int i = 0; i < 10; i++)
translated to Python would be this:
for i in range(0,10)
While that is not wrong, because the final behavior of the code is the same, what is going on under the surface is very
different. While C stores the i
variable in memory, incrementing and testing its value at every iteration, Python
instantiates a new Iterator.
An Iterator in Python is an instance of a class that implements the methods __iter__
and __next__
.
The __iter__
method is responsible for returning an Iterator object, which usually is the same instance that
holds the method. This means that, generally speaking, most __iter__
implementations will just return self
, but
more complex implementations may have additional logic.
The __next__
method is where all the magic happens. When this method is called, it is expected to
return the value that will be used in the iteration, or a StopIteration
exception if the values have all been
already used. To do that, the __next__
method must contain the logic that goes in the C for
.
Let's explain this better with an example, creating our own implementation of range(x, y)
.
class OurRange:
def __init__(self, lower_boundary, upper_boundary):
self.i = lower_boundary
self.limit = upper_boundary
def __iter__(self):
return self
def __next__(self):
if self.i == self.limit:
raise StopIteration
value = self.i
self.i += 1
return value
In this implementation, we are storing the boundary values as attributes, so they persist through the method
calls. The __next__
method first checks if the upper_boundary
value has been reached, raising the StopIteration
exception if it has. If not, then the value is saved in a variable and incremented by 1. The original value is then
returned. Let's see what happens when we use OurRange
in a for
loop.
>>> for i in OurRange(1, 10):
... print(i, end=' ')
...
1 2 3 4 5 6 7 8 9
As we can see, it behaves exactly as range(0, 10)
.
Since OurRange
is an object, we can also assign it to a variable, and call __next__
manually. Let's try this.
>>> our_range = OurRange(0, 3)
>>> our_range.__next__()
0
>>> our_range.__next__()
1
>>> our_range.__next__()
2
>>> our_range.__next__()
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "<input>", line 11, in __next__
StopIteration
Now let's try the same for range(0, 3)
. It must be noted that range
by itself is not an Iterable, but its
Iterable can be obtained by calling the __iter__
method.
>>> range = range(0, 3)
>>> range = range.__iter__()
>>> range.__next__()
0
>>> range.__next__()
1
>>> range.__next__()
2
>>> range.__next__()
Traceback (most recent call last):
File "<input>", line 1, in <module>
StopIteration
As we can see, both implementations behave the same way: while the Iterator has not reached the limit, the values
are returned; when the limit is reached, a StopIteration
exception is raised, and our call breaks. It is also
important to notice that when we are using a for
loop, this exception is caught by the for
itself, without us
realizing it ever happened.
Real World Example
So, where do Iterators apply in real world code?
An example would be to simplify the way you get values from sequential API calls. You can encapsulate the calling
logic in the __next__
method, raising an exception when the API returns an empty result. Let's see an example code.
class JsonPlaceholderCaller:
def __init__(self):
self.post = 1
def __iter__(self):
return self
def __next__(self):
resp = requests.get(f'https://jsonplaceholder.typicode.com/posts/{self.post}')
if resp.status_code == 404:
raise StopIteration
self.post += 1
return resp
for response in JsonPlaceholderCaller():
print(response)
At every iteration, this for
will call the JSON Placeholder API to get a new post, until a new post is not found.
Of course, in real world code this class and the code that calls it would be separated in different layers.
I hope you enjoyed learning more about Python Iterators. My next post is about Python Generators, which are also cool.