Bruno Behnken

Sep 01, 2023

Python Iterators

I was in college when I first met Python, and most of my code up to that point has been written in C. When making the transition, I got convinced that this:

for (int i = 0; i < 10; i++)

translated to Python would be this:

for i in range(0,10)

While that is not wrong, because the final behavior of the code is the same, what is going on under the surface is very different. While C stores the i variable in memory, incrementing and testing its value at every iteration, Python instantiates a new Iterator.

An Iterator in Python is an instance of a class that implements the methods __iter__ and __next__.

The __iter__ method is responsible for returning an Iterator object, which usually is the same instance that holds the method. This means that, generally speaking, most __iter__ implementations will just return self, but more complex implementations may have additional logic.

The __next__ method is where all the magic happens. When this method is called, it is expected to return the value that will be used in the iteration, or a StopIteration exception if the values have all been already used. To do that, the __next__ method must contain the logic that goes in the C for.

Let's explain this better with an example, creating our own implementation of range(x, y).

class OurRange:
    def __init__(self, lower_boundary, upper_boundary):
        self.i = lower_boundary
        self.limit = upper_boundary

    def __iter__(self):
        return self

    def __next__(self):
        if self.i == self.limit:
            raise StopIteration
        value = self.i
        self.i += 1
        return value

In this implementation, we are storing the boundary values as attributes, so they persist through the method calls. The __next__ method first checks if the upper_boundary value has been reached, raising the StopIteration exception if it has. If not, then the value is saved in a variable and incremented by 1. The original value is then returned. Let's see what happens when we use OurRange in a for loop.

>>> for i in OurRange(1, 10):
...     print(i, end=' ')
... 
1 2 3 4 5 6 7 8 9 

As we can see, it behaves exactly as range(0, 10).

Since OurRange is an object, we can also assign it to a variable, and call __next__ manually. Let's try this.

>>> our_range = OurRange(0, 3)
>>> our_range.__next__()
0
>>> our_range.__next__()
1
>>> our_range.__next__()
2
>>> our_range.__next__()
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "<input>", line 11, in __next__
StopIteration

Now let's try the same for range(0, 3). It must be noted that range by itself is not an Iterable, but its Iterable can be obtained by calling the __iter__ method.

>>> range = range(0, 3)
>>> range = range.__iter__()
>>> range.__next__()
0
>>> range.__next__()
1
>>> range.__next__()
2
>>> range.__next__()
Traceback (most recent call last):
  File "<input>", line 1, in <module>
StopIteration

As we can see, both implementations behave the same way: while the Iterator has not reached the limit, the values are returned; when the limit is reached, a StopIteration exception is raised, and our call breaks. It is also important to notice that when we are using a for loop, this exception is caught by the for itself, without us realizing it ever happened.

Real World Example

So, where do Iterators apply in real world code?

An example would be to simplify the way you get values from sequential API calls. You can encapsulate the calling logic in the __next__ method, raising an exception when the API returns an empty result. Let's see an example code.

class JsonPlaceholderCaller:
    def __init__(self):
        self.post = 1

    def __iter__(self):
        return self

    def __next__(self):
        resp = requests.get(f'https://jsonplaceholder.typicode.com/posts/{self.post}')
        if resp.status_code == 404:
            raise StopIteration
        self.post += 1
        return resp

for response in JsonPlaceholderCaller():
    print(response)

At every iteration, this for will call the JSON Placeholder API to get a new post, until a new post is not found. Of course, in real world code this class and the code that calls it would be separated in different layers.

I hope you enjoyed learning more about Python Iterators. My next post is about Python Generators, which are also cool.