Image from CS50, licensed under CC BY-NC-SA 4.0.

Lazy evaluation with Iterators

In this section of the CS50 course, we take a look at the python syntax, its data types and some more elaborated structures like lists and dictionnaries.
Coding in Python after C, allows you to better appreciate a higher level of abstraction.
Programs that required a tidious implementation in C, are done in a few Python lines.
Python introduces exceptions, object oriented programming, resolves the issue of Integer overflow (but not the problem of floating point imprecision) and possesses a broad range of librairies that allows us to stand on the shoulder of giants.
But I would like to talk about a feature that was not mention in CS50 which is the Iterators.
An iterator is an object that allows you to loop through a sequence of data without having to store it in memory.
This behavior is called lazy evaluation and it is very useful when we want to loop over a very large or infinite data set (used a lot in data science).

Take a look at the below implementation that is NOT using iterators:


      def createData(bigNumber):
          sequence = []
          for i in range(bigNumber):
              sequence.append(i * i)
          return sequence 
      
      def loopWithoutIterator(bigNumber):
          for value in createData(bigNumber):
              if(value > 100):
                  break
              print(value)
      
      loopWithoutIterator(1000000)
        

# Here we make a million calculation that we store in an array and then we loop over this array set and print out the first eleven values.

Now let's take a look at the strength of iterators:


      def createIterator(bigNumber):
          for i in range(bigNumber):
              yield i * i
      
      def loopWithIterator(bigNumber):
          for i in createIterator(bigNumber):
              if i > 100:
                  break
              print(i)
      
      loopWithIterator(1000000)
        

# Here the iterator will yield one value on each outer iteration, and if the value is smaller than 100 it will print it. So instead of making 1000 000 calculation + 11 print out it will do 11 calculation + 11 print outs