Iterators

As their name hints, Iterators are objects that can be used to iterate over the elements in collections such as sequences, sets, and maps (see the Scala Iterator overview and also Section 13 of Scala by Example).

The core of an iterator consists of two methods:

  1. hasNext that tells whether there is a next element still available, and

  2. next that returns the next element.

Iterators are mutable: the call to next changes the internal state of the iterator so that the following call to the next operator returns the elements after the one returned first.

A basic use of iterators is as follows:

scala>  val l = List("a","b","c")
l: List[String] = List(a, b, c)

scala>  val it = l.iterator
it: Iterator[String] = non-empty iterator

scala>  while(it.hasNext()) println(it.next())
a
b
c

Using the two core methods, the Iterator trait implements many methods that mimic the collection traversal and transformation methods we have seen earlier: foreach, map, and so forth. Thus we could have written:

scala>  val l = List("a","b","c")
l: List[String] = List(a, b, c)

scala>  l.iterator.foreach(println _)
a
b
c

Or alternatively:

scala>  val l = List("a","b","c")
l: List[String] = List(a, b, c)

scala>  l.iterator.zipWithIndex.foreach(p => println("At index "+p._2+" we have: "+p._1))
At index 0 we have: a
At index 1 we have: b
At index 2 we have: c

But what is the point of iterators? For observe that we could have done the same as above without iterators in a more elegant way:

scala>  val l = List("a","b","c")
l: List[String] = List(a, b, c)

scala>  l.foreach(println _)
a
b
c

scala>  l.zipWithIndex.foreach(p => println("At index "+p._2+" we have: "+p._1))
At index 0 we have: a
At index 1 we have: b
At index 2 we have: c

The main benefit of iterators is that they can be more memory-efficient. For example, when we write

val l = List("a","b","c")
l.zipWithIndex.foreach(p => println("At index "+p._2+" we have: "+p._1))

then in the second line the List((a,0), (b,1), (c,2)) is first formed and only after that the function p => println("At index "+p._2+" we have: "+p._1) is applied to each element of that list. However, if we write

val l = List("a","b","c")
l.iterator.zipWithIndex.foreach(p => println("At index "+p._2+" we have: "+p._1))

instead, then the expression l.iterator.zipWithIndex does not form the intermediate list but rather a non-empty iterator of type Iterator[(String, Int)] that is then able to generate the elements in the list, one by one, when needed. The foreach method on this iterator then queries the list elements, one by one, and applies the function p => println("At index "+p._2+" we have: "+p._1) to each. Therefore, for long sequences and large sets or maps, iterators can be used to reduce the amount of memory needed (thus leading to improved time-efficiency because of increased memory locality) if the computation is such that it requires large intermediate collections to be formed.

Iterators are also used internally in Scala when implementing collection traversal and transformation methods.

Of course, the memory efficiency of iterators does not come without a price. The price we pay is that the way we can use iterators is restricted. This is described in the documentation as follows:

It is of particular importance to note that, unless stated otherwise, one should never use an iterator after calling a method on it. The two most important exceptions are also the sole abstract methods: next and hasNext.

Both these methods can be called any number of times without having to discard the iterator. Note that even hasNext may cause mutation – such as when iterating from an input stream, where it will block until the stream is closed or some input becomes available.

Note

In many other languages, iterators play an even more important role in particular because the languages do not support anonymous functions.

In these languages we would create a loop over the iterator and apply the operation each pass. For example, in Java iterating through and printing each element in object List<String> wordList (a list of strings) is done by using an Iterator object:

Iterator<String> wordIter = wordList.iterator();
while (wordIter.hasNext()) {
  System.out.println(wordIter.next());
}

The same can be done in Scala much more concisely and elegantly:

wordList.foreach(Console.println _)

Although iterators are in fact automatically used in the underlying implementation of the foreach method in Scala, we as programmers need not pay attention to this unless we want to.

(Modern versions of Java does provide a `foreach` method through the Iterable interface, which takes a consumer function as argument.)

Independent use: generators

Iterators can be also used independently of collections. For instance, the Iterator companion object has methods for constructing iterators that generate arbitrarily long sequences.

For example, the Iterator.from(start: Int) returns an iterator that generates the infinite sequence start, start+1, start2, ... of integers (of course, after \(2^{31}-1\) the sequence will overflow to \(-2^{31}\)). We could use this, for example, to return the indices (starting from 1) of all the elements that are divisible by 3 in a list of integers:

scala>  import Iterator._
import Iterator._

scala>  val l = List(2312,234,2,64,573,42,34,12,4,35,34,6,6)
l: List[Int] = List(2312, 234, 2, 64, 573, 42, 34, 12, 4, 35, 34, 6, 6)

scala>  l.iterator.zip(from(1)).filter({case (num, index) => num % 3 == 0}).map({case (num,index) => index}).toVector
res1: Vector[Int] = Vector(2, 5, 6, 8, 12, 13)

Above, the expression {case (num, index) => num % 3 == 0} is a concise way to write a function f(x,y) of type (Int,Int)=>Boolean that returns true if the first argument x is divisible by 3 and false` otherwise.

As a second example, we could use Iterator.continually(elem: => A), which applies the argument code elem indefinitely, for example, to read a file one line at a time and print out the character counts for each line in the:

val reader = new java.io.BufferedReader(new java.io.FileReader(new java.io.File("index.pre")))
Iterator.continually(reader.readLine()).takeWhile(_ != null).zipWithIndex.foreach {
 case (line, lineNum) => println("Line "+(lineNum+1)+": "+line.length+" characters")
}

Alternatively, we could have used var declarations and while loops:

val reader = new java.io.BufferedReader(new java.io.FileReader(new java.io.File("index.pre")))
var lineNum = 1
var line = ""
while( {line = reader.readLine(); line!= null} ) {
  println("Line "+lineNum+": "+line.length+" characters")
  lineNum += 1
}

Writing new iterators

One can also easily write new, special-purpose iterators. To see how, here is roughly how Iterator.from, an iterator that returns an infinite sequence of integers start, start + 1*step, start + 2*step, …, is defined in the companion object of Iterator:

def from(start: Int, step: Int): Iterator[Int] = new Iterator[Int] {
  private var i = start
  def hasNext: Boolean = true
  def next(): Int = { val result = i; i += step; result }
}

That is, from returns an object that extends the abstract class Iterator[Int] with

  1. a private var i that keeps track of the next value to be given when next is called, and

  2. by implementing the required methods hasNext and next.

We may now call:

scala> from(0, 4).take(10).toList
res1: List[Int] = List(0, 4, 8, 12, 16, 20, 24, 28, 32, 36)

As a second example, here is a pseudo-random “lottery” iterator that, given a player’s row of 7 integers in the range [1, 39], for each next call returns a new winning row (7 regular numbers and 2 extra) together with the information how many of the player’s numbers occurred in the winning row:

def lottery(myRow: Seq[Int]): Iterator[(Seq[Int], Seq[Int], String)] = new Iterator[(Seq[Int], Seq[Int], String)] {
  require(myRow.length == 7 && myRow.distinct.length == 7 && myRow.forall(n => (1 <= n) && (n <= 39)), "The row should consist of 7 distinct integers in [1,39]")
  private val rand = new scala.util.Random(System.nanoTime)
  def hasNext: Boolean = true
  def next(): (Seq[Int], Seq[Int], String) = {
    val (regular, extra) = rand.shuffle((1 to 39).toVector).take(9).splitAt(7)
    (regular.sorted, extra.sorted, s":math:`{regular.count(myRow contains _)}+`{extra.count(myRow contains _)}")
  }
}

We can now witness how lucky we are during a year’s worth of weekly lotteries:

scala> val machine = lottery(List(3,5,16,21,22,33,39))
machine: Iterator[(Seq[Int], Seq[Int], String)] = <iterator>

scala> machine.take(52).foreach(println _)
(Vector(3, 10, 14, 19, 20, 25, 37),Vector(1, 21),1+1)
(Vector(9, 15, 20, 21, 27, 31, 32),Vector(17, 33),1+1)
(Vector(13, 20, 21, 25, 26, 35, 39),Vector(6, 24),2+0)
...

To amuse yourself you may want to run the machine until you hit the jackpot (7 regular numbers right) and witness how many weeks of play this takes.