Iterators
As their name hints, Iterators are objects that can be used to iterate over the elements in collections such as sequences, sets, and maps (see the Scala Iterator overview and also Section 13 of Scala by Example).
The core of an iterator consists of two methods:
hasNext
that tells whether there is a next element still available, andnext
that returns the next element.
Iterators are mutable: the call to next
changes the internal state of
the iterator so that the following call to the next
operator returns
the elements after the one returned first.
A basic use of iterators is as follows:
scala> val l = List("a","b","c")
l: List[String] = List(a, b, c)
scala> val it = l.iterator
it: Iterator[String] = non-empty iterator
scala> while(it.hasNext()) println(it.next())
a
b
c
Using the two core methods, the Iterator
trait implements many
methods that mimic the collection traversal and transformation methods
we have seen earlier: foreach
, map
, and so forth.
Thus we could have written:
scala> val l = List("a","b","c")
l: List[String] = List(a, b, c)
scala> l.iterator.foreach(println _)
a
b
c
Or alternatively:
scala> val l = List("a","b","c")
l: List[String] = List(a, b, c)
scala> l.iterator.zipWithIndex.foreach(p => println("At index "+p._2+" we have: "+p._1))
At index 0 we have: a
At index 1 we have: b
At index 2 we have: c
But what is the point of iterators? For observe that we could have done the same as above without iterators in a more elegant way:
scala> val l = List("a","b","c")
l: List[String] = List(a, b, c)
scala> l.foreach(println _)
a
b
c
scala> l.zipWithIndex.foreach(p => println("At index "+p._2+" we have: "+p._1))
At index 0 we have: a
At index 1 we have: b
At index 2 we have: c
The main benefit of iterators is that they can be more memory-efficient. For example, when we write
val l = List("a","b","c")
l.zipWithIndex.foreach(p => println("At index "+p._2+" we have: "+p._1))
then in the second line the List((a,0), (b,1), (c,2))
is first formed
and only after that the function
p => println("At index "+p._2+" we have: "+p._1)
is applied to each element of that list.
However, if we write
val l = List("a","b","c")
l.iterator.zipWithIndex.foreach(p => println("At index "+p._2+" we have: "+p._1))
instead, then the expression l.iterator.zipWithIndex
does not
form the intermediate list but rather a non-empty iterator of type
Iterator[(String, Int)]
that is then able to generate the
elements in the list, one by one, when needed.
The foreach
method on this iterator then queries the list elements,
one by one, and applies
the function p => println("At index "+p._2+" we have: "+p._1)
to each.
Therefore, for long sequences and large sets or maps,
iterators can be used to reduce the amount of memory needed
(thus leading to improved time-efficiency because of increased
memory locality) if the computation is such that it requires
large intermediate collections to be formed.
Iterators are also used internally in Scala when implementing collection traversal and transformation methods.
Of course, the memory efficiency of iterators does not come without a price. The price we pay is that the way we can use iterators is restricted. This is described in the documentation as follows:
It is of particular importance to note that, unless stated otherwise, one should never use an iterator after calling a method on it. The two most important exceptions are also the sole abstract methods:
next
andhasNext
.Both these methods can be called any number of times without having to discard the iterator. Note that even
hasNext
may cause mutation – such as when iterating from an input stream, where it will block until the stream is closed or some input becomes available.
Note
In many other languages, iterators play an even more important role in particular because the languages do not support anonymous functions.
In these languages we would create a loop over the iterator and apply the
operation each pass. For example, in Java
iterating through and printing each element in
object List<String> wordList
(a list of strings)
is done by using an
Iterator object:
Iterator<String> wordIter = wordList.iterator();
while (wordIter.hasNext()) {
System.out.println(wordIter.next());
}
The same can be done in Scala much more concisely and elegantly:
wordList.foreach(Console.println _)
Although iterators are in fact automatically used in the underlying
implementation of the foreach
method in Scala, we as programmers
need not pay attention to this unless we want to.
(Modern versions of Java does provide a `foreach`
method through the Iterable interface, which takes a consumer function as argument.)
Independent use: generators
Iterators can be also used independently of collections. For instance, the Iterator companion object has methods for constructing iterators that generate arbitrarily long sequences.
For example,
the Iterator.from(start: Int)
returns an iterator that generates
the infinite sequence start, start+1, start2, ...
of integers
(of course, after \(2^{31}-1\) the sequence will overflow to \(-2^{31}\)).
We could use this, for example, to return the indices
(starting from 1) of all the elements that are
divisible by 3 in a list of integers:
scala> import Iterator._
import Iterator._
scala> val l = List(2312,234,2,64,573,42,34,12,4,35,34,6,6)
l: List[Int] = List(2312, 234, 2, 64, 573, 42, 34, 12, 4, 35, 34, 6, 6)
scala> l.iterator.zip(from(1)).filter({case (num, index) => num % 3 == 0}).map({case (num,index) => index}).toVector
res1: Vector[Int] = Vector(2, 5, 6, 8, 12, 13)
Above, the expression {case (num, index) => num % 3 == 0}
is a
concise way to write a function f(x,y)
of type (Int,Int)=>Boolean
that returns true
if the first argument x
is divisible by 3
and false` otherwise.
As a second example,
we could use Iterator.continually(elem: => A)
,
which applies the argument code elem
indefinitely,
for example, to read a file one line at a time and print
out the character counts for each line in the:
val reader = new java.io.BufferedReader(new java.io.FileReader(new java.io.File("index.pre")))
Iterator.continually(reader.readLine()).takeWhile(_ != null).zipWithIndex.foreach {
case (line, lineNum) => println("Line "+(lineNum+1)+": "+line.length+" characters")
}
Alternatively, we could have used var
declarations and while
loops:
val reader = new java.io.BufferedReader(new java.io.FileReader(new java.io.File("index.pre")))
var lineNum = 1
var line = ""
while( {line = reader.readLine(); line!= null} ) {
println("Line "+lineNum+": "+line.length+" characters")
lineNum += 1
}
Writing new iterators
One can also easily write new, special-purpose iterators.
To see how, here is roughly how Iterator.from
,
an iterator that returns an infinite sequence of
integers start
, start + 1*step
, start + 2*step
, …,
is defined in the companion object of Iterator:
def from(start: Int, step: Int): Iterator[Int] = new Iterator[Int] {
private var i = start
def hasNext: Boolean = true
def next(): Int = { val result = i; i += step; result }
}
That is, from
returns an object that extends the
abstract class Iterator[Int]
with
a private var
i
that keeps track of the next value to be given whennext
is called, andby implementing the required methods
hasNext
andnext
.
We may now call:
scala> from(0, 4).take(10).toList
res1: List[Int] = List(0, 4, 8, 12, 16, 20, 24, 28, 32, 36)
As a second example, here is a pseudo-random “lottery” iterator that,
given a player’s row of 7 integers in the range [1, 39],
for each next
call returns a new winning row
(7 regular numbers and 2 extra) together with
the information how many of the player’s numbers
occurred in the winning row:
def lottery(myRow: Seq[Int]): Iterator[(Seq[Int], Seq[Int], String)] = new Iterator[(Seq[Int], Seq[Int], String)] {
require(myRow.length == 7 && myRow.distinct.length == 7 && myRow.forall(n => (1 <= n) && (n <= 39)), "The row should consist of 7 distinct integers in [1,39]")
private val rand = new scala.util.Random(System.nanoTime)
def hasNext: Boolean = true
def next(): (Seq[Int], Seq[Int], String) = {
val (regular, extra) = rand.shuffle((1 to 39).toVector).take(9).splitAt(7)
(regular.sorted, extra.sorted, s":math:`{regular.count(myRow contains _)}+`{extra.count(myRow contains _)}")
}
}
We can now witness how lucky we are during a year’s worth of weekly lotteries:
scala> val machine = lottery(List(3,5,16,21,22,33,39))
machine: Iterator[(Seq[Int], Seq[Int], String)] = <iterator>
scala> machine.take(52).foreach(println _)
(Vector(3, 10, 14, 19, 20, 25, 37),Vector(1, 21),1+1)
(Vector(9, 15, 20, 21, 27, 31, 32),Vector(17, 33),1+1)
(Vector(13, 20, 21, 25, 26, 35, 39),Vector(6, 24),2+0)
...
To amuse yourself you may want to run the machine until you hit the jackpot (7 regular numbers right) and witness how many weeks of play this takes.