Using collections in functional style

Let us now review some of the most common “utility methods” in the collection classes.

You can also revisit Chapter 6.2 of the CS-A1110 Programming 1 course.

For the programming assignments, the following methods may also prove useful:

  • splitAt

  • keys

  • filter

  • toSet

  • toMap

  • toVector

  • indexWhere

  • take

  • filterKeys

  • min and max

To study these and the collection classes further, please see

Let us proceed to review some of the utility methods with examples.

forall

Checks if all the elements in a collection satisfy a predicate (recall that a predicate is a function whose range is Boolean). For instance, we can test whether none of the elements in a set is divisible by 4:

scala> val s = Set(2,5,7,11,12)
val s: Set[Int] = HashSet(5, 2, 12, 7, 11)

scala> s.forall(_ % 4 != 0)
res: Boolean = false

Note that the order on which the predicate is applied to the elements is not necessarily “the intuitive one” and that the predicate is not necessarily applied to all the elements:

scala> val s = Set(12,2,5,7,11)
val s: Set[Int] = HashSet(5, 2, 12, 7, 11)

scala> s.forall(e => {println(e); e % 4 != 0})
5
2
12
res: Boolean = false

This matters if the predicate has side effects: the result can be very difficult to understand in case something unexpected happens!

foreach

This method applies the argument function to all the elements in a collection. The return values of the function evaluations are ignored. As an example

scala> val points = Map("40355T"->30, "346823"->70, "826822"->65)
points: Map[String,Int] = Map(40355T -> 30, 346823 -> 70, 826822 -> 65)

scala> points.foreach((id,score) => println(s"Student ${id} " + (if score > 50 then "passed" else "did not pass")))

Student '40355T' did not pass
Student '346823' passed
Student '826822' passed

Observe that the parameter x of the anonymous function is of type Tuple2[String,Int], that is, a (key,value) -pair. If you want to consider only keys or values of a Map, take a look at the keys and values methods. For example:

scala> val avgPoints = points.values.sum / points.size
avgPoints: Int = 55

map

This is one of the most commonly used utility methods that transforms a collection into another by applying the function given as parameter to each element in the collection. There is no requirement for the domain and range types of the function to be the same; for example,

scala> val names = Seq("Turing", "Godel", "Shannon", "Shamir")
names: Seq[String] = List(Turing, Godel, Shannon, Shamir)

scala> val lengths = names.map(name => name.length)
lengths: Seq[Int] = List(6, 5, 7, 6)

flatMap

Like map but the applied function must return a traversable collection, such as List or Vector, and the resulting collection will then contain the concatenation of the function applications. For example:

scala> val l = List(1,2,3)
l: List[Int] = List(1, 2, 3)

scala> l.map(v => List(v*2, v*2 + 1))
res: List[List[Int]] = List(List(2, 3), List(4, 5), List(6, 7))

scala> l.flatMap(v => List(v*2, v*2 + 1))
res: List[Int] = List(2, 3, 4, 5, 6, 7)

groupBy

The groupBy(f) method applies the function f to each element in the collection and returns an association array that maps each value of the function encountered to the list of elements that evaluated to that value. This is a very convenient method when one wants to classify elements in a collection according to some criteria. For instance, we can make an association array that groups all the names in a list that start with the same letter together as follows:

scala> val names = List("abba", "Turing", "Alabama", "Celsius")
names: List[String] = List(abba, Turing, Alabama, Celsius)

scala> names.groupBy(name => name(0).toLower)
res0: scala.collection.immutable.Map[Char,List[String]] = Map(t -> List(Turing), a -> List(abba, Alabama), c -> List(Celsius))

As a second example, we can use groupBy to compute an association map that tells how many names of certain length occurs in a list:

scala> val names = Seq("Turing", "Godel", "Shannon", "Shamir")
names: Seq[String] = List(Turing, Godel, Shannon, Shamir)

scala> val namesOfLength = names.groupBy(name => name.length)
namesOfLength: scala.collection.immutable.Map[Int,Seq[String]] = Map(5 -> List(Godel), 7 -> List(Shannon), 6 -> List(Turing, Shamir))

scala> val distributionMap = namesOfLength.map((k,v) => (k,v.length))
distributionMap: scala.collection.immutable.Map[Int,Int] = Map(5 -> 1, 7 -> 1, 6 -> 2)

Also observe the last statement: the Map trait includes the mapValues method that works like map but is only applied to the values of the association array.

foldLeft

An invocation of foldLeft(start)(f), where f is a function with two arguments, computes the value f(...f(f(start, v1), v2)...,vn) over the elements v1,v2,...,vn in the collection. For instance, we could write the mathematical vector length method with foldLeft:

def len(vec: Vector[Double]) = math.sqrt(vec.foldLeft(0.0)((s, v) => s+v*v))

This computes the same result as

def len(vec: Vector[Double]) = math.sqrt(vec.map(v => v*v).sum)

zip

Combines, or “zips”, two collections into a collection of pairs in an element-by-element manner. For example,

scala> val l = List("a", "b", "c", "d")
l: List[String] = List(a, b, c, d)

scala> val n = Vector(1, 2, 3, 4, 5)
n: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3, 4, 5)

scala> l zip n
res: List[(String, Int)] = List((a,1), (b,2), (c,3), (d,4))

As we can see, if either collection has more elements than the other, then its “extra” elements are ignored. And l zip n is just another way of writing l.zip(n).

By the way, a highly convenient method for building maps from other collections of pairs is to use the toMap method:

scala> (l zip n).toMap
res: scala.collection.immutable.Map[String,Int] = Map(a -> 1, b -> 2, c -> 3, d -> 4)

indices and zipWithIndex

The method indices returns the indices of a sequence while zipWithIndex returns a new sequence of pairs of form (e, i), where e is an element in the original sequence and i is its index there:

scala> val s = "Turing"
s: String = Turing

scala> s.indices
res: scala.collection.immutable.Range = Range(0, 1, 2, 3, 4, 5)

scala> s.zipWithIndex
res: scala.collection.immutable.IndexedSeq[(Char, Int)] = Vector((T,0), (u,1), (r,2), (i,3), (n,4), (g,5))

For example, there are now many alternatives to print out the characters in a string one-by-one:

val s = "Turing"

// Imperative style with vars and while loops
var i = 0
while i < s.length do
  println("s("+i+") = "+s(i))
  i += 1
end while

// With for-loops: this actually is internally implemented roughly as the next one
for i <- s.indices do println("s("+i+") = "+s(i))

// Functional style alternatives, pick your favorite
s.indices.foreach(i => println("s("+i+") = "+s(i)))

s.zipWithIndex.foreach(p => println("s("+p._2+") = "+p._1))

s.zipWithIndex.foreach((char, index) => println("s("+index+") = "+char))