Sequences

A sequence of elements is one of the fundamental abstractions in high-level programming. In particular, a sequence is a collection of objects that have been grouped together in a designated order.

In Scala, sequence objects implement the trait Seq. As we will learn more about in a later round on collections, there are many different concrete classes implementing this trait.

Let us do a small workout with sequences at the Scala console. Open up a console (or use a tool such as Scastie) so that you can play with the examples as we go!

During the course we will use several types of sequences, including Array and List, and we will learn about Iterators which is often a better way to traverse over a collection of elements in a sequence. In this section we will mostly see the ArrayBuffer, which is an example of a mutable Scala sequence that you have probably used before.

Before we use the ArrayBuffer we need to import it:

scala> import scala.collection.mutable.ArrayBuffer

Now, let us create a small array of integers at the console

scala> val a = ArrayBuffer(752, 884, 995, 462, 126, 632, 445)
a: ArrayBuffer[Int] = ArrayBuffer(752, 884, 995, 462, 126, 632, 445)

Note

A key property of sequences is that every object in a sequence has an index (or position) by which it can be addressed. The first object in the sequence has index 0, the second object has index 1, and so forth.

We can readily verify this at the console

scala> a(0)
res: Int = 752

scala> a(1)
res: Int = 884

scala> a(6)
res: Int = 445

Each sequence object has a number of methods that enable us to get information and manipulate the sequence to our convenience. For example, the length of a sequence is the number of objects in it

scala> a.length
res: Int = 7

Fundamental to a sequence is that we can use and manipulate the order in which the elements are positioned in the sequence. For example, the reverse method of a sequence returns a new sequence with the same objects but in reverse order

scala> a
res: ArrayBuffer[Int] = ArrayBuffer(752, 884, 995, 462, 126, 632, 445)

scala> a.reverse
res: ArrayBuffer[Int] = ArrayBuffer(445, 632, 126, 462, 995, 884, 752)

Similarly, we can obtain new sequences from an existing sequence by taking and dropping elements from the existing sequence, or by slicing a contiguous segment starting and ending at a specified position

scala> a
res: ArrayBuffer[Int] = ArrayBuffer(752, 884, 995, 462, 126, 632, 445)

scala> a.take(3)
res: ArrayBuffer[Int] = ArrayBuffer(752, 884, 995)

scala> a.drop(2)
res: ArrayBuffer[Int] = ArrayBuffer(995, 462, 126, 632, 445)

scala> a.takeRight(2)
res: ArrayBuffer[Int] = ArrayBuffer(632, 445)

scala> a.dropRight(3)
res: ArrayBuffer[Int] = ArrayBuffer(752, 884, 995, 462)

scala> a.slice(1,3)
res: ArrayBuffer[Int] = ArrayBuffer(884, 995)

You are invited to study the documentation of trait Seq to pick up further useful methods.

Certain sequences are useful enough that they have their own shorthand functions by which they can be created. A particularly useful case are ranges of consecutive integers

scala> (0 until 10)
val res: scala.collection.immutable.Range.Exclusive = Range 0 until 10

scala> (10 to 20)
val res: scala.collection.immutable.Range.Inclusive = Range 10 to 20

Representing the integer sequences \(0,1,2,...9\) and \(10, 11, ...20\) respectively. Note that the until is exclusive (meaning the upper bound of the range is excluded), while to is inclusive (meaning the upper bound of the range is included).

A sequence (such as a Range) can be converted to an array by the method .toBuffer if needed

scala> (10 to 20).toBuffer
val res: ArrayBuffer[Int] = ArrayBuffer(10, 11, 12, 13, 14, 15, 16, 17, 18, 19,20)

Transforming sequences

A great many tasks in programming require us to transform a sequence of objects into another sequence in some controlled way.

Mapping

Perhaps the most common such transformation is that we apply a function to each element in the sequence. This transformation is often called mapping the function across the sequence.

To show an example of mapping, let us define a simple function that squares a given integer

scala> def square(x: Int) = x*x
def square(x: Int): Int

scala> square(5)
res: Int = 25

We can now map the function across a sequence to obtain a sequence consisting of the square of each element in the original sequence

scala> a
res: ArrayBuffer[Int] = ArrayBuffer(752, 884, 995, 462, 126, 632, 445)

scala> a.map(square)
res: ArrayBuffer[Int] = ArrayBuffer(565504, 781456, 990025, 213444, 15876, 399424, 198025)

Mapping with anonymous functions. Mapping is particularly convenient when used together with anonymous functions, which we can give directly as an argument to map without introducing a separate named function

scala> a.map((x: Int) => x*x)
res: ArrayBuffer[Int] = ArrayBuffer(565504, 781456, 990025, 213444, 15876, 399424, 198025)

Furthermore, we can in many cases omit the type specifications because Scala can automatically infer the types for us

scala> a.map(x => x*x)
res: ArrayBuffer[Int] = ArrayBuffer(565504, 781456, 990025, 213444, 15876, 399424, 198025)

Filtering using a predicate

Another very common transformation is that we must select all elements that satisfy some specific property from a sequence. This transformation is often called filtering the sequence for the elements of interest.

We define the elements that we want using a function that takes as argument an element and returns true if the element is interesting, and false otherwise. Such a function is often called a predicate.

For example, suppose we want to filter a sequence for all of its even elements. Let us define a predicate that indicates whether an integer is even

scala> def isEven(x: Int) = x%2 == 0
def isEven(x: Int): Boolean

scala> isEven(0)
res: Boolean = true

scala> isEven(1)
res: Boolean = false

scala> isEven(2)
res: Boolean = true

scala> isEven(3)
res: Boolean = false

We can now use the predicate to filter a sequence for its even elements

scala> a
res: ArrayBuffer[Int] = ArrayBuffer(752, 884, 995, 462, 126, 632, 445)

scala> a.filter(isEven)
res: ArrayBuffer[Int] = ArrayBuffer(752, 884, 462, 126, 632)

Again we can use anonymous functions if we so want

scala> a.filter(x => x % 2 == 0)
res: ArrayBuffer[Int] = ArrayBuffer(752, 884, 462, 126, 632)

scala> a.filter(_ % 2 == 0)
res: ArrayBuffer[Int] = ArrayBuffer(752, 884, 462, 126, 632)

Reducing

Often it is necessary to summarise the elements of a sequence by applying a two-argument associative function across the elements of the sequence. This operation is called reducing the sequence.

A common example is that we want to take the sum of the elements of the sequence

scala> def myPlus(left: Int, right: Int) = left+right
def myPlus(left: Int, right: Int): Int

scala> a.reduce(myPlus)
res: Int = 4296

Again we may use anonymous functions

scala> a.reduce((left: Int, right: Int) => left+right)
res: Int = 4296

scala> a.reduce((left, right) => left+right)
res: Int = 4296

scala> a.reduce(_ + _)
res: Int = 4296

In the the call a.reduce(_ + _) above, we use some Scala syntactic sugar to write the anonymous function on a more compact form: the first _ takes on the value of left and the second one the value of right in the other two calls.

Sums are in fact important enough that they are available also as a dedicated method in trait Seq

scala> a.sum
res: Int = 4296

Folding

Folding is a generalisation of reducing where we can specify the initial value from which the reducing starts

scala> a.fold(0)(_ + _)
res: Int = 4296

scala> a.fold(100)(_ + _)
res: Int = 4396

Scanning

Scanning generalises folding by recording all the intermediate values during a fold when we traverse the sequence from left to right (or right to left):

scala> a
res: ArrayBuffer[Int] = ArrayBuffer(752, 884, 995, 462, 126, 632, 445)

scala> a.scanLeft(0)(_ + _) // running sum left-to-right
res: ArrayBuffer[Int] = ArrayBuffer(0, 752, 1636, 2631, 3093, 3219, 3851, 4296)

scala> a.scanRight(0)(_ + _) // running sum right-to-left
res: ArrayBuffer[Int] = ArrayBuffer(4296, 3544, 2660, 1665, 1203, 1077, 445, 0)

scala> a.scanLeft(100)(_ + _) // start sum from 100
res: ArrayBuffer[Int] = ArrayBuffer(100, 852, 1736, 2731, 3193, 3319, 3951, 4396)

From the ‘left’ or from the ‘right’?

As we saw whens scanning, we get different results when beginning operations from the left or from the right in the sequence, because we can see the intermediate values.

There are also explicit versions of other sequence operations such as the previously mentioned reduce and fold (called reduceLeft and reduceRight, foldLeft and foldRight respectively) that determines if the reduction starts from the left or right side of the sequence. This can be important if the operator we use to combine the elements is not commutative <https://en.wikipedia.org/wiki/Commutative_property>__. For example, if we subtract the values instead of adding them, reducing from left or right matters:

scala> a.reduce(_ - _)
val res: Int = -2792

scala> a.reduceLeft(_ - _)
val res: Int = -2792

scala> a.reduceRight(_ - _)
val res: Int = 340

Folding can of course give different values from reduce, as we can set the start value:

scala> a.foldLeft(0)((acc,right) => acc - right)
val res: Int = -4296

scala> a.foldRight(0)((left,acc) => left - acc)
val res: Int = 340

In the above we are explicitly naming the parameters as earlier in this section, but instead of calling them both left and right respectively, we have now chosen to call one of them acc (short for accumulator). This is just to highlight that this is where the running difference is accumulated. Either from the left or from the right side.

In the case of foldLeft we in effect calculate \((((((((0 - 752) - 884) - 995) - 462) - 126) - 632) - 445)\), so we start calculations with the accumulator and the leftmost element. In the case of foldRight, on the other hand, we calculate \((752 - (884 - (995 - (462 - (126 - (632 - (445 - 0)))))))\), which accumulates from the right side of the sequence.

In fact, the methods foldLeft and foldRight are more general than the basic fold in that the left and right sides of the operator does not need to be of the same type. We can for example make foldRight behave like filter by letting the zero element (the accumulator) be a sequence and then choosing if the left element (taken from the input sequence) should be appended (using the operator +:) to the resulting sequence or not:

scala> a.foldRight(Seq[Int]())((left,acc) => if left % 2 == 0 then left +: acc else acc)
val res: Seq[Int] = List(752, 884, 462, 126, 632)

Did you note that we got a List out from that last call, not an ArrayBuffer which we operated on? This was because acc is a plain Seq[Int], which defaults to a list. ArrayBuffer and List both have the trait Seq, meaning that we could have accumulated to an ArrayBuffer[Int] as well. You can try this out, if you like. (As we will learn later in the course, while you can often use different kind of sequences interchangeably, they are not necessarily equally efficient.)

Zipping and unzipping

In the exercises we get yet further practice with sequence transformations and transformations on pairs of sequences such as zipping two sequences into a sequence of pairs

scala> ArrayBuffer(1,2,3) zip ArrayBuffer(4,5,6)
res: ArrayBuffer[(Int, Int)] = ArrayBuffer((1,4), (2,5), (3,6))

The dual operation of unzipping breaks a sequence of pairs into two sequences

scala> ArrayBuffer((10,1),(20,2),(30,3)).unzip
res: (ArrayBuffer[Int], ArrayBuffer[Int]) = (ArrayBuffer(10, 20, 30),ArrayBuffer(1, 2, 3))

For working with pairs, recall that you can index the components of a tuple object (such as a pair of integers) with the underscore notation. To illustrate this, let us define a pair of integers

scala> val x = (123,456)
x: (Int, Int) = (123,456)

We can now index the pair as follows

scala> x._1
res: Int = 123

scala> x._2
res: Int = 456

Cascading

Often it is useful to execute one transformation after another to form a cascade of method calls on successive sequences obtained from consecutive transformations.

For example, suppose we want to compute the sum of the squares of our sequence of integers.

This is easy – first map to obtain the squares and then reduce to obtain the sum of squares:

scala> a.map(x => x*x).reduce(_ + _)
res: Int = 3163754

Or suppose we want to compute the sum of squares of all even integers.

No problem – first filter to get the even integers, then map to get the squares, and finally reduce to get the sum:

scala> a.filter(_ % 2 == 0).map(x => x*x).reduce(_ + _)
res: Int = 1975704

The term cascading comes from the fact that for better readability such transformations are best written one after one another on separate lines (with comments!), so the transformations form a “cascade”:

val result = a.filter(_ % 2 == 0)  // get even elements
              .map(x => x*x)       // square
              .reduce(_ + _)       // sum up

Transforming elements by `for` - `yield`

In Scala it is also possible to use a version of the for expression to transform a sequence. If we use the keyword yield instead of do before the block of code inside the loop the result of the code block (the last statement) will be collected into a sequence.

For example, if we want to calculate a sequence of the squares we could do:

scala> for x <- a yield x * x
val res: ArrayBuffer[Int] = ArrayBuffer(565504, 781456, 990025, 213444, 15876, 399424, 198025)

This can be understood as “for every value x in a give a new value which is the square of x and collect those values in a new collection”.