Linked lists constitute one of the most basic and best-known data structures.

In this section we will study a typed variant of singly linked lists, which are, for instance,

Note

What is the point of studying such an elementary data structure that is already implemented and readily usable in the Scala collections API?

First, we want to get a first intuition how data structures are implemented in practice in the underlying API, and how the chosen data structure affects the efficiency of the basic operations that we can perform on the data structure, such as adding or deleting elements.

Second, we require practice with recursion, and making recursive implementations of list operations is an excellent primer to learning to think in terms of recursion (which can be difficult, so we really want to start with a very basic data structure!) and tail recursion.

We can define singly linked lists with elements of some type T, or T-lists for short, recursively as follows:

1. The empty list, denoted by Nil, is a T-list.

2. If l is a T-list and e is an element of type T, then Cons(e, l) is a T-list that contains the element e as its first element, followed by the elements in the list l.

3. There are no other T-lists (this last requirement is usually implicit and not explicitly stated).

For example, let us assume that T is Int. Then

• Nil is an Int-list representing an empty sequence [] of integers,

• Cons(1, Nil) is an Int-list representing the sequence ,

• Cons(1, Cons(3, Cons(5, Nil))) is an Int-list for [1,3,5], and

• Cons(1, Cons(Nil, Cons(5, Nil))) is not an Int-list.

Given a T-list Cons(e, l), we say that e is the head of the list and l is its tail. The empty list Nil does not have a head or a tail. As an example,

• the head of the Int-list Cons(5, Nil) is 5 while the tail is Nil, and

• for the String-list Cons("first", Cons("second" Cons("last", Nil))), the head is the string "first" and the tail is the list Cons("second" Cons("last", Nil)).

We now present different ways to implement lists and some basic operations on them. The recommended way is the A second implementation with case classes but we will start without case classes to illustrate some concepts and to appreciate case classes more when we observe how they make life easier. All the implementations here, such as the class scala.collection.immutable.List, provide an immutable data structure.

In the exercises, your task will be to implement more operations on lists.

## A first implementation

Let us make our first implementation of lists. This first version is for introductory purposes, you will find the source code of the version in the corresponding home assignment. We first define an abstract base class for lists over some other type A, and declare some basic methods:

package linkedLists
import scala.annotation.tailrec

/**
* Returns true if the list is empty.
*/
def isEmpty: Boolean
/**
* Returns the first element in the list
* (throws an exception if the list is empty).
*/
/**
* Returns the list obtained by removing the first element from this list
* (throws an exception if the list is empty).
*/


In this implementation, we want the head and tail methods to be included in the base class so that we can access them in methods that we implement at the base class level (in the other approaches below we can implement the same functionality without doing this).

The concrete classes for an empty list (Nil) and a nonempty list (Cons) extend the base class and implement the remaining methods:

/**
*/
def isEmpty = true
def tail = throw new java.util.NoSuchElementException("tail of empty list")
end Nil

/**
* @param _head the first element in the list
* @param _tail the rest of the list
*/
def isEmpty = false
end Cons


Finally, the companion objects provide the usual helper methods for constructing new list objects:

/** Companion object for an empty list */
object Nil:
def apply[A]() = new Nil[A]()
end Nil
/** Companion object for a non-empty list */
object Cons:
end Cons


We can now create, for example, a small linked list over String objects and fetch its first element:

scala> val l = Cons("First", Cons("Second", Nil()))
l: linkedLists.Cons[String] = Cons(First, Cons(Second, Nil()))

res: String = First


(Actually, the second line will not look like that until we have implemented the toString method below.)

Let us study what happens when we prepend existing lists with new elements. For example, let us execute the following code:

val l = Cons(5, Cons(4, Cons(-7, Nil())))
val m = Cons(3, l)


Now m is a list of four Int values corresponding to the sequence [3, 5, 4, -7]. Observe that when we create the list Cons(3, l), the list l was not copied or even accessed at all. So what happens in the memory?

The statement

val l = Cons(5, Cons(4, Cons(-7, Nil())))


first creates four objects: Next, the statement

val m = Cons(3, l)


just adds one more Cons object, whose tail references to the first object in the list l: That is, the new list shares all the objects in the first list.

Observe that it is not possible to create cyclic lists because the recursive definition is well-founded (must start from Nil). Furthermore, the data structure is immutable so we cannot modify the tail references of existing lists.

### The contains method

Let us now implement our first non-trivial method on lists, namely one that checks whether an element e is contained in the list. The recursive reasoning behind the implementation is as follows:

1. an empty list contains no elements;

2. if the head of the list is the element e, then the list contains the element;

3. otherwise, the list contains the element if the tail part of the list contains the element.

This leads to the following tail-recursive implementation which we include in the base class LinkedList:

  /**
* Returns true if the list contains the element e.
*/
@tailrec final def contains(e: A): Boolean =
if isEmpty then false
else if head == e then true
else tail.contains(e)
end contains


Let us test the method on a small list:

scala> val l = Cons("First", Cons("Second", Nil()))
l: linkedLists.Cons[String] = Cons(First, Cons(Second, Nil()))

scala> l.contains("Second")
res: Boolean = true

scala> l.contains("Third")
res: Boolean = false


Recall that for the @tailrec annotation above, forcing the Scala compiler to check that the function is really tail recursive, we must import its definition at the beginning with

import scala.annotation.tailrec


As the method is free of side effects, we can also illustrate its action by “expanding” it:

Cons(1, Cons(2, Cons(3, Nil()))).contains(3) =
Cons(2, Cons(3, Nil())).contains(3) =
Cons(3, Nil()).contains(3) =
true


Or with a call-stack:

Cons(1, Cons(2, Cons(3, Nil()))).contains(3):
return Cons(2, Cons(3, Nil())).contains(3)
Cons(2, Cons(3, Nil())).contains(3):
return Cons(3, Nil()).contains(3)
Cons(3, Nil()).contains(3):
return true


### The length method

So far, so good. Let us now implement the length method returning the number of elements in the list. As lists were defined recursively, it is easy to define the result of the length method recursively, as well:

1. the length of the empty list Nil is 0, and

2. the length of a non-empty list Cons(e, l) is the length of the tail list l, plus $$1$$.

#### A non-tail recursive version

The above definition immediately leads to our first implementation of length. We include the method signature definition

abstract class LinkedList[A]:
// Rest as in LinkedList above ...

// Define a length method
def length: Int



in the abstract base class LinkedList and then implement it with

class Nil[A]() extends LinkedList[A]:
// Rest as in Nil above ...

// Length of Nil is 0
def length = 0

end Nil


in Nil and with

class Cons[A](val head: A, val tail: LinkedList[A]) extends LinkedList[A]:
// Rest as in Cons above ...

// Length of Cons is 1 + length of its tail
def length = 1 + tail.length

end Cons


in Cons. This is a correct implementation, easy to understand, and works well for small lists. However, as we easily observe, the implementation is not tail-recursive.

Indeed, in Cons one has to first compute the value tail.length and then add 1 to it.

Therefore, if we experiment with long lists (say, 20000 in one environment)

val x = (1 to 20000).foldRight(Nil[Int]())((v: Int, m: LinkedList[Int]) => Cons(v, m))
println("The length of x is " +x.length)


the result is a java.lang.StackOverflowError.

Again we can illustrate the behaviour of this method by “expanding” it:

Cons(1, Cons(2, Cons(3, Nil()))).length =
1 + Cons(2, Cons(3, Nil())).length =
1 + (1 + Cons(3, Nil()).length) =
1 + (1 + (1 + Nil().length)) =
1 + (1 + (1 + 0)) =
1 + (1 + 1) =
1 + 2 =
3


Or with a call-stack:

Cons(1, Cons(2, Cons(3, Nil()))).length:
val tmp1 = Cons(2, Cons(3, Nil())).length
Cons(2, Cons(3, Nil())).length:
val tmp2 = Cons(3, Nil()).length
Cons(3, Nil()).length:
val tmp3 = Nil().length
Nil().length:
return 0
// tmp3 == 0
return 1 + tmp3
// tmp2 == 1
return 1 + tmp2
// tmp1 == 2
return 1 + tmp1
3


#### An imperative version

The length method is simple and easily expressible in an imperative way by just going from the beginning of the list to the end with a while loop, counting the number of elements encountered in the way. That is, we implement the whole method in the abstract base class with the following code:

def length: Int =
var result = 0
var remaining = this
while !remaining.isEmpty do
result += 1
remaining = remaining.tail
end while
result
end length


Observe that the implementation does not change the state of the LinkedList object at any point during its execution; therefore, the class LinkedList remains immutable.

In fact, many of the methods in the classes in the scala.collection.immutable package are implemented in this way: one uses variables to compute intermediate results but the values stored in the object itself are not changed at any point (as required by immutability).

Note

In our programming exercise on lists, you are not allowed to use imperative code. In particular, var declarations are forbidden by a custom Scala compiler we use in the evaluation phase.

#### A tail-recursive version

Since we are practising recursion, let us implement the length method in a tail-recursive way. One way to do this is just to “implement” the while loop above with an inner function and then call the inner function with the appropriate initial values:

  def length: Int =
@tailrec def inner(remaining: LinkedList[A], result: Int): Int =
if remaining.isEmpty then result
else inner(remaining.tail, result+1)
end inner
inner(this, 0)
end length


Observe that the variables of the imperative version are now maintained and passed as parameters to the inner function. Such use of inner functions to implement an iteration is a very common and useful programming pattern. Let us illustrate the operation of the method with “expansion”

Cons(1, Cons(2, Cons(3, Nil()))).length =
inner(Cons(1, Cons(2, Cons(3, Nil()))), 0) =
inner(Cons(2, Cons(3, Nil())), 1) =
inner(Cons(3, Nil()), 2) =
inner(Nil(), 3) =
3


And as a call-stack

Cons(1, Cons(2, Cons(3, Nil()))).length:
return inner(Cons(1, Cons(2, Cons(3, Nil()))), 0)
inner(Cons(1, Cons(2, Cons(3, Nil()))), 0):
return inner(Cons(2, Cons(3, Nil())), 1)
inner(Cons(2, Cons(3, Nil())), 1):
return inner(Cons(3, Nil()), 2)
inner(Cons(3, Nil()), 2):
return inner(Nil(), 3)
inner(Nil(), 3):
return 3


### The reverse and toString methods

Let us practise tail-recursive functions with two more methods:

• reverse that gives another list obtained by listing the elements in the current list in the reverse order, and

• toString that gives a string representation of the list.

Again, we implement the methods in the base class LinkedList and use inner functions to perform the iteration over the elements in the list:

  /**
* Returns the reversed version of this list
*/
if remaining.isEmpty then result
end inner

inner(this, Nil())
end reverse

/**
* A string representation of the list.
*/
override def toString: String =
@tailrec def inner(remaining: LinkedList[A], prefix: String, postfix: String): String =
if remaining.isEmpty then prefix + "Nil()" + postfix
else inner(remaining.tail, prefix + "Cons(" + remaining.head.toString + ", ", postfix + ")")
end inner

inner(this, "", "")
end toString


It is probably not immediately clear how the reverse method works, so let us illustrate it with “expansion”:

Cons(1, Cons(2, Cons(3, Nil()))).reverse =
inner(Cons(1, Cons(2, Cons(3, Nil()))), Nil()) =
inner(Cons(2, Cons(3, Nil())), Cons(1, Nil())) =
inner(Cons(3, Nil()), Cons(2, Cons(1, Nil()))) =
inner(Nil(), Cons(3, Cons(2, Cons(1, Nil())))) =
Cons(3, Cons(2, Cons(1, Nil())))


## A second implementation with case classes

Our second implementation takes advantage of a feature in Scala: case classes. Such classes are special in the sense that

• a companion object is automatically generated, and

• more importantly, they export their constructor for pattern matching.

Let us see how case classes work by re-implementing our list example with case classes. We again have the abstract base class LinkedList[A] (in which we implement more methods in a while) but the main difference is that the classes Nil and Cons are declared as case classes:

abstract class LinkedList[A]:
def isEmpty: Boolean

def isEmpty = true
def tail = throw new java.util.NoSuchElementException("tail of empty list")
end Nil

def isEmpty = false
end Cons


Companion objects are automatically generated for case classes, enabling us to write, as earlier:

scala> val l = Cons("First", Cons("Second", Nil()))
l: linkedLists.Cons[String] = Cons(First, Cons(Second, Nil()))

res: String = First


### The contains method

We could still use our previous implementation of the contains method but we can now also write it with pattern matching on case classes:

  @tailrec final def contains(e: A): Boolean =
this match
case Nil() => false
case Cons(h, t) => if (h == e) then true else t.contains(e)
end match
end contains


Here the statement “this match  ...” takes the object this (i.e., the list on which the method is called) and matches it with the cases listed in the curly brackets:

1. The first case, case Nil(), checks whether this is an instance of the Nil class (i.e., whether the list is empty). If this is the case, then the “this match ...” statement evaluates to the value false.

2. The second case, case Cons(h, t), checks whether this is an instance of the Cons class and that

• its first constructor parameter value head equals to h, and

• its second constructor parameter value tail equals to t.

As h and t are not defined in the scope, new values are reserved for them and they can match to any value. These matched values are then available in the value evaluation part “if (h == e) true else t.contains(e)” of the case, allowing us to check (i) if the head of the list equals to e, or (ii) if it does not, perform the tail-recursive call on the tail of the list.

For example, if this is the list val l = Cons("First", Cons("Second", Nil())), then the case case Cons(h, t) is matched with the value of h being the string "First" and the value of t being the object Cons("Second", Nil()).

The cases are evaluated in the order they are given in program text, and the first case that matches is evaluated (if none of the cases match, then scala.MatchError error is thrown automatically).

### The length, reverse and toString methods

In a similar way, we can re-implement these methods more concisely with case classes:

  def length: Int =
@tailrec def inner(remaining: LinkedList[A], result: Int): Int =
remaining match
case Nil() => result
case Cons(_, t) => inner(t, result + 1)
end match
end inner
inner(this, 0)
end length

remaining match
case Nil() => result
case Cons(h, t) => inner(t, Cons(h, result))
end match
inner(this, Nil())
end reverse

override def toString: String =
@tailrec def inner(remaining: LinkedList[A], prefix: String, postfix: String): String =
remaining match
case Nil() => prefix + "Nil()" + postfix
case Cons(h, t) => inner(t, prefix + "Cons(" + h.toString + ", ", postfix + ")")
end match
inner(this, "", "")
end toString


Observe the underline _ in the case case Cons(_, t) => inner(t, result + 1) in the inner function of the length method. This tells the Scala compiler that “we are not interested in the _head value of the list in this case, do not generate the code needed for matching it”. It also increases readability as we need not introduce variables that we do not actually use.