More on functions

Let us deepen our knowledge of functions a bit more, in particular we want to increase our understanding precisely what happens (a) when we define a function, and (b) how the arguments to a function are treated when we call (or invoke or apply) the function.

Our study is divided into two parts:

  • Closure, the technique of “closing” a function when it is defined, that is, mapping the free variables in a function to variables in its defining environment. This allows the function to refer to the variables in the context it was created, even if it is applied in a totally different context (e.g., inside a method of a collection class object).

  • Call-by-value and call-by-name evaluation modes of function arguments. Call-by-value (“eager” or “strict” are also used) is the one we have seen and used so far: before a function can be evaluated, its arguments must be evaluated. Call-by-name (or “lazy” or “non-strict”) evaluation on the other hand evaluates a function argument only when it is needed (if it is needed) inside the function.

Closure

Let us recall our earlier simple primality test function

def isPrimeSlow(n: Int): Boolean = {
  require(n > 0)
  var divisor = 2
  while(divisor < n) {
    if(n % divisor == 0) return false
    divisor += 1
  }
  return true
}

We can write this function in functional programming style as

def isPrimeSlow(n: Int): Boolean = {
  require(n > 0)
  (2 until n).forall(divisor => n % divisor != 0)
}

Let us now take a closer look at the anonymous function “divisor => n % divisor != 0” and recall that it is roughly a short-hand for

def isPrimeSlow(n: Int): Boolean = {
  require(n > 0)
  val anonf = new Function1[Int,Boolean] {
    def apply(divisor: Int): Boolean = (n % divisor != 0)
  }
  (2 until n).forall(anonf(_))
}

The anonymous function does not have any parameter n, only divisor. Therefore, we say that

n is a free variable in the function.

Naturally, one cannot evaluate function’s value if it has free variables. Rather each free variable must be bound to some variable, that is, we must build a closure consisting of the function and the referencing environment, that is, information to which variables the free variables reference to. In Scala, each free variable refers to the variable with the same name in the closest lexical scope. In our example, the free variable n refers to the local variable n of the enclosing function isPrimeSlow.

Use of closure is not restricted to functional programming in any way. For instance, consider the following console session:

scala>  var prefix = "A> "
prefix: String = "A> "

scala>  var nofLinesOutput = 0
nofLinesOutput: Int = 0

scala>  def p: (String => Unit) = s => {
    |    Console.println(prefix + s)
    |    nofLinesOutput += 1
    |  }
p: String => Unit

scala>  val l = List("one", "two", "three")
l: List[String] = List(one, two, three)

scala>  l.foreach(p)
A> one
A> two
A> three

scala>  prefix = "B> "
prefix: String = B>

scala>  l.foreach(p)
B> one
B> two
B> three

scala>  nofLinesOutput
res2: Int = 6

The function p in the code has two free variables, prefix and nofLinesOutput, which are bound to the variables described in the previous lines. Now observe that after the first execution of l.foreach(p _), we modify the values of these variables. Then in the second execution of l.foreach(p _), the function p sees these new modified values. This illustrates that

the free variables are bound to variables, not to the values of those variables at the time when the function instance is created.

Also observe that when p is executed inside the foreach method of the list object l as f in the code

@inline override final
 def foreach[B](f: A => B) {
   var these = this
   while (!these.isEmpty) {
     f(these.head)
     these = these.tail
   }
 }

there are no variables prefix or nofLinesOutput available here. That is,

the binding of the free variables in a function is determined by the lexical scope in which the function is defined, not where the function is applied. (In the latter case we say that the binding is determined by dynamic scope.)

Closure and lexical scoping is implemented automatically by the Scala compiler. The compiler finds out to which variables the free variables should be bound to and adds this information to the function object it creates.

Call-by-value and call-by-name

In Scala, the arguments to functions and methods are by default passed according to what is called call-by-value semantics. That is, the values of the arguments are first evaluated, one by one, until all arguments have values, after which the the function or method is invoked with its parameters bound to these values. For example, consider the following code:

scala> def f(n: Int) = n*n*n
f: (n: Int)Int

scala> def countPrimes(last: Int, doLog: Boolean = false) = {
     |     if(doLog) Console.println("g: Start with last = "+last)
     |     val result = (1 to last).count(isPrimeSlow _)
     |     if(doLog) Console.println("g: End")
     |     result
     |   }
countPrimes: (last: Int, doLog: Boolean)Int

scala> val i = 7
i: Int = 7

scala> val debugMode = false
debugMode: Boolean = false

scala> val testMode = true
testMode: Boolean = true

scala> countPrimes(f(i+1), debugMode || testMode)
g: Start with last = 512
g: End
res0: Int = 98

Consider the call countPrimes(f(i+1), debugMode || testMode). Before the function countPrimes is called, the arguments to the function will be evaluated first.

  • We start with the first argument, f(i+1). Since it is also a function call, its argument i+1 will be evaluated first. As i+1 evaluates to 8, we next call f(8) and get the value of the first argument value to countPrimes as a return value, which is 512.

  • We then evaluate the second argument, debugMode || testMode. As evaluating debugMode gives false, we must also evaluate testMode. As a result, we get that the second argument value to countPrimes is true.

After this we invoke countPrimes(512, true).

Call-by-name

An alternative way to pass arguments to functions is what is called call-by-name semantics. In this case arguments are not evaluated before the function or method is called. Rather each argument is evaluated only when it is actually needed to compute results inside the function or method.

As the first example, let us consider the require method that have seen many times earlier. This method is defined in the scala.Predef object as:

final def require(requirement: Boolean, message: => Any) : Unit = {
  if (!requirement)
    throw new IllegalArgumentException("requirement failed: "+message)
}

Observe the type “=> Any” of the message parameter to the function. This means that the second argument is passed as “call-by-name”, that is, it is actually a function with no parameters. Suppose we now write, for example,

require(l.forall(_ >= 0), "The list must contain only positive values but has " + l.find(_ < 0).get)

In this case the Scala compiler interprets the expression "The list must contain only positive values but has " + l.find(_ < 0).get as a nullary anonymous function of type => String. This means that the second argument is not evaluated at all when the the list l contains only positive values. On the other hand, when the requirement l.forall(_ >= 0) fails, the expression "requirement failed: "+message is evaluated and thus also the expression "The list must contain only positive values but has " + l.find(_ < 0).get gets evaluated.

Had we defined the second argument of require as call-by-value, that is,

final def badRequire(requirement: Boolean, message: Any) : Unit = {
  if (!requirement)
    throw new IllegalArgumentException("requirement failed: "+message)
}

Then the line

badRequire(l.forall(_ >= 0), "The list must contain only positive values but has " + l.find(_ < 0).get)

will throw java.util.NoSuchElementException: None.get on all lists l that fulfill the requirement l.forall(_ >= 0): the second argument gets always evaluated but l.find(_ < 0) returns None, which does not have a method called get.

As a second example, let us extend our earlier indentedWriter example so that it supports dynamically evaluated prefixes:

class indentedWriter(s: java.io.PrintStream, prefix: => String = "") {
  private var level = 0
  def println(obj: Any) = s.println(prefix + (" "*level) + obj)
  def push : Unit = {level += 2 }
  def pop : Unit =  {level = 0 max level-2 }
}

Now the value prefix in an object of this class is not a string but rather a function that provides the string when it is needed. For example, we can now make a log writer that prefixes each line with the current time:

scala> val log = new indentedWriter(Console.err, (new java.util.Date()).toString()+": ")
log: indentedWriter = indentedWriter@bf37fa

scala>  def f(n: Int) = {
     |    log.push
     |    log.println("starting computation in function f")
     |    // Do some interesting work here
     |    val result = (1 to n).foldLeft(0L)((s, v) => s + v*v)
     |    log.println("done with the computation")
     |    log.pop
     |    result
     |  }
f: (n: Int)Long

scala>  def g = {
     |    log.println("Entering function g")
     |    val result = f(100000000)
     |    log.println("Leaving function g")
     |    result
     |  }
g: Long

scala>  val s = g
Mon Mar 10 12:50:33 EET 2014: Entering function g
Mon Mar 10 12:50:33 EET 2014:   starting computation in function f
Mon Mar 10 12:50:35 EET 2014:   done with the computation
Mon Mar 10 12:50:35 EET 2014: Leaving function g
s: Long = 20049330185600