Our first multithreaded program in Scala
Let us now get our hands dirty with multithreading in Scala. A common pattern in concurrent programming is that a master thread starts a number of worker threads that proceed independently with the work that the master has delegated to them.
Our first multithreaded program will use the main thread to start three worker threads that will report their progress to the console.
In Scala, a new thread of execution is prepared by extending the
trait Runnable
and providing an implementation for the abstract
method run()
. This implementation should contain the code we want
to run in a thread. When the method returns, the thread stops.
Here is the code that we want to run in the worker threads:
class SumWorker(name: String, target: Int, report: Int) extends Runnable:
// This method contains the code we want to run in a thread
def run() : Unit =
// ... say, we want to take the sum of the integers i = 1,2,...,target,
// and issue a progress report after every "report" integers
var s = 0L
var i = 1
while i <= target do
s = s + i
if i % report == 0 then {
println("Worker %s reporting at i = %d".format(name, i))
}
i = i + 1
end while
println("Worker %s is done and reports sum s = %d".format(name, s))
// The thread stops when run() returns, i.e. here
end run
end SumWorker
Observe that you can copy and paste the code above to the Scala console and nothing happens yet. Indeed, nothing should happen since we are simply declaring a new class.
Threads in Scala are objects instantiated from the class Thread
.
To get a new thread started, we first prepare
(construct) the thread by handing a Runnable
-object
(that contains the code we want to run) to the constructor. Let us
prepare three SumWorker
-runnables and three threads to run the workers:
val wa = new SumWorker("SumA", 100000000, 10000000) // prepare worker A
val wb = new SumWorker("SumB", 100000000, 10000000) // prepare worker B
val wc = new SumWorker("SumC", 100000000, 10000000) // prepare worker C
val ta = new Thread(wa) // prepare a thread for worker A
val tb = new Thread(wb) // prepare a thread for worker B
val tc = new Thread(wc) // prepare a thread for worker C
Again observe that nothing happens yet if you copy and paste what is above to the console. The threads are now ready to run, but not yet running.
We can start a thread by calling its start()
method.
Copy and paste the following to the console:
// Start threads of workers a, b, and c.
// One line with ; in between commands so we do not have to press
// return three times in the console
ta.start() ; tb.start() ; tc.start()
The threads are now running, and we observe reports from the running threads at the console:
Worker SumB reporting at i = 10000000
Worker SumA reporting at i = 10000000
Worker SumB reporting at i = 20000000
Worker SumC reporting at i = 10000000
Worker SumA reporting at i = 20000000
Worker SumC reporting at i = 20000000
Worker SumB reporting at i = 30000000
Worker SumA reporting at i = 30000000
Worker SumC reporting at i = 30000000
Worker SumB reporting at i = 40000000
Worker SumA reporting at i = 40000000
Worker SumC reporting at i = 40000000
Worker SumB reporting at i = 50000000
Worker SumA reporting at i = 50000000
Worker SumC reporting at i = 50000000
Worker SumB reporting at i = 60000000
Worker SumA reporting at i = 60000000
Worker SumC reporting at i = 60000000
Worker SumB reporting at i = 70000000
Worker SumC reporting at i = 70000000
Worker SumA reporting at i = 70000000
Worker SumB reporting at i = 80000000
Worker SumC reporting at i = 80000000
Worker SumB reporting at i = 90000000
Worker SumA reporting at i = 80000000
Worker SumC reporting at i = 90000000
Worker SumB reporting at i = 100000000
Worker SumB is done and reports sum s = 5000000050000000
Worker SumA reporting at i = 90000000
Worker SumC reporting at i = 100000000
Worker SumC is done and reports sum s = 5000000050000000
Worker SumA reporting at i = 100000000
Worker SumA is done and reports sum s = 5000000050000000
Observe how the threads are running in parallel, and independently of each other. In particular, the workers finish in the order B,C,A, which appears arbitrary and is not in the order the threads were started.
In fact, if you run the code above at the console yourself (which you should do!), you are likely to witness a different ordering from what is above.
What is going on?
Asynchronous execution
By default, threads execute asynchronously. That is to say, as programmers we cannot assume anything about how the executions of different threads are related to each other in time, unless we explicitly introduce dependency between the threads through synchronization.
Before proceeding with synchronization, let us look at a second example that gives a more vivid illustration of asynchronous execution.
Let us start 10 threads that send greetings to the console. First, the a class for greetings:
class Greeter(id: Int, num: Int) extends Runnable:
def run() : Unit =
for i <- 0 until num do
print("%d".format(id)) // greet by printing a single digit
end run
end Greeter
Then we make sure to start a number of them:
val n = 10 // ten greeters -- one for each digit 0,1,...,9
val num_greetings = 200 // each greeter sends 200 greetings
val greeters = (0 until n).map(id => new Greeter(id, num_greetings))
val threads = greeters.map(new Thread(_))
threads.foreach(_.start())
Copy and paste the code above to the console. What you should observe is a sequence of 2000 digits 0,1,…,9. That is, 200 greetings from each of the 10 threads. Repeat the copy-and-paste a few times, and observe how you get a different sequence every time.
Here is an example of one repeat:
04011111111111111111111111111111100000444444444444444444444444444444444444444444444444487778088889999866661222133333355553112626668889999999900007444477090000088866621212222113232222533332222221611118666808888897774497777888066661112333333535555552222166666610111118788894449999878888717777011116222266533333356662251555555555500007778789999984444999700005551222222263333363633333333222221215550079444448444448999977011111152222336222233351111007171111977777788889499999888887111070555500032222226666666222233305000077775755555188888181888888199994491999999858557070000733332222262666663733333050555550888989000091119499108585858376262737385858588505050519194915050583837327226273785050505141949494915058737232363232323737387858501914149095959878383838383838383826268237393939393935040144045353535353535353535397282826262628793535444010140404535959597982826282729535340101434343459727272782868686267979797979545431010303454494767628267649599390301030395456565752778725252649390101934646265757875756524393133909093931425252626278787878787872626262545414141434949494049493915252678787878686252151593949094345454541268676767676767682114153939090909035551541412822676726814153030390501048686828282828787272646666464015955555555539393951510101014146278787878726641405050509993939054141627878727676141505593939501040467627272787828282828286404010593935010146862622777864646464610105050303030309093535353164648782827224661359595909005031313136424242478742424636101159595951010606030234343737878834202026262621595951515252604343434848484747483030303030303625251595555551212111162222300000000333333333848787434343434343020061515951556020203434478743404202020651919191515151515151616024343787343420260601519595106626242387832324260606060611595161012424338788888424040106565960000142887872721010169616102787201010101010106960606061662727277777787876767606060606969666070777887866667979797976868789879767698967678988767689868686868786969696987898689897978986898987969796767976767699977979797979797979799797999999
The ten threads are writing their greetings independently of each other, in an essentially arbitrary order. Thus, as programmers we really cannot assume anything about how the threads run in relation to each other, unless we take care to synchronize their execution.
Synchronization
Synchronization refers to actions taken in a thread to enforce dependencies between the executions of different threads.
Two natural reference points in the execution of a thread are when the thread starts and when the thread exits.
The action of starting a thread by calling its start()
-method
is perhaps the simplest form of synchronization between threads.
Namely, the thread that performs the action (the thread that calls
start()
) and the thread that starts are not independent in their
executions since one must start before the other. This synchronization
is however rather weak, since after one thread starts the other,
the two threads execute asynchronously unless further synchronization
actions are taken.
The action of joining with a thread, that is, waiting until the
thread exits, is another form of synchronization between threads.
A thread joins with another thread by calling its join()
-method,
which returns after the thread exits. Joining is a very strong form of
synchronization because the execution of the calling thread is suspended
(blocked) until the other thread exits. That is, we are enforcing
that the execution of the calling thread can resume only after the
execution of the other thread has taken place.
Many other forms of synchronization exist, but for now we will be content with starting and joining threads. The more challenging exercises will explore some further possibilities for synchronization.
Why synchronization?
While it would be great if our threads could execute completely independently of each other, in most cases our threads need to use common resources, such as memory or operating system services.
For example, one thread may want to set (write) the value of a variable and another thread may want to use (read) the value. Without synchronization between the two threads we have no control over whether the thread that reads will obtain the value of the variable (a) as it was before the write, or (b) after the write took place:
With synchronization we can obtain control how the threads execute in relation to each other. Our intent in this round, however, is not to enter into the intricacies of thread synchronization. (Concurrent programming in its full generality is a subtle and difficult topic that easily deserves a course or two on its own. Interested students may want to consider CS-E4580 Programming Parallel Computers CS-E4110 and CS-E4110 Concurrent Programming D.)