7. Midterm Project
In this project we will be studying thread pools, a very useful concurrent programming pattern that allows one to use a fixed number of threads for executing concurrent tasks with unbounded parallelism. In some way, implementing a thread pool is similar to implementing a multiprocessor mechanism that allocates a particular CPU for a given thread. In reality, we typically have much fewer CPUs than we have threads running concurrently in the program, therefore some of the threads are forced to share processors. Furthermore, many runtimes (Java/Scala included) do not allow for an arbitrarily large number of threads to be spawned at the same time (remember that in our examples the numbers of threads were in tens, maybe, hundreds). Yet, it would be undesirable for a programmer to worry about the allotted number of threads that she is allowed to use at any moment. Thread pools provide a convenient mechanism to lift this limitation by using a fixed number of threads to execute an arbitrary number of “concurrent” computational tasks.
Those of you who have completed the midterm project for the latest edition of YSC2229 might think of thread pools as of “resource allocators” for threads. The only difference here is that instead of providing transparent access to some (de-facto limited) memory space, thread pools provide transparent access to a limited number of concurrent threads that can execute in parallel.
This project will be structured as a sequence of problems, each next building on the previous one. Not all problems will involve concurrency, but all of them should be completed in order to get the full project grade. Some of the posed problems are open-ended and require you to experiment with your design. Your solutions for those will be evaluated based on their performance, clarity, and elegance.
Your deliverables for this project must include:
A link to a tagged GitHub release with the implementation based on the provided project template.
A PDF file with the report describing your discoveries, experiments, design decisions, and related thoughts. The text below as well as some of the files in the provided project template will contain questions in comments. You are encouraged to elaborate on those in your narrative.
The link to the project template will be provided on Canvas. The template contains many hints and specifications in the comments. Make sure to read and understand them before starting with your implementations.
7.1. Problem 1 (Warm-Up)
As you know, Quicksort is one of the best sorting algorithms used in practice. Let us start from implementing a simple sequential version of Quicksort and making sure that it works correctly. To do so, please, check and, when requested, modify the following classes/objects/traits in the project:
Sorting
- a generic interface for in-place array sorting algorithms. The type ofsort(...)
method isUnit
, indicating that no new array is constructed.SimpleQuicksort
- an implementation of a classical in-place quicksort algorithm for an array.ArrayUtil
contains a number of utility methods, useful for working with arrays and testing sorting algorithms. Make sure to implement those that have partially missing implementations. When working with it, do not remove theimport Ordering.Implicits._
line, as otherwise the ordering operators, such as<
,>
etc will not be available for values of typeT: Ordering
.SortingTests
provides generic testing machinery for different array-sorting algorithms. Don’t forget add some adequate checks for the sorted result to ensure that your algorithm does indeed produce a sorted version of its input array.SimpleQuicksortTests
- a specific test suite forSimpleQuicksort
, which should be running without assertion failures once you implement the components listed above. Feel free to fiddle with the size of the array being sorted and examine the trends in performance.
7.2. Problem 2: Parallel Quicksort
A really nice property of Quicksort is that, due to its “divide-and-conquer” nature, it is naturally parallelisable. That is, its recursive sub-calls, performed after the partitioning step, work on disjoint sub-arrays. Therefore, we can run those calls concurrently in different threads, spawned for each such sub-call. hopefully, this will allow us to make some good use of our fancy parallel processors. Indeed, we should be able to sort an array much faster as now the work will be split between parallel threads changing its disjoint parts. By the way, do we need any form of synchronisation between those threads and if we do, how should ve do it (answer this in your report)?
In this problem, please, inspect and, when requested, change the following files in the project template:
ParallelQuicksort
- implementsort()
so it would create new threads for parallel sorting. Usestart()
andjoin()
to control the lifetime of the children threads and synchronise them with the parent thread that spawned them.ParallelQuicksortTests
- this test should succeed once you implementParallelQuicksort
.
By running the tests, we can compare the performance of the two algorithms. What are your thoughts about this? Can you explain the observed phenomena with regard to the execution times?
7.3. Problem 3: Thread Pool
Let us try a different strategy for adding concurrency into inherently parallelisable tasks, such as Quicksort, by means of introducing a thread pool. This is the largest sub-problem in the project, which will require you to study and modify a number of provided files. The project template contains a number of ready-to-use examples that will allow you to experiment with your thread pool implementation.
The following classes files are of your interest:
ThreadPool
is the main class implementing the thread pool. Some parts of the implementation are already provided and are explained in the comments. The class instance takes as an argument the size of the pool, i.e., a number of “worker” threads that it will maintain so they will execute the client-supplied tasks.
The most interesting component of the ThreadPool
implementation is the field
taskQueue
. Notice that this is a mutable (linked) non-thread-safe queue from
the standard Scala library. It will contain some user-provided tasks, which
will be executed by the fixed number of “worker”-threads, maintained by the
pool. Notice that we take advantage of Scala’s functional nature, representing
tasks as functions of type Unit => Unit
(you can think of them as
thunks—“bottled” computations that will fire when applied to the unit value
()
). The task queue can grow arbitrarily large, depending on the number of
tasks scheduled to execute by the threads in the pool.
The second important component is the array workers
containing a fixed
number of threads that will be executed all those tasks. The main trick of a
thread pool is that threads are not created whenever a new concurrent
computation needs to be executed. Instead, we have a fixed array (the pool) of
constantly running threads that “pick up” tasks from the queue and execute them.
Whenever there are no tasks, the threads are “waiting” for something to be added
to the task queue. When a worker thread starts executing a task, it removes it
from the queue.
An auxiliary boolean array workInProgress
is used to record, which of the
threads are currently executing some tasks. It will come useful to determine
whether all pending computations have been completed.
The exception ThreadPoolException
is used to indicate invalid usage of the
of the thread pool. Feel free to throw it in the appropriate situations.
The inner class Worker
implements a worker thread functionality. Its
functionality is structured in three parts:
Prelude: waiting for new tasks to be provided. This part requires synchronisation with the other worker threads and the client
Body: running some task. Does this part require synchronisation? Please, elaborate.
Epilogue: letting others know that you have completed this task, doing some bookkeeping and moving on to the next task. Do we need some synchronisation here?
The ThreadPool
class provides three methods available to its clients.
shutdown()
is a method that terminates all worker threads in the pool. Typically, it is used by the client when there is no need in the pool, and all its threads can be put to rest. I suggest implementing this method using theinterrupt()
method of the thread class. Calling this method for a threadt
that is blocked on await()
method of some monitor makest
throw anInterruptedException
and terminate its waiting and its execution. This exception can be caught and handled appropriately - a pattern known as Graceful Shutdown of a thread. The provided objectInterruptThreadExample
shows an example of using this functionality on a single thread.The method
async(task: Unit => Unit)
takes a task from the user and “schedules” it for an execution by some worker thread. Since there might be more tasks in the queue than workers, it is not guaranteed that the task will be executed immediately. Check the comments in the code and work out the way threads are made aware of the new tasks. Once you have this method implemented, try running the objectAsyncExample
in IntelliJ. As the result, you should see the output similar to the following one:Task 3 Task 1 Task 2 Task 5 Task 4 Task 7 Task 6 Task 8 Task 9 Task 10 About to shut down the pool. Process finished with exit code 0
There will be also a small delay right after the line
Task 10
is printed.The method
startAndWait(task: Unit => Unit): Unit
is similar toasync()
in that it will also schedule a provided task for the execution by some of the worker threads. However, unlikeasync()
it should block the caller thread until all activity in the thread pool ceases. That is, this method’s intended use is to give raise to some bunch of concurrent tasks, enabled by the thread pool, and then wait for all those tasks (and also the tasks they might create) to complete. This way, the caller will be synchronised with all concurrent tasks executed by the thread pool. This is what we used to achieve viaThread.join()
in the case of using native Java threads. Once implemented, you can experiment with using this method (in conjunction withasync()
andshutdown()
) by running theStartAndWaitExample
object.
7.4. Problem 4: Pooled Quicksort
It is time to get back to our quicksort implementation and put the thread pool to a good use. Inspect and modify the following files:
PooledQuickSort
is the object which should implement the quicksort via the thread pool. Just follow the comments in the file.PooledQuickSortTests
- a test suite forPooledQuickSort
.
Now let us run the three versions of quicksort we have implemented. Are we happy
with the result delivered by PooledQuickSort
? What if we increase the array
size? Can you explain the performance phenomena when comparing the execution of
PooledQuickSort
to those of SimpleQuickSort
and of
ParallelQuickSort
?
Hint
If you find your Quicksort implementations being very slow on large
arrays, check how you generate those arrays used for benchmarking. Recall in
the cases in which Quicksort doesn’t perform great in terms of complexity.
How about arrays with little diversity in their elements (e.g., what if
you’re trying to quick-sort and array of 10,000,000
numbers ranging
between 1
and 10
)?
7.5. Problem 5: Hybrid Quicksort
Finally, it’s time to unleash your creativity and experiment with different flavours of concurrent sorting to get the best of both worlds: single-threaded and parallel:
HybridQuickSort
- implement your own quicksort-based sorting strategy in this object with the aim to fix the shortcomings of the previous three algorithms. Feel free to experiment with different heuristics and parameters.HybridQuickSortTests
- use this file to test your hybrid sorting algorithm.
For the grand finale, let us check the absolute performance of the four sorting
algorithms. Use the file SortingBenchmarks
to compare the implementations on
the arrays of the different size. The benchmark suite also includes
ScalaSort
– a fine-tuned default Scala library implementation of array
sorting. Can you beat it in terms of performance? Use the benchmarks and drive
your experiments in the search of a better sorting algorithm that uses the full
potential of the parallel multiprocessors.
Make sure to document all your gotchas in your report.
7.6. Report
For grading the reports, I will be taking the following aspects into the account:
Clarity of the descriptions: how well the report explains the design of the main components (thread pool, hybrid sort, etc).
Observations: How well the report explains different observed concurrency phenomena regarding performance (and possibly correctness).
Story: The report is not too boring and does not simply paraphrase the code or shows its parts verbatim.
Good luck!