3.3. Generating Arrays¶
Searching and sorting are the two main procedures one perfroms on arrays, and this is why the performance of the corresponding algorithms plays crucial role in practical applications. In this chapter we will explore more efficient ways to do so.
In order to not duplicate the development let us include some of the files from the previous weeks, as they already contain a number of useful functions. This can be done in OCaml by first compiling the corresponding file and then loading it in a current development. For instance, if the previous week’s file has been called week_02.ml, it can be included into the current one by first running the following command from the terminal:
ocamlc week_02.ml
and then, assuming that the current file is in the same folder, adding the following lines to the current development:
#load "week_02.cmo";;
open Week_02
3.3.1. Simple random generators¶
To test not only the correctness, but also the performance of our search and sorting algorithms, let us invest some time into creating the procedures for random arrays. In this chapter and further we will consider slightly more interesting arrays than just arrays of integers of type int array
. Specifically, each element of an array will hold a pair: a key (typically an integer), identifying an element, and a value, which carries some interesting payload. In a general case, some elements might have duplicating keys, and different keys can correspond to the same element. Do not confuse the array indices (which are used for efficient access to specific array entries) with element keys (which are domain-specific and can be anything identifying the corresponding payload).
Let us start from implementing a random generator for lists of rundom numbers in a range from 0
to a given bound
, of a specified length len
:
let generate_keys bound len =
let acc = ref [] in
for i = 0 to len - 1 do
acc := (Random.int bound) :: ! acc
done;
!acc
Notice, that for the sake of efficiency (and diversity) the program is implemented via a loop rather than as a recursion.
Our next procedure is more interesting and will generate strings of a fixed length
containing lowercase characters of the standard latin alphabet:
let generate_words length num =
let random_ascii_char _ =
let rnd = (Random.int 26) + 97 in
Char.chr rnd
in
let random_string _ =
let buf = Buffer.create length in
for i = 0 to length - 1 do
Buffer.add_char buf (random_ascii_char ())
done;
Buffer.contents buf
in
let acc = ref [] in
for i = 0 to num - 1 do
acc := (random_string ()) :: ! acc
done;
!acc
The first function, random_ascii_char
generates a random lowercase ASCII character (of which there are 26), and 97 corresponds to 'a'
. random_string
will create a string up to the fixed length
. Finally, the main body function will add this string to the result list.
To generate arrays, let us first implement a familiar function iota
that creates a list of increasing natural numbers:
let iota n =
let rec walk acc m =
if m < 0
then acc
else walk (m :: acc) (m - 1)
in
walk [] n
Finally, we define an auxiliary function list_zip
, which is similar in its functionality to the standard function List.combine
, but, unlike the latter does not exhaust call stack, as it is implemented in Continuation-Passing Style and is, hence, in a tail-call form:
let list_zip ls1 ls2 =
let rec walk xs1 xs2 k = match xs1, xs2 with
| h1 :: t1, h2 :: t2 ->
walk t1 t2 (fun acc -> k ((h1, h2) :: acc))
| _ -> k []
in
walk ls1 ls2 (fun x -> x)
We can finally implement an generator for key-value arrays:
let generate_key_value_array len =
let kvs = list_zip (generate_keys len len) (generate_words 5 len) in
let almost_array = list_zip (iota (len - 1)) kvs in
let arr = Array.make len (0, "") in
List.iter (fun (i, kv) -> arr.(i) <- kv) almost_array;
arr
It can be used as follows:
# generate_key_value_array 10;;
- : (int * string) array =
[|(1, "emwbq"); (3, "yyrby"); (7, "qpzdd"); (7, "eoplb"); (6, "wrpgn");
(7, "jbkbq"); (7, "nncgq"); (1, "rruxr"); (8, "ootiw"); (7, "halys")|]
3.3.2. Measuring execution time¶
For our future experiments with algorithms and data structures, it is useful to be able to measure execution time, hence we implement the following helper function:
let time f x =
let t = Sys.time () in
let fx = f x in
Printf.printf "execution elapsed time: %f sec\n" (Sys.time () -. t);
fx
It can be used with any arbitrary computation that takes at least one argument.
3.3.3. Randomised array generation and testing¶
Let us re-implement insert-sort, so it would be useful for our new setting of arrays with key-value pairs and test its performance:
let new_insert_sort arr =
let len = Array.length arr in
for i = 0 to len - 1 do
let j = ref i in
while !j > 0 && (fst arr.(!j - 1)) > (fst arr.(!j)) do
swap arr !j (!j - 1);
j := !j - 1
done
done
# let a = generate_key_value_array 5000;;
val a : (int * string) array =
[|(894, "goavt"); (2768, "hvjjb"); (3535, "pbkoy"); (1615, "ybzua");
(2820, "ssriq"); (2060, "sfxsu"); (2328, "kjgff"); (112, "xuoht");
(1188, "xxfcs"); (2384, "xbwgb");
(1134, "oi"... (* string length 5; truncated *)); (3102, ...); ...|]
# time new_insert_sort a;;
execution elapsed time: 0.395832 sec
- : unit = ()
3.3.4. Exercise 4¶
Implement a function that generates takes (a) a sorting procedure sort
for a key-value array, (b) a number n
and a number length
, and generates n
random arrays of the length length
, testing that sort
is indeed correct on all those arrays.