# Lists

A list is an ordered sequence of elements. All elements of a list in OCaml must be the same type. Lists are built into the language and have a special syntax. Here is a list of three integers:

`# [1; 2; 3];;`

- : int list = [1; 2; 3]

Note semicolons separate the elements, not commas. The empty list is written
`[]`

. The type of this list of integers is `int list`

.

A list, if it is not empty, has a *head* (the first element) and a *tail* (the
list consisting of the rest of the elements). In our example, the head is the
integer `1`

while the tail is the list `[2; 3]`

. An empty list has neither a
head nor a tail. Here are some more lists:

`# [];;`

- : 'a list = []
# [1; 2; 3];;

- : int list = [1; 2; 3]
# [false; true; false];;

- : bool list = [false; true; false]
# [[1; 2]; [3; 4]; [5; 6]];;

- : int list list = [[1; 2]; [3; 4]; [5; 6]]

Notice the type of the empty list is `'a list`

(its element type is not known).
Notice also the type of the last list - `int list list`

or a list of lists of
integers.

There are two built-in operators on lists. The `::`

or cons operator, adds one
element to the front of a list. The `@`

or append operator combines two lists:

`# 1 :: [2; 3];;`

- : int list = [1; 2; 3]
# [1] @ [2; 3];;

- : int list = [1; 2; 3]

## Functions on lists

We can write functions which operate over lists by pattern matching:

```
# let rec total l =
match l with
| [] -> 0
| h :: t -> h + total t;;
```

val total : int list -> int = <fun>
# total [1; 3; 5; 3; 1];;

- : int = 13

Consider a function to find the length of a list:

```
# let rec length l =
match l with
| [] -> 0
| _ :: t -> 1 + length t;;
```

val length : 'a list -> int = <fun>

This function operates not just on lists of integers, but on any kind of list.

`# length [1; 2; 3];;`

- : int = 3
# length ["cow"; "sheep"; "cat"];;

- : int = 3
# length [[]];;

- : int = 1

Why is this? Because in the pattern `_ :: t`

the head of the list is not
inspected, so its type cannot be relevant. Such a function is called
polymorphic. Here is another polymorphic function, our own version of the `@`

operator for appending:

```
# let rec append a b =
match a with
| [] -> b
| h :: t -> h :: append t b;;
```

val append : 'a list -> 'a list -> 'a list = <fun>

Notice that the memory for the second list is shared, but the first list is effectively copied.

## Higher order functions on lists

We might wish to apply a function to each element in a list, yielding a new
one. We shall write a function `map`

which is given another function as its
argument - such a function is called "higher-order":

```
# let rec map f l =
match l with
| [] -> []
| h :: t -> f h :: map f t;;
```

val map : ('a -> 'b) -> 'a list -> 'b list = <fun>

Notice the type of the function `f`

in parentheses as part of the whole type.
This `map`

function, given a function of type `'a -> 'b`

and a list of `'a`

s,
will build a list of `'b'`

s. Sometimes `'a`

and `'b`

might be the same type, of
course. Here are two examples showing the `map`

function in use:

`# map (fun x -> x * 2) [1; 2; 3];;`

- : int list = [2; 4; 6]
# map total [[1; 2]; [3; 4]; [5; 6]];;

- : int list = [3; 7; 11]

## The standard library List module

The standard library List module contains a wide range of useful utility functions, including pre-written versions of many of the functions we have written in this tutorial. A version of the module with labeled functions is available as part of StdLabels.

In the List module documentation, functions which can raise an exception are marked. Such exceptions are usually the result of lists which are empty (and therefore have neither a head nor a tail) or lists of mismatched length.

### Maps and iterators

We have already written a `map`

function from scratch, and it is no surprise
that one is included in the List module.
There is also a variant for two lists:

`# List.map2 ( + ) [1; 2; 3] [4; 5; 6];;`

- : int list = [5; 7; 9]

In addition, we have an imperative analog to
`map`

, called
`iter`

. It takes an imperative
function of type `'a -> unit`

and an `'a list`

and applies the function to each
element in turn. A suitable function might be `print_endline`

:

`# List.iter print_endline ["frank"; "james"; "mary"];;`

frank
james
mary
- : unit = ()

There is a variant `iter2`

for two
lists too:

```
# List.iter2
(fun a b -> print_endline (a ^ " " ^ b))
["frank"; "james"; "mary"]
["carter"; "lee"; "jones"];;
```

frank carter
james lee
mary jones
- : unit = ()

Notice that `map2`

and
`iter2`

will fail if the lists are
of unequal length:

`# List.map2 ( + ) [1; 2; 3] [4; 5];;`

Exception: Invalid_argument "List.map2".

### List scanning

The useful function `mem`

checks
whether a given element is a member of a list by scanning its contents:

`# List.mem "frank" ["james"; "frank"; "mary"];;`

- : bool = true
# List.mem [] [[1; 2]; [3]; []; [5]];;

- : bool = true

There are more elaborate scanning functions: imagine we wish to check to see if
all elements of a list are even, or if any element is even. We could either
write functions to go over each element of the list, keeping a boolean check,
or use `mem`

and other functions already known to us:

```
# let all =
not (List.mem false (List.map (fun x -> x mod 2 = 0) [2; 4; 6; 8]));;
```

val all : bool = true
# let any =
List.mem true (List.map (fun x -> x mod 2 = 0) [1; 2; 3]);;

val any : bool = true

This is rather clumsy, though. The standard library provides two useful
functions `for_all`

and
`exists`

for this common problem:

`# List.for_all (fun x -> x mod 2 = 0) [2; 4; 6; 8];;`

- : bool = true
# List.exists (fun x -> x mod 2 = 0) [1; 2; 3];;

- : bool = true

So you can see how the standard library has evolved into its present state: pieces of frequently-used code are turned into useful general functions.

### List searching

The function `find`

returns the
first element of a list matching a given predicate (a predicate is a testing
function which returns either true or false when given an element). It raises
an exception if such an element is not found:

`# List.find (fun x -> x mod 2 = 0) [1; 2; 3; 4; 5];;`

- : int = 2
# List.find (fun x -> x mod 2 = 0) [1; 3; 5];;

Exception: Not_found.

The `filter`

function again takes
a predicate and tests it against each element in the list, but this time
returns the list of all elements which test true:

`# List.filter (fun x -> x mod 2 = 0) [1; 2; 3; 4; 5];;`

- : int list = [2; 4]

If we wish to know also which elements did not test true, we can use
`partition`

which returns a
pair of lists: the first being the list of elements for which the predicate is
true, the second those for which it is false.

`# List.partition (fun x -> x mod 2 = 0) [1; 2; 3; 4; 5];;`

- : int list * int list = ([2; 4], [1; 3; 5])

Note that the documentation for
`filter`

and
`partition`

tells us that the
order of the input is preserved in the output. Where this is not stated it the
documentation, it cannot be assumed.

### Association lists

Association lists are a simple (and simplistic) way of implementing the dictionary data structure: that is to say, a group of keys each with an associated value. For large dictionaries, for efficiency, we would use the standard library's Map or Hashtbl modules. But these functions from the List module are useful for lists which are generally small, and have other advantages: since they are just lists of pairs, they can be built and modified easily. They are also easily printed in the toplevel.

`# List.assoc 4 [(3, "three"); (1, "one"); (4, "four")];;`

- : string = "four"
# List.mem_assoc 4 [(3, "three"); (1, "one"); (4, "four")];;

- : bool = true

When using association lists, and for other purposes, it is sometimes useful to
be able to make a list of pairs from a pair of lists and vice versa. The
`List`

module provides the functions
`split`

and
`combine`

for this purpose:

`# List.split [(3, "three"); (1, "one"); (4, "four")];;`

- : int list * string list = ([3; 1; 4], ["three"; "one"; "four"])
# List.combine [3; 1; 4] ["three"; "one"; "four"];;

- : (int * string) list = [(3, "three"); (1, "one"); (4, "four")]

### Sorting lists

The function `List.sort`

, given a
comparison function of type `'a -> 'a -> int`

(zero if equal, negative if first
smaller, positive if second smaller) and an input list of type `'a list`

,
returns the list sorted according to the comparison function. Typically, we use
the built-in comparison function
`compare`

which can compare any
two values of like type (with the exception of functions which are incomparable).

`# List.sort compare [1; 4; 6; 4; 1];;`

- : int list = [1; 1; 4; 4; 6]
# List.sort compare ["Reynolds"; "Smith"; "Barnes"];;

- : string list = ["Barnes"; "Reynolds"; "Smith"]
# List.sort (Fun.flip compare) [1; 4; 6; 4; 1];;

- : int list = [6; 4; 4; 1; 1]
# List.sort compare [(1, 3); (1, 2); (2, 3); (2, 2)];;

- : (int * int) list = [(1, 2); (1, 3); (2, 2); (2, 3)]
# List.sort
(fun a b -> compare (fst a) (fst b))
[(1, 3); (1, 2); (2, 3); (2, 2)];;

- : (int * int) list = [(1, 3); (1, 2); (2, 3); (2, 2)]

(The function `Fun.flip`

reverses the
argument order of a binary function.)

### Folds

There are two interestingly-named functions in the List module,
`fold_left`

and
`fold_right`

. Their job is
to combine the elements of a list together, using a given function,
accumulating an answer which is then returned. The answer returned depends upon
the function given, the elements of the list, and the initial value of the
accumulator supplied. So you can imagine these are very general functions.
Let's explore `fold_left`

first.

In this example, we supply the addition function and an initial accumulator value of 0:

`# List.fold_left ( + ) 0 [1; 2; 3];;`

- : int = 6

The result is the sum of the elements in the list. Now let's use OCaml's
built-in `max`

function which returns the larger of two given integers in place
of our addition function. We use `min_int`

, the smallest possible integer, as
our initial accumulator

`# List.fold_left max min_int [2; 4; 6; 0; 1];;`

- : int = 6

The largest number in the list is found. Let's look at the type of the
`fold_left`

function:

`# List.fold_left;;`

- : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a = <fun>

The function is of type `'a -> 'b -> 'a`

where `'a`

is the accumulator and `'b`

is the type of each element of the list. The next argument is the initial
accumulator, which must be of type `'a`

, and then finally the input list of
type `'b list`

. The result is the final value of the accumulator, so it must
have type `'a`

. Of course, in both of our examples, `'a`

and `'b`

are the same
as one another. But this is not always so.

Consider the following definition of `append`

which uses
`fold_right`

(`fold_left`

considers the
elements from the left,
`fold_right`

from the right):

```
# let append x y =
List.fold_right (fun e a -> e :: a) x y;;
```

val append : 'a list -> 'a list -> 'a list = <fun>

In this example, the initial accumulator is the second list, and each element of the first is consed to it in turn. You can see the order of arguments to fold right is a little different:

`# List.fold_right;;`

- : ('a -> 'b -> 'b) -> 'a list -> 'b -> 'b = <fun>

The function comes first, then the list of elements, then the initial
accumulator value. We can use
`fold_right`

to define our
usual `map`

function:

```
# let map f l =
List.fold_right (fun e a -> f e :: a) l [];;
```

val map : ('a -> 'b) -> 'a list -> 'b list = <fun>

But care is needed. If we try that with
`List.concat`

, which turns a
list of lists into a list by concatenating the lists together, we produce this:

`# let concat l = List.fold_left ( @ ) [] l;;`

val concat : 'a list list -> 'a list = <fun>

Unfortunately, the order of evaluation here is such that larger and larger
items are passed to the `@`

operator as its first argument, and so the function
has a worse running time than
`List.concat`

. You can read more
about the time and space efficiency of lists and other common OCaml data
structures in the comparison of standard
containers.

Here are some more redefinitions of familiar functions in terms of
`fold_left`

or
`fold_right`

. Can you work
out how they operate?

```
# let length l =
List.fold_left (fun a _ -> a + 1) 0 l;;
```

val length : 'a list -> int = <fun>
# let rev l =
List.fold_left (fun a e -> e :: a) [] l;;

val rev : 'a list -> 'a list = <fun>
# let split l =
List.fold_right
(fun (x, y) (xs, ys) -> (x :: xs, y :: ys))
l
([], []);;

val split : ('a * 'b) list -> 'a list * 'b list = <fun>

## Lists and tail recursion

Our `length`

function builds up an intermediate expression of a size
proportional to its input list:

```
length [1; 2; 3]
=> 1 + length [2; 3]
=> 1 + (1 + length [3])
=> 1 + (1 + (1 + length []))
=> 1 + (1 + (1 + 0))
=> 1 + (1 + 1)
=> 1 + 2
=> 3
```

For long lists, this may overflow the stack (be too large for the computer to handle). The solution is to write our function with an accumulating argument, like this:

```
# let rec length acc l =
match l with
| [] -> acc
| _ :: t -> length (acc + 1) t;;
```

val length : int -> 'a list -> int = <fun>
# let l = length 0 [1; 2; 3];;

val l : int = 3

This function now uses a constant amount of space on the stack:

```
length 0 [1; 2; 3]
=> length 1 [2; 3]
=> length 2 [3]
=> length 3 []
=> 3
```

We call such a function *tail-recursive*. We may write a wrapper function so
that the initial accumulator value is supplied automatically:

```
# let rec length_inner acc l =
match l with
| [] -> acc
| _ :: t -> length_inner (acc + 1) t;;
```

val length_inner : int -> 'a list -> int = <fun>
# let length l = length_inner 0 l;;

val length : 'a list -> int = <fun>

Or, we can do it all in one function:

```
# let length l =
let rec length_inner acc l =
match l with
| [] -> acc
| _ :: t -> length_inner (acc + 1) t
in
length_inner 0 l;;
```

val length : 'a list -> int = <fun>

In the standard library documentation, functions which are not tail-recursive are marked.