489 lines
18 KiB
Plaintext
489 lines
18 KiB
Plaintext
\chapter{Labels and variants} \label{c:labl-examples}
|
|
%HEVEA\cutname{lablexamples.html}
|
|
{\it (Chapter written by Jacques Garrigue)}
|
|
|
|
\bigskip
|
|
|
|
\noindent This chapter gives an overview of the new features in
|
|
OCaml 3: labels, and polymorphic variants.
|
|
|
|
\section{s:labels}{Labels}
|
|
|
|
If you have a look at modules ending in "Labels" in the standard
|
|
library, you will see that function types have annotations you did not
|
|
have in the functions you defined yourself.
|
|
|
|
\begin{caml_example}{toplevel}
|
|
ListLabels.map;;
|
|
StringLabels.sub;;
|
|
\end{caml_example}
|
|
|
|
Such annotations of the form "name:" are called {\em labels}. They are
|
|
meant to document the code, allow more checking, and give more
|
|
flexibility to function application.
|
|
You can give such names to arguments in your programs, by prefixing them
|
|
with a tilde "~".
|
|
|
|
\begin{caml_example}{toplevel}
|
|
let f ~x ~y = x - y;;
|
|
let x = 3 and y = 2 in f ~x ~y;;
|
|
\end{caml_example}
|
|
|
|
When you want to use distinct names for the variable and the label
|
|
appearing in the type, you can use a naming label of the form
|
|
"~name:". This also applies when the argument is not a variable.
|
|
|
|
\begin{caml_example}{toplevel}
|
|
let f ~x:x1 ~y:y1 = x1 - y1;;
|
|
f ~x:3 ~y:2;;
|
|
\end{caml_example}
|
|
|
|
Labels obey the same rules as other identifiers in OCaml, that is you
|
|
cannot use a reserved keyword (like "in" or "to") as label.
|
|
|
|
Formal parameters and arguments are matched according to their
|
|
respective labels\footnote{This corresponds to the commuting label mode
|
|
of Objective Caml 3.00 through 3.02, with some additional flexibility
|
|
on total applications. The so-called classic mode ("-nolabels"
|
|
options) is now deprecated for normal use.}, the absence of label
|
|
being interpreted as the empty label.
|
|
%
|
|
This allows commuting arguments in applications. One can also
|
|
partially apply a function on any argument, creating a new function of
|
|
the remaining parameters.
|
|
|
|
\begin{caml_example}{toplevel}
|
|
let f ~x ~y = x - y;;
|
|
f ~y:2 ~x:3;;
|
|
ListLabels.fold_left;;
|
|
ListLabels.fold_left [1;2;3] ~init:0 ~f:( + );;
|
|
ListLabels.fold_left ~init:0;;
|
|
\end{caml_example}
|
|
|
|
If several arguments of a function bear the same label (or no label),
|
|
they will not commute among themselves, and order matters. But they
|
|
can still commute with other arguments.
|
|
|
|
\begin{caml_example}{toplevel}
|
|
let hline ~x:x1 ~x:x2 ~y = (x1, x2, y);;
|
|
hline ~x:3 ~y:2 ~x:5;;
|
|
\end{caml_example}
|
|
|
|
As an exception to the above parameter matching rules, if an
|
|
application is total (omitting all optional arguments), labels may be
|
|
omitted.
|
|
In practice, many applications are total, so that labels can often be
|
|
omitted.
|
|
\begin{caml_example}{toplevel}
|
|
f 3 2;;
|
|
ListLabels.map succ [1;2;3];;
|
|
\end{caml_example}
|
|
But beware that functions like "ListLabels.fold_left" whose result
|
|
type is a type variable will never be considered as totally applied.
|
|
\begin{caml_example}{toplevel}[error]
|
|
ListLabels.fold_left ( + ) 0 [1;2;3];;
|
|
\end{caml_example}
|
|
|
|
When a function is passed as an argument to a higher-order function,
|
|
labels must match in both types. Neither adding nor removing labels
|
|
are allowed.
|
|
\begin{caml_example}{toplevel}
|
|
let h g = g ~x:3 ~y:2;;
|
|
h f;;
|
|
h ( + ) [@@expect error];;
|
|
\end{caml_example}
|
|
Note that when you don't need an argument, you can still use a wildcard
|
|
pattern, but you must prefix it with the label.
|
|
\begin{caml_example}{toplevel}
|
|
h (fun ~x:_ ~y -> y+1);;
|
|
\end{caml_example}
|
|
|
|
\subsection{ss:optional-arguments}{Optional arguments}
|
|
|
|
An interesting feature of labeled arguments is that they can be made
|
|
optional. For optional parameters, the question mark "?" replaces the
|
|
tilde "~" of non-optional ones, and the label is also prefixed by "?"
|
|
in the function type.
|
|
Default values may be given for such optional parameters.
|
|
|
|
\begin{caml_example}{toplevel}
|
|
let bump ?(step = 1) x = x + step;;
|
|
bump 2;;
|
|
bump ~step:3 2;;
|
|
\end{caml_example}
|
|
|
|
A function taking some optional arguments must also take at least one
|
|
non-optional argument. The criterion for deciding whether an optional
|
|
argument has been omitted is the non-labeled application of an
|
|
argument appearing after this optional argument in the function type.
|
|
Note that if that argument is labeled, you will only be able to
|
|
eliminate optional arguments by totally applying the function,
|
|
omitting all optional arguments and omitting all labels for all
|
|
remaining arguments.
|
|
|
|
\begin{caml_example}{toplevel}
|
|
let test ?(x = 0) ?(y = 0) () ?(z = 0) () = (x, y, z);;
|
|
test ();;
|
|
test ~x:2 () ~z:3 ();;
|
|
\end{caml_example}
|
|
|
|
Optional parameters may also commute with non-optional or unlabeled
|
|
ones, as long as they are applied simultaneously. By nature, optional
|
|
arguments do not commute with unlabeled arguments applied
|
|
independently.
|
|
\begin{caml_example}{toplevel}
|
|
test ~y:2 ~x:3 () ();;
|
|
test () () ~z:1 ~y:2 ~x:3;;
|
|
(test () ()) ~z:1 [@@expect error];;
|
|
\end{caml_example}
|
|
Here "(test () ())" is already "(0,0,0)" and cannot be further
|
|
applied.
|
|
|
|
Optional arguments are actually implemented as option types. If
|
|
you do not give a default value, you have access to their internal
|
|
representation, "type 'a option = None | Some of 'a". You can then
|
|
provide different behaviors when an argument is present or not.
|
|
|
|
\begin{caml_example}{toplevel}
|
|
let bump ?step x =
|
|
match step with
|
|
| None -> x * 2
|
|
| Some y -> x + y
|
|
;;
|
|
\end{caml_example}
|
|
|
|
It may also be useful to relay an optional argument from a function
|
|
call to another. This can be done by prefixing the applied argument
|
|
with "?". This question mark disables the wrapping of optional
|
|
argument in an option type.
|
|
|
|
\begin{caml_example}{toplevel}
|
|
let test2 ?x ?y () = test ?x ?y () ();;
|
|
test2 ?x:None;;
|
|
\end{caml_example}
|
|
|
|
\subsection{ss:label-inference}{Labels and type inference}
|
|
|
|
While they provide an increased comfort for writing function
|
|
applications, labels and optional arguments have the pitfall that they
|
|
cannot be inferred as completely as the rest of the language.
|
|
|
|
You can see it in the following two examples.
|
|
\begin{caml_example}{toplevel}
|
|
let h' g = g ~y:2 ~x:3;;
|
|
h' f [@@expect error];;
|
|
let bump_it bump x =
|
|
bump ~step:2 x;;
|
|
bump_it bump 1 [@@expect error];;
|
|
\end{caml_example}
|
|
The first case is simple: "g" is passed "~y" and then "~x", but "f"
|
|
expects "~x" and then "~y". This is correctly handled if we know the
|
|
type of "g" to be "x:int -> y:int -> int" in advance, but otherwise
|
|
this causes the above type clash. The simplest workaround is to apply
|
|
formal parameters in a standard order.
|
|
|
|
The second example is more subtle: while we intended the argument
|
|
"bump" to be of type "?step:int -> int -> int", it is inferred as
|
|
"step:int -> int -> 'a".
|
|
%
|
|
These two types being incompatible (internally normal and optional
|
|
arguments are different), a type error occurs when applying "bump_it"
|
|
to the real "bump".
|
|
|
|
We will not try here to explain in detail how type inference works.
|
|
One must just understand that there is not enough information in the
|
|
above program to deduce the correct type of "g" or "bump". That is,
|
|
there is no way to know whether an argument is optional or not, or
|
|
which is the correct order, by looking only at how a function is
|
|
applied. The strategy used by the compiler is to assume that there are
|
|
no optional arguments, and that applications are done in the right
|
|
order.
|
|
|
|
The right way to solve this problem for optional parameters is to add
|
|
a type annotation to the argument "bump".
|
|
\begin{caml_example}{toplevel}
|
|
let bump_it (bump : ?step:int -> int -> int) x =
|
|
bump ~step:2 x;;
|
|
bump_it bump 1;;
|
|
\end{caml_example}
|
|
In practice, such problems appear mostly when using objects whose
|
|
methods have optional arguments, so that writing the type of object
|
|
arguments is often a good idea.
|
|
|
|
Normally the compiler generates a type error if you attempt to pass to
|
|
a function a parameter whose type is different from the expected one.
|
|
However, in the specific case where the expected type is a non-labeled
|
|
function type, and the argument is a function expecting optional
|
|
parameters, the compiler will attempt to transform the argument to
|
|
have it match the expected type, by passing "None" for all optional
|
|
parameters.
|
|
|
|
\begin{caml_example}{toplevel}
|
|
let twice f (x : int) = f(f x);;
|
|
twice bump 2;;
|
|
\end{caml_example}
|
|
|
|
This transformation is coherent with the intended semantics,
|
|
including side-effects. That is, if the application of optional
|
|
parameters shall produce side-effects, these are delayed until the
|
|
received function is really applied to an argument.
|
|
|
|
\subsection{ss:label-suggestions}{Suggestions for labeling}
|
|
|
|
Like for names, choosing labels for functions is not an easy task. A
|
|
good labeling is a labeling which
|
|
|
|
\begin{itemize}
|
|
\item makes programs more readable,
|
|
\item is easy to remember,
|
|
\item when possible, allows useful partial applications.
|
|
\end{itemize}
|
|
|
|
We explain here the rules we applied when labeling OCaml
|
|
libraries.
|
|
|
|
To speak in an ``object-oriented'' way, one can consider that each
|
|
function has a main argument, its {\em object}, and other arguments
|
|
related with its action, the {\em parameters}. To permit the
|
|
combination of functions through functionals in commuting label mode, the
|
|
object will not be labeled. Its role is clear from the function
|
|
itself. The parameters are labeled with names reminding of
|
|
their nature or their role. The best labels combine nature and
|
|
role. When this is not possible the role is to be preferred, since the
|
|
nature will
|
|
often be given by the type itself. Obscure abbreviations should be
|
|
avoided.
|
|
\begin{alltt}
|
|
"ListLabels.map : f:('a -> 'b) -> 'a list -> 'b list"
|
|
UnixLabels.write : file_descr -> buf:bytes -> pos:int -> len:int -> unit
|
|
\end{alltt}
|
|
|
|
When there are several objects of same nature and role, they are all
|
|
left unlabeled.
|
|
\begin{alltt}
|
|
"ListLabels.iter2 : f:('a -> 'b -> unit) -> 'a list -> 'b list -> unit"
|
|
\end{alltt}
|
|
|
|
When there is no preferable object, all arguments are labeled.
|
|
\begin{alltt}
|
|
BytesLabels.blit :
|
|
src:bytes -> src_pos:int -> dst:bytes -> dst_pos:int -> len:int -> unit
|
|
\end{alltt}
|
|
|
|
However, when there is only one argument, it is often left unlabeled.
|
|
\begin{alltt}
|
|
BytesLabels.create : int -> bytes
|
|
\end{alltt}
|
|
This principle also applies to functions of several arguments whose
|
|
return type is a type variable, as long as the role of each argument
|
|
is not ambiguous. Labeling such functions may lead to awkward error
|
|
messages when one attempts to omit labels in an application, as we
|
|
have seen with "ListLabels.fold_left".
|
|
|
|
Here are some of the label names you will find throughout the
|
|
libraries.
|
|
|
|
\begin{tableau}{|l|l|}{Label}{Meaning}
|
|
\entree{"f:"}{a function to be applied}
|
|
\entree{"pos:"}{a position in a string, array or byte sequence}
|
|
\entree{"len:"}{a length}
|
|
\entree{"buf:"}{a byte sequence or string used as buffer}
|
|
\entree{"src:"}{the source of an operation}
|
|
\entree{"dst:"}{the destination of an operation}
|
|
\entree{"init:"}{the initial value for an iterator}
|
|
\entree{"cmp:"}{a comparison function, {\it e.g.} "Stdlib.compare"}
|
|
\entree{"mode:"}{an operation mode or a flag list}
|
|
\end{tableau}
|
|
|
|
All these are only suggestions, but keep in mind that the
|
|
choice of labels is essential for readability. Bizarre choices will
|
|
make the program harder to maintain.
|
|
|
|
In the ideal, the right function name with right labels should be
|
|
enough to understand the function's meaning. Since one can get this
|
|
information with OCamlBrowser or the "ocaml" toplevel, the documentation
|
|
is only used when a more detailed specification is needed.
|
|
|
|
\begin{caml_eval}
|
|
#label false;;
|
|
\end{caml_eval}
|
|
|
|
|
|
\section{s:polymorphic-variants}{Polymorphic variants}
|
|
|
|
Variants as presented in section~\ref{s:tut-recvariants} are a
|
|
powerful tool to build data structures and algorithms. However they
|
|
sometimes lack flexibility when used in modular programming. This is
|
|
due to the fact that every constructor is assigned to a unique type
|
|
when defined and used. Even if the same name appears in the definition
|
|
of multiple types, the constructor itself belongs to only one type.
|
|
Therefore, one cannot decide that a given constructor belongs to
|
|
multiple types, or consider a value of some type to belong to some
|
|
other type with more constructors.
|
|
|
|
With polymorphic variants, this original assumption is removed. That
|
|
is, a variant tag does not belong to any type in particular, the type
|
|
system will just check that it is an admissible value according to its
|
|
use. You need not define a type before using a variant tag. A variant
|
|
type will be inferred independently for each of its uses.
|
|
|
|
\subsection*{ss:polyvariant:basic-use}{Basic use}
|
|
|
|
In programs, polymorphic variants work like usual ones. You just have
|
|
to prefix their names with a backquote character "`".
|
|
\begin{caml_example}{toplevel}
|
|
[`On; `Off];;
|
|
`Number 1;;
|
|
let f = function `On -> 1 | `Off -> 0 | `Number n -> n;;
|
|
List.map f [`On; `Off];;
|
|
\end{caml_example}
|
|
"[>`Off|`On] list" means that to match this list, you should at
|
|
least be able to match "`Off" and "`On", without argument.
|
|
"[<`On|`Off|`Number of int]" means that "f" may be applied to "`Off",
|
|
"`On" (both without argument), or "`Number" $n$ where
|
|
$n$ is an integer.
|
|
The ">" and "<" inside the variant types show that they may still be
|
|
refined, either by defining more tags or by allowing less. As such, they
|
|
contain an implicit type variable. Because each of the variant types
|
|
appears only once in the whole type, their implicit type variables are
|
|
not shown.
|
|
|
|
The above variant types were polymorphic, allowing further refinement.
|
|
When writing type annotations, one will most often describe fixed
|
|
variant types, that is types that cannot be refined. This is
|
|
also the case for type abbreviations. Such types do not contain "<" or
|
|
">", but just an enumeration of the tags and their associated types,
|
|
just like in a normal datatype definition.
|
|
\begin{caml_example}{toplevel}
|
|
type 'a vlist = [`Nil | `Cons of 'a * 'a vlist];;
|
|
let rec map f : 'a vlist -> 'b vlist = function
|
|
| `Nil -> `Nil
|
|
| `Cons(a, l) -> `Cons(f a, map f l)
|
|
;;
|
|
\end{caml_example}
|
|
|
|
\subsection*{ss:polyvariant-advanced}{Advanced use}
|
|
|
|
Type-checking polymorphic variants is a subtle thing, and some
|
|
expressions may result in more complex type information.
|
|
|
|
\begin{caml_example}{toplevel}
|
|
let f = function `A -> `C | `B -> `D | x -> x;;
|
|
f `E;;
|
|
\end{caml_example}
|
|
Here we are seeing two phenomena. First, since this matching is open
|
|
(the last case catches any tag), we obtain the type "[> `A | `B]"
|
|
rather than "[< `A | `B]" in a closed matching. Then, since "x" is
|
|
returned as is, input and return types are identical. The notation "as
|
|
'a" denotes such type sharing. If we apply "f" to yet another tag
|
|
"`E", it gets added to the list.
|
|
|
|
\begin{caml_example}{toplevel}
|
|
let f1 = function `A x -> x = 1 | `B -> true | `C -> false
|
|
let f2 = function `A x -> x = "a" | `B -> true ;;
|
|
let f x = f1 x && f2 x;;
|
|
\end{caml_example}
|
|
Here "f1" and "f2" both accept the variant tags "`A" and "`B", but the
|
|
argument of "`A" is "int" for "f1" and "string" for "f2". In "f"'s
|
|
type "`C", only accepted by "f1", disappears, but both argument types
|
|
appear for "`A" as "int & string". This means that if we
|
|
pass the variant tag "`A" to "f", its argument should be {\em both}
|
|
"int" and "string". Since there is no such value, "f" cannot be
|
|
applied to "`A", and "`B" is the only accepted input.
|
|
|
|
Even if a value has a fixed variant type, one can still give it a
|
|
larger type through coercions. Coercions are normally written with
|
|
both the source type and the destination type, but in simple cases the
|
|
source type may be omitted.
|
|
\begin{caml_example}{toplevel}
|
|
type 'a wlist = [`Nil | `Cons of 'a * 'a wlist | `Snoc of 'a wlist * 'a];;
|
|
let wlist_of_vlist l = (l : 'a vlist :> 'a wlist);;
|
|
let open_vlist l = (l : 'a vlist :> [> 'a vlist]);;
|
|
fun x -> (x :> [`A|`B|`C]);;
|
|
\end{caml_example}
|
|
|
|
You may also selectively coerce values through pattern matching.
|
|
\begin{caml_example}{toplevel}
|
|
let split_cases = function
|
|
| `Nil | `Cons _ as x -> `A x
|
|
| `Snoc _ as x -> `B x
|
|
;;
|
|
\end{caml_example}
|
|
When an or-pattern composed of variant tags is wrapped inside an
|
|
alias-pattern, the alias is given a type containing only the tags
|
|
enumerated in the or-pattern. This allows for many useful idioms, like
|
|
incremental definition of functions.
|
|
|
|
\begin{caml_example}{toplevel}
|
|
let num x = `Num x
|
|
let eval1 eval (`Num x) = x
|
|
let rec eval x = eval1 eval x ;;
|
|
let plus x y = `Plus(x,y)
|
|
let eval2 eval = function
|
|
| `Plus(x,y) -> eval x + eval y
|
|
| `Num _ as x -> eval1 eval x
|
|
let rec eval x = eval2 eval x ;;
|
|
\end{caml_example}
|
|
|
|
To make this even more comfortable, you may use type definitions as
|
|
abbreviations for or-patterns. That is, if you have defined "type
|
|
myvariant = [`Tag1 of int | `Tag2 of bool]", then the pattern "#myvariant" is
|
|
equivalent to writing "(`Tag1(_ : int) | `Tag2(_ : bool))".
|
|
\begin{caml_eval}
|
|
type myvariant = [`Tag1 of int | `Tag2 of bool];;
|
|
\end{caml_eval}
|
|
|
|
Such abbreviations may be used alone,
|
|
\begin{caml_example}{toplevel}
|
|
let f = function
|
|
| #myvariant -> "myvariant"
|
|
| `Tag3 -> "Tag3";;
|
|
\end{caml_example}
|
|
or combined with with aliases.
|
|
\begin{caml_example}{toplevel}
|
|
let g1 = function `Tag1 _ -> "Tag1" | `Tag2 _ -> "Tag2";;
|
|
let g = function
|
|
| #myvariant as x -> g1 x
|
|
| `Tag3 -> "Tag3";;
|
|
\end{caml_example}
|
|
|
|
\subsection{ss:polyvariant-weaknesses}{Weaknesses of polymorphic variants}
|
|
|
|
After seeing the power of polymorphic variants, one may wonder why
|
|
they were added to core language variants, rather than replacing them.
|
|
|
|
The answer is twofold. One first aspect is that while being pretty
|
|
efficient, the lack of static type information allows for less
|
|
optimizations, and makes polymorphic variants slightly heavier than
|
|
core language ones. However noticeable differences would only
|
|
appear on huge data structures.
|
|
|
|
More important is the fact that polymorphic variants, while being
|
|
type-safe, result in a weaker type discipline. That is, core language
|
|
variants do actually much more than ensuring type-safety, they also
|
|
check that you use only declared constructors, that all constructors
|
|
present in a data-structure are compatible, and they enforce typing
|
|
constraints to their parameters.
|
|
|
|
For this reason, you must be more careful about making types explicit
|
|
when you use polymorphic variants. When you write a library, this is
|
|
easy since you can describe exact types in interfaces, but for simple
|
|
programs you are probably better off with core language variants.
|
|
|
|
Beware also that some idioms make trivial errors very hard to find.
|
|
For instance, the following code is probably wrong but the compiler
|
|
has no way to see it.
|
|
\begin{caml_example}{toplevel}
|
|
type abc = [`A | `B | `C] ;;
|
|
let f = function
|
|
| `As -> "A"
|
|
| #abc -> "other" ;;
|
|
let f : abc -> string = f ;;
|
|
\end{caml_example}
|
|
You can avoid such risks by annotating the definition itself.
|
|
\begin{caml_example}{toplevel}[error]
|
|
let f : abc -> string = function
|
|
| `As -> "A"
|
|
| #abc -> "other" ;;
|
|
\end{caml_example}
|