2013-03-06 04:57:56 -08:00
This file describes the changes on the extension_points branch.
=== Attributes
Attributes are "decorations" of the syntax tree which are ignored by
2013-03-07 07:29:30 -08:00
the type-checker. An attribute is made of an identifier (an "LIDENT"
or "UIDENT", written id below) and an optional expression (written
expr below).
2013-03-06 04:57:56 -08:00
2013-03-07 05:00:01 -08:00
Attributes on expressions, type expressions, module expressions, module type expressions,
2013-03-06 04:57:56 -08:00
patterns (TODO: class expressions, class type expressions):
2013-03-07 05:00:01 -08:00
... [@id expr]
2013-03-06 04:57:56 -08:00
2013-03-07 05:00:01 -08:00
The same syntax [@id expr] is also available to add attributes on
2013-03-06 05:51:18 -08:00
constructors and labels in type declarations:
2013-03-06 04:57:56 -08:00
type t =
| A [@id1]
| B [@id2] of int [@id3]
Here, id1 (resp. id2) is attached to the constructor A (resp. B)
2013-03-06 05:51:18 -08:00
and id3 is attached to the int type expression. Example on records:
type t =
{
x [@id1]: int;
mutable y [@id2] [@id3]: string [@id4];
}
2013-03-06 04:57:56 -08:00
2013-03-07 05:00:01 -08:00
Attributes on items:
2013-03-06 04:57:56 -08:00
2013-03-07 05:00:01 -08:00
... [@@id expr]
2013-03-06 04:57:56 -08:00
2013-03-06 05:51:39 -08:00
Items designate signature and structure items, and also individual
2013-03-06 04:57:56 -08:00
components of multiple declaration (type declarations, recursive modules,
class declarations, class type declarations). (TODO: class fields?)
For instance, consider:
2013-03-07 05:00:01 -08:00
type t1 = ... [@@id1] [@@id2] and t2 = ... [@@id3] [@@id4]
2013-03-06 04:57:56 -08:00
2013-03-07 05:00:01 -08:00
Here, the attributes on t1 are id1, id23; the attributes on
t2 are id3 and id4.
2013-03-06 04:57:56 -08:00
Note: item attributes are currently not supported on Pstr_eval
and Pstr_value structure items.
2013-03-07 05:46:44 -08:00
The [@@id expr] form, when used at the beginning of a signature or
structure, or after a double semi-colon (;;), defines an attribute
which stands as a stand-alone signature or structure item (not
attached to another item).
Example:
module type S = sig
[@@id1]
type t
[@@id2]
;; [@@id3] [@@id4]
;; [@@id5]
type s
[@@id6]
end
Here, id1, id3, id4, id5 are stand-alone attributes, while
id2 is attached to the type t and id6 is attached to the type s.
2013-03-06 04:57:56 -08:00
2013-03-08 01:17:30 -08:00
=== Alternative syntax for attributes and extensions on specific kinds of nodes
2013-03-06 06:53:39 -08:00
2013-03-08 00:42:53 -08:00
All expression constructions starting with a keyword, a combination of
2013-03-08 01:17:30 -08:00
keywords or a delimiter supports an alternative syntax for attributes and/or extensions:
2013-03-06 06:53:39 -08:00
2013-03-08 00:42:53 -08:00
KW[@id expr]...[@id expr] REST
2013-03-06 06:53:39 -08:00
---->
2013-03-08 00:42:53 -08:00
(KW REST)[@id expr]...[@id expr]
2013-03-08 01:17:30 -08:00
KW%id REST
---->
[%id (KW REST)]
KW%id[@id expr]...[@id expr] REST
---->
([%id (KW REST)])[@id expr]...[@id expr]
2013-03-08 00:42:53 -08:00
where KW can stand for:
(
(module
[
[|
assert
begin
for
fun
function
if
lazy
let
let module
let open
match
new
object
try
while
{
{<
2013-03-06 06:53:39 -08:00
2013-03-08 00:42:53 -08:00
For instance:
2013-03-06 06:53:39 -08:00
2013-03-08 00:42:53 -08:00
let[@foo] x = 2 in x + 1 ==== (let x = 2 in x + 1)[@foo]
([@foo] 3 + 4) ==== (3 + 4)[@foo]
begin[@foo] ... end ==== (begin ... end)[@foo]
2013-03-08 01:17:30 -08:00
match%foo e with ... ==== [%foo match e with ...]
2013-03-06 06:53:39 -08:00
2013-03-06 06:29:04 -08:00
=== Representation of attributes in the Parsetree
For attributes on expressions and similar categories, attributes are
just another possible constructor (one single attribute per node):
and expression_desc =
....
| Pexp_attribute of (expression * attribute)
Note: we currently don't distinguish between prefix and postfix attributes.
Similarly, attributes as standalone signature/structure items are represented
by a new constructor:
| Psig_attribute of attribute
| Pstr_attribute of attribute
For "declarations"-like items (type declarations, module declarations,
..., but also constructors and record labels), all attributes are stored
in an extra field in their record (again, without making the distinction
between prefix and postfix attributes):
and type_declaration = {
...
ptype_attributes: attribute list;
...
For other kinds of items (currently: open/include stataments,
exception rebind), the attributes are stored directly in the
constructor of the item:
| Pstr_open of Longident.t loc * attribute list
2013-03-06 04:57:56 -08:00
=== Extension nodes
Extension nodes replace valid components in the syntax tree. They are
normally interpreted and expanded by AST mapper. The type-checker
fails when it encounters such an extension node. An extension node is
made of an identifier (an "LIDENT", written id below) and an optional
expression (written expr below).
Two syntaxes exist for extension node:
As expressions, type expressions, module expressions, module type expressions,
patterns (TODO: class expressions, class type expressions):
[%id expr]
As structure or signature item (TODO: class fields?):
[%%id expr]
As other structure or signature items, attributes can be attached to a
[%%id expr] extension node.
=== Other changes to the parser and Parsetree
2013-03-07 06:15:23 -08:00
--- Relaxing the syntax for signatures and structures
It is now possible to start a signature or a structure with a ";;" token and to have two successive ";;" tokens.
Rationale:
It makes it possible to always prefix a "standalone" attribute by ";;" independently
from its context (this will work at the beginning of the signature/structure and after
another item finished with ";;").
2013-03-06 04:57:56 -08:00
--- Relaxing the syntax for recursive modules.
Before:
module X1 : MT1 = M1 and ... and Xn : MTn = Mn
Now:
module X1 = M1 and ... and Xn = Mn
(with the usual sugar that Xi = (Mi : MTi) can be written as Xi : MTi = Mi
which gives the old syntax)
The type-checker fails when a module expression is not of
the form (M : MT)
Rationale:
1. More uniform representation in the Parsetree.
2. The type-checker can be made more clever in the future to support
other forms of module expressions (e.g. functions with an explicit
constraint on its result; or a structure with only type-level
components).
--- Turning some tuple or n-ary constructors into records
Before:
| Pstr_module of string loc * module_expr
After:
| Pstr_module of module_binding
...
and module_binding =
{
pmb_name: string loc;
pmb_expr: module_expr;
pmb_attributes: attribute list;
}
Rationale:
More self-documented, more robust to future additions (such as
attributes), simplifies some code.
--- Keeping names inside value_description and type_declaration
Before:
| Psig_type of (string loc * type_declaration) list
After:
| Psig_type of type_declaration list
....
and type_declaration =
{ ptype_name: string loc;
...
}
Rationale:
More self-documented, simplifies some code.
2013-03-06 05:51:18 -08:00
=== More TODOs
- Adapt pprintast.
- Adapt Camlp4 (both its parser(s) and its internal representation of OCaml ASTs).
- Propagate attributes to the Typedtree (so that they can be retrieved in .cmt/.cmti).
- Consider adding hooks to the type-checker so that custom extension expanders can be registered (a la OCaml Templates).
2013-03-06 06:29:04 -08:00
- Quotations (i.e. string literals with custom delimiters and without any interpretation of special characters in them), and a syntax which combines extension nodes and quotations.
- More cleanups to the Parsetree + documentation.
2013-03-07 07:29:11 -08:00
=== Use cases
From https://github.com/gasche/ocaml-syntax-extension-discussion/wiki/Use-Cases
-- Bisect
let f x =
match List.map foo [x; a x; b x] with
| [y1; y2; y3] -> tata
| _ -> assert false [@bisect VISIT]
;;[@@bisect IGNORE-BEGIN]
let unused = ()
;;[@@bisect IGNORE-END]
-- OCamldoc
val stats : ('a, 'b) t -> statistics
[@@doc
"[Hashtbl.stats tbl] returns statistics about the table [tbl]:
number of buckets, size of the biggest bucket, distribution of
buckets by size."
]
[@@since "4.00.0"]
2013-03-07 07:33:13 -08:00
;;[@@doc section 6 "Functorial interface"]
2013-03-07 07:29:11 -08:00
module type HashedType =
sig
type t
[@@doc "The type of the hashtable keys."]
val equal : t -> t -> bool
[@@doc "The equality predicate used to compare keys."]
end
-- type-conv, deriving
type t = {
x : int [@default 42];
y : int [@default 3] @[sexp_drop_default];
z : int [@default 3] @[sexp_drop_if z_test];
} [@@sexp]
type r1 = {
r1_l1 : int;
r1_l2 : int;
} [@@deriving (Dump, Eq, Show, Typeable, Pickle, Functor)]
-- camlp4 map/fold generators
type variable = string
and term =
| Var of variable
| Lam of variable * term
| App of term * term
class map = [%generate_map term]
or:
[%%generate_map map term]
-- ocaml-rpc
type t = { foo [@rpc "type"]: int; bar [@rpc "let"]: int }
[@@ rpc]
or:
type t = { foo: int; bar: int }
[@@ rpc ("foo" > "type"), ("bar" > "let")]
-- pa_monad
2013-03-08 01:36:55 -08:00
begin%monad
2013-03-07 07:29:11 -08:00
a <-- [1; 2; 3];
b <-- [3; 4; 5];
return (a + b)
end
-- pa_lwt
2013-03-08 01:36:55 -08:00
let%lwt x = start_thread foo
2013-03-07 07:29:11 -08:00
and y = start_other_thread foo in
2013-03-08 01:36:55 -08:00
try%lwt
let%for_lwt (x, y) = waiting_threads in
compute blah
2013-03-07 07:29:11 -08:00
with Killed -> bar
-- Bolt
let funct n =
2013-03-08 01:34:02 -08:00
(%log "funct(%d)" n LEVEL DEBUG);
2013-03-07 07:29:11 -08:00
for i = 1 to n do
print_endline "..."
done
-- pre-polyrecord
2013-03-08 01:34:02 -08:00
let r = {%polyrec x = 1; y = ref None }
let () = (%polyrec r.y <- Some 2)
2013-03-07 07:29:11 -08:00
-- orakuda
2013-03-08 01:34:02 -08:00
function%regexp
2013-03-07 07:29:11 -08:00
| "$/^[0-9]+$/" as v -> `Int (int_of_string v#_0)
| "$/^[a-z][A-Za-z0-9_]*$" as v -> `Variable v#_0
| _ -> failwith "parse error"
2013-03-08 01:34:02 -08:00
2013-03-07 07:29:11 -08:00
-- bitstring
let bits = Bitstring.bitstring_of_file "/bin/ls" in
2013-03-08 01:34:02 -08:00
match%bitstring bits with
2013-03-07 07:29:11 -08:00
| [ 0x7f, 8; "ELF", 24, string; (* ELF magic number *)
e_ident, Mul(12,8), bitstring; (* ELF identifier *)
e_type, 16, littleendian; (* object file type *)
e_machine, 16, littleendian (* architecture *)
] ->
printf "This is an ELF binary, type %d, arch %d\n"
e_type e_machine
-- sedlex
let rec token buf =
2013-03-08 01:34:02 -08:00
let%regexp ('a'..'z'|'A'..'Z') = letter in
match%sedlex buf with
| number -> Printf.printf "Number %s\n" (Sedlexing.Latin1.lexeme buf); token buf
| letter, Star ('A'..'Z' | 'a'..'z' | digit) -> Printf.printf "Ident %s\n" (Sedlexing.Latin1.lexeme buf); token buf
| Plus xml_blank -> token buf
| Plus (Chars "+*-/") -> Printf.printf "Op %s\n" (Sedlexing.Latin1.lexeme buf); token buf
| Range(128,255) -> print_endline "Non ASCII"
| eof -> print_endline "EOF"
| _ -> failwith "Unexpected character"
2013-03-07 07:29:11 -08:00
-- cppo
[%%ifdef DEBUG]
[%%define debug(s) = Printf.eprintf "[%S %i] %s\n%!" __FILE__ __LINE__ s]
[%%else]
[%%define debug(s) = ()]
[%%endif]
debug("test")
-- PG'OCaml
let fetch_users dbh =
2013-03-08 01:34:02 -08:00
(%pgsql dbh "select id, name from users")
2013-03-07 07:29:11 -08:00
-- Macaque
2013-03-08 01:34:02 -08:00
let names view = (%view {name = t.name}, t <- !view)