Document the Unicode mode for Windows (#1360)
Also: more detailed instructions related to FlexDLL.master
parent
5741d2fda3
commit
b46b5cea71
|
@ -73,9 +73,10 @@ https://github.com/alainfrisch/flexdll. A binary distribution is available;
|
|||
instructions on how to build FlexDLL from sources, including how to bootstrap
|
||||
FlexDLL and OCaml are given <<seflexdll,later in this document>>. Unless you
|
||||
bootstrap FlexDLL, you will need to ensure that the directory to which you
|
||||
install FlexDLL is included in your `PATH` environment variable. Note: if you
|
||||
use Visual Studio 2015 or Visual Studio 2017, the binary distribution of
|
||||
FlexDLL will not work and you must build it from sources.
|
||||
install FlexDLL is included in your `PATH` environment variable. Note: binary distributions
|
||||
of FlexDLL are compatible only with certain versions of Visual Studio; for instance
|
||||
version 0.36 of FlexDLL require Visual Studio 2015 or above, while earlier versions
|
||||
require older versions of Visual Studio.
|
||||
|
||||
The base bytecode system (ocamlc, ocaml, ocamllex, ocamlyacc, ...) of all three
|
||||
ports runs without any additional tools.
|
||||
|
@ -339,6 +340,55 @@ compiling `world`, you must compile `flexdll`, i.e.:
|
|||
installed FlexDLL, you must erase the contents of `flexdll/` before
|
||||
compiling.
|
||||
|
||||
== Unicode support
|
||||
|
||||
Prior to version 4.06, all filenames on the OCaml side were assumed
|
||||
to be encoded using the current 8-bit code page of the system. Some
|
||||
Unicode filenames could thus not be represented. Since version 4.06,
|
||||
OCaml adds to this legacy mode a new "Unicode" mode, where filenames
|
||||
are UTF-8 encoded strings. In addition to filenames,
|
||||
this applies to environment variables and command-line arguments.
|
||||
|
||||
The mode must be decided before building the system, by tweaking
|
||||
the `WINDOWS_UNICODE` variable in `config/Makefile`. A value of 1
|
||||
enables the the new "Unicode" mode, while a value of 0 maintains
|
||||
the legacy mode.
|
||||
|
||||
Technically, both modes use the Windows "wide" API, where filenames
|
||||
and other strings are made of 16-bit entities, usually interpreted as
|
||||
UTF-16 encoded strings.
|
||||
|
||||
Some more details about the two modes:
|
||||
|
||||
* Unicode mode: OCaml strings are interpreted as being UTF-8 encoded
|
||||
and translated to UTF-16 when calling Windows; strings returned by
|
||||
Windows are interpreted as UTF-16 and translated to UTF-8 on their
|
||||
way back to OCaml. Additionally, an OCaml string which is not
|
||||
valid UTF-8 will be interpreted as being in the current 8-bit code
|
||||
page. This fallback works well in practice, since the chances of
|
||||
non-ASCII string encoded in the a 8-bit code page to be a valid
|
||||
UTF-8 string are tiny. This means that filenames
|
||||
obtained from e.g. a 8-bit UI or database layer would continue to
|
||||
work fine. Application written for the legacy mode or older
|
||||
versions of OCaml might still break if strings returned by
|
||||
Windows (e.g. for `Sys.readdir`) are sent to components expecting
|
||||
strings encoded in the current code page.
|
||||
|
||||
* Legacy mode: this mode emulates closely the behavior of OCaml <
|
||||
4.06 and is thus the safest choice in terms of backward
|
||||
compatibility. In this mode, OCaml programs can only work with
|
||||
filenames that can be encoded in the current code page, and the
|
||||
same applies to ocaml tools themselves (ocamlc, ocamlopt, etc).
|
||||
|
||||
The legacy mode will be deprecated and then removed in future versions
|
||||
of OCaml. Users are thus strongly encouraged to use the Unicode mode
|
||||
and adapt their existing code bases accordingly.
|
||||
|
||||
Note: in order for ocaml tools to support Unicode pathnames, it is
|
||||
necessary to use a version of FlexDLL which has itself been compiled
|
||||
with OCaml >= 3.06 in Unicode mode. This is the case for binary distributions
|
||||
of FlexDLL starting from version 0.36 and above.
|
||||
|
||||
== Trademarks
|
||||
|
||||
Microsoft, Visual C++, Visual Studio and Windows are registered trademarks of
|
||||
|
|
Loading…
Reference in New Issue