Implement (in utils/binutils.ml) a simple parser for ELF, Mach-O and PE shared object files. Use it to get rid of libbfd in ocamlobjinfo and to improve the checking of external primitives during linking in ocamlc.
Taking advantage of the new closure representation, this PR simplifies the compaction algorithm (3 passes instead of 4) and remove the use of the page table in no-naked-pointers mode.
Instead of the erasure scheme that was used up to now, where we
considered that the type was always principal.
Note: the erasure still happens when polymorphic variants appear in the
patterns, and the type of the scrutinee contains a Reither.
Add hint when a module is used instead of a module type or when a
module type is used instead of a module or when a class type is
used instead of a class.
Previously, code areas from native-code DLLs were registered in the page
table and could be queried with the Is_in_code_area macros.
Following commit e4bf109d1 (PR#9682), the runtime system no longer
queries the page table for code areas (it uses the table of code fragments
instead).
A grep through OPAM package sources shows no uses of Is_in_code_area macro
or the In_code_area flag for pages.
This commit simply removes the Is_in_code_area macro and the registration
of code areas in the page table.
* camlinternalMod: use closure metadata for copying a closure over another
This change is careful to avoid writing a value into what was
previously a raw field or conversely, clearing fields that change
category first.
We also clear the end of the block, to make it easier to reason about
lifetime of values that could have been referenced there. (We don't
expect this to make a different in practice.)
* Obj: new submodule Closure giving basic access to closure metadata
* Ensure that Obj.new_block returns a sensible uninitialized closure
* Changes
This commit revises the generic hash functions to take advantage of
the new closure representation: code pointers are directly mixed
into the hash rather than having to be detected using Is_in_value_area.
Currently the new code for closures is activated only in no-naked-pointers
mode, even though it is sound in naked-pointers mode too.
Closes: #2168
We are planning to support two configurations (with or without
naked pointers) in the runtime for at least a couple years, so I think
it is useful to explain their value models and corresponding runtime
functions/macros. This should help runtime developers reason about the
code written to support both modes.
Introduce type Obj.raw_data and functions Obj.raw_field, Obj.set_raw_field to manipulate out-of-heap pointers in no-naked-pointers mode, and more generally all other data that is not a well-formed OCaml value
* Introducing codefrag: a new runtime module to work with code fragments
This module collects all the operations on code fragments performed in
various places of the runtime systems. Applies both to bytecode and
to native code.
The implementation is based on skiplists, so that "lookup fragment by
PC" and "lookup fragment by number" are efficient (logarithmic in the
number of code fragments). "Lookup fragment by digest" remains
linear-time.
The new module also improves the handling of digests: now it is
possible to mark a code fragment as "no digest" i.e. not marshal-able.
* Use the new "codefrag" runtime module for marshaling and for the
debugger interface
Replace the previous handling of code fragments with calls to the
functions provided by the "codefrag" runtime module.
It is invalid to reuse a Lambda.t term twice, because bound variables
may be used non-uniquely. If we want to perform a code transformation
may duplicate subterms in some cases, we have to refresh all bound
variables of the copied subterm.
The present PR implements a function
Lambda.duplicate : lambda -> lambda
that does exactly this. It is implemented by making Lambda.subst
parametrized over a transformation on bound variables.
We want to start allowing more information in the payload of
[@tailcall] attributes (currently no payload is supported), for
example we could consider using [@tailcall false] to ask the code
generator to disable a tail call.
A first required step in this direction is to use a custom datatype to
represent the tail-call attribute, instead of a boolean. This is
consistent with the other application-site
attributes (inline_attribute, specialise_attribute, local_attribute),
so it makes the code more regular -- but the change itself is
boilerplate-y.
This commit introduces a quantity Lambda.max_arity that is the maximal
number of parameters that a Lambda function can have.
Uncurrying is throttled so that, for example, assuming the limit is 10,
a 15-argument curried function fun x1 ... x15 -> e
becomes a 10-argument function (x1...x10) that returns a 5-argument
function (x11...x15).
Concerning untupling, a function that takes a N-tuple of arguments,
where N is above the limit, remains a function that takes a single
argument that is a tuple.
Currently, max_arity is set to 126 in native-code, to match the new
representation of closures implemented by #9619. A signed 8-bit field
is used to store the arity. 126 instead of 127 to account for the
extra "environment" argument.
In bytecode the limit is infinity (max_int) because there are no needs
yet for a limit on the number of parameters.
This PR fixes an old bug in the interaction between [merge_constraint]
and [Typedecl.transl_with_constraint], where
variance (and now separability) are recomputed in an invalid type
environment. See #9624 and the new tests.
This patch makes Unix.time and Unix.gettimeofday be unboxed and @noalloc, which makes them about 20% faster (as measured by a stupid benchmark that does them many times in a loop).
This removes the fallback and error-handling paths from gettimeofday. Neither is needed according to Single Unix Specification and POSIX.
Fixes: #7446
* Undefine the CAML_DEBUG_SOCKET variable early
So that if the debugged program creates or executes another program
that happens to be an OCaml bytecode executable, said program does
not try to connect to the debugger at beginning of execution.
Fixes: #8678
* Check availability of setenv() and unsetenv()
And guard the use of unsetenv() in runtime/debugger.c.
In -no-flat-float-array mode, instead of always returning
`best_msig` (the most permissive signature), we first compute the
flat-float-array separability signature -- falling back to `best_msig`
if it fails.
This discipline is conservative: it never rejects -no-flat-float-array
programs. At the same time it guarantees that, for any program that is
also accepted in -flat-float-array mode, the same separability will be
inferred in the two modes. In particular, the same .cmi files and
digests will be produced.
Before we introduced this hack, the production of different .cmi files
would break the build system of the compiler itself, when trying to
build a -no-flat-float-array system from a bootstrap compiler itself
using -flat-float-array. See #9291.
Instead of using the stdlib logf function for computing logarithms, we
use a faster polynomial-based approximation.
We use the xoshiro PRNG instead of the Mersenne Twister. xoshiro is
simpler and faster.
We generate samples by batches so that compilers can vectorize the
generation loops using SIMD instructions when possible.
This module provides a purely sequential implementation of the
concurrent atomic references provided by the Multicore OCaml
standard library:
https://github.com/ocaml-multicore/ocaml-multicore/blob/parallel_minor_gc/stdlib/atomic.mli
This sequential implementation is provided in the interest of
compatibility: when people will start writing code to run on
Multicore, it would be nice if their use of Atomic was
backward-compatible with older versions of OCaml without having to
import additional compatibility layers. *)
Otherwise, arguments get split at spaces.
This is the same quoting that the Win32 implementation of
Unix.create_process does.
A test was added.
Fixes: 9320
Some file systems maintain time stamps with sub-second resolution, up
to nanosecond resolution. When converting from a "(seconds, nanoseconds)"
timestamp to a floating-point timestamp, rounding to nearest can produce
"seconds + 1" as a result, i.e. the integer part of the FP timestamp
is not equal to "seconds". As described in #9490, this is a problem
in some cases.
This commit implements a more careful conversion of "(seconds,
nanoseconds)" pairs to FP timestamps so that the integer part of the
FP result is always "seconds".
Both the otherlibs/unix and the otherlibs/win32unix implementations are affected
and corrected.
Closes: #9490
Before, each head construction had a `make_<foo>_matching` construct that
was responsible for three things:
- consuming the argument from the argument list
(the "argument" is a piece of lambda code to access
the value of the current scrutinee)
- building arguments for the subpatterns of the scrutinee
and pushing them to the argument list
- building a `cell` structure out of this, representing a head group
during compilation
Only the second point is really specific to each construction.
This refactoring turns this second point into a construct-specific
`get_expr_args_<foo>` function (similarly to `get_pat_args_<foo>`),
and moves the first and third point to a generic `make_matching`
function.
Note: this commit contains a minor improvement to the location used to
force (lazy ..) arguments.
OCaml's signal handlers need to know the state of the processor when the signal was received. The approach standardized by the Single Unix specification is to use 3-argument signal handing functions and the SA_SIGINFO flag to sigaction. However, as the Linux man page for sigaction says,
Undocumented:
Before the introduction of SA_SIGINFO, it was also possible to get some
additional information, namely by using a sa_handler with a second
argument of type struct sigcontext. See the relevant Linux kernel
sources for details. This use is obsolete now.
For historical reasons, the i386, PowerPC and s390x Linux ports of OCaml still use the undocumented, obsolete approach, while the other Linux ports use the modern SA_SIGINFO approach.
For consistency and future-proofing, this PR updates the i386, PowerPC64, and s390x Linux ports to use the modern approach to signal handling. PowerPC32 was left as before because of technical subtleties and lack of hardware to test changes.
The new code for PowerPC 64 bits includes the trick proposed in PR#1795 to support Musl in addition to Glibc on ppc64le.
On FreeBSD, libbfd is - similarly to OpenBSD - not installed by default, but
can be added via the devel/libbfd port. This will install libbfd into the
/usr/local prefix, thus configure needs to look there for include and library
caml_alloc returns initialised blocks for tag < No_scan_tag. Otherwise,
initialise the blocks as necessary.
For Abtract_tag, Double_tag and Double_array_tag, the initial contents
are irrelevant.
Uninitialised Custom_tag objects are difficult to use correctly. Hence,
reject custom block allocations through Obj.new_block.
For String_tag, the last byte encodes the string length. Hence, reject
zero-length string objects. Initialise the last byte which encodes the
length to ensure non-negative lengths for uninitialised strings.
- Add the key argument in the description of 'merge'
- Note that the keys of 'union f m1 m2' are a subset of the
input keys, not all the keys, since
bindings (union (fun _ _ _ -> None) m m) = []
- Fix grammar in the descriptions of 'filter', 'union', 'merge'
- Fix mismatched variable name in the description of 'partition'
- Note that 'find' and 'find_opt' return values, not bindings
The instrumentation code in the instrumented runtime was replaced
with new APIs to gather runtime statistics and output them in a new format
(Common Trace Format).
This commit also exposes new functions in the Gc module to pause or resume
instrumentation during a program execution (Gc.eventlog_pause and
Gc.eventlog_resume).
Use a variable-length encoding (suggested by @stedolan) for dimensions that supports dimensions up to 2^64-1 each.
Change the marshalling identifier for bigarrays:_bigarray ~> _bigarr02
The identifier change reflects a change in the bigarray marshalling format.
- Deprecate Thread.kill, Thread.wait_read, Thread.wait_write
for reasons explained in the documentation comments.
- Update documentation comments for Thread functions.
- Deprecate ThreadUnix module, documented as deprecated since 2002
(commit 0a47a75d56).
- In the manual, remove leftover mentions of the VM threads
implementation; focus on the system threads implementation.
- In the manual, remove the documentation of ThreadUnix.
Closes: #9206
It's not just on Windows that the length of the command passed to Sys.command
can exceed system limits:
- On Linux there is a per-argument limit of 2^17 bytes
(the whole command is a single argument to /bin/sh)
- On macOS with default parameters, the limit is between 70000 and 80000
- On BSDs with default parameters, the limit is around 2^18.
In parallel, response files (@file) are supported by all the C compilers
we've encountered: GCC, Clang, Intel's ICC, and even CompCert. They all
seem to follow quoting rules similar to that of the native shell
for the contents of the response file.
This commit forces the use of a response file when the total size of
the arguments to the linker is greater than 2^16.
Arguments passed via a response file are quoted using Filename.quote
as if they were passed on the command line.
Closes: #9482Closes: #8549
* Don't include stdio.h in caml/misc.h
There is no need to include stdio.h in caml/misc.h
This seems to have happened by accident in commit cddec18fde
On OpenBSD, stderr and stdout are macros defined in stdio.h
ppx_expect uses stderr and stdout as identifiers in
collector/expect_test_collector_stubs.c where caml/misc.h is included.
This confuses the C compiler, because the macro will get expanded where an identifier is expected.
* Remove fallback NULL definition in caml/misc.h
ISO C guarantees that NULL is defined in <stddef.h>
* include missing stdio in tests/compatibility/stub.c
(This commit is more tricky than the previous ones in the patchset
and requires a careful review.)
This refactoring clarifies and simplifies the specialize_matrix logic
by getting rid of the OrPat exception used in a higher-order
way (or sometimes not used in certain matchers, when it is possible to
"push" the or-pattern down in the pattern). Instead it uses an
arity-based criterion to implement the or-pattern-related logic once
in the specializer, instead of having to implement it in each
matcher. As a result, the compiler improves a bit as it will push
or-patterns down during specialization in valid situations that were
not implemented before -- probably because they are not terribly
important in practice: all constant and arity-1 constructs benefit
from optimized or-pattern handling, in particular the following are
new:
- lazy patterns
- non-constant polymorphic variants
- size-one records and arrays
We don't want to record the state of the file system at the start
of the compilation in the compiled files.
Consequently, we only add persistent modules to the env summary
if they have an observable action on the initial environment.
This is only the case if they shadow a non-persistent module of the
initially opened library (which can only be Stdlib currently).
Alpine Linux and perhaps other musl-based Linux distributions produce
position-independent executables (PIEs) by default. If non-PIC object
files are given to the linker, it silently produces a wrong executable
that crashes when run. This is the case for ocamlopt-generated code,
which by default is not PIC except on amd64 (x86_64) and s390x (Z systems).
Closes: #7562