This is done by using a local entry array for each thread, containing
tracked blocks whose allocation callback has not yet been called.
This allows some simplification in the code running callbacks for
young allocations. Indeed, since the entry array is local to one
thread, we know for sure that it cannot be modified during a callback,
and therefore we no longer need to remember the indices of the
corresponding new entries.
The instrumentation code in the instrumented runtime was replaced
with new APIs to gather runtime statistics and output them in a new format
(Common Trace Format).
This commit also exposes new functions in the Gc module to pause or resume
instrumentation during a program execution (Gc.eventlog_pause and
Gc.eventlog_resume).
They are somewhat difficult to handle for native allocations, and it is not clear how useful they are. Moreover, they are easy to add back since [Gc.Memprof.allocation] is a private record.
We need to make sure they do not share a page with code, otherwise the
GC and the polymorphic hash and comparison functions will follow code
pointers and potentially trigger a segfault.
The user can register several callbacks, which are called for various
events during the block's lifetime. We need to maintain a data
structure for tracked blocks in the runtime. When using threads,
callbacks can be called concurrently in a reentrant way, so the
functions manipulating this data structure need to be reentrant.
In debug builds, the minor GC now asserts that young_ptr points to
a valid minor heap header before starting GC. Since very few bit
patterns are valid minor heap headers, this is unlikely to be true
by coincidence.
This patch also ensures that minor allocations have color 0. This
was inconsistent between backends before.
This code is adapted from jhjourdan's 2c93ca1e711. Comballoc is
extended to keep track of allocation sizes and debug info for each
allocation, and the frame table format is modified to store them.
The native code GC-entry logic is changed to match bytecode, by
calling the garbage collector at most once per allocation.
amd64 only, for now.
point following every minor collection or major slice.
Also run signal handlers first.
Indeed, in some cases, caml_something_to_do is not reliable (spotted
by @jhjourdan):
* We could get into caml_process_pending_actions when
caml_something_to_do is seen as set but not caml_pending_signals,
making us miss the signal.
* If there are two different callbacks (say, a signal and a finaliser)
arriving at the same time, then we set caml_something_to_do to 0
when starting the first one while the second one is still waiting.
We may want to run the second one if the first one is taking long.
In the latter case, the additional fix is to favour signals, which
have a lower latency requirement, whereas the latency of finalisers
keeps the same order of magnitude, and memprof callbacks are served on
a best-effort basis.
Introduce caml_process_pending_actions and
caml_process_pending_actions_exn: a variant of the former which does
not raise but returns a value that has to be checked against
Is_exception_value.
I keep the current conventions from caml_callback{,_exn}: For a
resource-safe interface, we mostly care about the _exn variants, but
every time there is a public _exn function I provide a function that
raises directly for convenience.
They are introduced and documented in caml/signals.h.
Private functions are converted to their _exn variant on the way as
needed: for internal functions of the runtime, it is desirable to go
towards a complete elimination of functions that raise implicitly.
Get rid of the distant logic of caml_raise_in_async_callback. Instead,
caml_process_pending_events takes care itself of its something_to_do
"resource". This avoids calling the former function in places
unrelated to asynchronous callbacks.
caml_minor_collection is not a new function, it is already used in the
wild and used to be exposed in caml/minor_gc.h in the past. The change
simply amounts to replacing a call to caml_request_minor_gc() with an
assignment requested_minor_gc = 1, bypassing
caml_set_something_to_do().
In 8691, caml_check_urgent_gc was merged with the function that runs
asynchronous callbacks. The rationale was that caml_check_urgent_gc
already runs finalisers, and so could have run any asynchronous
callbacks.
We agreed on a different strategy: we know that users could not rely
on asynchronous callbacks being called at this point, so take the
opportunity to make it callback-safe, like was done for allocation
functions.
The new check_urgent_gc no longer calls finalisers (nor any
callbacks), and instead two internal functions are introduced:
* caml_do_urgent_gc_and_callbacks : function to perform actions
unconditionally.
* caml_check_urgent_gc_and_callbacks : function that checks for
something to do, and then executes all actions (GC and callbacks).
* Fix free identifiers in spacetime
* Fix free identifiers in tools/gdb-macros
* [minor] Fix Caml_state fields in comments, and other comment updates
* Changes
This make us able to get rid of to xxx_to_do variables in `final.c`
and `memprof.c`. The variable is reset to 0 when entering
`caml_check_urgent_gc`, which is now the main entry point for
asynchronous callbacks. In case a callback raises an exception, we
need to set it back to 1 to make sure no callback is missed.
The finalizers and all the other asynchronous callbacks (including
signal handlers, memprof callbacks and finalizers) are now called in a
common function, [caml_async_callbacks]. It is called in
[caml_check_urgent_gc] and wherever [caml_process_pending_signals] was
called.
This makes it possible to simplify the [caml_gc_dispatch] logic by
removing the loop it contains, since it no longer calls finalizers.
Allocations ignored by this version
- Marshalling
- In the minor heap by natively-compiled OCaml code
Allocations potentially sampled
- In the major heap
- In the minor heap by C code and OCaml code in bytecode mode