Commit Graph

143 Commits (master)

Author SHA1 Message Date
Nicolás Ojeda Bär 43883ae4bc Remove labels after calls, checkbound, and GC points 2020-10-08 20:28:15 +02:00
Nicolás Ojeda Bär 540996d21e Remove Spacetime 2020-10-08 20:28:12 +02:00
Xavier Leroy 8e246c41c2 Revised detection of arithmetic instructions with immediate operands, continued
- Rewrite the `is_immediate` methods in $ARCH/selection.ml in the style of
  other selection methods: operations that need platform-dependent handling
  are explicitly listed, all others fall through `super#is_immediate`.

- The `is_immediate` method from selectgen.ml knows how to handle shifts
  (and no other operation).  Remove the `select_shift_op` method,
  now unnecessary.

- ARM: remove special cases for multiply and multiply-high, no longer
  necessary.

- RISC-V: in emit.mlp, remove implementation of checkbound immediate,
  which is no longer generated.
2020-09-21 14:49:16 +02:00
Xavier Leroy 65544ffd1f Revised detection of arithmetic instructions with immediate operands
Replace the a single `is_immediate n` method that is supposed to apply
to all arithmetic instructions by two methods:

`is_immediate op n` : tests whether `n` is in the range of supported
   immediate arguments for integer operation `op`
`is_immediate_test cmp n` : tests whether `n` is in the range of supported
   immediate arguments for integer comparison `cmp`

This makes it easier to handle operations without immediate operands
(e.g. multiply or multiply-high on many platforms) and operations with
specific ranges of immediate operands (e.g. N-bit unsigned versus
N-bit signed).  Before, these operations had to be treated as special
cases in the platform-specific `select_operation` method.
2020-09-16 11:52:19 +02:00
Greta Yorsh 2bb2bde74c
Prologue size should not depend on stack_offset (power, arm64) (#9083)
* Prologue size does not depend on stack_offset (power, arm64)

Define `initial_stack_offset` of a function, independently
of stack_offset, and use it to compute both frame_size and
prologue_size.
2020-09-03 13:26:00 +02:00
Xavier Leroy 9fcb295b98 Revised passing of arguments to external C functions
Introduce the type Cmm.exttype to precisely describe arguments to
external C functions, especially unboxed numerical arguments.

Annotate Cmm.Cextcall with the types of the arguments (Cmm.exttype list).
An empty list means "all arguments have default type XInt".

Annotate Mach.Iextcall with the type of the result (Cmm.machtype)
and the types of the arguments (Cmm.exttype list).

Change (slightly) the API for describing calling conventions in Proc:
- loc_external_arguments now takes a Cmm.exttype list,
  in order to know more precisely the types of the arguments.
- loc_arguments, loc_parameters, loc_results, loc_external_results
  now take a Cmm.machype instead of an array of pseudoregisters.
  (Only the types of the pseudoregisters mattered anyway.)

Update the implementations of module Proc accordingly, in every port.

Introduce a new overridable method in Selectgen, insert_move_extcall_arg,
to produce the code that moves an argument of an external C function
to the locations returned by Proc.loc_external_arguments.

Revise the selection of external calls accordingly
(method emit_extcall_args in Selectgen).
2020-07-24 17:39:22 +02:00
David Allsopp b6c8b35e2d
Make -flarge-toc the default for PowerPC (#9557)
Introduce -fsmall-toc in order to access the previous behaviour and
document both options in the manual and ocamlopt manpage.
2020-05-13 18:23:37 +02:00
Stephen Dolan 7fe360401b Per-architecture support for allocation size info in frame tables.
amd64: remove caml_call_gc{1,2,3} and simplify caml_alloc{1,2,3,N}
       by tail-calling caml_call_gc.

i386:  simplify caml_alloc{1,2,3,N} by tail-calling caml_call_gc.
       these functions do not need to preserve ebx.

arm:   simplify caml_alloc{1,2,3,N} by tail-calling caml_call_gc.
       partial revert of #8619.

arm64: simplify caml_alloc{1,2,3,N} by tail-calling caml_call_gc.
       partial revert of #8619.

power: partial revert of #8619.
       avoid restarting allocation sequence after failure.

s390:  partial revert of #8619.
       avoid restarting allocation seqeunce after failure.
2019-10-23 09:24:13 +01:00
Stephen Dolan 768dcce48f Use allocation-size info on more than just amd64.
Moves the alloc_dbginfo type to Debuginfo, to avoid a circular
dependency on architectures that use Branch_relaxation.

This commit generates frame tables with allocation sizes on all
architectures, but does not yet update the allocation code for
non-amd64 backends.
2019-10-22 11:47:31 +01:00
Greta Yorsh cae89d4e1b Pass num_stack_slots as argument 2019-09-11 18:48:20 +01:00
Greta Yorsh aeebb62e9b Move contains_calls and num_stack_slots from Proc to Mach.fundecl 2019-09-09 11:33:03 +01:00
Greta Yorsh 0b6b544fcb Split Linearize into two modules
Separate the description of the IR from the transformations
performed on it by moving type declarations from linearize.ml
into their own file, called linear.ml.
2019-09-04 11:55:11 +01:00
KC Sivaramakrishnan 4dab86ad54 Fix domain state field offset for power (32-bit)
The domain state fields are always aligned at 8 byte offset. This is to
ensure that even on a 32-bit where pointers are 32-bits and doubles are
64-bits, the offset calculation remains the same as 64-bit
architectures.
2019-08-23 09:50:05 +05:30
KC Sivaramakrishnan f7920a127f Domain state works on Power64 2019-08-23 09:50:05 +05:30
Greta Yorsh d8b6a1713b Add pseudo-instruction `Ladjust_trap_depth` (#2322)
Ladjust_trap_depth replaces dummy Lpushtrap generated in linearize of
Iexit to notify assembler generation about updates to the
stack. Ladjust_trap_depth is used to keep the virtual stack pointer in
sync and emit dwarf information, without emitting any assembly
instructions. It therefore avoids generating dead code.

This patch is extract from PR1482 @lthls
2019-06-24 14:18:37 +01:00
Stephen Dolan c24e5b5c8a Ensure that Gc.minor_words remains accurate after a GC (#8619)
If an allocation fails, the decrement of young_ptr should be undone
before the GC is entered. This happened correctly on bytecode but not
on native code.

This commit (squash of pull request #8619) fixes it for all the
platforms supported by ocamlopt.

amd64: add alternate entry points caml_call_gc{1,2,3} for code size
optimisation.

powerpc: introduce one GC call point per allocation size per function.
Each call point corrects the allocation pointer r31 before calling
caml_call_gc.

i386, arm, arm64, s390x: update the allocation pointer after the
conditional branch to the GC, not before.

arm64: simplify the code generator: Ialloc can assume that less than
0x1_0000 bytes are allocated, since the max allocation size for the
minor heap is less than that.

This is a partial cherry-pick of commit 8ceec on multicore.
2019-05-04 10:01:23 +02:00
Xavier Leroy 10994d8c80
Ensure frame table is 8-aligned on ARM64 and PPC64 (#8557)
This is a follow-up to commit 7077b60 that fixed a lack of 8-alignment for the frame table on ADM64, as reported in #7591.
A similar issue was reported in #7887 for ARM64 and is fixed here.
For good measure, explicit alignment was added  to PPC64 as well, although there was probably no issue there.

Closes: #7887.
2019-03-29 15:31:21 +01:00
Mark Shinwell 4334b2de87
Position [Lprologue] correctly (#2292) 2019-03-29 11:47:53 +00:00
Mark Shinwell 0933593596 Fix ppc64 TOC load for exception handler addresses (#8506)
The address was loaded from the TOC into register r0.  This generated  bad code in the "big TOC" case, as r0 was used as index register.  The fix is to use another temporary register instead of r0.
Add "arch_power" builtin to ocamltest.
Add test case.
2019-03-18 13:31:57 +01:00
Mark Shinwell 2cc1ea26b9 Remove gprof support (#2314)
This commit removes support for gprof-based profiling (the -p option to ocamlopt).  It follows a discussion on the core developers' list, which indicated that removing gprof support was a reasonable thing to do. The rationale is that there are better easy-to-use profilers out there now, such as perf for Linux and Instruments on macOS; and the gprof support has always been patchy across targets. We save a whole build of the runtime and simplify some other parts of the codebase by removing it.
2019-03-16 19:56:53 +01:00
Mark Shinwell 618e5dbfbd More debugging information in Cmm terms (#2308)
Following on from GPR#851 and GPR#873, this pull request further enhances debugging information in Cmm terms. This was driven both by manually examining the debugger's behaviour and also by a report received from a user regarding substandard DWARF location information.
2019-03-13 15:40:04 +00:00
Vincent Laviron 1dba5329a2 Linearize: for Trywith, remove the jump/call to the handler (#2237) 2019-03-07 10:37:22 +00:00
Mark Shinwell 770e662e96
Add [Proc.destroyed_at_reloadretaddr] (#2065) 2018-10-15 12:53:27 +01:00
Mark Shinwell dacb2240a4
DWARF register numberings (#2080) 2018-10-04 11:30:52 +01:00
Mark Shinwell dae65dacda
Rename Mach.Ialloc record field from _words_ to _bytes_ and fix logic in a couple of places (#2074) 2018-10-02 16:00:03 +01:00
Mark Shinwell 2a072d8036
Add Lprologue (#2055) 2018-09-24 10:03:26 +01:00
Xavier Clerc 7e29162582 Pass the elements from `BUILD_PATH_PREFIX_MAP` to the assembler (#1930) 2018-07-27 12:25:23 +02:00
Sébastien Hinderer d3e73595e5 Merge the asmrun and byterun directories into the runtime directory 2018-06-28 17:50:33 +02:00
Leo White 1671e5a3af Treat negated float comparisons more directly (#1487)
* Add float comparison test

* Treat negated float comparisons more directly

* Add Changes entry
2018-02-28 14:19:46 +01:00
Mark Shinwell 39f4f4e931 Add a padding word before "data_end" symbols (MPR#6329) (#1437) 2017-10-24 16:57:20 +01:00
Mark Shinwell b65096678b Register availability analysis (#856) 2017-09-15 11:08:14 +01:00
Xavier Leroy 24980d3fd9 Generate frametable in data section to improve code relocatability
The frametable contains absolute pointers into the code, which require relocation in shared libraries and also in position-independent executables (PIE).

Before this commit, the frametable was put in the readonly data section (rodata), which is part of the text segment.  In shared libraries and PIEs, relocations in the text segment are undesirable (they make the text segment writable, at least temporarily) and are flagged as warnings or errors by various tools (Debian's lintian package checker; the --warn-shared-textrel option of GNU ld; etc).

This commit puts the frametable in the (read-write) data section (.data), like in the AMD64 port for example.  In PowerPC 64-bit mode, this is enough to produce .so files and PIE executables that contain no relocations in the text segment.

In PowerPC 32-bit mode there remains relocations in the text segment, but that was expected because the code we generate is not position-independent (PIC).
2017-09-07 12:17:03 +02:00
KC Sivaramakrishnan abc5360dc1 Remove duplicate live_offset entries from frametables (#453) 2016-12-09 15:41:22 +00:00
Mark Shinwell eb05f63fbf Fix all architectures 2016-10-17 14:52:41 +01:00
Mark Shinwell cd0bd8aa73 Spacetime: a new memory profiler (#585) 2016-07-29 15:07:10 +01:00
François Bobot bb4a1b4f5d Specialize raise_kind after cmmgen
since the semantic changed. There is no need to check Clflags.debug
   anymore Raise_withtrace, means that traces must be computed (if the
   runtime boolean is true).
2016-07-28 15:29:50 +02:00
François Bobot 7be0a81e9c Fix backtrace for regular raise on arm64, arm
for constant exception, a reraise was done
    instead of a raise
2016-07-28 13:46:23 +02:00
alainfrisch 8eedf40f8f Avoid creating a dependency to Cmm (not detected by 'make depend' running under another arch). 2016-07-11 11:27:50 +02:00
Gabriel Scherer 99db3207e7 fallout from #645: remove emit_data_label (unused, breaks the build) 2016-07-10 10:14:40 -04:00
Alain Frisch e3ee2805b7 Merge pull request #645 from mshinwell/delete_cmm_label_stuff
Remove Cdefine_label and Clabel_address
2016-07-10 14:52:07 +02:00
Mark Shinwell 5f00ce793e Improve location handling in the middle end (version for merging) (#666) 2016-07-06 15:42:29 +01:00
Mark Shinwell c843ca0691 Labels after calls, call GC points and checkbound points (again) (#660) 2016-07-06 11:44:00 +01:00
Mark Shinwell 8e16cdd85d Remove Cdefine_label and Clabel_address 2016-06-29 10:01:03 +01:00
Fabrice Le Fessant 80c4576f03 Add line directives to preprocessed files 2016-06-29 10:43:00 +02:00
Xavier Leroy 5d02ca6f28 In frame tables, distinguish data pointers from code pointers
Since GPR#247 (stack backtraces aware of inlining) was merged, frame tables contain two kinds of addresses of labels: code labels (as before) and data labels (new, pointing to sub-frames).

On ARM in Thumb mode, the two kinds of pointers must be distinguished, because pointers to Thumb code have the low bit set, and the assembler needs to know whether a label denotes code or data to set the low bit or not.

This commit fixes this problem by splitting the "efa_label" action of record Emitaux.emit_frame_actions into two actions, "efa_code_label" and "efa_data_label".  On all ports except ARM, the two actions are identical.  On ARM, the actions add the appropriate ".type" declaration.

Tested on ARM-32 and x86-64 only.  CI will test the other platforms.
2016-06-27 09:14:54 +02:00
Mark Shinwell b2e7162546 Second attempt at fixing GPR#167 fallout 2016-04-15 13:41:54 +01:00
alainfrisch 502e4f9336 More warnings when compiling the compiler. 2016-03-15 22:46:35 +01:00
Damien Doligez 5401ce8473 Update headers for the new license.
Remains to be done: remove all headers in testsuite/tests.
2016-02-18 16:59:16 +01:00
Damien Doligez ee8f71101b clean up whitespace and cut long lines 2016-02-17 13:36:27 +01:00
Mark Shinwell 05f1746cb5 Rename to max_arguments_for_tailcalls; revise numbers assuming no unboxed floats using the OCaml calling conventions 2016-02-08 15:02:40 +01:00