Commit Graph

185 Commits (540996d21ee3793a1cecce252c81fb76a6b9fd61)

Author SHA1 Message Date
Nicolás Ojeda Bär 540996d21e Remove Spacetime 2020-10-08 20:28:12 +02:00
Nicolás Ojeda Bär ec6690fb53
x86 asm: handle unit names with special characters (#9465) 2020-04-19 11:17:00 +02:00
Stephen Dolan 4d4a056bc7
Micro-optimise allocations on amd64 to save a register (#9280)
There's no need for allocation on amd64 to clobber the %rax register. It's only used in one case (-compact out-of-line allocation of >3 words), and only used there to do a single subtraction. That subtraction can be done by the caller at no code size penalty, freeing up %rax.

Inside amd64.S functions, %r11 can be used instead of %rax as temporary.  %r11 is destroyed by PLT stub code, so on ELF platforms it costs nothing to use.
2020-03-09 19:52:36 +01:00
Greta Yorsh 6daaf62904 Do not emit references to dead labels (spacetime) (#9097) 2019-11-26 12:06:19 +00:00
Stephen Dolan 7fe360401b Per-architecture support for allocation size info in frame tables.
amd64: remove caml_call_gc{1,2,3} and simplify caml_alloc{1,2,3,N}
       by tail-calling caml_call_gc.

i386:  simplify caml_alloc{1,2,3,N} by tail-calling caml_call_gc.
       these functions do not need to preserve ebx.

arm:   simplify caml_alloc{1,2,3,N} by tail-calling caml_call_gc.
       partial revert of #8619.

arm64: simplify caml_alloc{1,2,3,N} by tail-calling caml_call_gc.
       partial revert of #8619.

power: partial revert of #8619.
       avoid restarting allocation sequence after failure.

s390:  partial revert of #8619.
       avoid restarting allocation seqeunce after failure.
2019-10-23 09:24:13 +01:00
Stephen Dolan 768dcce48f Use allocation-size info on more than just amd64.
Moves the alloc_dbginfo type to Debuginfo, to avoid a circular
dependency on architectures that use Branch_relaxation.

This commit generates frame tables with allocation sizes on all
architectures, but does not yet update the allocation code for
non-amd64 backends.
2019-10-22 11:47:31 +01:00
Stephen Dolan 787e2d05a7 Apply suggestions from code review, and make depend.
Co-Authored-By: Damien Doligez <damien.doligez@gmail.com>
2019-10-22 11:47:31 +01:00
Stephen Dolan 34f97941ec Retain debug information about allocation sizes, for statmemprof.
This code is adapted from jhjourdan's 2c93ca1e711. Comballoc is
extended to keep track of allocation sizes and debug info for each
allocation, and the frame table format is modified to store them.

The native code GC-entry logic is changed to match bytecode, by
calling the garbage collector at most once per allocation.

amd64 only, for now.
2019-10-22 11:47:31 +01:00
Stephen Dolan b0ad600b88 Use a more compact representation of debug information.
Locations of inlined frames are now represented as contiguous
sequences rather than linked lists.

The frame tables now refer to debug info by 32-bit offset rather
than word-sized pointer.
2019-10-22 11:46:35 +01:00
Stephen Dolan 71f3ec4091 Clear destination registers before sqrt instruction on amd64 (#9041)
This avoids a partial register stall.
2019-10-15 19:04:20 +02:00
Stephen Dolan 0852266a07 Improve code-generation for 32-to-64-bit zero-extension on amd64. 2019-10-14 10:45:15 +01:00
Tom Kelly 62d6917fd5 amd64: Emit 32bit registers for Iconst_int when we can (This is a reuse of (better) code proposed in PR1490 credit to xclerc/mshinwell) 2019-10-03 16:52:50 +02:00
Greta Yorsh aeebb62e9b Move contains_calls and num_stack_slots from Proc to Mach.fundecl 2019-09-09 11:33:03 +01:00
Greta Yorsh 0b6b544fcb Split Linearize into two modules
Separate the description of the IR from the transformations
performed on it by moving type declarations from linearize.ml
into their own file, called linear.ml.
2019-09-04 11:55:11 +01:00
KC Sivaramakrishnan a395c4cf71 Fix CFI offsets in amd64 2019-08-23 09:50:05 +05:30
KC Sivaramakrishnan de5ef602fd Rename exn_handler to exception_pointer 2019-08-23 09:50:05 +05:30
KC Sivaramakrishnan c06038a0ee Move backtrace support global variables to domain state.
Since we cannot access backtrace position in cmmgen.ml anymore,
Cmm.raise_kind in removed. Instead, we use Lambda.raise_kind. When
assembly code is generated, we reset the backtrace position to 0 in the
case of regular raise. Importantly, the semantics remains the same.
2019-08-23 09:50:05 +05:30
KC Sivaramakrishnan 45b1e18f59 young_ptr and young_limit are now in domain state 2019-08-23 09:50:05 +05:30
KC Sivaramakrishnan fc6f028492 Introduce domain state and steal exception pointer 2019-08-23 09:50:05 +05:30
Greta Yorsh c79387bb64 Add .caml to function section names
Emit .text.caml.function_name instead of .text.function_name,
and update runtime assembly function names accordingly.
2019-07-15 10:25:26 +01:00
Greta Yorsh 351edb49bb Add compile-time option -function-sections 2019-07-15 10:25:26 +01:00
Greta Yorsh 27a92a9445 Emit each function in a separate section (amd64,i386,arm,arm64)
Add --enable-function-sections option to configure. With this option,
the compiler will emit each function in a separate named text section,
on supported targets. This enables function reordering using a linker
script. With this option, the compiler also emits caml_hot__code_begin
and caml_hot__code_end sections. This allows a linker script to
move function sections outside of the segments they belong to,
without breaking caml_code_segments.
2019-07-15 10:25:26 +01:00
Greta Yorsh d8b6a1713b Add pseudo-instruction `Ladjust_trap_depth` (#2322)
Ladjust_trap_depth replaces dummy Lpushtrap generated in linearize of
Iexit to notify assembler generation about updates to the
stack. Ladjust_trap_depth is used to keep the virtual stack pointer in
sync and emit dwarf information, without emitting any assembly
instructions. It therefore avoids generating dead code.

This patch is extract from PR1482 @lthls
2019-06-24 14:18:37 +01:00
Nicolás Ojeda Bär 23f6a7364b amd64: align data segment to word boundary 2019-06-24 09:35:07 +02:00
Greta Yorsh e7aef3aa6f Fix amd64 and i386 emitters of Lcondbranch3 (#8677)
Use unsigned comparisons (jb/ja) in amd64 and i386 emitters of Lcondbranch3,
instead of the previous mixture of unsigned  and signed comparisons (jb/jg).
2019-06-03 16:30:34 +02:00
Stephen Dolan c24e5b5c8a Ensure that Gc.minor_words remains accurate after a GC (#8619)
If an allocation fails, the decrement of young_ptr should be undone
before the GC is entered. This happened correctly on bytecode but not
on native code.

This commit (squash of pull request #8619) fixes it for all the
platforms supported by ocamlopt.

amd64: add alternate entry points caml_call_gc{1,2,3} for code size
optimisation.

powerpc: introduce one GC call point per allocation size per function.
Each call point corrects the allocation pointer r31 before calling
caml_call_gc.

i386, arm, arm64, s390x: update the allocation pointer after the
conditional branch to the GC, not before.

arm64: simplify the code generator: Ialloc can assume that less than
0x1_0000 bytes are allocated, since the max allocation size for the
minor heap is less than that.

This is a partial cherry-pick of commit 8ceec on multicore.
2019-05-04 10:01:23 +02:00
Mark Shinwell 72ea849d2a
Move some middle-end files around (#2281)
* Various file moves in the middle end: this is the first stage of improving separation between the middle end and backend.
* Creation of file_formats/ directory (with associated file moves) to hold the definitions of compilation artifact formats.
* Creation of lambda/ directory (with associated file moves) to hold Lambda language definition files, transformation passes and construction passes from Typedtree.
* Disable (hopefully temporarily) dynlink, debugger and ocamldoc for the dune build.
2019-04-01 17:18:47 +01:00
Mark Shinwell 4334b2de87
Position [Lprologue] correctly (#2292) 2019-03-29 11:47:53 +00:00
Mark Shinwell 2cc1ea26b9 Remove gprof support (#2314)
This commit removes support for gprof-based profiling (the -p option to ocamlopt).  It follows a discussion on the core developers' list, which indicated that removing gprof support was a reasonable thing to do. The rationale is that there are better easy-to-use profilers out there now, such as perf for Linux and Instruments on macOS; and the gprof support has always been patchy across targets. We save a whole build of the runtime and simplify some other parts of the codebase by removing it.
2019-03-16 19:56:53 +01:00
Vincent Laviron 1dba5329a2 Linearize: for Trywith, remove the jump/call to the handler (#2237) 2019-03-07 10:37:22 +00:00
Daniel Bünzli a7afd89003 s/string_of_int/Int.to_string/g 2018-11-07 13:52:02 +01:00
Mark Shinwell dae65dacda
Rename Mach.Ialloc record field from _words_ to _bytes_ and fix logic in a couple of places (#2074) 2018-10-02 16:00:03 +01:00
Mark Shinwell 2a072d8036
Add Lprologue (#2055) 2018-09-24 10:03:26 +01:00
Gabriel Radanne 1be47bf7ab Just some tbl things. (#1699) 2018-07-23 13:19:41 +01:00
LemonBoy 86d1f0d714 Optimize 32->64 sign-extension for AMD64 (#1631) 2018-05-28 15:38:57 +02:00
Leo White 1671e5a3af Treat negated float comparisons more directly (#1487)
* Add float comparison test

* Treat negated float comparisons more directly

* Add Changes entry
2018-02-28 14:19:46 +01:00
Mark Shinwell 39f4f4e931 Add a padding word before "data_end" symbols (MPR#6329) (#1437) 2017-10-24 16:57:20 +01:00
Mark Shinwell b65096678b Register availability analysis (#856) 2017-09-15 11:08:14 +01:00
Mark Shinwell 70c585d5eb Take PLT-clobbered registers into account at Ialloc (#1304)
Mark PLT-clobbered registers as destroyed across the Ialloc instruction.

Currently only x86-64 is affected, in PIC mode only, and only with the glibc dynamic loader.
2017-08-28 19:09:57 +02:00
Xavier Leroy 7077b6057f MPR#7591: frametable not 8-aligned on x86-64 port
Misalignment was due to the "D.long (const 0)" emitted just before the frametable, which sets the data pointer to 4 mod 8. Looks like someone cut-and-pasted from i386 to amd64 without thinking.

This commit fixes the bug twice (because belt and suspenders and all that) in two obvious ways:
- the data terminator D.long becomes D.qword
- explicit 8-alignment is requested before emitting the frame table.

(Mental note: why is the frame table in the data segment and not in a readonly data segment?)
2017-07-22 16:32:23 -04:00
Mark Shinwell f20634493f Remove Istore_symbol (plus some Win64 fixes) (#955) 2016-12-27 12:30:12 +00:00
KC Sivaramakrishnan abc5360dc1 Remove duplicate live_offset entries from frametables (#453) 2016-12-09 15:41:22 +00:00
Mark Shinwell 77471455c7 Fix Spacetime assembler comments (MPR#7326) 2016-08-15 08:58:32 +01:00
Mark Shinwell cd0bd8aa73 Spacetime: a new memory profiler (#585) 2016-07-29 15:07:10 +01:00
François Bobot bb4a1b4f5d Specialize raise_kind after cmmgen
since the semantic changed. There is no need to check Clflags.debug
   anymore Raise_withtrace, means that traces must be computed (if the
   runtime boolean is true).
2016-07-28 15:29:50 +02:00
François Bobot 7be0a81e9c Fix backtrace for regular raise on arm64, arm
for constant exception, a reraise was done
    instead of a raise
2016-07-28 13:46:23 +02:00
Alain Frisch e3ee2805b7 Merge pull request #645 from mshinwell/delete_cmm_label_stuff
Remove Cdefine_label and Clabel_address
2016-07-10 14:52:07 +02:00
Mark Shinwell 5f00ce793e Improve location handling in the middle end (version for merging) (#666) 2016-07-06 15:42:29 +01:00
Mark Shinwell c843ca0691 Labels after calls, call GC points and checkbound points (again) (#660) 2016-07-06 11:44:00 +01:00
Alain Frisch c3c523109e Revert "Labels after calls, call GC points and checkbound points" 2016-07-01 18:42:51 +02:00