Ensure that HDMA always takes precedence over interrupts, otherwise
occurring on the same instruction boundary, after eventual additional
cycles due to unhalt. As suggested by the included tests.
The LYC flag was previously found to remain in an eventual high state
after LCD disable. An extension to this is that it might be possible to
trigger LYC IRQs via STAT write in such a case and some testing (tests
included) suggests that this is possible. Adjust accordingly.
Test LY timing at an apparent odd cycle offset in double speed mode
after multiple speed changes. Adjust accordingly.
Findings of note:
- In the event that LY is read at the boundary at which it gets
incremented, in an apparent window of one (single speed) cycle, the
resulting value appears to be LY & LY+1, where '&' denotes the bitwise
AND operation, and 'LY' is the value of LY prior to increment.
Preliminary speculative timing adjustments prior to testing in odd LCD
cycle offset states.
Reuse some constants to ensure consistency between some trigger windows
and respective trigger events.
Findings of note:
- The LCD (PPU) can seemingly end up at a cycle offset relative to the
CPU that is a non-multiple of 4 when switching away from double speed
mode (as compared with normal). The behaviour appears equivalent to a
CPU tick being skipped relative to the LCD/PPU, which can result in 2
different offsets depending on whether the speed change is done at an
odd M-cycle (i.e. an odd multiple of 4 cycles in double speed mode).
This will also seemingly carry over to double speed mode upon another
speed change, and, repeated speed changes can seemingly produce all 4
offsets.
Test LCD/PPU timing relative to LCD/PPU display enable and improve
implementation accordingly.
Also fix and test an apparent inconsistency between mode=0 IRQ trigger
checks and event timing in the implementation.
Temporarily disable a speed change test that fails after these changes.
Some inspection suggested that an apparent latency of LYC comparisons,
when modifying the LYC register, which has previously been seen in
relation to prevention of "mode 0" IRQs (and to the triggering of LYC
IRQs), should, also, have consequence for when a "mode 2" IRQ may be
prevented as a result of a previous LYC IRQ (i.e. when an LYC write that
might influence this is done shortly before the "mode 2" IRQ occurs) on
the CGB revision tested -- and that there was a lack of such an effect
in the implementation. Some testing confirms that this, indeed, appears
to be the case.
Adjust accordingly.
A "mode 2" IRQ at the beginning of line 144, when transitioning from
"mode 0" to "mode 1", has been discovered (this was found as a result of
adding tests that verified the timing of each "mode 2" IRQ for the first
two frames of display after LCD display enable).
The time periods for when "mode 0" or "mode 2" IRQs may be triggered via
STAT writes were also found to be slightly different on the CGB revision
tested for the "mode 0" to "mode 1" transition, as compared with for the
0 to 2 transition (slightly longer and slightly shorter, respectively --
up to one instruction as seen from the CPU [in "double speed" mode and
independent of "double speed", respectively]). No difference was seen on
the DMG revision tested.
Adding more test coverage, after some inspection, for the case where an
LYC status flag IRQ period is at its end, and a "mode 2" status flag IRQ
is triggered via STAT write, also uncovered an implementation deficiency
that could allow an IRQ trigger that should have been prevented on the
CGB in "double speed" mode (in a time window of up to one instruction)
-- hence the movement of the relevant check in the implementation.
Furthermore, this also uncovered that, in the case that the LYC status
flag IRQ is disabled shortly before a "mode 2" IRQ at LY=0 on the CGB,
then, that there is a similar latency as for when the "mode 1" IRQ is
disabled w.r.t. whether this avoids preventing the "mode 2" IRQ from
occurring (in non-"double speed" mode this can be observed as an up to
one instruction difference from the CPU, in "double speed" mode it
appears to be irrelevant) -- hence the amendment in the implementation
to the "mode 1" enable bit select with the LYC enable bit to the code
that avoids a late enable from being of consequence for the "mode 2" IRQ
assertion event.
Adjust accordingly (the "mode 1"-related changes result from the fact
that a preceding "mode 2" IRQ may prevent the triggering of a subsequent
"mode 1" IRQ).
More detailed timing tests uncovered some differences w.r.t. the timing
of the very first line after LCD display enable (an additional 2 cycles
of delay before "mode 3" begins).
("Double speed" testing with some related simplifications is in progress
-- which should, finally, get rid of some of the odd "ds" offsets
[emphasis has been on implementing observed behaviour].)
Findings of note:
- The "skip" glitch, when halt is cancelled early, is not limited to the
DMG; it is also present on the CGB revision tested and on a GBA SP.
- The skip glitch also applies to the case when IME is high, in which
case, the pushed return address is decremented (so that it points to the
halt instruction, rather than to the subsequent instruction; the halt is
repeated on an eventual ret).
- The timing for when pending interrupts can cancel the halt state is
different on the DMG (as in they can be detected earlier as compared
with on the CGB revision tested), excepting the first/second M-cycle.
(To facilitate this, information about when IRQs occur is propagated in
the implementation.)
Besides the improved coverage, this uncovered a few off-by-ones w.r.t.
IF writes, and IRQ ACK timing.
The IF write adjustment is a one-liner that has as consequence that mode
0 IRQs for lines with particular timing (e.g. due to scx offsets and/or
object positions) can be overwritten at up to one instruction earlier
than previously.
The ACK-related adjustments have consequence for cases when the same IRQ
is reasserted shortly after it has effected an interrupt in the sense
that, for the affected IRQs, the time window for when this will be
ignored is up to one instruction longer.
More cases of the SRAM flag of the cartridge header are now considered, allowing 64KiB and 128KiB saves to be made (with 8 and 16 banks, respectively).
The extra space is taken advantage of by software such as LSDj.
This code was broken in two places. First, the --input parsing always
skipped hats due to missing break statements in the direction switch.
Second, the handler for SDL_JoyHatEvents was iterating through the axis
map instead of the hat map.