Test STAT mode changes and related IRQs at apparent non-4n, as compared
with normal, LCD/PPU cycle offsets in single speed mode after multiple
speed changes (timing across one frame of display is tested). This also
covers LYC IRQ timing (and the LYC flag, at least, in part).
Findings of note:
- HDMA appears inactive during halt/speed change (presumably stop).
- HDMA appears to trigger on unhalt if halt occurs during STAT mode!=0
and unhalt occurs during STAT mode=0, if enabled, and similarly for
speed change (presumably stop).
- If the mode=0 transition that would otherwise trigger an HDMA occurs
during the first M-cycle of halt, HDMA appears to occur at unhalt time,
and the next PC increment appears to be skipped (i.e. the next byte
after the halt opcode will be fetched twice). The (semi-)prefetched
instruction at halt time is executed after unhalt and HDMA.
- If the mode=0 transition that would otherwise trigger an HDMA occurs
during the first M-cycle of speed change (stop), HDMA appears to occur
during the stopped state if in single speed mode prior to speed change,
or upon wake-up from stop if in double speed mode prior to speed change
(similarly as for the halt case). In both cases the prefetched (at stop
time) second byte of the stop instruction appears to be used as the
first byte of the next instruction after speed change. If, as a result
of the above, HDMA occurs during the stopped state, HDMA appears to get
disabled after the first 16 bytes have been transferred, and, rather
than decrement the length count, the most significant bit (80h) is
seemingly simply set in the HDMA5 register.
- Additional HDMA triggers (mode=0 transitions) appear to be ignored
during HDMA execution excepting the last 4 cycles of HDMA (i.e. no
additional HDMA will occur until the next mode=0 transition after HDMA
is done less the last 4 cycles).
- In the event that HDMA is triggered unconditionally on unhalt or upon
wake-up from speed change (presumably stop), as a result of halt/stop
occurring at the mode=0 transition, the execution of the next (semi-
prefetched) instruction appears to occur 4 cycles earlier than for other
DMA events. The execution of this instruction appears to occur prior to
pending interrupts or eventual additional HDMAs.
Test timing of STAT modes and related IRQs at an apparent odd cycle
offset in double speed mode (timing across one frame of display is
tested). LYC IRQ timing is also covered (and the LYC flag, at least, in
part). Adjust timing accordingly.
OAM DMA appears to be inactive during halt as well as CGB speed change
(presumably stop in general). This adds some simple tests for that and
adjusts the implementation accordingly.
HDMA appears to be similarly affected (not covered by this change).
Ensure that HDMA always takes precedence over interrupts, otherwise
occurring on the same instruction boundary, after eventual additional
cycles due to unhalt. As suggested by the included tests.
The LYC flag was previously found to remain in an eventual high state
after LCD disable. An extension to this is that it might be possible to
trigger LYC IRQs via STAT write in such a case and some testing (tests
included) suggests that this is possible. Adjust accordingly.
Test LY timing at an apparent odd cycle offset in double speed mode
after multiple speed changes. Adjust accordingly.
Findings of note:
- In the event that LY is read at the boundary at which it gets
incremented, in an apparent window of one (single speed) cycle, the
resulting value appears to be LY & LY+1, where '&' denotes the bitwise
AND operation, and 'LY' is the value of LY prior to increment.
Findings of note:
- The LCD (PPU) can seemingly end up at a cycle offset relative to the
CPU that is a non-multiple of 4 when switching away from double speed
mode (as compared with normal). The behaviour appears equivalent to a
CPU tick being skipped relative to the LCD/PPU, which can result in 2
different offsets depending on whether the speed change is done at an
odd M-cycle (i.e. an odd multiple of 4 cycles in double speed mode).
This will also seemingly carry over to double speed mode upon another
speed change, and, repeated speed changes can seemingly produce all 4
offsets.
Test LCD/PPU timing relative to LCD/PPU display enable and improve
implementation accordingly.
Also fix and test an apparent inconsistency between mode=0 IRQ trigger
checks and event timing in the implementation.
Temporarily disable a speed change test that fails after these changes.
Some inspection suggested that an apparent latency of LYC comparisons,
when modifying the LYC register, which has previously been seen in
relation to prevention of "mode 0" IRQs (and to the triggering of LYC
IRQs), should, also, have consequence for when a "mode 2" IRQ may be
prevented as a result of a previous LYC IRQ (i.e. when an LYC write that
might influence this is done shortly before the "mode 2" IRQ occurs) on
the CGB revision tested -- and that there was a lack of such an effect
in the implementation. Some testing confirms that this, indeed, appears
to be the case.
Adjust accordingly.
A "mode 2" IRQ at the beginning of line 144, when transitioning from
"mode 0" to "mode 1", has been discovered (this was found as a result of
adding tests that verified the timing of each "mode 2" IRQ for the first
two frames of display after LCD display enable).
The time periods for when "mode 0" or "mode 2" IRQs may be triggered via
STAT writes were also found to be slightly different on the CGB revision
tested for the "mode 0" to "mode 1" transition, as compared with for the
0 to 2 transition (slightly longer and slightly shorter, respectively --
up to one instruction as seen from the CPU [in "double speed" mode and
independent of "double speed", respectively]). No difference was seen on
the DMG revision tested.
Adding more test coverage, after some inspection, for the case where an
LYC status flag IRQ period is at its end, and a "mode 2" status flag IRQ
is triggered via STAT write, also uncovered an implementation deficiency
that could allow an IRQ trigger that should have been prevented on the
CGB in "double speed" mode (in a time window of up to one instruction)
-- hence the movement of the relevant check in the implementation.
Furthermore, this also uncovered that, in the case that the LYC status
flag IRQ is disabled shortly before a "mode 2" IRQ at LY=0 on the CGB,
then, that there is a similar latency as for when the "mode 1" IRQ is
disabled w.r.t. whether this avoids preventing the "mode 2" IRQ from
occurring (in non-"double speed" mode this can be observed as an up to
one instruction difference from the CPU, in "double speed" mode it
appears to be irrelevant) -- hence the amendment in the implementation
to the "mode 1" enable bit select with the LYC enable bit to the code
that avoids a late enable from being of consequence for the "mode 2" IRQ
assertion event.
Adjust accordingly (the "mode 1"-related changes result from the fact
that a preceding "mode 2" IRQ may prevent the triggering of a subsequent
"mode 1" IRQ).
More detailed timing tests uncovered some differences w.r.t. the timing
of the very first line after LCD display enable (an additional 2 cycles
of delay before "mode 3" begins).
("Double speed" testing with some related simplifications is in progress
-- which should, finally, get rid of some of the odd "ds" offsets
[emphasis has been on implementing observed behaviour].)
Findings of note:
- The "skip" glitch, when halt is cancelled early, is not limited to the
DMG; it is also present on the CGB revision tested and on a GBA SP.
- The skip glitch also applies to the case when IME is high, in which
case, the pushed return address is decremented (so that it points to the
halt instruction, rather than to the subsequent instruction; the halt is
repeated on an eventual ret).
- The timing for when pending interrupts can cancel the halt state is
different on the DMG (as in they can be detected earlier as compared
with on the CGB revision tested), excepting the first/second M-cycle.
(To facilitate this, information about when IRQs occur is propagated in
the implementation.)
Besides the improved coverage, this uncovered a few off-by-ones w.r.t.
IF writes, and IRQ ACK timing.
The IF write adjustment is a one-liner that has as consequence that mode
0 IRQs for lines with particular timing (e.g. due to scx offsets and/or
object positions) can be overwritten at up to one instruction earlier
than previously.
The ACK-related adjustments have consequence for cases when the same IRQ
is reasserted shortly after it has effected an interrupt in the sense
that, for the affected IRQs, the time window for when this will be
ignored is up to one instruction longer.