Ensures that the wakeup flag is read after the tail pointer has been
written. It's important to use memory load acquire semantics for the
flags read, otherwise the application and the kernel might not agree on
the consistency of the wakeup flag, leading to I/O starvation.
Refs: 6768ddcc56
Refs: https://github.com/axboe/liburing/issues/219