[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC PATCH V1 01/12] hvm/ioreq: Make x86's IOREQ feature common
On Wed, 5 Aug 2020, Jan Beulich wrote: > On 04.08.2020 21:11, Stefano Stabellini wrote: > >> The point of the check isn't to determine whether to wait, but > >> what to do after having waited. Reads need a retry round through > >> the emulator (to store the result in the designated place), > >> while writes don't have such a requirement (and hence guest > >> execution can continue immediately in the general case). > > > > The x86 code looks like this: > > > > rc = hvm_send_ioreq(s, &p, 0); > > if ( rc != X86EMUL_RETRY || currd->is_shutting_down ) > > vio->io_req.state = STATE_IOREQ_NONE; > > else if ( !hvm_ioreq_needs_completion(&vio->io_req) ) > > rc = X86EMUL_OKAY; > > > > Basically hvm_send_ioreq is expected to return RETRY. > > Then, if it is a PIO write operation only, it is turned into OKAY right > > away. Otherwise, rc stays as RETRY. > > > > So, normally, hvmemul_do_io is expected to return RETRY, because the > > emulator is not done yet. Am I understanding the code correctly? > > "The emulator" unfortunately is ambiguous here: Do you mean qemu > (or whichever else ioreq server) or the x86 emulator inside Xen? I meant QEMU. I'll use "QEMU" instead of "emulator" in this thread going forward for clarity. > There are various conditions leading to RETRY. As far as > hvm_send_ioreq() goes, it is expected to return RETRY whenever > some sort of response is to be expected (the most notable > exception being the hvm_send_buffered_ioreq() path), or when > submitting the request isn't possible in the first place. > > > If so, who is handling RETRY on x86? It tried to follow the call chain > > but ended up in the x86 emulator and got lost :-) > > Not sure I understand the question correctly, but I'll try an > answer nevertheless: hvm_send_ioreq() arranges for the vCPU to be > put to sleep (prepare_wait_on_xen_event_channel()). Once the event > channel got signaled (and vCPU unblocked), hvm_do_resume() -> > handle_hvm_io_completion() -> hvm_wait_for_io() then check whether > the wait reason has been satisfied (wait_on_xen_event_channel()), > and ... > > > At some point later, after the emulator (QEMU) has completed the > > request, handle_hvm_io_completion gets called which ends up calling > > handle_mmio() finishing the job on the Xen side too. > > ..., as you say, handle_hvm_io_completion() invokes the retry of > the original operation (handle_mmio() or handle_pio() in > particular) if need be. OK, thanks for the details. My interpretation seems to be correct. In which case, it looks like xen/arch/arm/io.c:try_fwd_ioserv should return IO_RETRY. Then, xen/arch/arm/traps.c:do_trap_stage2_abort_guest also needs to handle try_handle_mmio returning IO_RETRY the first around, and IO_HANDLED when after QEMU does its job. What should do_trap_stage2_abort_guest do on IO_RETRY? Simply return early and let the scheduler do its job? Something like: enum io_state state = try_handle_mmio(regs, hsr, gpa); switch ( state ) { case IO_ABORT: goto inject_abt; case IO_HANDLED: advance_pc(regs, hsr); return; case IO_RETRY: /* finish later */ return; case IO_UNHANDLED: /* IO unhandled, try another way to handle it. */ break; default: ASSERT_UNREACHABLE(); } Then, xen/arch/arm/ioreq.c:handle_mmio() gets called by handle_hvm_io_completion() after QEMU completes the emulation. Today, handle_mmio just sets the user register with the read value. But it would be better if it called again the original function do_trap_stage2_abort_guest to actually retry the original operation. This time do_trap_stage2_abort_guest calls try_handle_mmio() and gets IO_HANDLED instead of IO_RETRY, thus, it will advance_pc (the program counter) completing the handling of this instruction. The user register with the read value could be set by try_handle_mmio if try_fwd_ioserv returns IO_HANDLED instead of IO_RETRY. Is that how the state machine is expected to work? > What's potentially confusing is that there's a second form of > retry, invoked by the x86 insn emulator itself when it needs to > split complex insns (the repeated string insns being the most > important example). This results in actually exiting back to guest > context without having advanced rIP, but after having updated > other register state suitably (to express the progress made so > far). Ah! And it seems to be exactly the same label: X86EMUL_RETRY. It would be a good idea to differentiate between them.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |