[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 5/7] vpci: fix execution of long running operations
>>> On 07.11.18 at 18:15, <roger.pau@xxxxxxxxxx> wrote: > On Wed, Nov 07, 2018 at 08:06:00AM -0700, Jan Beulich wrote: >> >>> On 07.11.18 at 12:11, <roger.pau@xxxxxxxxxx> wrote: >> > On Mon, Nov 05, 2018 at 09:56:13AM -0700, Jan Beulich wrote: >> >> >>> On 30.10.18 at 16:41, <roger.pau@xxxxxxxxxx> wrote: >> >> > BAR map/unmap is a long running operation that needs to be preempted >> >> > in order to avoid overrunning the assigned vCPU time (or even >> >> > triggering the watchdog). >> >> > >> >> > Current logic for this preemption is wrong, and won't work at all for >> >> > AMD since only Intel makes use of hvm_io_pending (and even in that >> >> > case the current code is wrong). >> >> >> >> I'm having trouble understanding this, both for the AMD aspect >> >> (it is only vvmx.c which has a function call not mirrored on the >> >> AMD side) and for the supposed general brokenness. Without >> >> some clarification I can't judge whether re-implementing via >> >> tasklet is actually the best approach. >> > >> > hvm_io_pending itself cannot block the vCPU from executing, it's used >> > by nvmx_switch_guest in order to prevent changing the nested VMCS if >> > there's pending IO emulation work AFAICT. >> > >> > The only way I could find to actually prevent a vCPU from running >> > while doing some work on it's behalf in a preemptive way is by >> > blocking it and using a tasklet. What's done with IOREQs is not >> > suitable here since Xen needs to do some work instead of just wait on >> > an external event (an event channel from the IOREQ). >> >> No, there is a second means, I've just confused the functions. The >> question is whether your vpci_process_pending() invocation >> perhaps sits in the wrong function. handle_hvm_io_completion() is >> what hvm_do_resume() calls, and what can prevent a guest from >> resuming execution. The hvm_io_pending() invocation just sits on >> a special case path down from there (through handle_pio()). > > Yes, handle_hvm_io_completion is the function that actually blocks the > vCPU and waits for an event channel from the ioreq. This is however > not suitable because it uses the following code (simplified): > > set_bit(_VPF_blocked_in_xen, ¤t->pause_flags); > raise_softirq(SCHEDULE_SOFTIRQ); > do_softirq(); > > In the vPCI case Xen cannot set the vCPU as blocked_in_xen, Xen needs > the scheduler to schedule the vCPU so the pending work can be > processed. Right, and I didn't say you should set the vCPU to blocked. What I've pointed out is that the mere fact of handle_hvm_io_completion() returning false makes hvm_do_resume() bail, resulting in another round through softirq processing (from entry.S code) as long as _some_ softirq is pending (here: the scheduler one). Then if the blocked bit is not set the call to do_softirq > would be recurred, thus probably leading to a stack overflow if > there's enough pending work. ie: > > <process work> > <do_softirq> > <process work> > <do_softirq> > <...> Why would that be? The do_softirq() invocation sits on the exit- to-guest path, explicitly avoiding any such nesting unless there was a do_softirq() invocation somewhere in a softirq handler. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |