[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 5/7] vpci: fix execution of long running operations
On Thu, Nov 08, 2018 at 02:55:56AM -0700, Jan Beulich wrote: > >>> On 07.11.18 at 18:15, <roger.pau@xxxxxxxxxx> wrote: > > On Wed, Nov 07, 2018 at 08:06:00AM -0700, Jan Beulich wrote: > >> >>> On 07.11.18 at 12:11, <roger.pau@xxxxxxxxxx> wrote: > >> > On Mon, Nov 05, 2018 at 09:56:13AM -0700, Jan Beulich wrote: > >> >> >>> On 30.10.18 at 16:41, <roger.pau@xxxxxxxxxx> wrote: > >> >> > BAR map/unmap is a long running operation that needs to be preempted > >> >> > in order to avoid overrunning the assigned vCPU time (or even > >> >> > triggering the watchdog). > >> >> > > >> >> > Current logic for this preemption is wrong, and won't work at all for > >> >> > AMD since only Intel makes use of hvm_io_pending (and even in that > >> >> > case the current code is wrong). > >> >> > >> >> I'm having trouble understanding this, both for the AMD aspect > >> >> (it is only vvmx.c which has a function call not mirrored on the > >> >> AMD side) and for the supposed general brokenness. Without > >> >> some clarification I can't judge whether re-implementing via > >> >> tasklet is actually the best approach. > >> > > >> > hvm_io_pending itself cannot block the vCPU from executing, it's used > >> > by nvmx_switch_guest in order to prevent changing the nested VMCS if > >> > there's pending IO emulation work AFAICT. > >> > > >> > The only way I could find to actually prevent a vCPU from running > >> > while doing some work on it's behalf in a preemptive way is by > >> > blocking it and using a tasklet. What's done with IOREQs is not > >> > suitable here since Xen needs to do some work instead of just wait on > >> > an external event (an event channel from the IOREQ). > >> > >> No, there is a second means, I've just confused the functions. The > >> question is whether your vpci_process_pending() invocation > >> perhaps sits in the wrong function. handle_hvm_io_completion() is > >> what hvm_do_resume() calls, and what can prevent a guest from > >> resuming execution. The hvm_io_pending() invocation just sits on > >> a special case path down from there (through handle_pio()). > > > > Yes, handle_hvm_io_completion is the function that actually blocks the > > vCPU and waits for an event channel from the ioreq. This is however > > not suitable because it uses the following code (simplified): > > > > set_bit(_VPF_blocked_in_xen, ¤t->pause_flags); > > raise_softirq(SCHEDULE_SOFTIRQ); > > do_softirq(); > > > > In the vPCI case Xen cannot set the vCPU as blocked_in_xen, Xen needs > > the scheduler to schedule the vCPU so the pending work can be > > processed. > > Right, and I didn't say you should set the vCPU to blocked. What > I've pointed out is that the mere fact of handle_hvm_io_completion() > returning false makes hvm_do_resume() bail, resulting in another > round through softirq processing (from entry.S code) as long as > _some_ softirq is pending (here: the scheduler one). No, hvm_do_resume bailing doesn't prevent the VM from being scheduled. Both vmx_do_resume and svm_do_resume (the callers of hvm_do_resume) will run the guest regardless of whether hvm_do_resume has bailed early. Note that at the point where hvm_do_resume is called there's no turning back, both {vmx/svm}_do_resume functions are annotated as noreturn. The only way to prevent guest execution at that point is to raise a scheduler softirq and process it, like it's done in wait_on_xen_event_channel. > > Then if the blocked bit is not set the call to do_softirq > > would be recurred, thus probably leading to a stack overflow if > > there's enough pending work. ie: > > > > <process work> > > <do_softirq> > > <process work> > > <do_softirq> > > <...> > > Why would that be? The do_softirq() invocation sits on the exit- > to-guest path, explicitly avoiding any such nesting unless there > was a do_softirq() invocation somewhere in a softirq handler. It sits on an exit-to-guest path, but the following chunk: raise_softirq(SCHEDULE_SOFTIRQ); do_softirq(); Would prevent the path from ever reaching the exit-to-guest and nesting on itself, unless the vCPU is marked as blocked, which prevents it from being scheduled thus avoiding this recursion. Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |