[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xen/arm: introduce vwfi parameter
On 20/02/17 18:43, Stefano Stabellini wrote: > On Mon, 20 Feb 2017, Dario Faggioli wrote: >> On Sun, 2017-02-19 at 21:34 +0000, Julien Grall wrote: >>> Hi Stefano, >>> >>> I have CCed another ARM person who has more knowledge than me on >>> scheduling/power. >>> >> Ah, when I saw this, I thought you were Cc-ing my friend Juri, which >> also works there, and is doing that stuff. :-) >> >>>> In both cases the vcpus is not run until the next slot, so I don't >>>> think >>>> it should make the performance worse in multi-vcpus scenarios. But >>>> I can >>>> do some tests to double check. >>> >>> Looking at your answer, I think it would be important that everyone >>> in >>> this thread understand the purpose of WFI and how it differs with >>> WFE. >>> >>> The two instructions provides a way to tell the processor to go in >>> low-power state. It means the processor can turn off power on some >>> parts >>> (e.g unit, pipeline...) to save energy. >>> >> [snip] >>> >>> For both instruction it is normal to have an higher latency when >>> receiving an interrupt. When a software is using them, it knows that >>> there will have an impact, but overall it will expect some power to >>> be >>> saved. Whether the current numbers are acceptable is another >>> question. >>> >> Ok, thanks for these useful information. I think I understand the idea >> behind these two instructions/mechanisms now. >> >> What (I think) Stefano is proposing is providing the user (of Xen on >> ARM) with a way of making them behave differently. > > That's right. It's not always feasible to change the code of the guest > the user is running. Maybe she cannot, or maybe she doesn't want to for > other reasons. Keep in mind that the developer of the operating system > in this example might have had very different expectations of irq > latency, given that, even with wfi, is much lower on native. > > When irq latency is way more important than power consumption to the > user (think of a train, or an industrial machine that needs to move > something in a given amount of time), this option provides value to her > at very little maintenance cost on our side. > > Of course, even if we introduce this option, by no mean we should stop > improving the irq latency in the normal cases. > > >> Whether good or bad, I've expressed my thoughts, and it's your call in >> the end. :-) >> George also has a fair point, though. Using yield is a quick and *most >> likely* effective way of achieving Linux's "idle=poll", but at the same >> time, a rather rather risky one, as it basically means the final >> behavior would relay on how yield() behave on the specific scheduler >> the user is using, which may vary. >> >>> Now, regarding what you said. Let's imagine the scheduler is >>> descheduling the vCPU until the next slot, it will run the vCPU >>> after >>> even if no interrupt has been received. >>> >> There really are no slots. There sort of are in Credit1, but preemption >> can happen inside a "slot", so I wouldn't call them such in there too. >> >>> This is a real waste of power >>> and become worst if an interrupt is not coming for multiple slot. >>> >> Undeniable. :-) > > Of course. But if your app needs less than 3000ns of latency, then it's > the only choice. > > >>> In the case of multi-vcpu, the guest using wfi will use more slot >>> than >>> it was doing before. This means less slot for vCPUs that actually >>> have >>> real work to do. >>> >> No, because it continuously yields. So, yes indeed there will be higher >> scheduling overhead, but no stealing of otherwise useful computation >> time. Not with the yield() implementations we have right now in the >> code. >> >> But I'm starting to think that we probably better make a step back from >> withing deep inside the scheduler, and think, first, whether or not >> having something similar to Linux's idle=poll is something we want, if >> only for testing, debugging, or very specific use cases. >> >> And only then, if the answer is yes, decide how to actually implement >> it, whether or not to use yield, etc. > > I think we want it, if the implementation is small and unintrusive. But surely we want it to be per-domain, not system-wide? -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |