[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen HPET improvement proposal
On 25/10/13 13:47, Jan Beulich wrote: > >> Independently of the HPET issues themselves, I have identified a race >> condition in the mwait-idle routines where a cpu which is preparing to >> sleep can arrange for another cpu to wake it up, and have that other cpu >> wake it up before it has enabled its mwait trigger, meaning that it will >> idle for an arbitrary length of time in mwait. Realistically, the cpu >> will be woken up by the time calibration rendezvous once a second, and >> possibly by the watchdog NMI every half second. > Which is an awfully long period of time... Looking forward to see > further details on this. The fix is fairly simple. The mwait code must set up the trigger on its mwait region before arranging to be woken up. That way, if the other cpu does wake up (early perhaps), it will activate the trigger, and we will bounce straight back out of mwait rather than sleeping indefinitely. Currently, there is a window between arranging to be woken up and activating the mwait trigger where the other cpu might have already written to the mwait region. >> If there is not a free HPET, a cpu will need to share with another cpu. >> If this cpu can find another HPET which will fire at an appropriate >> time, the cpu can merely ask for it to be woken up by the HPET owner >> when the owner wakes up. If all the HPETs are programmed to fire a >> sufficient time into the future, one needs to be shortened. The cpu >> should choose the soonest HPET, add itself to the owner's list of other >> pcpus to wake, and reprogram the HPET to fire sooner. It should not >> reprogram the HPET to point to itself. > I think blindly looking for the one with the closest wakeup is not ideal: > For one, on huge systems this requires you to scan through too many > other CPUs. And taking NUMA aspects into consideration here would > seem at the very least desirable too (i.e. prefer sharing with a CPU > close to the one looking for a "partner"). I was actually thinking of just searching through the HPETs. There are typically far fewer hpet channels than cpus (the most hpet channels I have encountered in our test lab is 8). There is also a possibility of maintaining some form of priority-structure, so the next-to-fire HPET is trivial to identify. (My concern here is of the overhead with maintaining the priority structure). I see your point about NUMA, and shall consider it as I am developing the code (although I might end up with v1 doing the dumb thing first, before turning towards NUMA optimisation). The NUMA aspect plays the other way round as well, with the (usually single) HPET being on the southbridge/pch, therefore likely hanging off numa node 0. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |