[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 1/5] xen/domain: Remove function pointers from domain pause helpers
On 12/11/2021 09:36, Julien Grall wrote: On 11/11/2021 17:57, Andrew Cooper wrote:Retpolines are expensive, and all these do are select between the sync and nosync helpers. Pass a boolean instead, and use direct calls everywhere.To be honest, I much prefer to read the old code. I am totally not against the change but I can see how I would be ready to introduce new function pointers use in the future. Really? The only reason there are function points to begin with was because of a far more naive (and far pre-spectre) me fixing a reference counting mess in 2014 by consolidating the logic. My mistake was not spotting that the function pointers weren't actually necessary in the first place. So I think we need some guidelines on when to use function pointers in Xen. It's easy. If you are in any way unsure, they're probably the wrong answer. (Ok - I'm being a little facetious) There are concrete security improvements from not using function pointers, demonstrated by fact that JOP/COP attacks are so pervasive that all major hardware and software vendors are working on techniques (both hardware and software) to prevent forward-edge control flow integrity violations. (The mandate from the NSA to make this happen certainly helped spur things on, too.) There are also concrete performance improvements too. All competitive processors in the market today can cope with direct branches more efficiently than indirect branches, and a key principle behind profile-guided-optimsiation is to rearrange your `ptr()` function pointer call into `if ( ptr == a ) a(); else if ( ptr == b ) b(); else ptr()` based on the frequency of destinations, because this really does make orders of magnitude improvements in some cases. We have some shockingly inappropriate uses of function pointers in Xen right now (patches 4 and 5 in particular, and "x86/hvm: Remove callback from paging->flush_tlb() hook" posted today). While this specific example doesn't fall into shockingly inappropriate in my books, it is firmly in the "not appropriate" category. The more...Pause/unpause operations on behalf of dom0 are not fastpaths, so avoid exposing the __domain_pause_by_systemcontroller() internal. This actually compiles smaller than before:... the code doesn't really compile smaller on Arm: 42sh> ../scripts/bloat-o-meter xen-syms-old xen-syms add/remove: 4/2 grow/shrink: 0/6 up/down: 272/-252 (20) Function old new delta _domain_pause - 136 +136 _domain_pause_by_systemcontroller - 120 +120 domain_pause_by_systemcontroller_nosync - 8 +8 domain_pause_by_systemcontroller - 8 +8 domain_resume 136 132 -4 domain_pause_nosync 12 8 -4 domain_pause 12 8 -4 domain_pause_except_self 188 180 -8 do_domctl 5480 5472 -8 domain_kill 372 356 -16 do_domain_pause 88 - -88 __domain_pause_by_systemcontroller 120 - -120 Total: Before=966919, After=966939, chg +0.00% ARM, like x86, compiles for speed, not size. "it got a bit larger" is generally not as interesting as "it got smaller, despite everything the compiler would normally do in the opposite direction". Obviously, if the build of Xen got 10x larger then we'd have a problem, but it hasn't. Conversely, if we were compiling for size not speed, then "it got smaller" is the uninteresting direction. The truth is that Xen (both x86 and ARM) is a couple of megabytes in RAM and this literally doesn't matter these days where RAM is measured in GB and TB. We care to an extent for efficient use of the cache/etc, but noone would bat an eyelid at several MB more for a want-to-have feature. Here, you're saying that for an added 5 instructions, totalling 0.00% delta in the size of the hypervisor, you've got some reasonably frequent codepaths which will execute faster, and cannot be hijacked with a JOP/COP attack. That's a clear all-around improvement on ARM, just like it is on x86. ~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |