[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 00/12] const-ify vendor checks


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Alejandro Vallejo <alejandro.garciavallejo@xxxxxxx>
  • Date: Mon, 9 Feb 2026 12:56:52 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=suse.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=TKz42avQEeqgp9Afgsat2XSqKjshh55Fm/ZrKyKf2m0=; b=MeZT9w1x9HKYBK6qYcaw/V24nlbNpWqGJhvoFkTzTbNqLOSlGWAkBUdtDOjIcVxD84yNiUVvyWfHlBP9GFSe3UCli0bY+6dW26rFYu/S+f7XcnYMqlFoDcirIlMOtHxp1RFdHvh2f18J0P7eFGrT9a3HaLX7pX87ickIbiLhjg22WXg04OsTKdWFFB1/yV9gJ6xQFQDgSeasw1XGVCuy6Z3GOaVJJS9szEd+XMusDh9FD05yHDZrFquLz+ttZ7gXzf1LfSezHClw1yiXDyjzu24hyQsdr/k/5RPGZTa8QiXRKUcrVjJj+7CH0nfFcTQmfpm2iTVzIjUG3lSj8HdNBQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=UdCfqo8jicYVhAcwEjHmUXbjdKpqC+DNoao0nWtCi4A661MqI8q7JQUc6I5qLb7JZnJMpo4dLPJgQ+AwBBJxjcdkp7YqviBgn1/fzKPFFx7qRaiwEa4R0pjfncbt7yUdLCMbhh/4huPjm+dntxKig5A6cBpscbHV3FOEsswk05XWo7SH0mZs3fkSTgQTp4oFY/7riR8UcbBsE8mCpuGu31BigeWITwn9jEHKyZSW6NywKx0GWHUKVBdfZrUXsBAdqygs7q33jaJGvj+srx9H1stjMj+4xYN2MrWNTFf13sTYs2Te5w5ab2p97Zo0edJiYdsLiHk5LdJafraASA7KPg==
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Jason Andryuk <jason.andryuk@xxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 09 Feb 2026 11:57:20 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon Feb 9, 2026 at 11:15 AM CET, Jan Beulich wrote:
> On 09.02.2026 11:05, Alejandro Vallejo wrote:
>> On Mon Feb 9, 2026 at 10:21 AM CET, Jan Beulich wrote:
>>> On 06.02.2026 17:15, Alejandro Vallejo wrote:
>>>> High level description
>>>> ======================
>>>>
>>>> When compared to the RFC this makes a different approach The series 
>>>> introduces
>>>> cpu_vendor() which maps to a constant in the single vendor case and to
>>>> (boot_cpu_data.vendor & X86_ENABLED_VENDORS), where X86_ENABLED_VENDORS is 
>>>> a
>>>> mask of the compile-time chosen vendors. This enables the compiler to 
>>>> detect
>>>> dead-code at the uses and remove all unreachable branches, including in
>>>> switches.
>>>>
>>>> When compared to the x86_vendor_is() macro introduced in the RFC, this is
>>>> simpler. It achieves MOST of what the older macro did without touching the
>>>> switches, with a few caveats:
>>>>
>>>>   1. Compiled-out vendors cause a panic, they don't fallback onto the 
>>>> unknown
>>>>      vendor case. In retrospect, this is a much saner thing to do.
>>>
>>> I'm unconvinced here. Especially our Centaur and Shanghai support is at best
>>> rudimentary. Treating those worse when configured-out than when 
>>> configured-in
>>> feels questionable.
>> 
>> Isn't that the point of configuring out?
>
> That's what I'm unsure about.

I'm really missing what you're trying to make, sorry. How, if at all, is it
helpful for a hypervisor with a compiled out vendor to be bootable on a machine
of that vendor?

>
>> Besides the philosophical matter of whether or not a compiled-out vendor
>> should be allowed to run there's the more practical matter of what to do
>> with the x86_vendor field of boot_cpu_data. Because at that point our take
>> that cross-vendor support is forbidden is a lot weaker. If I can run an
>> AMD-hypervisor on an Intel host, what then? What policies would be allowed? 
>> If I
>> wipe x86_vendor then policies with some unknown vendor would be fine. Should 
>> the
>> leaves match too? If I do not wipe the field, should I do black magic to 
>> ensure
>> the behaviour is different depending on whether the vendor is compiled in or
>> not? What if I want to migrate a VM currently running in this hypothetical
>> hypervisor? The rules becomes seriously complex.
>> 
>> It's just a lot cleaner to take the stance that compiled out vendors can't 
>> run.
>> Then everything else is crystal clear and we avoid a universe's worth of 
>> corner
>> cases. I expect upstream Xen to support all cases (I'm skeptical about the
>> utility of the unknown vendor path, but oh well), but many downstreams might
>> benefit from killing off support for vendors they really will never touch.
>
> To them, will panic()ing (or not) make a difference?

One would hope not because the're compiling them out for a reason.
But for upstream, not panicking brings a sea of corner cases. The ones I
mentioned above is not the whole list.

Turning the question around. Who benefits from not panicking?

>
>>>>   2. equalities and inequalities have been replaced by equivalent 
>>>> (cpu_vendor() & ...)
>>>>      forms. This isn't stylistic preference. This form allows the compiler
>>>>      to merge the compared-against constant with X86_ENABLED_VENDORS, 
>>>> yielding
>>>>      much better codegen throughout the tree.
>>>>
>>>> The effect of (2) triples the delta in the full build below.
>>>>
>>>> Some differences might be attributable to the change from policy vendor 
>>>> checks
>>>> to boot_cpu_data. In the case of the emulator it caused a 400 bytes 
>>>> increase
>>>> due to the way it checks using LOTS of macro invocations, so I left that 
>>>> one
>>>> piece using the policy vendor except for the single vendor case.
>>>
>>> For the emulator I'd like to point out this question that I raised in the
>>> AVX10 series:
>> 
>> There's no optimisation shortage for the emulator. For that patch I just
>> ensure I didn't make a tricky situation worse. It is much better in the 
>> single-vendor case.
>> 
>>> "Since it'll be reducing code size, we may want to further convert
>>>  host_and_vcpu_must_have() to just vcpu_must_have() where appropriate
>>>  (should be [almost?] everywhere)."
>>>
>>> Sadly there was no feedback an that (or really on almost all of that work) 
>>> at
>>> all so far.
>> 
>> It sounds fairly orthogonal to this, unless I'm missing the point.
>
> It's largely orthogonal, except that if we had gone that route already, your
> codegen diff might look somewhat different.
>
>> In principle that would be fine. the vCPU features whose emulation requires
>> special instructions are most definitely a subset of those of the host 
>> anyway.
>> 
>> I agree many cases could be simplified as you describe.
>> 
>> I do see a worrying danger of XSA should the max policy ever exceed the
>> capabilities of the host on a feature required for emulating some instruction
>> for that very feature. Then the guest could abuse the emulator to trigger #UD
>> inside the hypervisor's emulation path.
>
> Well, that max-policy related question is why I've raised the point, rather
> than making (more) patches right away.

All jmp_rel() macros have amd_like() inside, and other checks are open-coded
in many places. The problem is that offsets into the policy pointer (which
is stored in a register) end up being global accesses to an offset from a
non-register-cached 64bit address. And that adds up. having an amd_like boolean
in the ctxt would've helped, but I went for the least intrusive solution.

I'm not sure how what you brought up would've helped for this particular codegen
matter.

Cheers,
Alejandro



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.