[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3] x86/dom0: delay setting SMAP after dom0 build is done
On Thu, Aug 01, 2024 at 12:28:06PM +0200, Jan Beulich wrote: > On 01.08.2024 11:52, Roger Pau Monne wrote: > > @@ -1907,16 +1890,25 @@ void asmlinkage __init noreturn > > __start_xen(unsigned long mbi_p) > > if ( cpu_has_smep && opt_smep != SMEP_HVM_ONLY ) > > setup_force_cpu_cap(X86_FEATURE_XEN_SMEP); > > if ( boot_cpu_has(X86_FEATURE_XEN_SMEP) ) > > + { > > set_in_cr4(X86_CR4_SMEP); > > + cr4_pv32_mask = mmu_cr4_features & XEN_CR4_PV32_BITS; > > Could be just "cr4_pv32_mask = X86_CR4_SMEP" now? Yes, indeed, same below then. > > + } > > > > if ( !opt_smap ) > > setup_clear_cpu_cap(X86_FEATURE_SMAP); > > if ( cpu_has_smap && opt_smap != SMAP_HVM_ONLY ) > > setup_force_cpu_cap(X86_FEATURE_XEN_SMAP); > > if ( boot_cpu_has(X86_FEATURE_XEN_SMAP) ) > > - set_in_cr4(X86_CR4_SMAP); > > - > > - cr4_pv32_mask = mmu_cr4_features & XEN_CR4_PV32_BITS; > > + /* > > + * Set SMAP on the %cr4 mask so that it's set for APs on bringup, > > but > > + * don't set for the BSP until domain building is done. > > + * > > + * Don't set it in cr4_pv32_mask either, until it's also set on the > > + * BSP. Otherwise the BUG in cr4_pv32_restore would trigger for > > events > > + * received on the BSP. > > + */ > > + mmu_cr4_features |= X86_CR4_SMAP; > > Don't you put APs at risk this way of triggering the BUG in > cr4_pv32_restore()? > They'll have the bit set in %cr4, but the bit remains clear in cr4_pv32_mask > until much later. As long as the bit is set in %cr4, but not in cr4_pv32_mask the BUG in cr4_pv32_restore won't hit. > > @@ -2048,6 +2040,18 @@ void asmlinkage __init noreturn __start_xen(unsigned > > long mbi_p) > > if ( !dom0 ) > > panic("Could not set up DOM0 guest OS\n"); > > > > + /* > > + * Enable SMAP only after being done with the domain building phase, > > as the > > + * PV builder switches to the domain page-tables and must be run with > > SMAP > > + * disabled. > > + */ > > + if ( boot_cpu_has(X86_FEATURE_XEN_SMAP) ) > > + { > > + ASSERT(mmu_cr4_features & X86_CR4_SMAP); > > + write_cr4(read_cr4() | X86_CR4_SMAP); > > + cr4_pv32_mask = mmu_cr4_features & XEN_CR4_PV32_BITS; > > + } > > Similarly for the BSP here: If we take an NMI between setting CR4.SMAP and > setting the bit in cr4_pv32_mask, cr4_pv32_restore() would hit the BUG > there if I'm not mistaken. No, having extra bits set in %cr4 not present in cr4_pv32_mask won't trigger the BUG. It's the other way around, having bits set in cr4_pv32_mask but not in %cr4. As long as %cr4 is always updated ahead of setting the bit in cr4_pv32_mask that's fine. See the logic in cr4_pv32_restore: /* Check that _all_ of the bits intended to be set actually are. */ mov %cr4, %rax and cr4_pv32_mask(%rip), %rax cmp cr4_pv32_mask(%rip), %rax je 1j [...] BUG Otherwise none of this would be safe, as it's impossible to set %cr4 and cr4_pv32_mask atomically in a single instruction. > I further fear that switching things around won't > help either. The code you remove from create_dom0() looks to have the same > issue. The only NMI-safe sequence looks to be: Clear both bits from %cr4, > update cr4_pv32_mask as wanted, and then write %cr4 with the bits from > cr4_pv32_mask ORed in. No, that won't be safe, as cr4_pv32_mask would contain bits not set in %cr4 and cr4_pv32_restore would hit the BUG. The update order is the %other way around, first set in %cr4, then in cr4_pv32_mask. Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |