[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 31 Aug 2023 10:59:31 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2SHvLegtqC+Auo3ugpZEW/aQN9sGY+Qj4PTNiezZYUE=; b=ix8aauu8aMrG+g05ehfiNRgrQqEDZbovDkI1UtbNOFI2ewO0xxWeBn2qe2Pl7C0dpSzrhJW9TC8AU81JDpmDzVMLUcv0Wzy8L2x/Z2gDWR3AvGNcujgR8nIGicmCs5SY9An10w19Detbm3S4aaZpu8B8Tk+mW0VERPbCSHjk9GPcQAUKh0/mxVMrg4LgFLKrfaVb6InhH9z2WYPHGPP7k/ToZFDZ9JmSxFOJGzonhLbmcKNvH3ZAc4Bsrqh56RjY6p+bAX9KcTJe3UW7a8LDz8EHHrfneWEx1Z97DjQrL+E+5bvChjVUwKRIQ7u8hItDO/VZY0Ik/6nzfXGq2WLYCA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Yrliu4xMMKjPN4vzUyOTIvpSy0GHsY2/uendKD28WoXA0IZMETjs4SwPbZmaW/daqq+KOZJAgbcARw1exkeilV3HlUM1/uxQHbimSSj+6JSxsgQ4EjT3mvcSGgqpPLTbRvs3hXxpwl5Lr2JlhuU/Qpq6XmosrRDccDBG5EWYy4CrONE/hHkzCwIag80S60QzX7WiMIx5Od77DY95IPxLU1Ac0/PfqbYGVZjN5F2buA/xNEF4uSCGfLBGm1aVL2ZGwUskh+OL/7GTEn6OmrPR89Kg/yYxlRhRcHzGpb8F0xEMIDn1Op7ahd652+OYSjyaI3F6v7Ejquw8QWNC1vInoA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Paul Durrant <paul.durrant@xxxxxxxxxx>
  • Delivery-date: Thu, 31 Aug 2023 08:59:57 +0000
  • Ironport-data: A9a23:pk3nPatfvT8oaaUZOnQUulS7u+fnVEVfMUV32f8akzHdYApBsoF/q tZmKT3QOvyPNzf2fd9xadnn900PupHVx99hSQZuqi40Rn4W+JbJXdiXEBz9bniYRiHhoOCLz O1FM4Wdc5pkJpP4jk3wWlQ0hSAkjclkfpKlVKiffHg3HVQ+IMsYoUoLs/YjhYJ1isSODQqIu Nfjy+XSI1bg0DNvWo4uw/vrRChH4rKq4lv0gnRkPaoQ5A+FziFMZH4iDfrZw0XQE9E88tGSH 44v/JnhlkvF8hEkDM+Sk7qTWiXmlZaLYGBiIlIPM0STqkAqSh4ai87XB9JFAatjsB2bnsgZ9 Tl4ncfYpTHFnEH7sL91vxFwS0mSNEDdkVPNCSDXXce7lyUqf5ZwqhnH4Y5f0YAwo45K7W9yG fMwKmkiVBDcpuWP5p3gR+deoZUEKcvEFdZK0p1g5Wmx4fcOZ7nmG/+PyfoDmTA6i4ZJAOrUY NcfZXx3dhPcbhZTO1ARTpUjgOOvgXq5eDpdwL6XjfNvvy6Pk0osjv6xbLI5efTTLSlRtlyfq W/cuXzwHzkRNcCFyCrD+XWp7gPKtXqhAdxPSO3oq5aGhnXO/z0DNS8OXGfiitDpi1S1X857B 1cbr39GQa8asRbDosPGdx+yrWOAvxUcc8FNCOB84waIooLE7gDcCmUaQzppbN09qNRwVTEsz kWOnd7iGXpoqrL9dJ6G3rKdrDf3NS1OK2YHPXUAVVFdv4Wlp5wvhBXSSNolCLSyktD+BTD3x XaNsTQ6gLIQy8UM0s1X4Gz6vt5lnbCRJiZd2+kddjvNAt9RDGJ9W7GV1A==
  • Ironport-hdrordr: A9a23:mFB8hqvsYfjv8ZqhyGSc9qhg7skDstV00zEX/kB9WHVpm6yj+v xG/c5rsCMc7Qx6ZJhOo7+90cW7L080lqQFg7X5X43DYOCOggLBQL2KhbGI/9SKIVycygcy78 Zdm6gVMqyLMbB55/yKnTVRxbwbsaW6GKPDv5ag8590JzsaD52Jd21Ce36m+ksdfnggObMJUK Cyy+BgvDSadXEefq2AdwI4t7iqnaysqHr+CyR2fiIa1A==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Aug 31, 2023 at 09:03:18AM +0200, Jan Beulich wrote:
> On 30.08.2023 20:09, Andrew Cooper wrote:
> > On 30/08/2023 3:30 pm, Roger Pau Monné wrote:
> >> On Wed, Sep 12, 2018 at 03:09:35AM -0600, Jan Beulich wrote:
> >>> The function does two translations in one go for a single guest access.
> >>> Any failure of the first translation step (guest linear -> guest
> >>> physical), resulting in #PF, ought to take precedence over any failure
> >>> of the second step (guest physical -> host physical).
> > 
> > Erm... No?
> > 
> > There are up to 25 translations steps, assuming a memory operand
> > contained entirely within a cache-line.
> > 
> > They intermix between gla->gpa and gpa->spa in a strict order.
> 
> But we're talking about an access crossing a page boundary here.
> 
> > There not a point where the error is ambiguous, nor is there ever a
> > point where a pagewalk continues beyond a faulting condition.
> > 
> > Hardware certainly isn't wasting transistors to hold state just to see
> > could try to progress further in order to hand back a different error...
> > 
> > 
> > When the pipeline needs to split an access, it has to generate multiple
> > adjacent memory accesses, because the unit of memory access is a cache line.
> > 
> > There is a total order of accesses in the memory queue, so any faults
> > from first byte of the access will be delivered before any fault from
> > the first byte to move into the next cache line.
> 
> Looks like we're fundamentally disagreeing on what we try to emulate in
> Xen. My view is that the goal ought to be to match, as closely as
> possible, how code would behave on bare metal. IOW no considerations of
> of the GPA -> MA translation steps. Of course in a fully virtualized
> environment these necessarily have to occur for the page table accesses
> themselves, before the the actual memory access can be carried out. But
> that's different for the leaf access itself. (In fact I'm not even sure
> the architecture guarantees that the two split accesses, or their
> associated page walks, always occur in [address] order.)
> 
> I'd also like to expand on the "we're": Considering the two R-b I got
> already back at the time, both apparently agreed with my way of looking
> at things. With Roger's reply that you've responded to here, I'm
> getting the impression that he also shares that view.

Ideally the emulator should attempt to replicate the behavior a guests
gets when running on second-stage translation, so it's not possible to
differentiate the behavior of emulating an instruction vs
executing it in non-root mode. IOW: not only take the ordering of #PF
into account, but also the EPT_VIOLATION vmexits.

> Of course that still doesn't mean we're right and you're wrong, but if
> you think that's the case, it'll take you actually supplying arguments
> supporting your view. And since we're talking of an abstract concept
> here, resorting to how CPUs actually deal with the same situation
> isn't enough. It wouldn't be the first time that they got things
> wrong. Plus it may also require you potentially accepting that
> different views are possible, without either being strictly wrong and
> the other strictly right.

I don't really have an answer here, with the lack of a written down
specification by vendors I think we should just go with whatever is
easier for us to handle in the hypervisor.

Also, this is such a corner case, that I would think any guest
attempting this is likely hitting a BUG or attempting something fishy.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.