[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 4/4] XSA-60 security hole: flush cache when vmentry back to UC guest

To: Jan Beulich <JBeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>
From: "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
Date: Mon, 25 Nov 2013 16:08:46 +0000
Accept-language: en-US
Cc: "keir@xxxxxxx" <keir@xxxxxxx>, "suravee.suthikulpanit@xxxxxxx" <suravee.suthikulpanit@xxxxxxx>, "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>, "zhenzhong.duan@xxxxxxxxxx" <zhenzhong.duan@xxxxxxxxxx>, "Dugger, Donald D" <donald.d.dugger@xxxxxxxxx>, "tim@xxxxxxx" <tim@xxxxxxx>, "Auld, Will" <will.auld@xxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>, "sherry.hurwitz@xxxxxxx" <sherry.hurwitz@xxxxxxx>, "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx>
Delivery-date: Mon, 25 Nov 2013 16:09:46 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>
Thread-index: AQHO1Y6UVOVonvTrQHKgaof9wQdN4po2MT9w
Thread-topic: [PATCH 4/4] XSA-60 security hole: flush cache when vmentry back to UC guest

>> From: Liu Jinsong <jinsong.liu@xxxxxxxxx>
>> Date: Thu, 31 Oct 2013 06:38:15 +0800
>> Subject: [PATCH 4/4] XSA-60 security hole: flush cache when vmentry
>> back to UC guest 
>> 
>> This patch flush cache when vmentry back to UC guest, to prevent
>> cache polluted by hypervisor access guest memory during UC mode.
>> 
>> The elegant way to do this is, simply add wbinvd just before vmentry.
>> However, currently wbinvd before vmentry will mysteriously trigger
>> lapic timer interrupt storm, hung booting stage for 10s ~ 60s. We
>> still didn't dig out the root cause of interrupt storm, so currently this
>> patch add flag indicating hypervisor access UC guest memory to
>> prevent interrupt storm problem. Whenever the interrupt storm got root caused
>> and fixed, the protection flag can be removed.
> 

Hi,

We re-do some investigation this weekend, dig out the root cause of the 
interrupt storm, per the test data:

 =========================================================
 Test 1: add wbinvd before vmentry (at vmx_vmenter_helper())
 (XEN) uc_vmenter_count = 10607
 (XEN) uc_vmexit_count = 10607
 (XEN) EXIT-REASON      COUNT
 (XEN)        1               10463       // EXIT_REASON_EXTERNAL_INTERRUPT
 (XEN)       28                   10       // EXIT_REASON_CR_ACCESS
 (XEN)       31                 114       // EXIT_REASON_MSR_READ
 (XEN)       32                   15       // EXIT_REASON_MSR_WRITE
 (XEN)       54                     5       // EXIT_REASON_WBINVD
 (XEN) TOTAL EXIT-REASON-COUNT = 10607
 (XEN)
 (XEN) vcpu[0] vmentry count = 10492
 (XEN) vcpu[1] vmentry count = 37
 (XEN) vcpu[2] vmentry count = 40
 (XEN) vcpu[3] vmentry count = 38
 (XEN) interrupt vec 0xfa occurs 10450 times  // lapic timer
 (XEN) interrupt vec 0xfb occurs 13 times       // call function IPI


 Test 2: current patch which didn't add wbinvd before vmentry
 (XEN) uc_vmenter_count = 147
 (XEN) uc_vmexit_count = 147
 (XEN) EXIT-REASON      COUNT
 (XEN)        1                      3          // 
EXIT_REASON_EXTERNAL_INTERRUPT
 (XEN)       28                    10         // EXIT_REASON_CR_ACCESS
 (XEN)       31                  114         // EXIT_REASON_MSR_READ
 (XEN)       32                    15         // EXIT_REASON_MSR_WRITE
 (XEN)       54                      5         // EXIT_REASON_WBINVD
 (XEN) TOTAL EXIT-REASON-COUNT = 147
 (XEN)
 (XEN) vcpu[0] vmentry count = 45
 (XEN) vcpu[1] vmentry count = 34
 (XEN) vcpu[2] vmentry count = 34
 (XEN) vcpu[3] vmentry count = 34
 (XEN) interrupt vec 0xfa occurs 3 times  // lapic timer
 ==================================================================

From data above,
(uc_vmentey_count of tes1) - (uc_vmentry_count of test2) = 10607 - 147 = 10460
(intr exit of test1)                 - (intr exit of test2)                 = 
10463 - 3     = 10460
That means new-added vmentry count _all_ comes from external interrupt, almost 
1 wbinvd trigger 1 lapic timer.

The root cause is, wbinvd is a _very_ time consuming operation, so
1. wbinvd at vmentry ... timer has a good possibility to expire, and now irq 
disabled so it would be delayed until
2. ... vmentry back to guest (and irq enalbed), timer interrupt then occurs and 
drops guest at once;
3. drop to hypervisor ... then vmentry and wbinvd again;
This loop will run again and again, interrupt and vmexit again and again, until 
lucky enough wbinvd happens
not to expire timer and then loop break, usually it would occur 10K~60K times, 
blocking guest 10s~60s.

To fix the the dead_like_loop and interrupt storm, an attracted approach is to 
do the wbinvd at vmentry but at irq
enabled stage, so that lapic timer interrupt could be triggerred after wbinvd 
and got handled at hypervisor,
and then 'cleanly' vmentry back to guest.

Unfortunately, this would trigger another real dead-loop inside hypervisor:
1. wbinvd at vmentry ... then lapic timer interrupt since irq enabled
2. timer interrupt handler raise TIMER_SOFTIRQ
3. check softirq pending bits and do_softirq;
4. after softirq handled, re-do vmentry;
--> real dead-loop

So currently our approach is, still do wbinvd at irq disabled stage of vmentry, 
but adjust timer by delaying a little time,
so that it will not triggerred at once when back to guest. This way all works 
fine except at rarely occurred case timer
delay a little, say, 1ms.

Patch will be sent out later.

Thanks,
Jinsong
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Prev by Date: Re: [Xen-devel] [PATCH/RFC OSSTEST] Debian PV netboot guest test
Next by Date: Re: [Xen-devel] [PATCH/RFC OSSTEST] Debian PV netboot guest test
Previous by thread: [Xen-devel] [PATCH/RFC OSSTEST] Debian PV netboot guest test
Next by thread: [Xen-devel] [PATCH] VMX: wbinvd when vmentry under UC
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.