[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Installing WinPV drivers on Windows 2019



Since we have a reproducer here on an apparently up-to-date Server 2019 
installation, and the stack was:

ffff8a83`92044c78 fffff805`733d40e9 : 00000000`00000139 00000000`0000001e 
ffff8a83`92044fa0 ffff8a83`92044ef8 : nt!KeBugCheckEx
ffff8a83`92044c80 fffff805`733d4490 : 00000000`00000000 ffff8a83`92045220 
00000000`0000000e ffffe588`a03b48a0 :
nt!KiBugCheckDispatch+0x69
ffff8a83`92044dc0 fffff805`733d288e : 00000000`00001080 fffff805`732b0d31 
00000020`00000000 00000000`00000000 :
nt!KiFastFailDispatch+0xd0
ffff8a83`92044fa0 fffff805`734490c0 : 00000000`00000003 00000000`00000000 
00000001`00000000 00000000`00000000 :
nt!KiRaiseSecurityCheckFailure+0x30e
ffff8a83`92045130 fffff805`7336730c : ffff8a83`00000000 00000000`00000002 
00000000`00000000 fffff805`7333a570 :
nt!KiAcquireThreadStateLock+0x11fa90
ffff8a83`920451a0 fffff805`7344eef6 : ffffa700`4ff59180 ffffa700`00000000 
ffff8a83`92045240 00000000`00000000 :
nt!KeSetIdealProcessorThreadEx+0xd0
ffff8a83`92045220 fffff805`73339e84 : 00000000`00000200 00000000`00000000 
ffff8a83`920453c9 00000000`ffffffff :
nt!MiZeroInParallelWorker+0x115016
ffff8a83`92045350 fffff805`73339386 : ffff8280`00000000 00000000`00000200 
00000000`00000001 fffff805`00000003 :
nt!MiZeroInParallel+0x11c
ffff8a83`92045430 fffff805`73338e3a : fffff805`00000000 ffffe588`00000000 
00000000`00000000 00000000`00000000 :
nt!MiInitializeMdlBatchPages+0x2ae
ffff8a83`92045500 fffff805`73338c69 : 00000000`00000000 ffffe588`ab8a9010 
00000000`00436d66 ffffe588`ab976dff :
nt!MiAllocatePagesForMdl+0x192
ffff8a83`920456b0 fffff805`73338b8d : 00000000`00000000 ffffe588`a002cab0 
ffffe588`b44ebfd0 ffffe588`ba4d8140 :
nt!MmAllocatePartitionNodePagesForMdlEx+0xc9
ffff8a83`92045720 fffff805`78fdaf66 : 00000000`00000000 00000000`00000000 
ffffe588`a002cab0 ffffe588`b44ebfd0 :
nt!MmAllocatePagesForMdlEx+0x4d
ffff8a83`92045770 00000000`00000000 : 00000000`00000000 ffffe588`a002cab0 
ffffe588`b44ebfd0 00000000`00000001 : xenvbd+0xaf66

(i.e. clearly zeroing pages for an allocation that had MM_DONT_ZERO_ALLOCATION 
set) then I guess we can probably conclude that
Microsoft has undone the fix (or it only selectively applied in the first 
place).
Hence, for AllocatePage() in all drivers I think we should apply the workaround 
patch; I don't think it will make much of a
difference to performance. For the balloon though I think we need to stick to a 
non-zeroed allocation (to avoid needless faulting-in
of PoD frames) but have a registry override in case it also hits this buggy 
case.

  Paul

> -----Original Message-----
> From: Ben Chalmers <ben.chalmers@xxxxxxxxxx>
> Sent: 10 June 2020 08:31
> To: Owen Smith <owen.smith@xxxxxxxxxx>; jan.bakuwel@xxxxxxxxx; paul@xxxxxxx; 
> win-pv-
> devel@xxxxxxxxxxxxxxxxxxxx
> Subject: Re: Installing WinPV drivers on Windows 2019
> 
> The fix Microsoft provided, if I remember correctly, was to not zero memory 
> when
> MM_DONT_ZERO_ALLOCATION was set (that is, I don't think they fixed the 
> underlying problem, they fixed
> the easily reproducible one)
> 
> If the use of  MmAllocatePagesForMdlEx has been expanded to situations where 
> we want zeroed memory,
> the same problem could, plausibly, occur.
> 
> (unfortunately, I forget precisely what the actual problem was...  something 
> which required me a lot
> of diving through stack traces and reading assembler to figure out, if I 
> recall correctly)
> 
> Ben Chalmers
> T +44 1223 225964 | M +44 7855 464069
> ben.chalmers@xxxxxxxxxx
> 
> ________________________________________
> From: win-pv-devel <win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx> on behalf of 
> Owen Smith
> <owen.smith@xxxxxxxxxx>
> Sent: 09 June 2020 11:28
> To: jan.bakuwel@xxxxxxxxx; paul@xxxxxxx; win-pv-devel@xxxxxxxxxxxxxxxxxxxx
> Subject: RE: Installing WinPV drivers on Windows 2019
> 
> [CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments 
> unless you have verified the
> sender and know the content is safe.
> 
> Interestingly, the Citrix drivers are using MmAllocatePagesForMdlEx and are 
> not hitting this 0x139
> BSOD.
> 
> The patch Paul mentioned was reverted after this update was pushed out
> https://support.microsoft.com/en-us/help/4458469/windows-10-update-kb4458469
> 
> Owen
> 
> > -----Original Message-----
> > From: win-pv-devel [mailto:win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx] On
> > Behalf Of Jan Bakuwel
> > Sent: 09 June 2020 10:34
> > To: paul@xxxxxxx; win-pv-devel@xxxxxxxxxxxxxxxxxxxx
> > Subject: Re: Installing WinPV drivers on Windows 2019
> >
> > [CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments
> > unless you have verified the sender and know the content is safe.
> >
> > Hi Paul,
> >
> > On 9/06/20 7:27 pm, Paul Durrant wrote:
> > >> -----Original Message-----
> > >> From: Jan Bakuwel <jan.bakuwel@xxxxxxxxx>
> > >> Sent: 09 June 2020 03:33
> > >> To: paul@xxxxxxx; win-pv-devel@xxxxxxxxxxxxxxxxxxxx
> > >> Subject: Re: Installing WinPV drivers on Windows 2019
> > >>
> > >> Hi Paul,
> > >>> Hi Jan,
> > >>>
> > >>>     Oddly the summary analysis below fingers the network driver but
> > >>> when I pull the dump into windbg,
> > >> it clearly points at a stack overflow in a thread starting in xenvbd...
> > >>
> > >> I'm not familiar with Windows driver debugging but trust you when you
> > >> say it is :-)
> > >>
> > >> Any suggestions where we can go from here?
> > >>
> > > Hi Jan,
> > >
> > > I think you are hitting a bug in Windows itself that was worked around by 
> > > this
> > commit in XENVIF:
> > >
> > > https://xenbits.xen.org/gitweb/?p=pvdrivers/win/xenvif.git;a=commit;h=
> > > 4f85d00477b0931d4367a34de4f84d54b0f04d4e
> > >
> > > ---
> > > Replace uses of MmAllocatePagesForMdlEx in __AllocatePage
> > >
> > > Windows appears to have an edge case bug in which zeroing memory using
> > > MmAllocatePAgesForMdlEx (which in Win 10 1803 happens even if you
> > > specify MM_DONT_ZERO_ALLOCATION) can cause a BSOD 139 1e.
> > >
> > > This commit uses MmAllocateContinguousMemorySpecifyCache
> > > to allocate memory instead, then builds and Mdl to wrap it up.
> > >
> > > __AllocatePages is left unchanged (as we don't want to allocate
> > > multiple contiguous pages).  This issue has not been seen outside of
> > > xenvif calls to __AllocatePage and we expect a fix to the underlying
> > > Windows problem in the near future
> > >
> > > Signed-off-by: Ben.Chalmers <ben.chalmers@xxxxxxxxxx>
> > > ---
> > >
> > > Firstly, is your installation fully updated?
> >
> > That depends on Microsoft :-) ... the server is set to auto update.
> >
> > I just hit "Check for Updates" and the server pulled in three updates:
> > one for Windows Defender, one KB4551853 and one for Adobe Flash, none of
> > them related.
> >
> > > If so then it appears Microsoft have *still* not fixed this, despite it 
> > > being
> > identified in 2018, in which case I will clone the above commit into XENVBD 
> > and
> > also amend any other drivers using MmAllocatePagesForMdlEx() with that
> > option.
> >
> > I think it's safe to conclude that Microsoft indeed has not fixed this.
> > Not really surprising, eh?
> >
> > Many thanks for your help, looking forward to the new drivers.
> >
> > kind regards,
> > Jan
> >
> >





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.