[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Fix VGA logdirty related display freezes with altp2m



On Thu, Oct 25, 2018 at 9:02 AM Razvan Cojocaru
<rcojocaru@xxxxxxxxxxxxxxx> wrote:
>
> On 10/25/18 5:55 PM, Tamas K Lengyel wrote:
> > On Thu, Oct 25, 2018 at 8:24 AM Razvan Cojocaru
> > <rcojocaru@xxxxxxxxxxxxxxx> wrote:
> >>
> >> On 10/24/18 8:52 PM, Tamas K Lengyel wrote:
> >>> On Wed, Oct 24, 2018 at 11:31 AM Tamas K Lengyel
> >>> <tamas.k.lengyel@xxxxxxxxx> wrote:
> >>>>
> >>>> On Wed, Oct 24, 2018 at 11:20 AM Razvan Cojocaru
> >>>> <rcojocaru@xxxxxxxxxxxxxxx> wrote:
> >>>>>
> >>>>> On 10/24/18 8:09 PM, Tamas K Lengyel wrote:
> >>>>>> On Tue, Oct 23, 2018 at 6:37 AM Razvan Cojocaru
> >>>>>> <rcojocaru@xxxxxxxxxxxxxxx> wrote:
> >>>>>>>
> >>>>>>> Tamas, could you please give this a spin?
> >>>>>>>
> >>>>>>> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take2
> >>>>>>>
> >>>>>>> It _should_ solve the crashes.
> >>>>>>
> >>>>>> Indeed, I no longer see the crash. However, there might be some
> >>>>>> locking issue present because the whole system freezes up shortly
> >>>>>> after starting DRAKVUF on a domain - within a couple seconds. I mean
> >>>>>> Xen itself locks up: no response on the serial, dom0 screen frozen,
> >>>>>> etc.
> >>>>>
> >>>>> Do you have any type of log / backtrace / way I could reproduce it
> >>>>> without Drakvuf? All the ways I've tested it were fine (including
> >>>>> xen-access).
> >>>>
> >>>> I don't have a standalone test that produces that error. With DRAKVUF
> >>>> it is easily reproducible though. If you have a Windows guest
> >>>> installed, setting up DRAKVUF should really not be much trouble. With
> >>>> xen-access it indeed doesn't lock up but since the guest is pretty
> >>>> much unresponsive during that test I can't verify whether the VGA
> >>>> issue is now resolved or not. Also the xen-access tests are fairly
> >>>> limited and don't use all aspects of altp2m.
> >>>>
> >>>
> >>> What I see from the DRAKVUF log is that the last thing it prints is
> >>> sending a vm_event response that both enables singlestepping and
> >>> switches altp2m view. This looks to be consistent. It didn't matter if
> >>> the guest had 1 or 2 vCPUs, the freeze occurs just the same. It's
> >>> definitely racey because it doesn't happen right away, the system
> >>> works as expected for a couple seconds.
> >>
> >> After having to install clang because my GCC couldn't build Drakvuf:
> >>
> >> ../../src/plugins/plugins.h:188:1: sorry, unimplemented: non-trivial
> >> designated initializers not supported
> >
> > Please follow the instruction for compiling it, clang is a
> > requirement. I don't even know how you got pass the ./configure stage
> > without clang being installed. You could also just copy-paste things
> > from the travis script directly:
> > https://github.com/tklengyel/drakvuf/blob/master/.travis.yml#L51
> >
> >>
> >> then rekall via pip, then having to mount my Windows disk to do "rekal
> >> peinfo", I finally gave up when "rekall fetch_pdb" couldn't find the
> >> debug files on the Microsoft server. :)
> >
> > If your version if Windows is that brand new then yes, Microsoft takes
> > a couple days to publish their debug information and you will just
> > have to wait or use an older version of Windows.
> >
> >>
> >> So if you could find a way to reproduce the issue with a simple
> >> libxc-based application alone (or at least with something
> >> libvmi-related, which I do have set up), I'd really appreciate it.
> >>
> >> Or maybe try to hack around with patch no 3 of the series (for a start,
> >> just revert it and see if the problem persists - of course the display
> >> will freeze) and see if there's an easy fix?
> >
> > Unfortunately I won't have time to do either of these any time soon.
> > If you are having that much trouble setting it up I can perhaps send
> > you a pre-compiled version with a version of Windows for which
> > Microsoft already published the debug info for.
>
> It's a Windows 7 x64 guest. But the problem was that the right command
> line is:
>
> rekall fetch_pdb ntkrnlmp
>
> instead of the suggested "rekall fetch_pdb ntkrpamp" on the drakvuf.com
> website.

The kernel filename is specific to the version of Windows you have
installed. The instructions specify _an example_ for the 32-bit
version of Windows 7 and you will need to adjust it according to the
kernel filename. For 64-bit it is ntkrnlmp. The instruction explicitly
say that you need to use the PDB filename that was printed for your
specific kernel version.

>
> I'll try to continue - in any case should I have more trouble I'll
> contact you privately so as not to spam the list. Just wanted to leave
> this here in case someone else has this problem in the hope that it's
> useful.

Of course, also please feel free to open an issue on github if you run
into something that's blocking you. Chances are if you run into it,
others would too :)

Thanks,
Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.