[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Accessing Dom0 Physical memory from xen, via direct mappings (PML4:262-271)

  • To: xen-devel@xxxxxxxxxxxxx
  • From: "Andres Lagar-Cavilla" <andres@xxxxxxxxxxxxxxxx>
  • Date: Tue, 13 Mar 2012 11:13:19 -0700
  • Cc: Shriram Rajagopalan <rshriram@xxxxxxxxx>, Tim Deegan <tim@xxxxxxx>
  • Delivery-date: Tue, 13 Mar 2012 18:13:49 +0000
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=lagarcavilla.org; h=message-id :in-reply-to:references:date:subject:from:to:cc:reply-to :mime-version:content-type:content-transfer-encoding; q=dns; s= lagarcavilla.org; b=EggXbRxrYNCd6gjVrnPv7WuvJBIeVC/tyrdESsFK/sQK sPEz3py0RmPDqQTcPoKPP2ECYXSd1jw+iKMvhDalYHkiY0wsFY3rSwkRYJQqBw99 hPeAjmjL7mydN0GClXpeBgmDe6tmih/3Mv2c1qFyf/b9KP5YATGeuaAtm8P5nKI=
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>

> Date: Tue, 13 Mar 2012 09:32:28 -0700
> From: Shriram Rajagopalan <rshriram@xxxxxxxxx>
> To: Tim Deegan <tim@xxxxxxx>
> Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
> Subject: Re: [Xen-devel] Accessing Dom0 Physical memory from xen, via
>       direct mappings (PML4:262-271)
> Message-ID:
>       <CAP8mzPPJTK_tg5h2JUTKOHptrG1C2giok3Yy9OO+=zwWEKq4Pg@xxxxxxxxxxxxxx>
> Content-Type: text/plain; charset="iso-8859-1"
> On Tue, Mar 13, 2012 at 9:08 AM, Tim Deegan <tim@xxxxxxx> wrote:
>> At 08:45 -0700 on 13 Mar (1331628358), Shriram Rajagopalan wrote:
>> > In config.h (include/asm-x86/config.h) I found this:
>> >
>> > #if __x86_64__
>> > ...
>> >  *  0xffff830000000000 - 0xffff87ffffffffff [5TB, 5*2^40 bytes,
>> PML4:262-271]
>> >  *    1:1 direct mapping of all physical memory.
>> > ...
>> >
>> > I was wondering if it's possible for dom0 to malloc a huge chunk of
>> memory
>> > and let xen know the starting address of this range.
>> > Inside xen, I can translate dom0 virt address to a virt address in the
>> above
>> > range and access the entire chunk via these virtual addresses.
>> Eh, maybe?  But it's harder than you'd think.  Memory malloc()ed in dom0
>> may not be contiguous in PFN-space, and dom0 PFNs may not be contiguous
>> in MFN-space, so you can't just translate the base of the buffer and use
>> it with offsets, you have to translate again for every 4k page.  Also,
>> you
>> need to make sure dom0 doesn't page out or relocate your user-space
>> buffer while Xen is accessing the MFNs.  mlock() might do what you need
>> but AIUI there's no guarantee that it won't, e.g., move the buffer
>> around to defragment memory.
> Yep. I am aware of the above issues. As far as contiguity is concerned,
> I was hoping (*naively/lazily*) that if I allocate a huge chunk (1G or so)
> using posix_memalign, it would start at page boundary and also be
> contiguous
> *most* of the time.  I need this setup only for some temporary analysis
> and
> not for a production quality system.
> And the machine has more than enough ram, with swap usage being 0 all the
> time.
> If you do handle all that, the correct way to get at the mappings in
>> this range is with map_domain_page().  But remember, this only works
>> like this on 64-bit Xen.  On 32-bit, only a certain amount of memory can
>> be mapped at one time so if the buffer is really big, you'll need to map
>> an unmap parts of it on demand.
> 64-bit. The comments I pointed out were under the #if x86_64 region.
> But maybe back up a bit: why do you want to do this?  What's the buffer
>> for?  Is it something you could do more easily by having Xen allocate
>> the buffer and let dom0 map it?
> Well, the buffer acts as a huge log dirty "byte" map (a byte per word).
> I am skipping the reason for doing this huge byte map, for the sake of
> brevity.
> Can I have xen allocate this huge buffer ? (a byte per 8-byte word means
> about
> 128M for a 1G guest). And if I were to have this byte-map per-vcpu, it
> would mean
> 512M worth of RAM, for a 4-vcpu guest.
> Is there a way I could increase the xen heap size to be able to allocate
> this much memory?
> And how do I map the xen memory in dom0 ? I vaguely remember seeing
> similar
> code in
> xentrace, but if you could point me in the right direction, it would be
> great.

Have you looked into XENMEM_exchange? There might be some size constraints
to this hypercall, but assuming you meet them, you could
1. have your user-space dom0 tool mmap an fd from your kernel driver
2. your kernel driver get_free_page()s as many as necessary
3. It then calls XENMEM_exchange. The hypervisor will take the mfn's and
hand back a contiguous mfn range.
4. Hook up the resulting mfn's into the user-space mmap

Now you have un-swappable, page-aligned, machine-contiguous memory, mapped
into your dom0 tool. And any other actor can easily identify this region
with the base mfn and the page count. I don't think a guest vcpu could map
this, though, being dom0 pages, but certainly the hypervisor can map it
and store information as appropriate.

I bring this up because it would be far easier than having Xen remember
all the mfn's in your array, which I'm suspicious you might need for your
byte-dirty use case.

Hope that helps
>  > The catch here is that I want this virtual address range to be
> accessible
>> > across all vcpu contexts in xen (whether it's servicing a hypercall
>> from
>> dom0
>> > or a vmx fault caused by Guest).
>> >
>> > So far, I have only been able to achieve the former. In the latter
>> case,
>> > where the "current" vcpu belongs to a guest (eg in a vmx fault
>> handler),
>> > I can't access this address range inside xen. Do I have to add EPT
>> > mappings to guest's p2m to do this ? Or can I do something else ?
>> If you really have got a pointer into the 1-1 mapping it should work
>> from any vcpu.
>  But again, that's not going to work on 32-bit Xen.
>> There, you have to use map_domain_page_global() to get a mapping that
>> persists across all vcpus, and that's even more limited in how much it
>> can map at once.
>> Cheers,
>> Tim.
> cheers
> shriram

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.