Xen project Mailing List

Re: [Xen-devel] PML (Page Modification Logging) design for Xen

To: "Kai Huang" <kai.huang@xxxxxxxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Wed, 11 Feb 2015 13:06:19 +0000

Cc: andrew.cooper3@xxxxxxxxxx, kevin.tian@xxxxxxxxx, tim@xxxxxxx, keir@xxxxxxx, xen-devel@xxxxxxxxxxxxx

Delivery-date: Wed, 11 Feb 2015 13:06:41 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 11.02.15 at 09:28, <kai.huang@xxxxxxxxxxxxxxx> wrote: > - PML enable/disable for particular Domain > > PML needs to be enabled (allocate PML buffer, initialize PML index, PML base > address, turn PML on VMCS, etc) for all vcpus of the domain, as PML buffer > and PML index are per-vcpu, but EPT table may be shared by vcpus. Enabling > PML on partial vcpus of the domain won't work. Also PML will only be enabled > for the domain when it is switched to dirty logging mode, and it will be > disabled when domain is switched back to normal mode. As looks vcpu number > won't be changed dynamically during guest is running (correct me if I am > wrong here), so we don't have to consider enabling PML for new created vcpu > when guest is in dirty logging mode. > > After PML is enabled for the domain, we only need to clear EPT entry's D-bit > for guest memory in dirty logging mode. We achieve this by checking if PML is > enabled for the domain when p2m_ram_rx changed to p2m_ram_logdirty, and > updating EPT entry accordingly. However, for super pages, we still write > protect them in case of PML as we still need to split super page to 4K page > in dirty logging mode. While it doesn't matter much for our immediate needs, the documentation isn't really clear about the behavior when a 2M or 1G page gets its D bit set: Wouldn't it be rather useful to the consumer to know of that fact (e.g. by setting some of the lower bits of the PML entry to indicate so)? > - PML buffer flush > > There are two places we need to flush PML buffer. The first place is PML > buffer full VMEXIT handler (apparently), and the second place is in > paging_log_dirty_op (either peek or clean), as vcpus are running > asynchronously along with paging_log_dirty_op is called from userspace via > hypercall, and it's possible there are dirty GPAs logged in vcpus' PML > buffers but not full. Therefore we'd better to flush all vcpus' PML buffers > before reporting dirty GPAs to userspace. > > We handle above two cases by flushing PML buffer at the beginning of all > VMEXITs. This solves the first case above, and it also solves the second > case, as prior to paging_log_dirty_op, domain_pause is called, which kicks > vcpus (that are in guest mode) out of guest mode via sending IPI, which cause > VMEXIT, to them. > > This also makes log-dirty radix tree more updated as PML buffer is flushed > on basis of all VMEXITs but not only PML buffer full VMEXIT. Is that really efficient? Flushing the buffer only as needed doesn't seem to be a major problem (apart from the usual preemption issue when dealing with guests with very many vCPU-s, but you certainly recall that at this point HVM is still limited to 128). Apart from these two remarks, the design looks okay to me. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.