[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [BUG] IOTLB is not flushed in IOMMU Unmap Process



Bug Descriptions
We find that Xen does NOT flush I/O TLB when iommu_unmap_page is invoked in PV mode. Attackers may be able to use this bug to compromise Xen.
(When a PV guest OS adds a new page into its page table, the hypervisor should set the new page-table page read-only if the page is writable before, and consequently flush the IOTLB to avoid DMA attacks. )
 
The following code is from __get_page_type:
if ( unlikely((x & PGT_type_mask) != type) )
    {
        /* Special pages should not be accessible from devices. */
        struct domain *d = page_get_owner(page);
        if ( d && !is_hvm_domain(d) && unlikely(need_iommu(d)) )
        {
            if ( (x & PGT_type_mask) == PGT_writable_page )
                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
            else if ( type == PGT_writable_page )
                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
                               page_to_mfn(page),
                               IOMMUF_readable|IOMMUF_writable);
        }
}
Details
When Xen deals with a L2 page table entry update, the possible function flow is as follows:
do_mmu_update()
        ->mod_l2_entry()
                ->get_page_from_l2e()
                        ->get_page_and_type_from_pagenr
                                ->get_page_type()
                                        ->__get_page_type
                                                ->iommu_unmap_page
                                                        ->intel_iommu_unmap_page
                                                                ->dma_pte_clear_one()
                                                                        ->__intel_iommu_iotlb_flush
                                                                                ->iommu_flush_iotlb_psi
                                                                                        ->flush_iotlb_qi
We have noticed that the function: flush_iotlb_qi is supposed to complete the task of invalidating I/O TLB. However, it ALWAYS returns before really flushing the IOTLB. 
In flush_iotlb_qi
if ( flush_non_present_entry )
        {
        if ( !cap_caching_mode(iommu->cap) )
            return 1;  ß return from this point.
        else
            did = 0;
}
Similarly, the function: iommu_flush_write_buffer coming next to iommu_flush_iotlb_psi directly returns also without write buffer flushing because the if statement: !rwbf_quirk && !cap_rwbf(iommu->cap) are also true.
Note that some internal information can be found at the end of this email, in the log file.
 
Fix Suggestions
We observe that the argument: dma_old_pte_present in the function: __intel_iommu_iotlb_flush is always assigned as 0 in the above function flow, the value of flush_non_present_entry in flush_iotlb_qi is always 1, the negation of dma_old_pte_present (!dma_old_pte_present).
 
We suggest to set dma_old_pte_present as 1 or collect the value from dma_pte_present(pte).
Configurations
Our experimental platform is a HP Compaq 8100 Elite CMT PC with an Intel Core i7-870 running at 2.93 GHz with vt-d feature enabled and we use Xen version 4.2.1 as the hypervisor while domain0 uses the kernel version 3.2.0-51-generic-pae.
To obtain the same control flow, we need to do some configurations.
We have enabled the Intel VT-d technique in the BIOS.
The grub.conf of the para-guest domain 0:
menuentry 'Ubuntu GNU/LinuxXen 4.2.1' --class ubuntu --class gnu-linux --class gnu --class os --class xen {
        insmod part_msdos
        insmod ext2
        set root='(hd1,msdos2)'
  search --no-floppy --fs-uuid --set=root b23f1bc8-31e1-4590-a063-2109e68309fd
  multiboot /boot/xen-4.2.1.gz iommu=verbose iommu=dom0-strict placeholder sched=credit loglvl=all guest_loglvl=all debug_stack_lines=80 console=vga,com1 com1=115200,8n1 
        module /boot/vmlinuz-3.2.0-51-generic-pae placeholder root=UUID=b23f1bc8-31e1-4590-a063-2109e68309fd ro  quiet splash console=tty0 max_loop=128 xencons=ttyS0,115200
module  /boot/initrd.img-3.2.0-51-generic-pae
}
Also, we have printed corresponding hardware registers as well as the global initialized variables along the function flow in the log file, which strongly determines the control flow.
Here we list the corresponding outputs from the log file:
(XEN) Xen version 4.2.1 (root@) (gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3) Fri Aug 23 15:02:53 CST 2013
(XEN) Latest ChangeSet: unavailable
(XEN) Bootloader: GRUB 1.99-21ubuntu3
(XEN) Command line: iommu=verbose iommu=dom0-strict placeholder schde=credit loglvl=all guest_loglvl=all debug_stack_lines=80 console=vga,com1 com1=115200,8n1
(XEN) Intel machine check reporting enabled
(XEN) Intel VT-d supported page sizes: 4kB.
(XEN) Intel VT-d Snoop Control enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping not enabled.
(XEN) Intel VT-d Shared EPT tables not enabled.
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Strict
(XEN) is_hvm_domain(d):false,    need_iommu(d):true
(XEN) iommu_enabled:true,        hd->platform_ops: non-null
(XEN) iommu_passthrough:false,   d->domain_id=0
(XEN) dma_pte_present(*pte):present
(XEN) this_cpu(iommu_dont_flush_iotlb):false
(XEN) cap_pgsel_inv(iommu->cap):true,   order>cap_max_amask_val:false
(XEN) flush non present entry:true,     cap_caching_mode(iommu->cap):false
(XEN) rwbf_quirk:false, cap_rwbf(iommu->cap):false
 
If you need more information, please let us know.
Hopefully our information can contribute to the security of Xen.
 
 
Best wishes
Yueqiang & Zhi & Junqing
 
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.