[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] PVH dom0 creation fails - the system freezes
On Wed, Jul 25, 2018 at 05:19:03PM +0100, Paul Durrant wrote: > > -----Original Message----- > > From: Xen-devel [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx] On Behalf > > Of Roger Pau Monné > > Sent: 25 July 2018 15:12 > > To: bercarug@xxxxxxxxxx > > Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>; David Woodhouse > > <dwmw2@xxxxxxxxxxxxx>; Jan Beulich <JBeulich@xxxxxxxx>; > > abelgun@xxxxxxxxxx > > Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes > > > > On Wed, Jul 25, 2018 at 04:57:23PM +0300, bercarug@xxxxxxxxxx wrote: > > > On 07/25/2018 04:35 PM, Roger Pau Monné wrote: > > > > On Wed, Jul 25, 2018 at 01:06:43PM +0300, bercarug@xxxxxxxxxx > > wrote: > > > > > On 07/24/2018 12:54 PM, Jan Beulich wrote: > > > > > > > > > On 23.07.18 at 13:50, <bercarug@xxxxxxxxxx> wrote: > > > > > > > For the last few days, I have been trying to get a PVH dom0 > > > > > > > running, > > > > > > > however I encountered the following problem: the system seems > > to > > > > > > > freeze after the hypervisor boots, the screen goes black. I have > > tried to > > > > > > > debug it via a serial console (using Minicom) and managed to get > > some > > > > > > > more Xen output, after the screen turns black. > > > > > > > > > > > > > > I mention that I have tried to boot the PVH dom0 using different > > kernel > > > > > > > images (from 4.9.0 to 4.18-rc3), different Xen versions (4.10, > > > > > > > 4.11, > > 4.12). > > > > > > > > > > > > > > Below I attached my system / hypervisor configuration, as well as > > the > > > > > > > output captured through the serial console, corresponding to the > > latest > > > > > > > versions for Xen and the Linux Kernel (Xen staging and Kernel from > > the > > > > > > > xen/tip tree). > > > > > > > [...] > > > > > > > (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow > > > > > > > (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending > > Fault > > > > > > > (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault > > addr 8deb3000, iommu reg = ffff82c00021b000 > > > > Can you figure out which PCI device is 00:14.0? > > > This is the output of lspci -vvv for device 00:14.0: > > > > > > 00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI > > > Controller (rev 31) (prog-if 30 [XHCI]) > > > Subsystem: Intel Corporation Sunrise Point-H USB 3.0 xHCI > > > Controller > > > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- > > ParErr+ > > > Stepping- SERR+ FastB2B- DisINTx+ > > > Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- > > > <TAbort- <MAbort+ >SERR- <PERR- INTx- > > > Latency: 0 > > > Interrupt: pin A routed to IRQ 178 > > > Region 0: Memory at a2e00000 (64-bit, non-prefetchable) [size=64K] > > > Capabilities: [70] Power Management version 2 > > > Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA > > > PME(D0-,D1-,D2-,D3hot+,D3cold+) > > > Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- > > > Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+ > > > Address: 00000000fee0e000 Data: 4021 > > > Kernel driver in use: xhci_hcd > > > Kernel modules: xhci_pci > > > > I'm afraid your USB controller is missing RMRR entries in the DMAR > > ACPI tables, thus causing the IOMMU faults and not working properly. > > > > You could try to manually add some extra rmrr regions by appending: > > > > rmrr=0x8deb3=0:0:14.0 > > > > To the Xen command line, and keep adding any address that pops up in > > the iommu faults. This is of course quite cumbersome, but there's no > > way to get the required memory addresses if the data in RMRR is > > wrong/incomplete. > > > > You could just add all E820 reserved regions in there. That will almost > certainly cover it. I have a prototype patch for this that attempts to identity map all reserved regions below 4GB to the p2m. It's still a WIP, but if you could give it a try that would help me figure out whether this fixes your issues and is indeed something that would be good to have. I don't really like the patch as-is because it doesn't check whether the reserved regions added to the p2m overlap with the LAPIC page or the PCIe MCFG regions for example, I will continue to work on a safer version. If you can give this a shot, please remove any rmrr options from the command line and use iommu=debug in order to catch any issues. Thanks, Roger. ---8<--- diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c index 2c44fabf99..76a1fd6681 100644 --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -21,6 +21,8 @@ #include <xen/keyhandler.h> #include <xsm/xsm.h> +#include <asm/setup.h> + static int parse_iommu_param(const char *s); static void iommu_dump_p2m_table(unsigned char key); @@ -47,6 +49,8 @@ integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout); * no-igfx Disable VT-d for IGD devices (insecure) * no-amd-iommu-perdev-intremap Don't use per-device interrupt remapping * tables (insecure) + * inclusive Include any memory ranges below 4GB not used + * by Xen or unusable to the iommu page tables. */ custom_param("iommu", parse_iommu_param); bool_t __initdata iommu_enable = 1; @@ -60,6 +64,7 @@ bool_t __read_mostly iommu_passthrough; bool_t __read_mostly iommu_snoop = 1; bool_t __read_mostly iommu_qinval = 1; bool_t __read_mostly iommu_intremap = 1; +bool __read_mostly iommu_inclusive = true; /* * In the current implementation of VT-d posted interrupts, in some extreme @@ -126,6 +131,8 @@ static int __init parse_iommu_param(const char *s) iommu_dom0_strict = val; else if ( !strncmp(s, "sharept", ss - s) ) iommu_hap_pt_share = val; + else if ( !strncmp(s, "inclusive", ss - s) ) + iommu_inclusive = val; else rc = -EINVAL; @@ -165,6 +172,85 @@ static void __hwdom_init check_hwdom_reqs(struct domain *d) iommu_dom0_strict = 1; } +static void __hwdom_init setup_inclusive_mappings(struct domain *d) +{ + unsigned long i, j, tmp, top, max_pfn; + + BUG_ON(!is_hardware_domain(d)); + + max_pfn = (GB(4) >> PAGE_SHIFT) - 1; + top = max(max_pdx, pfn_to_pdx(max_pfn) + 1); + + for ( i = 0; i < top; i++ ) + { + unsigned long pfn = pdx_to_pfn(i); + bool map; + int rc = 0; + + /* + * Set up 1:1 mapping for dom0. Default to include only + * conventional RAM areas and let RMRRs include needed reserved + * regions. When set, the inclusive mapping additionally maps in + * every pfn up to 4GB except those that fall in unusable ranges. + */ + if ( pfn > max_pfn && !mfn_valid(_mfn(pfn)) ) + continue; + + if ( is_pv_domain(d) && iommu_inclusive && pfn <= max_pfn ) + map = !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE); + else if ( is_hvm_domain(d) && iommu_inclusive ) + map = page_is_ram_type(pfn, RAM_TYPE_RESERVED); + else + map = page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL); + + if ( !map ) + continue; + + /* Exclude Xen bits */ + if ( xen_in_range(pfn) ) + continue; + + /* + * If dom0-strict mode is enabled or guest type is HVM/PVH then exclude + * conventional RAM and let the common code map dom0's pages. + */ + if ( (iommu_dom0_strict || is_hvm_domain(d)) && + page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) ) + continue; + + /* For HVM avoid memory below 1MB because that's already mapped. */ + if ( is_hvm_domain(d) && pfn < PFN_DOWN(MB(1)) ) + continue; + + tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K); + for ( j = 0; j < tmp; j++ ) + { + int ret; + + if ( iommu_use_hap_pt(d) ) + { + ASSERT(is_hvm_domain(d)); + ret = set_identity_p2m_entry(d, pfn * tmp + j, p2m_access_rw, + 0); + } + else + ret = iommu_map_page(d, pfn * tmp + j, pfn * tmp + j, + IOMMUF_readable|IOMMUF_writable); + + if ( !rc ) + rc = ret; + } + + if ( rc ) + printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n", + d->domain_id, rc); + + if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K)))) + process_pending_softirqs(); + } + +} + void __hwdom_init iommu_hwdom_init(struct domain *d) { const struct domain_iommu *hd = dom_iommu(d); @@ -207,7 +293,10 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) d->domain_id, rc); } - return hd->platform_ops->hwdom_init(d); + hd->platform_ops->hwdom_init(d); + + if ( !iommu_passthrough ) + setup_inclusive_mappings(d); } void iommu_teardown(struct domain *d) diff --git a/xen/drivers/passthrough/vtd/extern.h b/xen/drivers/passthrough/vtd/extern.h index fb7edfaef9..91cadc602e 100644 --- a/xen/drivers/passthrough/vtd/extern.h +++ b/xen/drivers/passthrough/vtd/extern.h @@ -99,6 +99,4 @@ void pci_vtd_quirk(const struct pci_dev *); bool_t platform_supports_intremap(void); bool_t platform_supports_x2apic(void); -void vtd_set_hwdom_mapping(struct domain *d); - #endif // _VTD_EXTERN_H_ diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 1710256823..569ec4aec2 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -1304,12 +1304,6 @@ static void __hwdom_init intel_iommu_hwdom_init(struct domain *d) { struct acpi_drhd_unit *drhd; - if ( !iommu_passthrough && is_pv_domain(d) ) - { - /* Set up 1:1 page table for hardware domain. */ - vtd_set_hwdom_mapping(d); - } - setup_hwdom_pci_devices(d, setup_hwdom_device); setup_hwdom_rmrr(d); diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c index cc2bfea162..9971915349 100644 --- a/xen/drivers/passthrough/vtd/x86/vtd.c +++ b/xen/drivers/passthrough/vtd/x86/vtd.c @@ -32,11 +32,9 @@ #include "../extern.h" /* - * iommu_inclusive_mapping: when set, all memory below 4GB is included in dom0 - * 1:1 iommu mappings except xen and unusable regions. + * iommu_inclusive_mapping: superseded by iommu=inclusive. */ -static bool_t __hwdom_initdata iommu_inclusive_mapping = 1; -boolean_param("iommu_inclusive_mapping", iommu_inclusive_mapping); +boolean_param("iommu_inclusive_mapping", iommu_inclusive); void *map_vtd_domain_page(u64 maddr) { @@ -107,67 +105,3 @@ void hvm_dpci_isairq_eoi(struct domain *d, unsigned int isairq) } spin_unlock(&d->event_lock); } - -void __hwdom_init vtd_set_hwdom_mapping(struct domain *d) -{ - unsigned long i, j, tmp, top, max_pfn; - - BUG_ON(!is_hardware_domain(d)); - - max_pfn = (GB(4) >> PAGE_SHIFT) - 1; - top = max(max_pdx, pfn_to_pdx(max_pfn) + 1); - - for ( i = 0; i < top; i++ ) - { - unsigned long pfn = pdx_to_pfn(i); - bool map; - int rc = 0; - - /* - * Set up 1:1 mapping for dom0. Default to include only - * conventional RAM areas and let RMRRs include needed reserved - * regions. When set, the inclusive mapping additionally maps in - * every pfn up to 4GB except those that fall in unusable ranges. - */ - if ( pfn > max_pfn && !mfn_valid(_mfn(pfn)) ) - continue; - - if ( iommu_inclusive_mapping && pfn <= max_pfn ) - map = !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE); - else - map = page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL); - - if ( !map ) - continue; - - /* Exclude Xen bits */ - if ( xen_in_range(pfn) ) - continue; - - /* - * If dom0-strict mode is enabled then exclude conventional RAM - * and let the common code map dom0's pages. - */ - if ( iommu_dom0_strict && - page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) ) - continue; - - tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K); - for ( j = 0; j < tmp; j++ ) - { - int ret = iommu_map_page(d, pfn * tmp + j, pfn * tmp + j, - IOMMUF_readable|IOMMUF_writable); - - if ( !rc ) - rc = ret; - } - - if ( rc ) - printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU mapping failed: %d\n", - d->domain_id, rc); - - if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K)))) - process_pending_softirqs(); - } -} - diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h index 6b42e3b876..15d6584837 100644 --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -35,6 +35,7 @@ extern bool_t iommu_snoop, iommu_qinval, iommu_intremap, iommu_intpost; extern bool_t iommu_hap_pt_share; extern bool_t iommu_debug; extern bool_t amd_iommu_perdev_intremap; +extern bool iommu_inclusive; extern unsigned int iommu_dev_iotlb_timeout; _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |