On 07/28/2010 03:36 PM, Konrad Rzeszutek Wilk wrote:
The really clean solution would be to virtualize the ACPI table for dom0 and remove the DMAR entry in this version. This would require some major work, I guess (clone at least the BIOS page containing the ACPI anchor and present a modified version to dom0).Well, that is what it does right now. It zeros it out so that the DMAR entry is gone from the ACPI tables.
No. It changes the ORIGINAL ACPI table, not a copy of it.
I am not really sure that having a DMAR accessible to Dom0 is good. You would have two entities trying to write to the DMAR's to control the IOMMU and the PCI devices. Does Xen enable the IOMMU? Do you see that in the serial log?
I don't want to let dom0 access DMAR. I want the crash kernel be able to access it. And I think Xen does enable the IOMMU: (XEN) HVM: ASIDs enabled. (XEN) HVM: VMX enabled (XEN) HVM: Hardware Assisted Paging detected. (XEN) Intel machine check reporting enabled (XEN) Intel VT-d Snoop Control supported. (XEN) Intel VT-d DMA Passthrough not supported. (XEN) Intel VT-d Queued Invalidation supported. (XEN) Intel VT-d Interrupt Remapping supported. (XEN) I/O virtualisation enabled (XEN) I/O virtualisation for PV guests disabled (XEN) x2APIC mode enabled.
The crash kernel expects a valid DMAR entry, as following code in enable_IR_x2apic() suggests:I don't know what that function does, nor how the error path below depends on DMAR. DMAR isn't mentioned in the below code.Sorry, here a larger fragment (source arch/x86/kernel/apic/apic.c): /* IR is required if there is APIC ID> 255 even when running * under KVM */ if (max_physical_apicid> 255 || !kvm_para_available()) goto nox2apic;The if stmt is confusing. Also, what would happen if this kernel was booted on a system without VT-d (and hence no DMAR)? Presumably it *can* boot in a DMAR-less environment -- there must be something odd going on for it to end on this path for us.Yeah, that puzzled me, too.What is the crash? And do you see any indiciation that x2APIC is turned on? Do provide a serial log please.
Log is attached. I did some more testing. The problem occurred on a Nehalem-EX system. I tried the same on a Nehalem-EP system and all was okay. I suspect some further problems in the ACPI tables of the EX system now. I'm not too familiar with ACPI tables. Anything I can do for further analysis? Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@xxxxxxxxxxxxxx Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
Description: Text Data
_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel