[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] RE: VT-d device assignment may fail (regression from Xen c/s 19805:2f1fa2215e60)
>-----Original Message----- >From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx] >Sent: Friday, October 22, 2010 6:13 PM >To: Han, Weidong >Cc: Jiang, Yunhong; xen-devel@xxxxxxxxxxxxxxxxxxx >Subject: VT-d device assignment may fail (regression from Xen c/s >19805:2f1fa2215e60) > >Weidong, > >in this patch you removed a bus/devfn check around an invocation of >domain_context_mapping_one() avoiding the attempt to call the >function again if it was already called for this very device. This >removal, however, conflicts with the context_present() check at the >top of domain_context_mapping_one() - in particular, pdev->domain >isn't set to the new owner yet, and hence the function fails. Weidong is on travel, and he may give more comments when is back tomorrow. What's the removed bus/devfn check you mean? I didn't catch it in the patch. > >The question now is whether some similar check should be restored, >or whether pdev->domain should get updated earlier. This may >need some additional consideration, since from looking at the code >I would say that reassign_device_ownership() needs some error >handling improvements too: Currently, partial failure isn't being >handled properly at all (respective devices are left in a half way >state - no longer properly assigned to Dom0, but also not yet >assigned to DomU). > >I also wonder what guarantee there is for a device to exist at ><secbus>:00.0 (since if there is none, the same context_present() >check could at least theoretically again lead to problems as it >checks for pci_get_pdev() returning non-NULL). Hmm, the function 0 should always exists, but I didn't find in spec that device 0 should always be populated. > >Finally, isn't it inconsistent that only the original device gets its >->domain set to the new owner and gets moved to that domain's >device list, but neither the upstream bridge nor that bridge's ><secbus>:00.0 get handled the same way? What if below that Per my understanding, the bridge and the <secbus>:00.0 is only for PCI device because all PCI device behind the same pcie2pci bridge should be assigned to the same domain. So if a device is assigned to a domain, the bridge and the <secbus>:00.0 should be the same, so it is not that neccessary to keep that information for the bridge and <secbus>:0.0 . But seems current implementation missed something, Weidong, correct me please, if I'm wrong. 1) Currently Xen hypervisor does not gurantee the "atomic" assignment of device. I assume this is done by tools currently. But if tools does not guard this, it may cause problem in xen hypervisor. For example, if tools assign PCI device A to domain A, and then it try to assign PCI device B (in the same bus as device A) to domain B, the second assignment (to domain B) will fail because the assign to the pci bridge fail, and thus leave the device B in half way, As Jan stated above. 2) If a device is hot-added, the hot-added device is owned by domain0 by default, that may cause issue. >bridge a device gets hot-added? Wouldn't that device >incorrectly end up in Dom0, with no failures because the bridge >still appears to be owned by Dom0 while it really isn't? Yes, that has trouble. The device should be hide to dom0. Thanks --jyh > >Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |