[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v1 1/3] vpci: Hide capability when it fails to initialize
On 2025/3/31 16:43, Roger Pau Monné wrote: > On Mon, Mar 31, 2025 at 07:26:20AM +0000, Chen, Jiqian wrote: >> On 2025/3/27 17:25, Roger Pau Monné wrote: >>> On Thu, Mar 27, 2025 at 03:32:12PM +0800, Jiqian Chen wrote: >>>> #endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */ >>>> >>>> +static int vpci_init_cap_with_priority(struct pci_dev *pdev, >>>> + const char *priority) >>>> +{ >>>> + for ( unsigned int i = 0; i < NUM_VPCI_INIT; i++ ) >>>> + { >>>> + const vpci_capability_t *capability = __start_vpci_array[i]; >>>> + const unsigned int cap_id = capability->id; >>>> + unsigned int pos; >>>> + int rc; >>>> + >>>> + if ( *(capability->priority) != *priority ) >>>> + continue; >>>> + >>>> + if ( !capability->is_ext ) >>>> + pos = pci_find_cap_offset(pdev->sbdf, cap_id); >>>> + else >>>> + pos = pci_find_ext_capability(pdev->sbdf, cap_id); >>>> + >>>> + if ( !pos ) >>>> + continue; >>>> + >>>> + rc = capability->init(pdev); >>>> + >>>> + if ( rc ) >>>> + { >>>> + printk(XENLOG_WARNING "%pd %pp: cap init fail rc=%d, try to >>>> hide\n", >>>> + pdev->domain, &pdev->sbdf, rc); >>>> + rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL, >>>> + pos, capability->is_ext ? 4 : 1, NULL); >>> >>> Are you sure this works as intended? >> Yes, I used failure test cases of init_msi/rebar. >> From the "lspci" result, they were hided from the dom0. >> But I forgot to test for domUs. > > I assume that's only tested with Linux? See my comment below about > capability ID 0 being reserved, and hence I think we should not keep > capabilities with ID 0 on the list, as it might cause malfunctions to > OSes. > >>> The capability ID 0 is marked as "reserved" in the spec, so it's unclear to >>> me how OSes would handle >>> finding such capability on the list - I won't be surprised if some >>> implementations decide to terminate the walk. It's fine to mask the >>> capability ID for the ones that we don't want to expose, but there's >>> further work to do IMO. >>> >>> The usual way to deal with masking capabilities is to short circuit >>> the capability from the linked list, by making the previous capability >>> "Next Capability Offset" point to the next capability in the list, >>> thus skipping the current one. So: >>> >>> capability[n - 1].next_cap = capability[n].next_cap >>> >>> IOW: you will need to add the handler to the previous capability on >>> the list. That's how it's already done in init_header(). >> Oh, I got your opinion. >> But we may need to discuss this more. >> In my opinion, there should be two situations: >> First, if device belongs to hardware domain, there is no emulation of legacy >> or extended capabilities linked list if I understand codes right. > > Yes, but that needs to be fixed, we need to have this kind of > emulation uniformly. > >> So, for this situation, I think current implementation of my patch is enough >> for hiding legacy or extended capabilities. > > It works given the current code in Linux. As said above, I don't > think this is fully correct according to the PCI spec. > >> Second, if device belongs to common domain, we just need to consider legacy >> capabilities since all extended capabilities are hided in init_header(). >> So, for this situation, I need to what you said " capability[n - 1].next_cap >> = capability[n].next_cap " > > I'm not sure why would want to handle the hardware domain vs > unprivileged domains differently here. The way to hide the > capabilities should always be the same, like it's currently done for > domUs. So, I need to refactor the emulating PCI capability list codes of init_header() to serve for all domain(dom0+domUs) firstly, since current codes only emulate PCI capability list for domUs, right? > >> I am not sure if above are right. >>> >>>> + if ( rc ) >>>> + { >>>> + printk(XENLOG_ERR "%pd %pp: fail to hide cap rc=%d\n", >>>> + pdev->domain, &pdev->sbdf, rc); >>>> + return rc; >>>> + } >>>> + } >>>> + } >>>> + >>>> + return 0; >>>> +} >>>> + >>>> void vpci_deassign_device(struct pci_dev *pdev) >>>> { >>>> unsigned int i; >>>> @@ -128,7 +169,6 @@ void vpci_deassign_device(struct pci_dev *pdev) >>>> >>>> int vpci_assign_device(struct pci_dev *pdev) >>>> { >>>> - unsigned int i; >>>> const unsigned long *ro_map; >>>> int rc = 0; >>>> >>>> @@ -159,12 +199,19 @@ int vpci_assign_device(struct pci_dev *pdev) >>>> goto out; >>>> #endif >>>> >>>> - for ( i = 0; i < NUM_VPCI_INIT; i++ ) >>>> - { >>>> - rc = __start_vpci_array[i](pdev); >>>> - if ( rc ) >>>> - break; >>>> - } >>>> + /* >>>> + * Capabilities with high priority like MSI-X need to >>>> + * be initialized before header >>>> + */ >>>> + rc = vpci_init_cap_with_priority(pdev, VPCI_PRIORITY_HIGH); >>>> + if ( rc ) >>>> + goto out; >>> >>> I understand this is not introduced by this change, but I wonder if >>> there could be a way to ditch the priority stuff for capabilities, >>> specially now that we only have two "priorities": before or after PCI >>> header initialization. >> I have an idea, but it seems like a hake. >> Can we add a flag(maybe name it "msix_initialized") to struct vpci{}? >> Then in vpci_make_msix_hole(), we can first check that flag, if it is false, >> we return an error to let modify_decoding() directly return in the process >> of init_header. >> And in the start of init_msix(), to set msix_initialized=true, in the end of >> init_msix(), to call modify_decoding() to setup p2m. >> Then we can remove the priorities. > > Maybe the initialization of the MSI-X capability could be done after > the header, and call vpci_make_msix_hole()? There's a bit of > redundancy here in that the BAR is first fully mapped, and then a hole > is punched in place of the MSI-X related tables. Seems like the > easier option to break the depedency of init_msix() in being called > ahead of init_header(). You mean the sequence should be: vpci_init_header() vpci_init_capability() // all capabilities vpci_make_msix_hole() Right? > > Completely unrelated: looking at vpci_make_msix_hole() I see the call > in modify_decoding() is redundant, as modify_bars() already craves the > MSI-X regions out of the BARs. > > Thanks, Roger. -- Best regards, Jiqian Chen.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |