[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 1/3] vpci: Hide capability when it fails to initialize



On Mon, Mar 31, 2025 at 09:32:02AM +0000, Chen, Jiqian wrote:
> On 2025/3/31 16:43, Roger Pau Monné wrote:
> > On Mon, Mar 31, 2025 at 07:26:20AM +0000, Chen, Jiqian wrote:
> >> On 2025/3/27 17:25, Roger Pau Monné wrote:
> >>> On Thu, Mar 27, 2025 at 03:32:12PM +0800, Jiqian Chen wrote: 
> >>>>  #endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
> >>>>  
> >>>> +static int vpci_init_cap_with_priority(struct pci_dev *pdev,
> >>>> +                                       const char *priority)
> >>>> +{
> >>>> +    for ( unsigned int i = 0; i < NUM_VPCI_INIT; i++ )
> >>>> +    {
> >>>> +        const vpci_capability_t *capability = __start_vpci_array[i];
> >>>> +        const unsigned int cap_id = capability->id;
> >>>> +        unsigned int pos;
> >>>> +        int rc;
> >>>> +
> >>>> +        if ( *(capability->priority) != *priority )
> >>>> +            continue;
> >>>> +
> >>>> +        if ( !capability->is_ext )
> >>>> +            pos = pci_find_cap_offset(pdev->sbdf, cap_id);
> >>>> +        else
> >>>> +            pos = pci_find_ext_capability(pdev->sbdf, cap_id);
> >>>> +
> >>>> +        if ( !pos )
> >>>> +            continue;
> >>>> +
> >>>> +        rc = capability->init(pdev);
> >>>> +
> >>>> +        if ( rc )
> >>>> +        {
> >>>> +            printk(XENLOG_WARNING "%pd %pp: cap init fail rc=%d, try to 
> >>>> hide\n",
> >>>> +                   pdev->domain, &pdev->sbdf, rc);
> >>>> +            rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
> >>>> +                                   pos, capability->is_ext ? 4 : 1, 
> >>>> NULL);
> >>>
> >>> Are you sure this works as intended? 
> >> Yes, I used failure test cases of init_msi/rebar.
> >> From the "lspci" result, they were hided from the dom0.
> >> But I forgot to test for domUs.
> > 
> > I assume that's only tested with Linux?  See my comment below about
> > capability ID 0 being reserved, and hence I think we should not keep
> > capabilities with ID 0 on the list, as it might cause malfunctions to
> > OSes.
> > 
> >>> The capability ID 0 is marked as "reserved" in the spec, so it's unclear 
> >>> to me how OSes would handle
> >>> finding such capability on the list - I won't be surprised if some
> >>> implementations decide to terminate the walk.  It's fine to mask the
> >>> capability ID for the ones that we don't want to expose, but there's
> >>> further work to do IMO.
> >>>
> >>> The usual way to deal with masking capabilities is to short circuit
> >>> the capability from the linked list, by making the previous capability
> >>> "Next Capability Offset" point to the next capability in the list,
> >>> thus skipping the current one. So:
> >>>
> >>> capability[n - 1].next_cap = capability[n].next_cap
> >>>
> >>> IOW: you will need to add the handler to the previous capability on
> >>> the list.  That's how it's already done in init_header().
> >> Oh, I got your opinion.
> >> But we may need to discuss this more.
> >> In my opinion, there should be two situations:
> >> First, if device belongs to hardware domain, there is no emulation of 
> >> legacy or extended capabilities linked list if I understand codes right.
> > 
> > Yes, but that needs to be fixed, we need to have this kind of
> > emulation uniformly.
> > 
> >> So, for this situation, I think current implementation of my patch is 
> >> enough for hiding legacy or extended capabilities.
> > 
> > It works given the current code in Linux.  As said above, I don't
> > think this is fully correct according to the PCI spec.
> > 
> >> Second, if device belongs to common domain, we just need to consider 
> >> legacy capabilities since all extended capabilities are hided in 
> >> init_header().
> >> So, for this situation, I need to what you said " capability[n - 
> >> 1].next_cap = capability[n].next_cap "
> > 
> > I'm not sure why would want to handle the hardware domain vs
> > unprivileged domains differently here.  The way to hide the
> > capabilities should always be the same, like it's currently done for
> > domUs.
> So, I need to refactor the emulating PCI capability list codes of 
> init_header() to serve
> for all domain(dom0+domUs) firstly, since current codes only emulate PCI 
> capability list for domUs, right?

Yes, that would be my understanding.  The current logic in
init_header() needs to be expanded/generalized so it can be used for
masking random PCI capabilities, plus adapted to work with PCIe
capabilities also.

> > 
> >> I am not sure if above are right.
> >>>
> >>>> +            if ( rc )
> >>>> +            {
> >>>> +                printk(XENLOG_ERR "%pd %pp: fail to hide cap rc=%d\n",
> >>>> +                       pdev->domain, &pdev->sbdf, rc);
> >>>> +                return rc;
> >>>> +            }
> >>>> +        }
> >>>> +    }
> >>>> +
> >>>> +    return 0;
> >>>> +}
> >>>> +
> >>>>  void vpci_deassign_device(struct pci_dev *pdev)
> >>>>  {
> >>>>      unsigned int i;
> >>>> @@ -128,7 +169,6 @@ void vpci_deassign_device(struct pci_dev *pdev)
> >>>>  
> >>>>  int vpci_assign_device(struct pci_dev *pdev)
> >>>>  {
> >>>> -    unsigned int i;
> >>>>      const unsigned long *ro_map;
> >>>>      int rc = 0;
> >>>>  
> >>>> @@ -159,12 +199,19 @@ int vpci_assign_device(struct pci_dev *pdev)
> >>>>          goto out;
> >>>>  #endif
> >>>>  
> >>>> -    for ( i = 0; i < NUM_VPCI_INIT; i++ )
> >>>> -    {
> >>>> -        rc = __start_vpci_array[i](pdev);
> >>>> -        if ( rc )
> >>>> -            break;
> >>>> -    }
> >>>> +    /*
> >>>> +     * Capabilities with high priority like MSI-X need to
> >>>> +     * be initialized before header
> >>>> +     */
> >>>> +    rc = vpci_init_cap_with_priority(pdev, VPCI_PRIORITY_HIGH);
> >>>> +    if ( rc )
> >>>> +        goto out;
> >>>
> >>> I understand this is not introduced by this change, but I wonder if
> >>> there could be a way to ditch the priority stuff for capabilities,
> >>> specially now that we only have two "priorities": before or after PCI
> >>> header initialization.
> >> I have an idea, but it seems like a hake.
> >> Can we add a flag(maybe name it "msix_initialized") to struct vpci{}?
> >> Then in vpci_make_msix_hole(), we can first check that flag, if it is 
> >> false, we return an error to let modify_decoding() directly return in the 
> >> process of init_header.
> >> And in the start of init_msix(), to set msix_initialized=true, in the end 
> >> of init_msix(), to call modify_decoding() to setup p2m.
> >> Then we can remove the priorities.
> > 
> > Maybe the initialization of the MSI-X capability could be done after
> > the header, and call vpci_make_msix_hole()?  There's a bit of
> > redundancy here in that the BAR is first fully mapped, and then a hole
> > is punched in place of the MSI-X related tables.  Seems like the
> > easier option to break the depedency of init_msix() in being called
> > ahead of init_header().
> You mean the sequence should be:
> vpci_init_header()
> vpci_init_capability() // all capabilities
> vpci_make_msix_hole()
> 
> Right?

Yes, I think that would be my preference.  The call to
vpci_make_msix_hole() should be inside of init_msix().

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.