|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 5/7] vpci: add SR-IOV support for PVH Dom0
On 4/21/26 17:43, Jan Beulich wrote:
> On 09.04.2026 16:01, Mykyta Poturai wrote:
>> From: Stewart Hildebrand <stewart.hildebrand@xxxxxxx>
>>
>> This code is expected to only be used by privileged domains,
>> unprivileged domains should not get access to the SR-IOV capability.
>>
>> Implement RW handlers for PCI_SRIOV_CTRL register to dynamically
>> map/unmap VF BARS. Recalculate BAR sizes before mapping VFs to account
>> for possible changes in the system page size register. Also force VFs to
>> always use emulated reads for command register, this is needed to
>> prevent some drivers accidentally unmapping BARs.
>
> This apparently refers to the change to vpci_init_header(). Writes are
> already intercepted. How would a read lead to accidental BAR unmap? Even
> for writes I don't see how a VF driver could accidentally unmap BARs, as
> the memory decode bit there is hardwired to 0.
>
>> Discovery of VFs is
>> done by Dom0, which must register them with Xen.
>
> If we intercept control register writes, why would we still require
> Dom0 to report the VFs that appear?
>
Sorry, I don't understand this question. You specifically requested this
to be done this way in V2. Quoting your reply from V2 below.
> Aren't you effectively busy-waiting for these 100ms, by simply
returning "true"
> from vpci_process_pending() until the time has passed? This imo is a
no-go. You
> want to set a timer and put the vCPU to sleep, to wake it up again
when the
> timer has expired. That'll then eliminate the need for the
not-so-nice patch 4.
> Question is whether we need to actually go this far (right away). I
expect you
> don't mean to hand PFs to DomU-s. As long as we keep them in the hardware
> domain, can't we trust it to set things up correctly, just like we
trust it in
> a number of other aspects?
> Jan
>> +static int map_vfs(const struct pci_dev *pf_pdev, uint16_t cmd)
>> +{
>> + struct pci_dev *vf_pdev;
>> + int rc;
>> +
>> + ASSERT(rw_is_write_locked(&pf_pdev->domain->pci_lock));
>> +
>> + list_for_each_entry(vf_pdev, &pf_pdev->vf_list, vf_list)
>> + {
>> + rc = vpci_modify_bars(vf_pdev, cmd, false);
>> + if ( rc )
>> + {
>> + gprintk(XENLOG_ERR, "failed to %s VF %pp: %d\n",
>> + (cmd & PCI_COMMAND_MEMORY) ? "map" : "unmap",
>> + &vf_pdev->sbdf, rc);
>> + return rc;
>> + }
>> +
>> + vf_pdev->vpci->header.guest_cmd &= ~PCI_COMMAND_MEMORY;
>> + vf_pdev->vpci->header.guest_cmd |= (cmd & PCI_COMMAND_MEMORY);
>
> As mentioned elsewhere as well, this bit is supposed to be 0 for VFs.
There are some devices that expose VFs with the same VID/DID as in the
PF, causing Linux to use normal driver for them and threat them like
normal devices. At some point, those normal drivers try to do a
read-modify-update of the command register and end up writing 0 to
PCI_COMMAND_MEMORY, causing cmd_write to unmap the BARS of that device.
I am not sure, maybe it would be better to just ignore cmd writes for VFs?
>> + sriov_pos = pci_find_ext_capability(pf_pdev, PCI_EXT_CAP_ID_SRIOV);
>> + ctrl = pci_conf_read16(pf_pdev->sbdf, sriov_pos + PCI_SRIOV_CTRL);
>> +
>> + if ( (pf_pdev->domain == vf_pdev->domain) && (ctrl &
>> PCI_SRIOV_CTRL_MSE) )
>> + {
>> + rc = vpci_modify_bars(vf_pdev, PCI_COMMAND_MEMORY, false);
>
> Doesn't VF enable also need to be set for the BARs to be mapped?
I don't think so. Enabling memory space logically maps very well to
mapping memory to the guest. I don’t see any benefit of also requiring
VFE bit here.
--
Mykyta
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |