[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] xen/virtio: Handle PCI devices which Host controller is described in DT
On 19.10.22 23:38, Stefano Stabellini wrote: Hello Stefano > + Vikram > > On Wed, 19 Oct 2022, Oleksandr Tyshchenko wrote: >>>> Regarding the virtio-mmio (platform) devices, yes, we could expose them >>>> with status "disabled", and they won't get probed by default. >>>> To be honest, I have experimented with that, when I was thinking of >>>> possible hotplug for virtio-mmio devices (I know, this sounds uncommon >>>> and strange). >>>> I used Linux feature (CONFIG_OF_DYNAMIC, overlays) to update the >>>> device-tree on running guest, so the toolstack initially inserts >>>> virtio-mmio device nodes for non-boot devices >>>> with status "disabled", and at the runtime, once we receive an event for >>>> example, we change the status to "ok" and the corresponding virtio-mmio >>>> device gets probed. >>>> But again, it is not a 100% hotplug, as we need to pre-allocate memory >>>> range and interrupt in advance (when generating guest device tree). >>> Actually this is really cool! Does it work? It doesn't matter to me if >>> the virtio devices are pci or mmio as long as we can solve the "wait" >>> problem. So this could be a good solution. >> >> ... yes, it does. Initially I experimented with virtio-mmio devices, but >> today I tried with PCI host bridge as well. >> I won't describe the commands which I used to apply/remove device-tree >> overlays from the userspace as well as the context of >> dtso files I created, I will describe how that could be done from the >> kernel by using existing functionality (CONFIG_OF_DYNAMIC). >> >> As I said if we exposed the devices with status "disabled", they >> wouldn't get probed by default. Once we receive an signal >> that otherend is ready, we change the status to "ok" and the >> corresponding device gets probed. >> >> So below the test patch, which just change the status of the required >> device-tree node (as you can see the code to update the property is >> simple enough), >> I hacked "xl sysrq" for the convenience of testing. >> >> >> diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c >> index 045c1805b2d5..9683ce075bc9 100644 >> --- a/drivers/xen/grant-dma-ops.c >> +++ b/drivers/xen/grant-dma-ops.c >> @@ -10,6 +10,7 @@ >> #include <linux/module.h> >> #include <linux/dma-map-ops.h> >> #include <linux/of.h> >> +#include <linux/of_platform.h> >> #include <linux/pci.h> >> #include <linux/pfn.h> >> #include <linux/xarray.h> >> @@ -379,6 +380,108 @@ bool xen_is_grant_dma_device(struct device *dev) >> return false; >> } >> >> +/* TODO: Consider using statically allocated (struct property status) */ >> +static int xen_grant_dma_enable_device(struct device_node *np) >> +{ >> + struct property *status; >> + >> + status = kzalloc(sizeof(*status), GFP_KERNEL); >> + if (!status) >> + return -ENOMEM; >> + >> + status->name = kstrdup("status", GFP_KERNEL); >> + if (!status->name) >> + return -ENOMEM; >> + >> + status->value = kstrdup("okay", GFP_KERNEL); >> + if (!status->value) >> + return -ENOMEM; >> + >> + status->length = sizeof("okay"); >> + >> + return of_update_property(np, status); >> +} >> + >> +static int xen_grant_dma_disable_device(struct device_node *np) >> +{ >> + struct property *status; >> + >> + status = kzalloc(sizeof(*status), GFP_KERNEL); >> + if (!status) >> + return -ENOMEM; >> + >> + status->name = kstrdup("status", GFP_KERNEL); >> + if (!status->name) >> + return -ENOMEM; >> + >> + status->value = kstrdup("disabled", GFP_KERNEL); >> + if (!status->value) >> + return -ENOMEM; >> + >> + status->length = sizeof("disabled"); >> + >> + return of_update_property(np, status); >> +} >> + >> +void xen_grant_dma_handle_sysrq(int key) >> +{ >> + struct device_node *np; >> + const char *path; >> + bool enable; >> + >> + printk("%s: got key %d\n", __func__, key); >> + >> + switch (key) { >> + case '0': >> + path = "/virtio@2000000"; >> + enable = true; >> + break; >> + >> + case '1': >> + path = "/virtio@2000200"; >> + enable = true; >> + break; >> + >> + case '2': >> + path = "/virtio@2000000"; >> + enable = false; >> + break; >> + >> + case '3': >> + path = "/virtio@2000200"; >> + enable = false; >> + break; >> + >> + case '4': >> + path = "/pcie@10000000"; >> + enable = true; >> + break; >> + >> + case '5': >> + path = "/pcie@10000000"; >> + enable = false; >> + break; >> + >> + default: >> + printk("%s: wrong key %d\n", __func__, key); >> + return; >> + } >> + >> + np = of_find_node_by_path(path); >> + if (!np) { >> + printk("%s: failed to find node by path %s\n", __func__, >> path); >> + return; >> + } >> + >> + if (enable) { >> + xen_grant_dma_enable_device(np); >> + printk("%s: enable %s\n", __func__, path); >> + } else { >> + xen_grant_dma_disable_device(np); >> + printk("%s: disable %s\n", __func__, path); >> + } >> +} >> + >> bool xen_virtio_mem_acc(struct virtio_device *dev) >> { >> if (IS_ENABLED(CONFIG_XEN_VIRTIO_FORCE_GRANT) || xen_pv_domain()) >> diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c >> index c16df629907e..6df96be1ea40 100644 >> --- a/drivers/xen/manage.c >> +++ b/drivers/xen/manage.c >> @@ -308,7 +308,8 @@ static void sysrq_handler(struct xenbus_watch >> *watch, const char *path, >> goto again; >> >> if (sysrq_key != '\0') >> - handle_sysrq(sysrq_key); >> + /*handle_sysrq(sysrq_key);*/ >> + xen_grant_dma_handle_sysrq(sysrq_key); >> } >> >> static struct xenbus_watch sysrq_watch = { >> diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h >> index a34f4271a2e9..c2da1bc24091 100644 >> --- a/include/xen/xen-ops.h >> +++ b/include/xen/xen-ops.h >> @@ -215,6 +215,8 @@ static inline void xen_preemptible_hcall_end(void) { } >> >> #endif /* CONFIG_XEN_PV && !CONFIG_PREEMPTION */ >> >> +void xen_grant_dma_handle_sysrq(int key); >> + >> #ifdef CONFIG_XEN_GRANT_DMA_OPS >> void xen_grant_setup_dma_ops(struct device *dev); >> bool xen_is_grant_dma_device(struct device *dev); >> (END) >> >> So how it looks like: >> >> 1. DomU boots without PCI Host bridge probed. So nothing PCI related is >> observed in DomU. >> >> cat /proc/device-tree/pcie@10000000/status >> disabled >> >> 2. I run backends in DomD and after that issue a signal to "enable" >> >> root@generic-armv8-xt-dom0:~# xl sysrq DomU 4 >> >> 3. The PCI Host bridge is probed, and all required PCI devices are >> discovered >> >> root@generic-armv8-xt-dom0:~# xl console DomU >> [ 237.407620] xen_grant_dma_handle_sysrq: got key 52 >> [ 237.408133] pci-host-generic 10000000.pcie: host bridge >> /pcie@10000000 ranges: >> [ 237.408186] pci-host-generic 10000000.pcie: MEM >> 0x0023000000..0x0032ffffff -> 0x0023000000 >> [ 237.408231] pci-host-generic 10000000.pcie: MEM >> 0x0100000000..0x01ffffffff -> 0x0100000000 >> [ 237.408313] pci-host-generic 10000000.pcie: ECAM at [mem >> 0x10000000-0x1fffffff] for [bus 00-ff] >> [ 237.408451] pci-host-generic 10000000.pcie: PCI host bridge to bus >> 0000:00 >> [ 237.408490] pci_bus 0000:00: root bus resource [bus 00-ff] >> [ 237.408517] pci_bus 0000:00: root bus resource [mem >> 0x23000000-0x32ffffff] >> [ 237.408545] pci_bus 0000:00: root bus resource [mem >> 0x100000000-0x1ffffffff pref] >> [ 237.409043] pci 0000:00:00.0: [1b36:0008] type 00 class 0x060000 >> [ 237.458045] pci 0000:00:01.0: [1af4:1041] type 00 class 0x020000 >> [ 237.502588] pci 0000:00:01.0: reg 0x20: [mem 0x00000000-0x00003fff >> 64bit pref] >> [ 237.507475] pci 0000:00:02.0: [1af4:1042] type 00 class 0x010000 >> [ 237.552706] pci 0000:00:02.0: reg 0x20: [mem 0x00000000-0x00003fff >> 64bit pref] >> [ 237.559847] pci 0000:00:01.0: BAR 4: assigned [mem >> 0x100000000-0x100003fff 64bit pref] >> [ 237.560411] pci 0000:00:02.0: BAR 4: assigned [mem >> 0x100004000-0x100007fff 64bit pref] >> [ 237.563324] virtio-pci 0000:00:01.0: Set up Xen grant DMA ops (rid >> 0x8 sid 0x1) >> [ 237.564833] virtio-pci 0000:00:01.0: enabling device (0000 -> 0002) >> [ 237.582734] virtio-pci 0000:00:02.0: Set up Xen grant DMA ops (rid >> 0x10 sid 0x1) >> [ 237.583413] virtio-pci 0000:00:02.0: enabling device (0000 -> 0002) >> [ 237.595712] virtio_blk virtio1: 4/0/0 default/read/poll queues >> [ 237.596227] virtio_net virtio0 enp0s1: renamed from eth1 >> [ 237.602499] virtio_blk virtio1: [vda] 4096000 512-byte logical blocks >> (2.10 GB/1.95 GiB) >> [ 237.606317] xen_grant_dma_handle_sysrq: enable /pcie@10000000 >> >> 4. The same way the pseudo-hotremove would work (if we change the status >> to "disabled" the corresponding device gets removed) >> >> >> If this pseudo-hotplug sounds appropriate for the dom0less, > This is great! Yes I think it is totally acceptable. Perfect! > > >> the one of the next questions would be what mechanism to use for >> signalling (event, xenstore, whatever). > For your information, we had to solve a similar issue a few months ago > to let a domU discover a newly added and directly assigned programmable > logic block. That was also done by applying DT overlays, first to Xen, > then to the domU. > > Have a look at Vikram's Xen Summit presentation: > https://urldefense.com/v3/__https://static.sched.com/hosted_files/xen2022/e8/Introduce*20Dynamic*20Device*20Node*20Programming*20for*20Xen.pdf__;JSUlJSUl!!GF_29dbcQIUBPA!yTJGjmmfrdJfoc-magnbk6EcuNspXxuQyDMzItgaV9mehJMj37w7Go_O9tDCQV3Qpij0PORZdiZsjBYROsaRNWVjPNUucg$ > [static[.]sched[.]com] > > We wrote a small xenstore-based protocol to notify the domU and also to > tranfer the overlay to it: > > https://urldefense.com/v3/__https://github.com/Xilinx/xen/blob/xlnx_rebase_4.16/docs/misc/arm/overlay.txt__;!!GF_29dbcQIUBPA!yTJGjmmfrdJfoc-magnbk6EcuNspXxuQyDMzItgaV9mehJMj37w7Go_O9tDCQV3Qpij0PORZdiZsjBYROsaRNWXs7xlouA$ > [github[.]com] > https://urldefense.com/v3/__https://github.com/Xilinx/xen/blob/xlnx_rebase_4.16/tools/helpers/get_overlay.c__;!!GF_29dbcQIUBPA!yTJGjmmfrdJfoc-magnbk6EcuNspXxuQyDMzItgaV9mehJMj37w7Go_O9tDCQV3Qpij0PORZdiZsjBYROsaRNWXoR3gEjw$ > [github[.]com] > > There is a good description starting at slide 16 in the PDF. > > > I am only sharing this as FYI. Very interesting materials and impressive work! I can't even imagine what it took to get it working. > This Virtio problem is simpler because we > already know the devices that are going to become available. We don't > need an actual DT overlay to be passed to the domU. So we could get away > with just a single interrupt or a single xenstore property. I completely agree that for virtio it is going to be simpler than for FPGA. > > >> Note that signal should only be sent if all backends which serve >> virtio-pci devices within that PCI Host bridge are ready. > Yes. That should be fine as long as all the backends are in the same > domain. I can imagine there could be difficulties if the backends are > in different domains: backend-domain-1 would have to tell dom0 that it > is ready, then backend-domain-2 would have to do the same, then dom0 > finally notifies the domU, or something like that. > > Anyway, I think this is good enough to start as a solution. Excellent! > > >>>>> Other ideas? >>>> Another (crazy?) idea is to reuse CONFIG_XEN_VIRTIO_FORCE_GRANT for >>>> dom0less system (I mean without "xen,grant-dma" bindings at all). >>>> If virtio backends are always going to run in Dom0 when we have it up >>>> and running, then it should work as domid == 0 is reserved for Dom0. >>>> If there is a need to run virtio backends in other *backend* domain (for >>>> the domain ID to be always known we could reserve an ID for it, so it >>>> would be a const value), >>>> we could probably introduce something configurable like >>>> CONFIG_XEN_VIRTIO_FORCE_GRANT_BE_DOMID with 0 by default (or cmd line >>>> option). >>> The problem in a dom0less system is not much how to tell which is the >>> backend domid, because that is known in advance and could be added to >>> device tree at boot somehow. The issue is how to ask the frontend to >>> "wait" and then how to tell the frontend to "proceed" after the backend >>> comes online. >> please see above. >> >> >> To summarize: >> >> 1. For normal case there is no problem with communicating the backend >> domid on Arm with device-tree (neither for virtio-mmio nor for virtio-pci), >> for the virtio-pci the V2 (PCI-IOMMU bindings) should be used. For the >> dom0less there won't be problem also as I understood from the discussion >> (as the configuration is known in advance). >> So I propose to concentrate on V2. > Yes I agree > > >> 2. The problem is in supporting virtio for the dom0less in general >> despite whether it is a foreign or grant mappings. >> Here we would need a (pseudo-)hotplug or some other method to start >> operating only when backend is available. > Yes I think you are right -- Regards, Oleksandr Tyshchenko
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |