[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] OVS Netlink zerocopy vs Xen netback zerocopy


Currently I'm working on a patchset which reintroduces grant mapping into netback. We used it before Linux Xen bits were upstreamed, but we had to change to grant copy as the original solution were fundamentally not upstreamable. But the advantage would be huge, as we could replace copy guest pages by Xen to mapping guest pages to Dom0. Parallel to this I'm working on a grant mapping optimization, which makes it possible to avoid m2p_override for grant mapped pages. It causes lock contention and we don't need it if the pages doesn't go to userspace. This could be a safe assumption, as those pages would stay in kernel space while switched by OVS, and if they end up on the local port, delivered to Dom0 IP stack, deliver_skb will call skb_orphan_frags which swaps out those foreign (=grant mapped from guest) pages by local copies and notify netback through a callback that it can give back the pages to the guest.

And after that bit long introduction here comes the main question: OVS recently introduced Netlink zerocopy, which by my understanding means that Netlink messages from kernel are not copied but mapped to userspace. And such message can contain a whole packet if it haven't matched any flows in the kernel, or the flow action said so. As far as I saw skb_zerocopy will clone the frags from the real packet skb to the Netlink skb. Note, the linear buffer is local memory in netback case as well, we copy the beginning of the packet (max 128 bytes) there, only the pages on frags are foreign ones. I don't know the internals of Netlink that much, how a packet is forwarded up in this case, but that concerns me, as if the pages on the skb_shinfo(skb)->frags array are still the foreign ones, and userspace wants to touch that data, we are in trouble. If this is the scenario, I think the best would be to call skb_orphan_frags before skb_zerocopy in queue_userspace_packet, so the frags will become local. Fortunately this is a corner case, as it shouldn't happen very often that the kernel sends up packets bigger than 128 bytes.

What do you think about the solution in the last paragraph? Or do we need it at all?



Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.