[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Request for help: passing network statistics from netback driver to Xen scheduler.

On 31 Jul 2014, at 18:38, George Dunlap <George.Dunlap@xxxxxxxxxxxxx> wrote:

> Just as a comment: I think a potential problem with this approach is
> that you'll run into a positive feedback loop.  Processing network
> traffic takes a lot of CPU time; and in particular, it needs the
> ability to process packets in a *timely* manner.  Because most
> connections are TCP, the ability to do work now *creates* work later
> (in the form of more network packets).  Not having enough CPU, or
> being delayed in when it can process packets, even by 50ms, can
> significantly reduce the amount of traffic a VM gets.  So it's likely
> that a domain that currently has a high priority will be able to
> generate more traffic for itself, maintaining its high priority; and a
> domain that currently has a low priority will not be able to send acks
> fast enough, and will continue to receive low network traffic, thus
> maintaining its low priority.
> Something to watch out for, anyway. :-)

Thank you for your feedback! Iâm aware of the potential problem. However, as I 
have a tight deadline, 
Iâll address the issue when it arises. 

> Does the printk print the new value (i.e., "Intensity now [nnn] for
> domain M"), or just print that it's trying to do something (i.e.,
> "Incrementing network intensityâ)?

Itâs printing the new value. The xenvif_rx_action function looks like that 
after my modification:

void xenvif_rx_action(struct xenvif *vif)
        s8 status;
        u16 flags;
        struct xen_netif_rx_response *resp;
        struct sk_buff_head rxq;
        struct sk_buff *skb;
        int ret;
        int nr_frags;
        int count;
        unsigned long offset;
        struct skb_cb_overlay *sco;
        int need_to_notify = 0;
        struct shared_info *shared_info = HYPERVISOR_shared_info;

        struct netrx_pending_operations npo = {
                .copy  = vif->grant_copy_op,
                .meta  = vif->meta,


        count = 0;

        while ((skb = skb_dequeue(&vif->rx_queue)) != NULL) {
                vif = netdev_priv(skb->dev);
                nr_frags = skb_shinfo(skb)->nr_frags;

                sco = (struct skb_cb_overlay *)skb->cb;
                sco->meta_slots_used = xenvif_gop_skb(skb, &npo);

                count += nr_frags + 1;

                __skb_queue_tail(&rxq, skb);

                /* Filled the batch queue? */
                /* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
                if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)

        BUG_ON(npo.meta_prod > ARRAY_SIZE(vif->meta));

        if (!npo.copy_prod)

        BUG_ON(npo.copy_prod > MAX_GRANT_COPY_OPS);
        gnttab_batch_copy(vif->grant_copy_op, npo.copy_prod);

        while ((skb = __skb_dequeue(&rxq)) != NULL) {
                sco = (struct skb_cb_overlay *)skb->cb;

                vif = netdev_priv(skb->dev);

                if ((1 << vif->meta[npo.meta_cons].gso_type) &
                    vif->gso_prefix_mask) {
                        resp = RING_GET_RESPONSE(&vif->rx,

                        resp->flags = XEN_NETRXF_gso_prefix | 

                        resp->offset = vif->meta[npo.meta_cons].gso_size;
                        resp->id = vif->meta[npo.meta_cons].id;
                        resp->status = sco->meta_slots_used;


                vif->dev->stats.tx_bytes += skb->len;

                printk(KERN_EMERG "RX ACTION: %d %ld\n", vif->domid, 

                status = xenvif_check_gop(vif, sco->meta_slots_used, &npo);

                if (sco->meta_slots_used == 1)
                        flags = 0;
                        flags = XEN_NETRXF_more_data;

                if (skb->ip_summed == CHECKSUM_PARTIAL) /* local packet? */
                        flags |= XEN_NETRXF_csum_blank | 
                else if (skb->ip_summed == CHECKSUM_UNNECESSARY)
                        /* remote but checksummed. */
                        flags |= XEN_NETRXF_data_validated;

                offset = 0;
                resp = make_rx_response(vif, vif->meta[npo.meta_cons].id,
                                        status, offset,

                if ((1 << vif->meta[npo.meta_cons].gso_type) &
                    vif->gso_mask) {
                        struct xen_netif_extra_info *gso =
                                (struct xen_netif_extra_info *)

                        resp->flags |= XEN_NETRXF_extra_info;

                        gso->u.gso.type = vif->meta[npo.meta_cons].gso_type;
                        gso->u.gso.size = vif->meta[npo.meta_cons].gso_size;
                        gso->u.gso.pad = 0;
                        gso->u.gso.features = 0;

                        gso->type = XEN_NETIF_EXTRA_TYPE_GSO;
                        gso->flags = 0;

                xenvif_add_frag_responses(vif, status,
                                          vif->meta + npo.meta_cons + 1,

                RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&vif->rx, ret);

                if (ret)
                        need_to_notify = 1;


                npo.meta_cons += sco->meta_slots_used;

        if (need_to_notify)

        /* More work to do? */
        if (!skb_queue_empty(&vif->rx_queue))

> Do you really need this information to be "live" on a ms granularity?
> If not, you could have a process in dom0 wake up every several hundred
> ms, read the information from the netback thread, and then make calls
> to the scheduler to adjust priority.  It would be somewhat less
> responsive, but much easier to change (as you could simply recompile
> the dom0 process and restart it instead of having to reboot your
> host).

It doesnât have to be âliveâ, but hundred ms is probably too slow. I donât 
really mind rebooting 
the machine, as Iâve got a fairly automated setup for development. Iâm looking 
for the simplest
solution thatâd work :)

> If your goal is just to hack together something for your project, then
> doing the shared page info is probably fine.  But if you want to
> upstream anything, you'll probably have to take a different approach.

Iâm afraid my current version wouldnât be merged, itâs more of a prototype.
However, If Iâll get it working with some reasonable results, and thereâd be 
interest in this scheduler,
Iâd be more than happy to clean up the code and use whatever approach is the 
right one.

Kind regards,
Marcin DÅugajczyk

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.