[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 01/10] xen/arm: vpl011: Add pl011 uart emulation in Xen



On Wed, 19 Apr 2017, Bhupinder Thakur wrote:
> Hi Stefano,
> 
> Thanks for your comments.
> 
> On 19 April 2017 at 05:45, Stefano Stabellini <sstabellini@xxxxxxxxxx> wrote:
> >> +static void vpl011_read_data(struct domain *d, uint8_t *data)
> >> +{
> >> +    unsigned long flags;
> >> +    struct vpl011_s *vpl011=&d->arch.vpl011;
> >
> > The code style is:
> >
> >     vpl011 = &d->arch.vpl011
> >
> >
> >> +    struct xencons_interface *intf=(struct xencons_interface 
> >> *)vpl011->ring_buf;
> >
> > Same here. Please fix it everywhere.
> >
> I will fix the coding style here.
> 
> >
> >> +    /*
> >> +     * Initialize the data so that even if there is no data in ring buffer
> >> +     * 0 is returned.
> >> +     */
> >> +    *data = 0;
> >> +
> >> +    VPL011_LOCK(d, flags);
> >> +
> >> +    /*
> >> +     * It is expected that there will be data in the ring buffer when this
> >> +     * function is called since the guest is expected to read the data 
> >> register
> >> +     * only if the TXFE flag is not set.
> >> +     * If the guest still does read event when TXFE bit is set then 0 will
> >> +     * be returned.
> >> +     */
> >> +    if ( !VPL011_IN_RING_EMPTY(intf) )
> >> +    {
> >> +        *data = intf->in[MASK_XENCONS_IDX(intf->in_cons++, intf->in)];
> >
> > For correctness, it is important to maintain the right ordering and use
> > barriers:
> >
> >         in_cons = intf->in_cons;
> >         *data = intf->in[MASK_XENCONS_IDX(in_cons, intf->in)];
> >         smb_mb();
> >         intf->in_cons = in_cons + 1;
> >
> >
> >> +    }
> >> +
> >> +    if ( VPL011_IN_RING_EMPTY(intf) )
> >> +    {
> >> +        vpl011->uartfr |= (RXFE);
> >> +        vpl011->uartris &= ~(RXI);
> >> +    }
> >> +
> >> +    vpl011->uartfr &= ~(RXFF);
> >> +
> >> +    VPL011_UNLOCK(d, flags);
> >
> > I think we need to notify xenconsoled here, in case the ring was
> > previosuly full? Now that we have read some data, xenconsoled can go
> > back and write more.
> >
> Yes this is a valid point. I will send an event here.
> 
> >
> >> +}
> >> +
> >> +static void vpl011_write_data(struct domain *d, uint8_t data)
> >> +{
> >> +    unsigned long flags;
> >> +    struct vpl011_s *vpl011=&d->arch.vpl011;
> >> +    struct xencons_interface *intf=(struct xencons_interface 
> >> *)vpl011->ring_buf;
> >> +
> >> +    VPL011_LOCK(d, flags);
> >> +
> >> +    /*
> >> +     * It is expected that the ring is not full when this function is 
> >> called
> >> +     * as the guest is expected to write to the data register only when 
> >> the
> >> +     * TXFF flag is not set.
> >> +     * In case the guest does write even when the TXFF flag is set then 
> >> the
> >> +     * data will be silently dropped.
> >> +     */
> >> +    if ( !VPL011_OUT_RING_FULL(intf) )
> >> +    {
> >> +        intf->out[MASK_XENCONS_IDX(intf->out_prod++, intf->out)] = data;
> >> +        smp_wmb();
> >
> > Similarly, the order should be:
> >
> >     out_prod = intf->out_prod;
> >     intf->out[MASK_XENCONS_IDX(out_prod, intf->out)] = data;
> >     smb_wmb();
> >     intf->out_prod = out_prod + 1;
> >
> I will use the suggested order.
> >
> >> +    }
> >> +
> >> +    if ( VPL011_OUT_RING_FULL(intf) )
> >> +    {
> >> +        vpl011->uartfr |= (TXFF);
> >> +        vpl011->uartris &= ~(TXI);
> >> +    }
> >> +
> >> +    vpl011->uartfr |= (BUSY);
> >> +
> >> +    vpl011->uartfr &= ~(TXFE);
> >> +
> >> +    VPL011_UNLOCK(d, flags);
> >> +
> >> +    /* raise an event to xenconsoled only if it is the first character in 
> >> the buffer */
> >> +    if ( VPL011_RING_DEPTH(intf, out) == 1 )
> >> +    {
> >> +        notify_via_xen_event_channel(d, 
> >> d->arch.hvm_domain.params[HVM_PARAM_VCONSOLE_EVTCHN]);
> >> +    }
> >
> > Why only in case is the first character? xenconsoled will miss updates
> > if we don't send a notification, right?
> >
> Whenever xenconsoled sends an event, the vpl011_data_avail function
> will make sure that if there is data in the OUT ring buffer then it
> again
> sends an event to xenconsoled. Also it avoids the need of sending an
> event for every character written in the OUT ring buffer.

We don't have a formal protocol spec for PV console, but if we had, it
would say that the frontend (Xen in this case) should send notifications
every time there is new data to write. Thus, once per character in this
case.

I would start with a dumb implementation like that, simply notify the
other end every time. Then, in a separate patch, we can consider
introducing optimizations, which need to be well explained.


Regarding the optimization you introduced in this patch, delaying write
notifications until we receive a notification from xenconsoled, how many
notifications from xen to xenconsoled does it actually save? xenconsoled
is going to send a notification for every read: we might end up sending
the same number of notifications, only delayed.


> >> +}
> >> +
> >> +static int vpl011_mmio_read(struct vcpu *v, mmio_info_t *info, register_t 
> >> *r, void *priv)
> >> +{
> >> +    uint8_t ch;
> >> +    struct hsr_dabt dabt = info->dabt;
> >> +    int vpl011_reg = (int)(info->gpa - GUEST_PL011_BASE);
> >> +    struct vpl011_s *vpl011 = &v->domain->arch.vpl011;
> >> +
> >> +    switch ( vpl011_reg )
> >> +    {

> >> +    case DR:
> >> +        if ( !VALID_W_SIZE(dabt.size) ) goto bad_width;
> >> +        vpl011_read_data(v->domain, &ch);
> >> +        *r = ch;
> >> +        break;
> >> +
> >> +    case RSR:
> >> +        if ( !VALID_BW_SIZE(dabt.size) ) goto bad_width;
> >> +
> >> +        /* It always returns 0 as there are no physical errors. */
> >> +        *r = 0;
> >> +        break;
> >> +
> >> +    case FR:
> >> +        if ( !VALID_BW_SIZE(dabt.size) ) goto bad_width;
> >> +        *r = (vpl011->uartfr & vpl011_reg_mask[dabt.size]);
> >> +        break;
> >> +
> >> +    case RIS:
> >> +        if ( !VALID_W_SIZE(dabt.size) ) goto bad_width;
> >> +        *r = (vpl011->uartris & vpl011_reg_mask[dabt.size]);
> >> +        break;
> >> +
> >> +    case MIS:
> >> +        if ( !VALID_W_SIZE(dabt.size) ) goto bad_width;
> >> +        *r = (vpl011->uartris &
> >> +                            vpl011->uartimsc & 
> >> vpl011_reg_mask[dabt.size]);
> >> +        break;
> >> +
> >> +    case IMSC:
> >> +        if ( !VALID_W_SIZE(dabt.size) ) goto bad_width;
> >> +        *r = (vpl011->uartimsc & vpl011_reg_mask[dabt.size]);
> >> +        break;
> >> +
> >> +    case ICR:
> >> +        if ( !VALID_W_SIZE(dabt.size) ) goto bad_width;
> >> +
> >> +        /* Only write is valid. */
> >> +        return 0;
> >> +
> >> +    default:
> >> +        gprintk(XENLOG_ERR, "vpl011: unhandled read r%d offset %#08x\n",
> >> +                               dabt.reg, vpl011_reg);
> >> +        return 0;
> >> +    }
> >> +
> >> +    return 1;
> >> +
> >> +bad_width:
> >> +    gprintk(XENLOG_ERR, "vpl011: bad read width %d r%d offset %#08x\n",
> >> +                       dabt.size, dabt.reg, vpl011_reg);
> >> +    domain_crash_synchronous();
> >> +    return 0;
> >> +
> >> +}
> >> +
> >> +static int vpl011_mmio_write(struct vcpu *v, mmio_info_t *info, 
> >> register_t r, void *priv)
> >> +{
> >> +    uint8_t ch = ((struct uartdr_reg *)&r)->data;
> >> +    struct hsr_dabt dabt = info->dabt;
> >> +    int vpl011_reg = (int)(info->gpa - GUEST_PL011_BASE);
> >> +    struct vpl011_s *vpl011 = &v->domain->arch.vpl011;
> >> +
> >> +    switch ( vpl011_reg )
> >> +    {
> >> +    case DR:
> >> +
> >> +        if ( !VALID_BW_SIZE(dabt.size) ) goto bad_width;
> >> +        vpl011_write_data(v->domain, ch);
> >> +        break;
> >> +
> >> +    case RSR: /* Nothing to clear. */
> >> +        if ( !VALID_BW_SIZE(dabt.size) ) goto bad_width;
> >> +        break;
> >> +
> >> +    case FR:
> >> +        goto write_ignore;
> >> +    case RIS:
> >> +    case MIS:
> >> +        goto word_write_ignore;
> >> +
> >> +    case IMSC:
> >> +        if ( !VALID_W_SIZE(dabt.size) ) goto bad_width;
> >> +        vpl011->uartimsc = (r & vpl011_reg_mask[dabt.size]);
> >> +        vgic_inject_vpl011_spi(v->domain);
> >> +        break;
> >> +
> >> +    case ICR:
> >> +        if ( !VALID_W_SIZE(dabt.size) ) goto bad_width;
> >> +        vpl011->uartris &= ~(r & vpl011_reg_mask[dabt.size]);
> >> +        vgic_inject_vpl011_spi(v->domain);
> >> +        break;
> >> +
> >> +    default:
> >> +        gprintk(XENLOG_ERR, "vpl011: unhandled write r%d offset %#08x\n",
> >> +                               dabt.reg, vpl011_reg);
> >> +        return 0;
> >> +    }
> >> +
> >> +    return 1;
> >> +
> >> +write_ignore:
> >> +    if ( !VALID_BW_SIZE(dabt.size) ) goto bad_width;
> >> +    return 1;
> >> +
> >> +word_write_ignore:
> >> +    if ( !VALID_W_SIZE(dabt.size) ) goto bad_width;
> >> +    return 1;
> >> +
> >> +bad_width:
> >> +    gprintk(XENLOG_ERR, "vpl011: bad write width %d r%d offset %#08x\n",
> >> +                       dabt.size, dabt.reg, vpl011_reg);
> >> +    domain_crash_synchronous();
> >> +    return 0;
> >> +
> >> +}
> >> +
> >> +static const struct mmio_handler_ops vpl011_mmio_handler = {
> >> +    .read = vpl011_mmio_read,
> >> +    .write = vpl011_mmio_write,
> >> +};
> >> +
> >> +int vpl011_map_guest_page(struct domain *d)
> >> +{
> >> +    struct vpl011_s *vpl011 = &d->arch.vpl011;
> >> +
> >> +    /* Map the guest PFN to Xen address space. */
> >> +    return prepare_ring_for_helper(d,
> >> +                                   
> >> d->arch.hvm_domain.params[HVM_PARAM_VCONSOLE_PFN],
> >> +                                   &vpl011->ring_page,
> >> +                                   &vpl011->ring_buf);
> >> +}
> >> +
> >> +static void vpl011_data_avail(struct domain *d)
> >> +{
> >> +    unsigned long flags;
> >> +    struct vpl011_s *vpl011 = &d->arch.vpl011;
> >> +    struct xencons_interface *intf=(struct xencons_interface 
> >> *)vpl011->ring_buf;
> >> +
> >> +    VPL011_LOCK(d, flags);
> >> +
> >> +    /* Update the uart rx state if the buffer is not empty. */
> >> +    if ( !VPL011_IN_RING_EMPTY(intf) )
> >> +    {
> >> +        vpl011->uartfr &= ~(RXFE);
> >> +        if ( VPL011_IN_RING_FULL(intf) )
> >> +            vpl011->uartfr |= (RXFF);
> >> +        vpl011->uartris |= (RXI);
> >> +    }
> >> +
> >> +    /* Update the uart tx state if the buffer is not full. */
> >> +    if ( !VPL011_OUT_RING_FULL(intf) )
> >> +    {
> >> +        vpl011->uartfr &= ~(TXFF);
> >> +        vpl011->uartris |= (TXI);
> >> +        if ( VPL011_OUT_RING_EMPTY(intf) )
> >> +        {
> >> +            vpl011->uartfr &= ~(BUSY);
> >> +            vpl011->uartfr |= (TXFE);
> >
> > It is not a good idea to check the intf values twice: xenconsoled might
> > be modifying them, so we can get into the situation where intf->out*
> > change in between the two checks. Please read out_cons and out_prod only
> > once in this function.
> 
> Since this code is under LOCK, the IN and OUT ring buffers will not be
> updated by the guest. Specifically, the following transitions are
> ruled out:
> 
> IN ring buffer
> non-empty ----> empty (as the reader is blocked due to lock)
> 
> OUT ring buffer
> not-full ----> full (as writer is blocked due to lock).
> 
> So the code inside the IF block remains valid even if the buffer state 
> changes.
> For the IN ring buffer it can go from non-empty to full. Similarly for
> OUT ring buffer it can go from FULL to empty.
> 
> Also checking the latest buffer index (instead of checking buffer
> index read as local variables) allows to update the pl011 state at the
> earliest.

I understand that the IN state shouldn't change. However, like you say,
the OUT state can change between the VPL011_OUT_RING_FULL and the
VPL011_OUT_RING_EMPTY checks.

It is a bad idea to try to write lockless code against a potentially
changing state. It is already hard enough without adding that
complexity; the other end can find ways to cause problems. Please take
a snapshot of the out indexes and use the local variables, rather than
"intf", to do the checks.


> >
> >
> >> +        }
> >> +    }
> >> +
> >> +    VPL011_UNLOCK(d, flags);
> >> +
> >> +    vgic_inject_vpl011_spi(d);
> >> +
> >> +    if ( !VPL011_OUT_RING_EMPTY(intf) )
> >> +    {
> >> +        ASSERT( d->arch.hvm_domain.params[HVM_PARAM_VCONSOLE_EVTCHN] != 0 
> >> );
> >> +        notify_via_xen_event_channel(d, 
> >> d->arch.hvm_domain.params[HVM_PARAM_VCONSOLE_EVTCHN]);
> >
> > Why are we sending notifications at this point? The out buffer hasn't
> > changed since xenconsoled sent the evtchn to Xen, so xenconsoled already
> > knows at this point that the out ring is not empty.
> 
> The OUT buffer may be empty when xenconsoled sends the event. By the
> time the event is handled, the guest may write more data to OUT
> buffer. This condition makes sure that if the OUT ring buffer is not
> empty then it sends an event to xenconsoled.

OUT notifications from Xen to xenconsoled are best handled where writes
actually happen: in vpl011_write_data. I could accept an optimization
like the one you introduced, but please explain the design and the
benefits in details.


> >
> >
> >> +    }
> >> +}
> >> +
> >> +
> >> +static void vpl011_notification(struct vcpu *v, unsigned int port)
> >> +{
> >> +    vpl011_data_avail(v->domain);
> >> +}
> >> +
> >> +int domain_vpl011_init(struct domain *d, struct xen_arch_domainconfig 
> >> *config)
> >> +{
> >> +    int rc;
> >> +    struct vpl011_s *vpl011 = &d->arch.vpl011;
> >> +
> >> +    rc = alloc_unbound_xen_event_channel(d, 0, config->console_domid,
> >> +                                         vpl011_notification);
> >> +    if (rc < 0)
> >> +    {
> >> +        return rc;
> >> +    }
> >> +    d->arch.hvm_domain.params[HVM_PARAM_VCONSOLE_EVTCHN] = rc;
> >> +    rc = vgic_reserve_virq(d, GUEST_VPL011_SPI);
> >> +    if ( !rc )
> >> +        return rc;
> >
> > We should free the allocated evtchn in case of error here
> I will free the event channel.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.