|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Minios-devel] [UNIKRAFT PATCH 2/2] plat/kvm: Introducing virtio base driver
Hi Sharan,
Thanks a lot for comments and in-depth analysis! Please see my replies
inline.
On 06/29/2018 05:14 PM, Sharan Santhanam wrote:
> Hello,
>
> Please find some of the comments inline:
>
>
> On 06/27/2018 12:59 PM, Costin Lupu wrote:
>> From: Costin Lupu <costin.lup@xxxxxxxxx>
>>
>> Currently, the virtio base driver contains the implementation only for
>> virtio rings. The implementation was ported from Solo5 and adapted to
>> Unikraft APIs. Subsequent virtio drivers should depend on this base
>> driver.
>>
>> This patch also introduces the plat/drivers/ directory which should
>> contain the drivers implementations which may be used by more than a
>> single platform.
>>
>> Signed-off-by: Costin Lupu <costin.lupu@xxxxxxxxx>
>> ---
>> plat/Makefile.uk | 1 +
>> plat/common/include/pci/virtio/virtio_pci.h | 66 +++++++
>> plat/common/include/pci/virtio/virtio_ring.h | 257
>> +++++++++++++++++++++++++++
>> plat/drivers/virtio/virtio_ring.c | 153 ++++++++++++++++
>> plat/kvm/Config.uk | 8 +
>> plat/kvm/Makefile.uk | 8 +
>> 6 files changed, 493 insertions(+)
>> create mode 100644 plat/common/include/pci/virtio/virtio_pci.h
>> create mode 100644 plat/common/include/pci/virtio/virtio_ring.h
>> create mode 100644 plat/drivers/virtio/virtio_ring.c
>>
>> diff --git a/plat/Makefile.uk b/plat/Makefile.uk
>> index 6ff632c..72155bd 100644
>> --- a/plat/Makefile.uk
>> +++ b/plat/Makefile.uk
>> @@ -1,5 +1,6 @@
>> UK_PLAT_BASE := $(CONFIG_UK_BASE)/plat
>> UK_PLAT_COMMON_BASE := $(UK_PLAT_BASE)/common
>> +UK_PLAT_DRIVERS_BASE:= $(UK_PLAT_BASE)/drivers
>> $(eval $(call _import_lib,$(UK_PLAT_BASE)/xen))
>> $(eval $(call _import_lib,$(UK_PLAT_BASE)/kvm))
>> diff --git a/plat/common/include/pci/virtio/virtio_pci.h
>> b/plat/common/include/pci/virtio/virtio_pci.h
>> new file mode 100644
>> index 0000000..02bad4c
>> --- /dev/null
>> +++ b/plat/common/include/pci/virtio/virtio_pci.h
>> @@ -0,0 +1,66 @@
>> +/* SPDX-License-Identifier: ISC */
>> +/*
>> + * Authors: Dan Williams
>> + * Costin Lupu <costin.lupu@xxxxxxxxx>
>> + *
>> + * Copyright (c) 2015, IBM
>> + * Copyright (c) 2018, NEC Europe Ltd., NEC Corporation
>> + *
>> + * Permission to use, copy, modify, and/or distribute this software
>> + * for any purpose with or without fee is hereby granted, provided
>> + * that the above copyright notice and this permission notice appear
>> + * in all copies.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL
>> + * WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED
>> + * WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE
>> + * AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR
>> + * CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
>> + * OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT,
>> + * NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
>> + * CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
>> + */
>> +/* Taken and adapted from solo5 virtio_pci.h */
>> +
>> +#ifndef __PLAT_CMN_PCI_VIRTIO_PCI_H__
>> +#define __PLAT_CMN_PCI_VIRTIO_PCI_H__
>> +
>> +#include <uk/arch/limits.h>
>> +
> Do we protect (MARCO definition) for the legacy mode of virtio vs
> nonlegacy virtio mode? As the current layout support for legacy mode.
Ack, I will add a TODO comment to emphasize that.
>> +/* virtio config space layout */
>> +#define VIRTIO_PCI_HOST_FEATURES 0 /* 32-bit r/o */
>> +#define VIRTIO_PCI_GUEST_FEATURES 4 /* 32-bit r/w */
>> +#define VIRTIO_PCI_QUEUE_PFN 8 /* 32-bit r/w */
>> +#define VIRTIO_PCI_QUEUE_SIZE 12 /* 16-bit r/o */
>> +#define VIRTIO_PCI_QUEUE_SEL 14 /* 16-bit r/w */
>> +#define VIRTIO_PCI_QUEUE_NOTIFY 16 /* 16-bit r/w */
>> +
>> +/*
>> + * Shift size used for writing physical queue address to QUEUE_PFN
>> + */
>
> The spec explicitly mentions the size of the page size to be 4096 in
> section 4.1.5.1.5.4.1.
>
> Should we not hard code it to 12 instead of combining it with
> __PAGE_SHIFT as there is also virtio-PCI which is driving this
> configuration?
Ack, I will replace it with 12.
>> +#define VIRTIO_PCI_QUEUE_ADDR_SHIFT __PAGE_SHIFT
>> +
>> +
>> +/*
>> + * The status register lets us tell the device where we are in
>> + * initialization
>> + */
>> +#define VIRTIO_PCI_STATUS 18 /* 8-bit r/w */
>> +#define VIRTIO_PCI_STATUS_ACK 0x1 /* we recognize device
>> as virtio */
>> +#define VIRTIO_PCI_STATUS_DRIVER 0x2 /* we want to drive it */
>> +#define VIRTIO_PCI_STATUS_DRIVER_OK 0x4 /* initialization is
>> complete */
> Missing definition for the device reset or device error flag (0x40). Do
> we need to care for the device status?
Ack for reset flag. Yep, we'll need these in the virtio drivers.
>> +#define VIRTIO_PCI_STATUS_FAIL 0x80 /* tell device
>> something's wrong */
>> +
>> +/*
>> + * Reading the value will return the current contents of the interrupt
>> + * status register and will also clear it. This is effectively a
>> + * read-and-acknowledge.
>> + */
>> +#define VIRTIO_PCI_ISR 19 /* 8-bit r/o */
>> +#define VIRTIO_PCI_ISR_HAS_INTR 0x1 /* interrupt is for this
>> device */
>> +#define VIRTIO_PCI_ISR_CONFIG 0x2 /* config change bit */
>> +
>
> Do we also make assumption on MSI more explicitly?
Ack, I'll add a TODO comment for revisiting this when MSI will be
implemented.
>> +/* xxx assuming msi is not configured */
>> +#define VIRTIO_PCI_CONFIG_OFF 20
>> +
>> +#endif /* __PLAT_CMN_PCI_VIRTIO_PCI_H__ */
>> diff --git a/plat/common/include/pci/virtio/virtio_ring.h
>> b/plat/common/include/pci/virtio/virtio_ring.h
>> new file mode 100644
>> index 0000000..549ae54
>> --- /dev/null
>> +++ b/plat/common/include/pci/virtio/virtio_ring.h
>> @@ -0,0 +1,257 @@
>
> As a general comment, we should use the virtio_ring.h from the official
> specification. It is available in the linux kernel repo
> include/uapi/linux/virtio_ring.h under BSD License.
I totally agree that the Linux variant is better. However, given the
short time, we'll stick to the Solo5 variant. I added a reminding TODO
comment in the header.
>> +/* SPDX-License-Identifier: BSD-3-Clause */
>> +/*
>> + * Authors: Costin Lupu <costin.lupu@xxxxxxxxx>
>> + *
>> + * Copyright (c) 2018, NEC Europe Ltd., NEC Corporation. All rights
>> reserved.
>> + *
>> + * Redistribution and use in source and binary forms, with or without
>> + * modification, are permitted provided that the following conditions
>> + * are met:
>> + *
>> + * 1. Redistributions of source code must retain the above copyright
>> + * notice, this list of conditions and the following disclaimer.
>> + * 2. Redistributions in binary form must reproduce the above copyright
>> + * notice, this list of conditions and the following disclaimer in
>> the
>> + * documentation and/or other materials provided with the
>> distribution.
>> + * 3. Neither the name of the copyright holder nor the names of its
>> + * contributors may be used to endorse or promote products derived
>> from
>> + * this software without specific prior written permission.
>> + *
>> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
>> CONTRIBUTORS "AS IS"
>> + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
>> TO, THE
>> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
>> PURPOSE
>> + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
>> CONTRIBUTORS BE
>> + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
>> + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
>> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
>> BUSINESS
>> + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
>> WHETHER IN
>> + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
>> OTHERWISE)
>> + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
>> ADVISED OF THE
>> + * POSSIBILITY OF SUCH DAMAGE.
>> + *
>> + * THIS HEADER MAY NOT BE EXTRACTED OR MODIFIED IN ANY WAY.
>> + */
>> +/* An interface for efficient virtio implementation.
>> + *
>> + * This header is BSD licensed so anyone can use the definitions
>> + * to implement compatible drivers/servers.
>> + *
>> + * Copyright 2007, 2009, IBM Corporation
>> + * Copyright 2011, Red Hat, Inc
>> + * All rights reserved.
>> + *
>> + * Redistribution and use in source and binary forms, with or without
>> + * modification, are permitted provided that the following conditions
>> + * are met:
>> + * 1. Redistributions of source code must retain the above copyright
>> + * notice, this list of conditions and the following disclaimer.
>> + * 2. Redistributions in binary form must reproduce the above copyright
>> + * notice, this list of conditions and the following disclaimer in
>> the
>> + * documentation and/or other materials provided with the
>> distribution.
>> + * 3. Neither the name of IBM nor the names of its contributors
>> + * may be used to endorse or promote products derived from this
>> software
>> + * without specific prior written permission.
>> + *
>> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
>> CONTRIBUTORS "AS IS"
>> + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
>> TO, THE
>> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
>> PURPOSE
>> + * ARE DISCLAIMED. IN NO EVENT SHALL IBM OR CONTRIBUTORS BE
>> + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
>> + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
>> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
>> BUSINESS
>> + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
>> WHETHER IN
>> + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
>> OTHERWISE)
>> + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
>> ADVISED OF THE
>> + * POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +#ifndef __PLAT_CMN_PCI_VIRTIO_RING_H__
>> +#define __PLAT_CMN_PCI_VIRTIO_RING_H__
>> +
>> +#include <uk/arch/limits.h>
>> +#include <uk/alloc.h>
>> +
>> +/* This marks a buffer as continuing via the next field. */
>> +#define VIRTQ_DESC_F_NEXT 1
>> +/* This marks a buffer as write-only (otherwise read-only). */
>> +#define VIRTQ_DESC_F_WRITE 2
>> +/* This means the buffer contains a list of buffer descriptors. */
>> +#define VIRTQ_DESC_F_INDIRECT 4
>> +
>> +/*
>> + * The device uses this in used->flags to advise the driver:
>> + * don't kick me when you add a buffer. It's unreliable, so
>> + * it's simply an optimization.
>> + */
>> +#define VIRTQ_USED_F_NO_NOTIFY 1
>> +
>> +/*
>> + * The driver uses this in avail->flags to advise the device:
>> + * don't interrupt me when you consume a buffer. It's unreliable, so
>> + * it's simply an optimization.
>> + */
>> +#define VIRTQ_AVAIL_F_NO_INTERRUPT 1
>> +
>> +/* Support for indirect descriptors */
>> +#define VIRTIO_F_INDIRECT_DESC 28
>> +
>> +/* Support for avail_event and used_event fields */
>> +#define VIRTIO_F_EVENT_IDX 29
>> +
>> +/* Arbitrary descriptor layouts. */
>> +#define VIRTIO_F_ANY_LAYOUT 27
>> +
>> +
>> +/*
>> + * Virtqueue descriptors: 16 bytes.
>> + * These can chain together via "next".
>> + */
> It is better to explicitly use virtio based datatype instead of generic
> u64, u32, u16 because in the specification it is mentioned that the data
> type to be little endian/guest endianess for different layout. We should
> be careful on the use of these data types.
>
> It may also be useful to provide conversion functions for guest to host
> endianess.
Ack. I added new types for little endianess (__virtio_le{16,32,64}) to
clarify which is which. For big endianess, the conversion functions
should be implemented in the future. I added a preprocessor validation
for that.
>> +struct virtq_desc {
>> + /* Address (guest-physical). */
>> + __u64 addr;
>> + /* Length. */
>> + __u32 len;
>> + /* The flags as indicated above. */
>> + __u16 flags;
>> + /* We chain unused descriptors via this, too */
>> + __u16 next;
>> +};
>> +
>> +struct virtq_avail {
>> + __u16 flags;
>> + __u16 idx;
>> + __u16 ring[];
>> + /* Only if VIRTIO_F_EVENT_IDX: __u16 used_event; */
>> +};
>> +
>> +/* __u32 is used here for ids for padding reasons. */
>> +struct virtq_used_elem {
>> + /* Index of start of used descriptor chain. */
>> + __u32 id;
>> + /* Total length of the descriptor chain which was written to. */
>> + __u32 len;
>> +};
>> +
>> +struct virtq_used {
>> + __u16 flags;
>> + volatile __u16 idx;
>> + struct virtq_used_elem ring[];
>> + /* Only if VIRTIO_F_EVENT_IDX: __u16 avail_event; */
>> +};
>> +
>
> Why did we choose 1526 as the length? Is it based on the size of the
> Ethernet frame size of 1514 and 12 for the meta data? If yes, should we
> make this choice based on this use case?
Yes, it's from the frame size + meta. I added a TODO comment in the code
to clarify that and to outline it should be changed to support buffer
sizes defined by the drivers.
>> +/* This is the max buffer length per descriptor. */
>> +#define MAX_BUFFER_LEN 1526
>> +
>> +/*
>> + * Each one of these io_buffer's map to a descriptor.
>> + * An array of io_buffer's of size virtq->num (same
>> + * as virtq->desc) is allocated during init.
>> + */
>> +struct io_buffer {
>> + __u8 data[MAX_BUFFER_LEN];
>> +
>> + /*
>> + * Data length in bytes. It is written by the driver on
>> + * a tx/write, or by the device on a rx/read on interrupt
>> + * handling (do not remove the volatile).
>> + */
>> + volatile __u32 len;
>> +
>> + /* Extra flags to be added to the corresponding descriptor. */
>> + __u16 extra_flags;
>> +};
>> +
>
> We should also consider packing the structure as we want a specific
> layout of memory allocation. As mentioned in the spec [1].
The layout structure follows the specification if one uses the helper
macros below to initialize the structure fields, therefore there is no
need for structure packing.
>> +struct virtq {
>> + unsigned int num;
>> +
>> + struct virtq_desc *desc;
>> + struct virtq_avail *avail;
>> + struct virtq_used *used;
>> + struct io_buffer *bufs;
>> +
>> + /* Keep track of available (free) descriptors */
>> + __u16 num_avail;
>> +
>> + /* Indexes in the descriptors array */
>> + __u16 last_used;
>> + __u16 next_avail;
>> +};
>> +
>> +/*
>> + * Helper macros for accessing virtqueue fields
>> + */
>> +
>> +#define VIRTQ_OFF_DESC(q) 0
>> +
>> +#define VIRTQ_OFF_AVAIL(q) ((q)->num * sizeof(struct virtq_desc))
>> +#define VIRTQ_OFF_AVAIL_RING(q) \
>> + (VIRTQ_OFF_AVAIL(q) + sizeof(struct virtq_avail))
>> +
>> +#define VIRTQ_OFF_PADDING(q) \
>> + (VIRTQ_OFF_AVAIL_RING(q) + (sizeof(__u16) * (q)->num))
>> +
>> +#define VIRTQ_OFF_USED(q) \
>> + ((VIRTQ_OFF_PADDING(q) + __PAGE_SIZE - 1) & __PAGE_MASK)
>> +#define VIRTQ_OFF_USED_RING(q) \
>> + (VIRTQ_OFF_USED(q) + sizeof(struct virtq_used))
>> +
>> +#define VIRTQ_SIZE(q) \
>> + (VIRTQ_OFF_USED_RING(q) + (sizeof(struct virtq_used_elem) *
>> (q)->num))
>> +
>> +static inline
>> +int virtq_need_event(__u16 event_idx, __u16 new_idx, __u16 old_idx)
>> +{
>> + return (new_idx - event_idx - 1) < (new_idx - old_idx);
>> +}
>> +
>> +/* Get location of event indices (only with VIRTIO_F_EVENT_IDX) */
>> +static inline __u16 *virtq_used_event(struct virtq *vq)
>> +{
>> + /*
>> + * For backwards compatibility, used event index
>> + * is at *end* of avail ring.
>> + */
>> + return &vq->avail->ring[vq->num];
>> +}
>> +
>> +static inline __u16 *virtq_avail_event(struct virtq *vq)
>> +{
>> + /*
>> + * For backwards compatibility, avail event index
>> + * is at *end* of used ring.
>> + */
>> + return (__u16 *) &vq->used->ring[vq->num];
>> +}
>> +
>> +/*
>> + * Create a descriptor chain starting at index head,
>> + * using vq->bufs also starting at index head.
>> + * @param vq Virtual queue
>> + * @param head Starting descriptor index
>> + * @param num Number of descriptors (and number of bufs).
>> + * @return 0 on success, < 0 otherwise
>> + */
>> +int virtq_add_descriptor_chain(struct virtq *vq,
>> + __u16 head, __u16 num);
>> +
>> +/*
>> + * Initializes a virtual queue
>> + * @param vq Virtual queue
>> + * @param pci_base Base in PCI configuration space
>> + * @param selector Virtual queue selector
>> + * @param a Memory allocator
>> + * @return 0 on success, < 0 otherwise
>> + */
>> +int virtq_rings_init(struct virtq *vq, __u16 pci_base, int selector,
>> + struct uk_alloc *a);
>> +
>> +/*
>> + * Deinitializes a virtual queue
>> + * @param vq Virtual queue
>> + * @param pci_base Base in PCI configuration space
>> + * @param selector Virtual queue selector
>> + * @param a Memory allocator
>> + */
>> +void virtq_rings_fini(struct virtq *vq, __u16 pci_base, int selector,
>> + struct uk_alloc *a);
>> +
>> +#endif /* __PLAT_CMN_PCI_VIRTIO_RING_H__ */
>> diff --git a/plat/drivers/virtio/virtio_ring.c
>> b/plat/drivers/virtio/virtio_ring.c
>> new file mode 100644
>> index 0000000..c82cc86
>> --- /dev/null
>> +++ b/plat/drivers/virtio/virtio_ring.c
>> @@ -0,0 +1,153 @@
>> +/* SPDX-License-Identifier: ISC */
>> +/*
>> + * Authors: Dan Williams
>> + * Martin Lucina
>> + * Ricardo Koller
>> + * Costin Lupu <costin.lupu@xxxxxxxxx>
>> + *
>> + * Copyright (c) 2015-2017 IBM
>> + * Copyright (c) 2016-2017 Docker, Inc.
>> + * Copyright (c) 2018, NEC Europe Ltd., NEC Corporation
>> + *
>> + * Permission to use, copy, modify, and/or distribute this software
>> + * for any purpose with or without fee is hereby granted, provided
>> + * that the above copyright notice and this permission notice appear
>> + * in all copies.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL
>> + * WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED
>> + * WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE
>> + * AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR
>> + * CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
>> + * OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT,
>> + * NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
>> + * CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
>> + */
>> +/* Taken and adapted from solo5 virtio_ring.c */
>> +
>> +#include <string.h>
>> +#include <uk/print.h>
>> +#include <cpu.h>
>> +#include <pci/virtio/virtio_pci.h>
>> +#include <pci/virtio/virtio_ring.h>
>> +
>
> In section 3.2.1, the specification mention a max size of 32768.
Ack.
>> +/*
>> + * There is no official max queue size. But we've seen 4096,
>> + * so let's use the double of that.
>> + */
>> +#define VIRTQ_MAX_QUEUE_SIZE (2 * __PAGE_SIZE)
>> +
>> +
>> +/*
>> + * Create a descriptor chain starting at index head, using vq->bufs
>> + * also starting at index head.
>> + * Make sure the vq-bufs are cleaned before using them again.
>> + */
>> +int virtq_add_descriptor_chain(struct virtq *vq, __u16 head, __u16 num)
>> +{
>> + struct virtq_desc *desc;
>> + __u16 used_descs, mask, i;
>> +
>> + UK_ASSERT(vq != NULL);
>> + UK_ASSERT(num > 0);
>> +
>> + if (vq->num_avail < num) {
>> + uk_printd(DLVL_WARN, "virtq full! next_avail:%d last_used:%d\n",
>> + vq->next_avail, vq->last_used);
>
> Should it not be -ENOMEM or -ENOSPC?
Actually, num might have invalid values when `num > vq->num`. I refined
the validations and replaced the error code here with -ENOMEM to report
there is no space only if `num <= vq->num && q->num_avail < num`.
>> + return -EINVAL;
>> + }
>> +
>> + used_descs = num;
>> + mask = vq->num - 1;
>> +
>> + for (i = head; used_descs > 0; used_descs--) {
>> + struct io_buffer *buf = &vq->bufs[i];
>> +
>> + /*
>> + * The first field of a "struct io_buffer" is the "data" field,
>> + * so in the interrupt handler we can just cast this pointer
>> + * back into a 'struct io_buffer'.
>> + */
>> + UK_ASSERT(buf->data == (__u8 *) buf);
>> + UK_ASSERT(buf->len <= MAX_BUFFER_LEN);
>
> Do we assert here or report an error to an upper layer? ASSERT might be
> skipped in the optimized code?
With the current implementation, if the assertions are not satisfied
then the buffer is corrupted. This comment raises a new concern here:
how should we do the error handling in Unikraft? How tolerant should we
be in situations like this. We should define some directions for error
handling and document them.
>> +
>> + desc = &vq->desc[i];
>> + desc->addr = (__u64) buf->data;
>> + desc->len = buf->len;
>> + desc->flags = VIRTQ_DESC_F_NEXT | buf->extra_flags;
>> +
>> + i = (i + 1) & mask;
>> + desc->next = i;
>> + }
>> +
>> + /* The last descriptor in the chain does not have a next */
>> + desc->next = 0;
>> + desc->flags &= ~VIRTQ_DESC_F_NEXT;
>> +
>> + vq->num_avail -= num;
>
> I think we need a write memory barrier before the next statement as we
> transferring the buffer to the virtio device which could run on a SMP
> host. What do you think?
Ack.
>> + /* Memory barriers should be unnecessary with one processor */
>> + vq->avail->ring[vq->avail->idx & mask] = head;
>> + /* avail->idx always increments and wraps naturally at 65536 */
>> + vq->avail->idx++;
>> + vq->next_avail += num;
>> +
>> + return 0;
>> +}
>> +
>> +int virtq_rings_init(struct virtq *vq, __u16 pci_base, int selector,
>> + struct uk_alloc *a)
>> +{
>> + __u8 *data;
>> + __sz vq_size;
>> +
>> + UK_ASSERT(vq != NULL);
>> + UK_ASSERT(a != NULL);
>> +
>> + vq->last_used = vq->next_avail = 0;
>> +
>> + /* read queue size */
>> + outw(pci_base + VIRTIO_PCI_QUEUE_SEL, selector);
>> + vq->num = vq->num_avail = inw(pci_base + VIRTIO_PCI_QUEUE_SIZE);
>
> Check for zero on the num, as it is a check if the queue exist or not.
>
> For better alignment, it may be wise to check for a power of two.
Ack. I'll leave the power-of-two check for the future updates.
>> +
>> + UK_ASSERT(vq->num <= VIRTQ_MAX_QUEUE_SIZE);
>
> The specification provides a cleaner implementation for calculating the
> size of the queue. Personally, I would use that implementation as it
> improve on readability.
We shall change this when we'll adopt the Linux virtio_ring.h.
>> +
>> + vq_size = VIRTQ_SIZE(vq);
>
> Initialize data variable as the memalign return error without modifying
> the data variable.
Ack.
>> +
>> + /* allocate queue memory */
>> + uk_posix_memalign_ifpages(a, (void **) &data, __PAGE_SIZE, vq_size);
>> + if (!data)
>> + return -ENOMEM;
>> +
>> + memset(data, 0, vq_size);
>> +
>> + vq->desc = (struct virtq_desc *) (data + VIRTQ_OFF_DESC(vq));
>> + vq->avail = (struct virtq_avail *) (data + VIRTQ_OFF_AVAIL(vq));
>> + vq->used = (struct virtq_used *) (data + VIRTQ_OFF_USED(vq));
>> +
>> + /* set queue memory */
>> + outw(pci_base + VIRTIO_PCI_QUEUE_SEL, selector);
>
> The specification mentions about providing the physical address of the
> guest OS. We are currently assigning the address return from malloc,
> which will not work always.
Yep, good catch. I added a TODO comment since we don't have a function
converting virtual addresses to physical addresses.
>> + outl(pci_base + VIRTIO_PCI_QUEUE_PFN,
>> + (__u64) data >> VIRTIO_PCI_QUEUE_ADDR_SHIFT);
>> +
>> + return 0;
>> +}
>> +
>> +void virtq_rings_fini(struct virtq *vq, __u16 pci_base, int selector,
>> + struct uk_alloc *a)
>> +{
>> + __u8 *data;
>> +
>> + UK_ASSERT(vq != NULL);
>> + UK_ASSERT(a != NULL);
>> +
>> + /* reset queue memory */
>> + outw(pci_base + VIRTIO_PCI_QUEUE_SEL, selector);
>> + outl(pci_base + VIRTIO_PCI_QUEUE_PFN, 0);
>> +
>> + /* free queue memory */
>> + data = (__u8 *) vq->desc - VIRTQ_OFF_DESC(vq->num);
>> + uk_free(a, data);
>> +
>> + /* cleanup the queue */
>> + memset(vq, 0, sizeof(*vq));
>> +}
>> diff --git a/plat/kvm/Config.uk b/plat/kvm/Config.uk
>> index 967a07b..893de74 100644
>> --- a/plat/kvm/Config.uk
>> +++ b/plat/kvm/Config.uk
>> @@ -17,4 +17,12 @@ config KVM_PCI
>> select LIBUKBUS
>> help
>> PCI bus driver for probing and operating PCI devices
>> +
>> +if (KVM_PCI)
>> +config KVM_PCI_VIRTIO
>> + bool "Virtio Drivers"
>> + default n
>> + help
>> + Virtio drivers for KVM
>> +endif
>> endif
>> diff --git a/plat/kvm/Makefile.uk b/plat/kvm/Makefile.uk
>> index c1e79a2..1ae1b08 100644
>> --- a/plat/kvm/Makefile.uk
>> +++ b/plat/kvm/Makefile.uk
>> @@ -8,6 +8,7 @@ $(eval $(call addplat_s,kvm,$(CONFIG_PLAT_KVM)))
>> ##
>> $(eval $(call addplatlib,kvm,libkvmplat))
>> $(eval $(call addplatlib_s,kvm,libkvmpci,$(CONFIG_KVM_PCI)))
>> +$(eval $(call
>> addplatlib_s,kvm,libkvmpcivirtio,$(CONFIG_KVM_PCI_VIRTIO)))
>> ##
>> ## Platform library definitions
>> @@ -46,3 +47,10 @@ LIBKVMPLAT_SRCS-y +=
>> $(UK_PLAT_COMMON_BASE)/memory.c|common
>> LIBKVMPCI_ASINCLUDES-$(CONFIG_ARCH_X86_64) +=
>> -I$(UK_PLAT_COMMON_BASE)/include
>> LIBKVMPCI_CINCLUDES-$(CONFIG_ARCH_X86_64) +=
>> -I$(UK_PLAT_COMMON_BASE)/include
>> LIBKVMPCI_SRCS-y +=
>> $(UK_PLAT_COMMON_BASE)/pci_bus.c|common
>> +
>> +##
>> +## Virtio library definitions
>> +##
>> +LIBKVMPCIVIRTIO_ASINCLUDES-y += -I$(UK_PLAT_COMMON_BASE)/include
>> +LIBKVMPCIVIRTIO_CINCLUDES-y += -I$(UK_PLAT_COMMON_BASE)/include
>> +LIBKVMPCIVIRTIO_SRCS-y +=
>> $(UK_PLAT_DRIVERS_BASE)/virtio/virtio_ring.c
>>
>
> We may also need to append to export symbols for the api functions as
> virtio_ring expose function to be used by other libraries.
Let's wait until we'll have some clearer indications on that matter.
> Reference
> [1] http://docs.oasis-open.org/virtio/virtio/v1.0/cs01/virtio-v1.0-cs01.pdf
>
> Thanks & Regards
> Sharan Santhanam
_______________________________________________
Minios-devel mailing list
Minios-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/minios-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |