|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] libbxl: add support for pvscsi, iteration 1
On Fri, May 02, Ian Campbell wrote:
> > The backend driver uses a single_host:many_devices notation to manage
> > domU devices. The xenstore layout looks like this:
>
> Please can you add ~/device/vscsi/$DEVID/* and
> ~/backend/vif/$DOMID/$DEVID/* to docs/misc/xenstore-paths.markdown.
>
> If you were inclined to improve xen/include/public/io/vscsiif.h by using
> some of the content from here then that would be awesome too.
I think its a good idea to document what the current backend/frontend
code does, as this will also aid with review of the libxl changes.
> > This is a challenge for libxl because libxl currently handles only
> > single_host:single_device in its device helper functions.
>
> This is a bit like pci I suppose?
Maybe. But today Ondrej found a way to attach additional devices,
detaching them will most likely work as well. Basically one creates the
new vscsi-devs/dev-N/ subdir with state==1, then write state==7 to the
backend. Similar with detach, state==5 to the dev-N, then state==7 to
the backend.
> > Due to this
> > limitation in libxl, scsi-detach will remove the whole virtual scsi host
> > with all its virtual devices, instead of just the requested single
> > virtual device.
>
> That's rather unfortunate!
>
> Are multiple <vhost> allowed/supported? e.g. could I (or better, libxl
> by default) setup a device-per-host situation to alleviate this?
>
> For pcifront/back we support bus reconfiguration (via
> XenbusStateReconfigur{ing,ed}) which is how this works there. Does vscsi
> not support anything like that?
Many vhosts are supported, I just did not realize that reconfigure was
used here as well. So maybe libxl does not need further changes. I will
hack some code to show how attach/detach works in theory.
> > To support the pdev notation '/dev/$device' it is required to parse
> > sysfs to translate the path to HST:CHN:TGT:LUN notation. In its current
> > variant /sys/dev/ is used, which is available since at least Linux 3.0.
> > Support for dom0 kernels which lack this interface will be added later.
>
> Are there any such kernels which anyone cares about any more?
I have to check when /sys/dev was introduced. In the end it would be
nice to support also older dom0 kernels.
> > - xl scsi-detach will remove entire scsi hosts, unlike its xm counter part.
>
> What did xm do instead? Fail?
No, it will use the xenstore reconfigure state to indicate a change.
> > Note that the latter format is unreliable because
> > +the HOST value can change across dom0 reboots.
>
> /dev/sgN would be unreliable too I think? As might block devices not
> using the /dev/disk/by-* links you give in the example. You might be
> best saying that only /dev/disk/by-* block device references are
> reliable because all of the others might change over a reboot.
I will reword this part.
One customer reported that his SCSI device was not working with xend
because this device was accessible only via a char device. And I think
udev was able to create some by-id path for the sgN node. Have to dig
into that old report.
> > +
> > +=item C<vdev>
> > +
> > +Specifies how the SCSI device is mapped into the guest. The notation is in
> > +SCSI notation HOST:CHANNEL:TARGET:LUN. HOST in this case means a virthal
>
> Typo "virtual"
>
> > +SCSI host within the guest.
>
> All of HOST:CHANNEL:TARGET:LUN are numbers I think, are they
> hex/dec/oct? Are there limits?
I have to find out, its most likely all decimal.
> > +Right now only one option is recognized: feature-host.
> What does it do?
I have to find out ;-)
Last time I checked the command was passed as is from domU to dom0, but
I dont really know as it is not documented.
> > +=item B<scsi-attach> I<domain-id> I<pdev> I<vdev> I<[,feature-host]>
>
> Is the comma in I<[,feature-host]> literally required, as in
> xl scsi-attach mydom 0:0:0:0 0:0:0:0 ,feature-host
> ?
This is a typo I noticed after submission.
Not sure how to specify the option. Either a '0:0:0:0,feature-host' or
as two strings.
> > +Removes the vscsi device from domain specified by I<domain-id>.
> > +Note that the whole virtual SCSI host with all its devices is removed.
> > +This is a BUG!
>
> I have a feeling it would be preferable in the first instance to either
> just not provide the detach portion of this interface, or to require
> that vdev is only a host and not a specific device (... unless the host
> only has a single device, but maybe that's getting too far)
Attach/detach can probably be implemented without much work.
> > +++ b/tools/libxl/libxl.c
> > +
> > +void libxl__device_vscsi_add(libxl__egc *egc, uint32_t domid,
> > + libxl_device_vscsi *vscsi,
> > + libxl__ao_device *aodev)
> > +{
> > + STATE_AO_GC(aodev->ao);
> > + flexarray_t *front;
> > + flexarray_t *back;
> > + libxl__device *device;
> > + unsigned int rc, i;
> > +
> > + i = 2 * (4 + (3 * vscsi->num_vscsi_devs));
>
> That's quite exciting ;-)
>
> A flexarray will grow as neccessary so some sort of static best guest of
> a starting point would be fine here I think (or a comment about 2 4 and
> 3)!
Yes, that was just an attempt to avoid realloc. As usual, optimization
of non-critical code paths...
> I suppose the num_vscsi_devs here is an artefact of the short coming you
> mentioned before. The way PCI handles this is to create the "bus" (aka
> vhost here) either when the first device is attached or in the higher
> level code before it calls the individual device adds (i.e. this
> function), or something (I must confess my memory is a bit fuzzy,
> libxl__create_pci_backend might be a place to look). Having done that
> then libxl_device_pci is just a single device.
>
> I think you should look to do something similar here, so that each
> libxl_device_vscsi is actually a single device. The alternative is some
> sort of libxl_bus_vscsi thing, which would be unconventional compared
> with everything else
The thing with host:dev[,dev[,dev]] is that within the guest something
may rely on the way SCSI devices are presented (ordering, naming, no
idea). Surely the vscsi= and scsi-attach/detach parsing code could do
some sort of translation from a group of devices-per-vhost into a single
vhost:dev. But now that we have found that reconfiguration is used in
xenstore to attach/detach devices it may be possible to reuse existing
libxl helpers while preserving existing domU.cfg files and backend
drivers.
> > +libxl_device_vscsi *libxl_device_vscsi_list(libxl_ctx *ctx, uint32_t
> > domid, int *num)
> > +{
> > + GC_INIT(ctx);
> > +
> > + libxl_device_vscsi *vscsi_hosts = NULL;
> > + char *fe_path;
> > + char **dir;
> > + unsigned int ndirs = 0;
> > +
> > + fe_path = libxl__sprintf(gc, "%s/device/vscsi",
> > libxl__xs_get_dompath(gc, domid));
>
> Need to take care trusting this too much...
Not sure what you mean with this sentence?
> > @@ -3520,6 +3713,10 @@ DEFINE_DEVICE_REMOVE(vfb, destroy, 1)
> > DEFINE_DEVICE_REMOVE(vtpm, remove, 0)
> > DEFINE_DEVICE_REMOVE(vtpm, destroy, 1)
> >
> > +/* vscsi */
> > +DEFINE_DEVICE_REMOVE(vscsi, remove, 0)
> > +DEFINE_DEVICE_REMOVE(vscsi, destroy, 0)
>
> Should the second one not be (vscsi, destroy, 1) ?
Right.
> > +libxl_device_vscsi = Struct("device_vscsi", [
> > + ("backend_domid", libxl_domid),
> > + ("devid", libxl_devid),
> > + ("v_hst", uint32),
> > + ("vscsi_devs", Array(libxl_vscsi_dev, "num_vscsi_devs")),
> > + ("feature_host", bool),
>
> Not defbool (I don't know what this does...)
This is just flag for "feature-host" yes/no. Have to find out what this
backend option actually does.
> > @@ -508,6 +528,28 @@ libxl_vtpminfo = Struct("vtpminfo", [
> > ("uuid", libxl_uuid),
> > ], dir=DIR_OUT)
> >
> > +libxl_vscsiinfo = Struct("vscsiinfo", [
> > + ("backend", string),
> > + ("backend_id", uint32),
> > + ("frontend", string),
> > + ("frontend_id", uint32),
> > + ("devid", libxl_devid),
> > + ("p_hst", uint16),
> > + ("p_chn", uint16),
> > + ("p_tgt", uint16),
> > + ("p_lun", uint16),
>
> Type, also the widths here differ from the device...
>
> > + ("vscsi_dev_id", libxl_devid),
> > + ("v_hst", uint16),
> > + ("v_chn", uint16),
> > + ("v_tgt", uint16),
> > + ("v_lun", uint16),
>
> and again
Some format code was unhappy with u16, cant remember. I was just trying
to save space. Again, optimization of non-critical code paths.
> > +static char *vscsi_trim_string(char *s)
> > +{
> > + unsigned int len;
> > +
> > + while (isspace(*s))
> > + s++;
> > + len = strlen(s);
> > + while (len-- > 1 && isspace(s[len]))
> > + s[len] = '\0';
> > + return s;
>
> Doesn't seem to be anything vscsi specific here (and it seems like a
> generally useful thing. I expect at least thevif parsing would benefit
> from deploying this too (in so much as all this wouldn't be better with
> a proper parser using flex).
I figured that could be a generic helper. Have to check existing code if
it can make use of it.
> > +}
> > +
> > +static void parse_vscsi_config(libxl_device_vscsi *vscsi_host,
> > + libxl_vscsi_dev *vscsi_dev,
> > + char *buf)
> > +{
> > + char *pdev, *vdev, *fhost;
> > + unsigned int hst, chn, tgt, lun;
> > +
> > + libxl_device_vscsi_init(vscsi_host);
> > + pdev = strtok(buf, ",");
> > + vdev = strtok(NULL, ",");
> > + fhost = strtok(NULL, ",");
>
> I guess we are single threaded here so this is ok.
>
> I wonder if this might be useful to put in libxlu alongside the disk
> parser? The question is whether anything else wants to parse xm style
> vscsi strings, like libvirt perhaps. (arguably a bunch of these parsers
> here have that property and belong in libxlu).
I was wondering how libvirt support for vscsi would look like, looks
like it has to reimplement all that parsing to fill struct
libxl_domain_config. Maybe that parser could be generic, depends how
external callers deal with such parsing today.
> > + pdev = vscsi_trim_string(pdev);
> > + vdev = vscsi_trim_string(vdev);
> > +
> > + if (strncmp(pdev, "/dev/", 5) == 0) {
> > +#ifdef __linux__
>
> Urk. Normally we do this by splitting by files, e.g. into a linux
> version and a nop version and compiling the appropriate one.
>
> This seems like a candidate for libxlu too. We probably need to have a
> think about how much of this stuff is done in libxl too -- e.g. for
> disks we pass the pdev in as a string (path) and interpret it there.
> That allows us to choose between alternative backend providers etc. Also
> libxl is a slightly more palatable place to deal with Linux vs non-Linux
> bits. I think that might be the right answer here too.
There is a libxl_linux.c file, but thats for libxl not xl. So we went
with the ifdef for the time being.
> > @@ -1242,6 +1360,56 @@ static void parse_config_data(const char
> > *config_source,
> > }
> > }
> >
> > + if (!xlu_cfg_get_list(config, "vscsi", &vscsis, 0, 0)) {
> > + int cnt_vscsi_devs = 0;
> > + d_config->num_vscsis = 0;
> > + d_config->vscsis = NULL;
> > + while ((buf = xlu_cfg_get_listitem (vscsis, cnt_vscsi_devs)) !=
> > NULL) {
> > + libxl_vscsi_dev vscsi_dev = { };
> > + libxl_device_vscsi vscsi_host = { };
> > + libxl_device_vscsi *host;
> > + char *tmp_buf;
> > + int num_vscsis, host_found = 0;
> > +
> > + /*
> > + * #1: parse the devspec and place it in temporary host+dev
> > part
> > + * #2: find existing vscsi_host with number v_hst
> > + * if found, append the vscsi_dev to this vscsi_host
> > + * #3: otherwise, create new vscsi_host and append vscsi_dev
> > + * Note: v_hst does not represent the index named "num_vscsis",
> > + * it is a private index used just in the config file
>
> Would all be a lot simpler if a libxl_device_vscsi was a single device
> and all the worrying about which bus was which was in libxl
As said above, domU.cfg code like this may serve a purpose. Devices are
grouped per vhost (0: and 1: in this example):
### vscsi = [
### '/dev/sdm, 0:0:0:0',
### '/dev/sdn, 0:0:0:1',
### '/dev/sdo, 0:0:1:0',
### '/dev/sdp, 0:2:0:1 ',
### '/dev/sdq, 1:0:0:0, feature-host ',
### '/dev/sdr, 1:1:1:0, feature-host',
### ]
Thanks for the quick review!
Olaf
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |