Xen project Mailing List

Re: [Xen-devel] [Patch 3/7] pvSCSI driver

From: Steven Smith <steven.smith@xxxxxxxxxxxxx>

Date: Wed, 27 Feb 2008 11:16:28 +0000

Delivery-date: Wed, 27 Feb 2008 03:17:51 -0800

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

> diff -r f76e90b4f7ad -r a8a56c423e78 include/xen/interface/io/vscsiif.h > --- /dev/null Thu Jan 01 00:00:00 1970 +0000 > +++ b/include/xen/interface/io/vscsiif.h Thu Feb 14 11:14:46 2008 +0900 ... > + > +#define VSCSIIF_CMND_SCSI 1 /* scsi */ > +#define VSCSIIF_CMND_SCSI_RESET 2 /* scsi */ Do you ever actually use the reset command? > + > +/* ---------------------------------------------------------------------- > + Definition of Ring Structures > + ---------------------------------------------------------------------- */ > + > +#define VSCSIIF_DEFAULT_CAN_QUEUE 256 What does this mean? > +#define VSCSIIF_MAX_COMMAND_SIZE 16 Hmm. SBC-3 includes a couple of 32 byte commands, which won't fit into this request structure. Admittedly, none of them look terribly useful, but it would be unfortunate to discover that we have to do another PV scsi protocol in a year's time because some critical request doesn't fit. > +#define VSCSIIF_SG_TABLESIZE 27 Where did 27 come from? > +struct vscsiif_request { > + uint16_t rqid; > + uint8_t cmd; > + /* SCSI */ > + uint8_t cmnd[VSCSIIF_MAX_COMMAND_SIZE]; > + uint8_t cmd_len; So cmd_len applies to cmnd rather than cmd, yes? > + uint16_t id, lun, channel; Okay, so each of your PV devices maps to the moral equivalent of an HBA, yes? You can have several targets attached to a single PV device? > + uint8_t sc_data_direction; What does this mean? > + uint8_t use_sg; From looking at your front and back implementations, I think this is actually a count of the number of segments you have in your SG table, yes? If so, it could probably cope with being renamed to something a bit more appropriate. Also, I think you have two bytes of implicit padding here, don't you? We usually try to avoid that in shared structures, because (a) it's confusing, (b) it's wasteful of ring space, and (c) it can cause portability problems if guests are built with different compilers. > + uint32_t request_bufflen; Is this not redundant with the scatter-gather list, below? Or is there supposed to be some padding on the SG list which isn't included in the buffer length? > + /*int32_t timeout_per_command;*/ > + struct scsiif_request_segment { > + grant_ref_t gref; > + uint16_t offset; > + uint16_t length; > + } seg[VSCSIIF_SG_TABLESIZE]; > +}; That gives you a 248 byte request structure, doesn't it? That means you can only have 16 requests outstanding on a single page ring, which may not be enough to get good performance. > +#define VSCSIIF_SENSE_BUFFERSIZE 96 Where did 96 come from? SPC-4 revision 10 section 6.28 seems to imply that sense data can be up to 252 bytes. > + > +struct vscsiif_response { > + uint16_t rqid; Another two bytes of implicit padding. > + int32_t rslt; > + uint8_t sense_len; > + uint8_t sense_buffer[VSCSIIF_SENSE_BUFFERSIZE]; > +}; For this sort of ring, responses have to be padded up to match request sizes. This means that you've effectively got another 143 bytes of padding after the end of this structure, which could be used to increase the sense buffer size to 239 bytes (240 if you rearrange the response fields to avoid the padding). That's still not really big enough, but it's at least closer. > + > +DEFINE_RING_TYPES(vscsiif, struct vscsiif_request, struct vscsiif_response); > + > +#endif It might be worth considering abandoning the use of fixed-sized request and response structures, and instead using a more byte-oriented protocol in which requests can be almost any size. This avoids the need to size things like the CDB area to match the largest possible request, because each request will be precisely as large as it needs to be. The main downsides are: -- It's much harder to access structures directly while they're on the ring. On the other hand, you almost certainly need to copy the request into backend-local storage before processing it anyway, to avoid TOCTOU races and the associated security problems, and making that copy handle the ring is fairly easy. -- If you allow requests and responses to be different sizes, you need to put them on separate rings, so that you don't find that e.g. you've had a series of responses which were bigger than their associated requests, and the response area has now overflowed on top of some requests. Again, this isn't terribly difficult to deal with. -- The ring.h macros won't work. You'll need to invent your own scheme. The main alternative here is request chaining, like the netchannel protocol. This is generally somewhat tricky to get right, and I'd expect that just going for variable-sized requests would be easier. If you do decide to go for this then you may want to pad requests to a multiple of 8 bytes so that the memcpy()s are suitable aligned. I think the current netchannel2 plan also calls for variable-sized messages with split front->back and back->front rings. It might be possible to share some code there (although at present there doesn't exist any code to share). I'd also strongly recommend supporting multi-page rings. That will allow you to have more requests in flight at any one time, which should lead to better performance. Steven.

Attachment: signature.asc
Description: Digital signature

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.