Xen project Mailing List

Re: [Xen-devel] [PATCH RFC 06/12] xen-blkback: implement LRU mechanism for persistent grants

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>

From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

Date: Tue, 5 Mar 2013 16:49:50 -0500

Cc: "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Tue, 05 Mar 2013 21:50:17 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Tue, Mar 05, 2013 at 07:10:04PM +0100, Roger Pau Monné wrote: > On 04/03/13 21:10, Konrad Rzeszutek Wilk wrote: > > On Thu, Feb 28, 2013 at 11:28:49AM +0100, Roger Pau Monne wrote: > >> This mechanism allows blkback to change the number of grants > >> persistently mapped at run time. > >> > >> The algorithm uses a simple LRU mechanism that removes (if needed) the > >> persistent grants that have not been used since the last LRU run, or > >> if all grants have been used it removes the first grants in the list > >> (that are not in use). > >> > >> The algorithm has several parameters that can be tuned by the user > >> from sysfs: > >> > >> * max_persistent_grants: maximum number of grants that will be > >> persistently mapped. > >> * lru_interval: minimum interval (in ms) at which the LRU should be > >> run > >> * lru_num_clean: number of persistent grants to remove when executing > >> the LRU. > >> > >> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > >> Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> > >> Cc: xen-devel@xxxxxxxxxxxxx > >> --- > >> drivers/block/xen-blkback/blkback.c | 207 > >> +++++++++++++++++++++++++++-------- > >> drivers/block/xen-blkback/common.h | 4 + > >> drivers/block/xen-blkback/xenbus.c | 1 + > >> 3 files changed, 166 insertions(+), 46 deletions(-) > > > > You also should add a Documentation/sysfs/ABI/stable/sysfs-bus-xen-backend > > OK > > >> > >> diff --git a/drivers/block/xen-blkback/blkback.c > >> b/drivers/block/xen-blkback/blkback.c > >> index 415a0c7..c14b736 100644 > >> --- a/drivers/block/xen-blkback/blkback.c > >> +++ b/drivers/block/xen-blkback/blkback.c > >> @@ -63,6 +63,44 @@ static int xen_blkif_reqs = 64; > >> module_param_named(reqs, xen_blkif_reqs, int, 0); > >> MODULE_PARM_DESC(reqs, "Number of blkback requests to allocate"); > >> > >> +/* > >> + * Maximum number of grants to map persistently in blkback. For maximum > >> + * performance this should be the total numbers of grants that can be used > >> + * to fill the ring, but since this might become too high, specially with > >> + * the use of indirect descriptors, we set it to a value that provides > >> good > >> + * performance without using too much memory. > >> + * > >> + * When the list of persistent grants is full we clean it using a LRU > >> + * algorithm. > >> + */ > >> + > >> +static int xen_blkif_max_pgrants = 352; > > > > And a little blurb saying why 352. > > Yes, this is (as you probably already realized) RING_SIZE * > BLKIF_MAX_SEGMENTS_PER_REQUEST > > > > >> +module_param_named(max_persistent_grants, xen_blkif_max_pgrants, int, > >> 0644); > >> +MODULE_PARM_DESC(max_persistent_grants, > >> + "Maximum number of grants to map persistently"); > >> + > >> +/* > >> + * The LRU mechanism to clean the lists of persistent grants needs to > >> + * be executed periodically. The time interval between consecutive > >> executions > >> + * of the purge mechanism is set in ms. > >> + */ > >> + > >> +static int xen_blkif_lru_interval = 100; > > > > So every second? What is the benefit of having the user modify this? Would > > it better if there was an watermark system in xen-blkfront to automatically > > figure this out? (This could be a TODO of course) > > Every 100ms, so every 0.1 seconds. This can have an impact on > performance as implemented right now (if we move the purge to a separate > thread, it's not going to have such an impact), but anyway I feel we can > let the user tune it. Why not automatically tune it in the backend? So it can do this by itself? > > >> +module_param_named(lru_interval, xen_blkif_lru_interval, int, 0644); > >> +MODULE_PARM_DESC(lru_interval, > >> +"Execution interval (in ms) of the LRU mechanism to clean the list of > >> persistent grants"); > >> + > >> +/* > >> + * When the persistent grants list is full we will remove unused grants > >> + * from the list. The number of grants to be removed at each LRU execution > >> + * can be set dynamically. > >> + */ > >> + > >> +static int xen_blkif_lru_num_clean = BLKIF_MAX_SEGMENTS_PER_REQUEST; > >> +module_param_named(lru_num_clean, xen_blkif_lru_num_clean, int, 0644); > >> +MODULE_PARM_DESC(lru_num_clean, > >> +"Number of persistent grants to unmap when the list is full"); > > > > Again, what does that mean to the system admin? Why would they need to > > modify > > the contents of that value? > > > > Well if you set the maximum number of grants to 1024 you might want to > increase this to 64 maybe, or on the other hand, if you set the maximum > number of grants to 10, you may wish to set this to 1, so I think it is > indeed relevant for system admins. So why not make this automatic? A value blkback can automatically adjust as there are less or more grants. This of course does not have to be part of this patch. > > > Now if this is a debug related one for developing, then this could all be > > done in DebugFS. > > > >> + > >> /* Run-time switchable: /sys/module/blkback/parameters/ */ > >> static unsigned int log_stats; > >> module_param(log_stats, int, 0644); > >> @@ -81,7 +119,7 @@ struct pending_req { > >> unsigned short operation; > >> int status; > >> struct list_head free_list; > >> - DECLARE_BITMAP(unmap_seg, BLKIF_MAX_SEGMENTS_PER_REQUEST); > >> + struct persistent_gnt > >> *persistent_gnts[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > >> }; > >> > >> #define BLKBACK_INVALID_HANDLE (~0) > >> @@ -102,36 +140,6 @@ struct xen_blkbk { > >> static struct xen_blkbk *blkbk; > >> > >> /* > >> - * Maximum number of grant pages that can be mapped in blkback. > >> - * BLKIF_MAX_SEGMENTS_PER_REQUEST * RING_SIZE is the maximum number of > >> - * pages that blkback will persistently map. > >> - * Currently, this is: > >> - * RING_SIZE = 32 (for all known ring types) > >> - * BLKIF_MAX_SEGMENTS_PER_REQUEST = 11 > >> - * sizeof(struct persistent_gnt) = 48 > >> - * So the maximum memory used to store the grants is: > >> - * 32 * 11 * 48 = 16896 bytes > >> - */ > >> -static inline unsigned int max_mapped_grant_pages(enum blkif_protocol > >> protocol) > >> -{ > >> - switch (protocol) { > >> - case BLKIF_PROTOCOL_NATIVE: > >> - return __CONST_RING_SIZE(blkif, PAGE_SIZE) * > >> - BLKIF_MAX_SEGMENTS_PER_REQUEST; > >> - case BLKIF_PROTOCOL_X86_32: > >> - return __CONST_RING_SIZE(blkif_x86_32, PAGE_SIZE) * > >> - BLKIF_MAX_SEGMENTS_PER_REQUEST; > >> - case BLKIF_PROTOCOL_X86_64: > >> - return __CONST_RING_SIZE(blkif_x86_64, PAGE_SIZE) * > >> - BLKIF_MAX_SEGMENTS_PER_REQUEST; > >> - default: > >> - BUG(); > >> - } > >> - return 0; > >> -} > >> - > >> - > >> -/* > >> * Little helpful macro to figure out the index and virtual address of the > >> * pending_pages[..]. For each 'pending_req' we have have up to > >> * BLKIF_MAX_SEGMENTS_PER_REQUEST (11) pages. The seg would be from 0 > >> through > >> @@ -251,6 +259,76 @@ static void free_persistent_gnts(struct rb_root > >> *root, unsigned int num) > >> BUG_ON(num != 0); > >> } > >> > >> +static int purge_persistent_gnt(struct rb_root *root, int num) > >> +{ > >> + struct gnttab_unmap_grant_ref unmap[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > >> + struct page *pages[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > >> + struct persistent_gnt *persistent_gnt; > >> + struct rb_node *n; > >> + int ret, segs_to_unmap = 0; > >> + int requested_num = num; > >> + int preserve_used = 1; > > > > Boolean? And perhaps 'scan_dirty' ? > > Sure > > > > > > >> + > >> + pr_debug("Requested the purge of %d persistent grants\n", num); > >> + > >> +purge_list: > > > > This could be written a bit differently to also run outside the > > xen_blkif_schedule > > (so a new thread). This would require using the lock mechanism and > > converting > > this big loop to two smaller loops: > > 1) - one quick that holds the lock - to take the items of the list, > > 2) second one to do the grant_set_unmap_op operations and all the heavy > > free_xenballooned_pages call. > > Yes, I could add a list_head to persistent_gnt, so we can take them out > of the red-black tree and queue them in a list to be processed (unmap + > free) after we have looped thought the list, without holding the lock. > > > > > .. As this function ends up (presumarily?) causing xen_blkif_schedule to be > > doing > > this for some time every second. Irregardless of how utilized the ring is - > > so > > if we are 100% busy - we should not need to call this function. But if we > > do, > > then we end up walking the persistent_gnt twice - once with preserve_used > > set > > to true, and the other with it set to false. > > > > We don't really want that - so is there a way for xen_blkif_schedule to > > do a quick determintion of this caliber: > > > > > > if (RING_HAS_UNCONSUMED_REQUESTS(x) >= some_value) > > wait_up(blkif->purgarator) > > It's not possible to tell if all grants will be in use just by looking > at the number of active requests, since all requests might just be using > one segment, and thus the list of persistent grants could be purged > without problems. We could keep a count of the number of active grants > at each time and use that to check if we can kick the purge or not. > > if (grants_in_use > (persistent_gnt_c - num_purge)) > wait(...) Sure. > > > And the thread would just sit there until kicked in action? > > And when a request frees some grants it could be kicked back to action. OK. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.