[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 1/2] Resize the MAX_NR_IO_RANGES for ioreq server
Thanks a lot, George. On 7/6/2015 10:06 PM, George Dunlap wrote: On Mon, Jul 6, 2015 at 2:33 PM, Paul Durrant <Paul.Durrant@xxxxxxxxxx> wrote:-----Original Message----- From: George Dunlap [mailto:george.dunlap@xxxxxxxxxxxxx] Sent: 06 July 2015 14:28 To: Paul Durrant; George Dunlap Cc: Yu Zhang; xen-devel@xxxxxxxxxxxxx; Keir (Xen.org); Jan Beulich; Andrew Cooper; Kevin Tian; zhiyuan.lv@xxxxxxxxx Subject: Re: [Xen-devel] [PATCH v2 1/2] Resize the MAX_NR_IO_RANGES for ioreq server On 07/06/2015 02:09 PM, Paul Durrant wrote:-----Original Message----- From: dunlapg@xxxxxxxxx [mailto:dunlapg@xxxxxxxxx] On Behalf Of George Dunlap Sent: 06 July 2015 13:50 To: Paul Durrant Cc: Yu Zhang; xen-devel@xxxxxxxxxxxxx; Keir (Xen.org); Jan Beulich;AndrewCooper; Kevin Tian; zhiyuan.lv@xxxxxxxxx Subject: Re: [Xen-devel] [PATCH v2 1/2] Resize the MAX_NR_IO_RANGESforioreq server On Mon, Jul 6, 2015 at 1:38 PM, Paul Durrant <Paul.Durrant@xxxxxxxxxx> wrote:-----Original Message----- From: dunlapg@xxxxxxxxx [mailto:dunlapg@xxxxxxxxx] On Behalf Of George Dunlap Sent: 06 July 2015 13:36 To: Yu Zhang Cc: xen-devel@xxxxxxxxxxxxx; Keir (Xen.org); Jan Beulich; AndrewCooper;Paul Durrant; Kevin Tian; zhiyuan.lv@xxxxxxxxx Subject: Re: [Xen-devel] [PATCH v2 1/2] Resize theMAX_NR_IO_RANGESforioreq server On Mon, Jul 6, 2015 at 7:25 AM, Yu Zhang <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:MAX_NR_IO_RANGES is used by ioreq server as the maximum number of discrete ranges to be tracked. This patch changes its value to 8k, so that more ranges can be tracked on next generation of Intel platforms in XenGT. Future patches can extend the limit to be toolstack tunable, and MAX_NR_IO_RANGES can serve as a default limit. Signed-off-by: Yu Zhang <yu.c.zhang@xxxxxxxxxxxxxxx>I said this at the Hackathon, and I'll say it here: I think this is the wrong approach. The problem here is not that you don't have enough memory ranges.Theproblem is that you are not tracking memory ranges, but individual pages. You need to make a new interface that allows you to tag individual gfns as p2m_mmio_write_dm, and then allow one ioreq server to get notifications for all such writes.I think that is conflating things. It's quite conceivable that more than oneioreq server will handle write_dm pages. If we had enough types to have two page types per server then I'd agree with you, but we don't. What's conflating things is using an interface designed for *device memory ranges* to instead *track writes to gfns*.What's the difference? Are you asserting that all device memory rangeshave read side effects and therefore write_dm is not a reasonable optimization to use? I would not want to make that assertion. Using write_dm is not the problem; it's having thousands of memory "ranges" of 4k each that I object to. Which is why I suggested adding an interface to request updates to gfns (by marking them write_dm), rather than abusing the io range interface.And it's the assertion that use of write_dm will only be relevant to gfns, and that all such notifications only need go to a single ioreq server, that I have a problem with. Whilst the use of io ranges to track gfn updates is, I agree, not ideal I think the overloading of write_dm is not a step in the right direction.So there are two questions here. First of all, I certainly think that the *interface* should be able to be transparently extended to support multiple ioreq servers being able to track gfns. My suggestion was to add a hypercall that allows an ioreq server to say, "Please send modifications to gfn N to ioreq server X"; and that for the time being, only allow one such X to exist at a time per domain. That is, if ioreq server Y makes such a call after ioreq server X has done so, return -EBUSY. That way we can add support when we need it. Well, I also agree the current implementation is probably not optimal. And yes, it seems promiscuous( hope I did not use the wrong word :) ) to mix the device I/O ranges and the guest memory. But, forwarding an ioreq to backend driver, just by an p2m type? Although it would be easy for XenGT to take this approach, I agree with Paul that this would weaken the functionality of ioreq server. Besides, is it appropriate for a p2m type to be used this way? It seems strange for me. In fact, you probably already have a problem with two ioreq servers, because (if I recall correctly) you don't know for sure when a page Fortunately, we do, and these unmapped page tables will be removed from the rangeset of ioreq server. So the following scenario won't happen. :) has stopped being used as a GPU pagetable. Consider the following scenario: 1. Two devices, served by ioreq servers 1 and 2. 2. driver for device served by ioreq server 1 allocates a page, uses it as a pagetable. ioreq server 1 adds that pfn to the ranges it's watching. 3. driver frees page back to guest OS; but ioreq server 1 doesn't know so it doesn't release the range 4. driver for device served by ioreq server 2 allocates a page, which happens to be the same one used before, and uses it as a pagetable. ioreq server 1 tries to add that pfn to the ranges it's watching. Now you have an "overlap in the range" between the two ioreq servers. What do you do? Regarding using write_dm for actual device memory ranges: Do you have any concrete scenarios in mind where you think this will be used? Fundamentally, write_dm looks to me like it's about tracking gfns -- i.e., things backed by guest RAM -- not IO ranges. As such, it should have an interface and an implementation that reflects that. Here, I guess your major concern about the difference between tracking gfns and I/O ranges is that the gfns are scattered? And yes, this is whywe need more ranges inside a rangeset. Here the new value of the limit, 8K, is a practical one for XenGT. In the future, we can either provide other approaches to configure the maximum ranges inside an ioreq server, or provide some xenheap allocation management routines. Is this OK? I thought we had successfully convinced you in hackathon. And seems I'm wrong. Anyway, your advices are very appreciated. :) -George B.R. Yu _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |