[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] Xen/MCE: adjust for future new vMCE model
Jan Beulich wrote: >>>> On 05.07.12 at 11:20, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote: >> Jan Beulich wrote: >>>>>> On 05.07.12 at 05:30, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> >>>>>> wrote: @@ -68,7 +74,7 @@ >>>> >>>> int vmce_restore_vcpu(struct vcpu *v, uint64_t caps) { >>>> - if ( caps & ~g_mcg_cap & ~MCG_CAP_COUNT & ~MCG_CTL_P ) >>>> + if ( caps & ~g_mcg_cap & ~MCG_CAP_COUNT ) >>>> { >>>> dprintk(XENLOG_G_ERR, "%s restore: unsupported MCA >>>> capabilities" " %#" PRIx64 " for d%d:v%u >>>> (supported: %#Lx)\n", @@ -77,6 +83,12 @@ return -EPERM; >>>> } >>>> >>>> + if ( (caps & MCG_CAP_COUNT) != GUEST_BANK_NUM ) + { >>>> + dprintk(XENLOG_G_ERR, "Bank number inconsistency\n"); + >>>> return -EPERM; + } + >>>> v->arch.mcg_cap = caps; >>>> return 0; >>>> } >>> >>> These two changes, as pointed out before, need some further >>> thought and tweaking: As I said earlier, we are already shipping >>> the code in its prior form, so outright rejecting MCG_CTL_P set >>> and non-default bank counts is not a proper solution. Warning >>> about them being in an unsupported state is certainly acceptable. >>> >>> And I think the guest visible MCG_CAP value also shouldn't >>> change across migration, so tolerating (but ignoring) a higher >>> bank count seems necessary. Not sure what to do when the >>> bank count is lower (0 or 1) - for 1, all communication to the >>> guest should probably go through bank 0, while 0 should >>> probably disable vMCE for that vCPU. >>> >>> Further I just noticed that you don't touch fill_vmsr_data() at >>> all (sending patches created with diff's -p option or equivalent >>> helps spotting where individual changes belong), yet that >>> function uses the hardware bank number to fill the struct >>> bank_entry. With the intended concept, the "bank" member >>> of that structure can probably go away altogether. >>> >>> Jan >> >> >> Seems things become a little bit chaos, let's jump out from details, >> make agreement first about what's the general purpose of this >> middle-work patch. >> >> ============ >> 1. current xen vmce status >> 1.1) current xen vmce has 2 kinds of bugs: >> * functional bug: it actually doesn't work correctly for guest; >> * migration bug: partly solved by c/s 24887; >> 1.2) c/s 24887 not in formal xen release version, it's a temporary >> fix (but unfortunately has been backported to SELS11 sp2) >> ============ >> 2. target of this middle-work patch >> 2.1) it's not used to address functional bug >> 2.3) it does minimal work, just in order not to bring >> trouble/dirty to future new vMCE, so it would * remove MCG_CTL >> --> otherwise new vMCE have to add useless MCG_CTL_P and MCG_CTL, w/ >> potential model specific issue; >> * stick all 1's to MCi_CTL --> to avoid semantic issue; >> * set MCG_CTL_P=0 and 2 banks at MCG_CAP --> otherwise dirty >> code at new vMCE, like bank number issue; ============ >> >> I think in this middle-work patch, we don't need constrained by c/s >> 24887: >> 1. c/s 24887 not in formal xen release >> 2. the benefit of keep compatible w/ 24887 is minor: >> * currently all xen version have migration bug, 3.x -> 4.0 -> 4.1 >> -> 4.2 >> * keep compatible w/ 24887 only benefit one case --> migration >> from SLES11 sp2 to xen 4.2 (for both SLES11 sp2 and xen 4.2, vMCE >> actually doesn't functionally work fine to guest) >> 3. the price/hurt to future vMCE is high (to keep compatible w/ >> 24887) > > Why do you consider the price high? I think it would require pretty > minor changes. No, what I meant is not coding price. I mean the price that have to add dirty and useless change to the new vMCE is high. Just curious why we cannot simply get rid of c/s 24887? after all it only benefit SLES11 sp2. > >> Considering above, I prefer an outright cleanup in xen4.2, not >> constrained by c/s 24887 and not bring trouble/dirty to future vMCE. > > Whether or not is was appropriate for us to backport this change > early is unclear, but given that back at that time I had already > pointed out the problems and asked for work to be done to clean > it up, and given that it has taken over four months to get going > with this, I don't think you would suggest the alternative of us > having left the bug unfixed for this entire time period. Jan, I'm not challenge why you backport to SLES11 sp2. If there is anything wrong, I agree it's my fault. Currently what I concern is if we can do an outright cleanup at xen4.2 so that future vMCE could be simple and clean. > > So unless the price to pay is unreasonably high, I'd favor getting > this taken care of rather than ignored. If so, why we need this middle-work patch? It could just keep current status at xen 4.2, then start 'dirty' new vMCE implement at xen4.3 --> and the 'dirty' inherit from c/s 24887 which from code point of view would be totally removed at new vMCE. Thanks, Jinsong > > If I can find time to work on your last version of the patch towards > the direction I have in mind tomorrow, I'll do so; I'll be gone for > two weeks afterwards (and be mostly invisible for another one), > so wouldn't be able to look at this again presumably until the > beginning of August. > > Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |