[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 1/2] kexec: use hypercall_create_continuation to protect KEXEC ops
On Wed, Apr 19, 2017 at 10:47:15AM -0500, Eric DeVolder wrote: > When we concurrently try to unload and load crash > images we eventually get: > > Xen call trace: > [<ffff82d08018b04f>] machine_kexec_add_page+0x3a0/0x3fa > [<ffff82d08018b184>] machine_kexec_load+0xdb/0x107 > [<ffff82d080116e8d>] kexec.c#kexec_load_slot+0x11/0x42 > [<ffff82d08011724f>] kexec.c#kexec_load+0x119/0x150 > [<ffff82d080117c1e>] kexec.c#do_kexec_op_internal+0xab/0xcf > [<ffff82d080117c60>] do_kexec_op+0xe/0x1e > [<ffff82d08025c620>] pv_hypercall+0x20a/0x44a > [<ffff82d080260116>] cpufreq.c#test_all_events+0/0x30 > > Pagetable walk from ffff820040088320: > L4[0x104] = 00000002979d1063 ffffffffffffffff > L3[0x001] = 00000002979d0063 ffffffffffffffff > L2[0x000] = 00000002979c7063 ffffffffffffffff > L1[0x088] = 80037a91ede97063 ffffffffffffffff > > The interesting thing is that the page bits (063) look legit. > > The operation on which we blow up is us trying to write > in the L1 and finding that the L2 entry points to some > bizzare MFN. It stinks of a race, and it looks like > the issue is due to no concurrency locks when dealing > with the crash kernel space. > > Specifically we concurrently call kimage_alloc_crash_control_page > which iterates over the kexec_crash_area.start -> kexec_crash_area.size > and once found: > > if ( page ) > { > image->next_crash_page = hole_end; > clear_domain_page(_mfn(page_to_mfn(page))); > } > > clears. Since the parameters of what MFN to use are provided > by the callers (and the area to search is bounded) the the 'page' > is probably the same. So #1 we concurrently clear the > 'control_code_page'. > > The next step is us passing this 'control_code_page' to > machine_kexec_add_page. This function requires the MFNs: > page_to_maddr(image->control_code_page). > > And this would always return the same virtual address, as > the MFN of the control_code_page is inside of the > kexec_crash_area.start -> kexec_crash_area.size area. > > Then machine_kexec_add_page updates the L1 .. which can be done > concurrently and on subsequent calls we mangle it up. > > This is all a theory at this time, but testing reveals > that adding the hypercall_create_continuation() at the > kexec hypercall fixes the crash. > > NOTE: This patch follows 5c5216 (kexec: clear kexec_image slot > when unloading kexec image) to prevent crashes during > simultaneous load/unloads. > > NOTE: Consideration was given to using the existing flag > KEXEC_FLAG_IN_PROGRESS to denote a kexec hypercall in > progress. This, however, overloads the original intent of > the flag which is to denote that we are about-to/have made > the jump to the crash path. The overloading would lead to > failures in existing checks on this flag as the flag would > always be set at the top level in do_kexec_op_internal(). > For this reason, the new flag KEXEC_FLAG_HC_IN_PROGRESS > was introduced. > > While at it, fixed the #define mismatched spacing > > Signed-off-by: Eric DeVolder <eric.devolder@xxxxxxxxxx> > Reviewed-by: Bhavesh Davda <bhavesh.davda@xxxxxxxxxx> > Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> > Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> > Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > Reviewed-by: Daniel Kiper <daniel.kiper@xxxxxxxxxx> > --- > v3: > - Incorporated feedback from Jan Beulich to change the > name of the new flag to KEXEC_FLAG_IN_HYPERCALL > > v2: 04/17/2017 > - Patch titled 'kexec: use hypercall_create_continuation to protect KEXEC > ops' > - Jan Beulich directed me to use a continuation method instead > of spinlock. > - Incorporated feedback from Daniel Kiper <daniel.kiper@xxxxxxxxxx> > - Incorporated feedback from Konrad Wilk <konrad.wilk@xxxxxxxxxx> > > v1: 04/10/2017 > - Patch titled 'kexec: Add spinlock for the whole hypercall' > - Used spinlock in do_kexec_op_internal() > --- > xen/common/kexec.c | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/xen/common/kexec.c b/xen/common/kexec.c > index 072cc8e..253c204 100644 > --- a/xen/common/kexec.c > +++ b/xen/common/kexec.c > @@ -50,9 +50,10 @@ static cpumask_t crash_saved_cpus; > > static struct kexec_image *kexec_image[KEXEC_IMAGE_NR]; > > -#define KEXEC_FLAG_DEFAULT_POS (KEXEC_IMAGE_NR + 0) > -#define KEXEC_FLAG_CRASH_POS (KEXEC_IMAGE_NR + 1) > -#define KEXEC_FLAG_IN_PROGRESS (KEXEC_IMAGE_NR + 2) > +#define KEXEC_FLAG_DEFAULT_POS (KEXEC_IMAGE_NR + 0) > +#define KEXEC_FLAG_CRASH_POS (KEXEC_IMAGE_NR + 1) > +#define KEXEC_FLAG_IN_PROGRESS (KEXEC_IMAGE_NR + 2) > +#define KEXEC_FLAG_IN_HYPERCALL (KEXEC_IMAGE_NR + 3) I do not think that you need change all of this right now. Just add one space after KEXEC_FLAG_IN_HYPERCALL and you are set. > static unsigned long kexec_flags = 0; /* the lowest bits are for > KEXEC_IMAGE... */ > > @@ -1193,6 +1194,9 @@ static int do_kexec_op_internal(unsigned long op, > if ( ret ) > return ret; > > + if ( test_and_set_bit(KEXEC_FLAG_IN_HYPERCALL, &kexec_flags) ) > + return hypercall_create_continuation(__HYPERVISOR_kexec_op, "lh", > op, uarg); > + I would suggest here: ASSERT(test_bit(KEXEC_FLAG_IN_HYPERCALL, &kexec_flags)); ... Daniel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |