[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 09/13] xsplice: Implement support for applying/reverting/replacing patches. (v2)
On 01/19/2016 04:55 PM, Konrad Rzeszutek Wilk wrote: snip +/* Must be holding the payload_list lock. */payload lock?+static int schedule_work(struct payload *data, uint32_t cmd) +{ + /* Fail if an operation is already scheduled. */ + if ( xsplice_work.do_work ) + return -EAGAIN;Hmm, I don't think EAGAIN is correct. It will cause xen-xsplice to poll for a status update, but the operation hasn't actually been submitted.-EBUSY -EDEADLK ? I would choose -EBUSY. + + xsplice_work.cmd = cmd; + xsplice_work.data = data; + atomic_set(&xsplice_work.semaphore, -1); + atomic_set(&xsplice_work.irq_semaphore, -1); + + xsplice_work.ready = 0; + smp_wmb(); + xsplice_work.do_work = 1; + smp_wmb(); + + return 0; +} + +/* + * Note that because of this NOP code the do_nmi is not safely patchable. + * Also if we do receive 'real' NMIs we have lost them. + */ +static int mask_nmi_callback(const struct cpu_user_regs *regs, int cpu) +{ + return 1; +} + +static void reschedule_fn(void *unused) +{ + smp_mb(); /* Synchronize with setting do_work */ + raise_softirq(SCHEDULE_SOFTIRQ); +} + +static int xsplice_do_wait(atomic_t *counter, s_time_t timeout, + unsigned int total_cpus, const char *s) +{ + int rc = 0; + + while ( atomic_read(counter) != total_cpus && NOW() < timeout ) + cpu_relax(); + + /* Log & abort. */ + if ( atomic_read(counter) != total_cpus ) + { + printk(XENLOG_DEBUG "%s: %s %u/%u\n", xsplice_work.data->name, + s, atomic_read(counter), total_cpus); + rc = -EBUSY; + xsplice_work.data->rc = rc; + xsplice_work.do_work = 0; + smp_wmb(); + return rc; + } + return rc; +} + +static void xsplice_do_single(unsigned int total_cpus) +{ + nmi_callback_t saved_nmi_callback; + s_time_t timeout; + struct payload *data, *tmp; + int rc; + + data = xsplice_work.data; + timeout = data->timeout ? data->timeout : MILLISECS(30);The design doc says that a timeout of 0 means infinity.True. Lets update the document.+ printk(XENLOG_DEBUG "%s: timeout is %"PRI_stime"ms\n", data->name, + timeout / MILLISECS(1)); + + timeout += NOW(); + + if ( xsplice_do_wait(&xsplice_work.semaphore, timeout, total_cpus, + "Timed out on CPU semaphore") ) + return; + + /* "Mask" NMIs. */ + saved_nmi_callback = set_nmi_callback(mask_nmi_callback); + + /* All CPUs are waiting, now signal to disable IRQs. */ + xsplice_work.ready = 1; + smp_wmb(); + + atomic_inc(&xsplice_work.irq_semaphore); + if ( xsplice_do_wait(&xsplice_work.irq_semaphore, timeout, total_cpus, + "Timed out on IRQ semaphore.") ) + return; + + local_irq_disable();As far as I can tell, the mechanics of how this works haven't changed, the code has just been reorganized. Which means the points that Martin raised about this mechanism are still outstanding.A bit. I added the extra timeout on both of the 'spin-around' and also moved some of the barriers around. Also removed your spin-lock and used the atomic_t mechanism to synchronize. But the one thing that I didn't do was the spin on the 'workers?' that are just spinnig idly. They will do that forever if say the 'master' hasn't gone to the IRQ semaphore part. My thinking was that the 'workers' could also use the timeout feature but just multiple it by two? After looking at this again, I remembered that the algorithm I used is the same as the one used by stop_machine_run(). That function runs without timeouts at all (seemingly without problems), so why shouldn't this one? (The only reason stop_machine_run() itself isn't used for patching is because we need to enter the function without a stack, i.e. not from a tasklet.) -- Ross Lagerwall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |