[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xen: sched: fix (ACPI S3) resume with cpupools with different schedulers.
On Fri, 2015-11-13 at 03:41 -0700, Jan Beulich wrote: > > > > On 13.11.15 at 11:08, <dario.faggioli@xxxxxxxxxx> wrote: > > Fix things by properly deallocating scheduler specific > > data of the pCPU's pool scheduler during pCPU teardown, > > and re-allocating them --always for &ops-- during pCPU > > bringup. > > Avoiding this was done for a reason iirc: What if one such allocation > fails (e.g. due to heap fragmentation resulting from the allocation > order not being the exact inverse of the freeing order)? > I don't think I'm getting the point. cpu_schedule_up() allocates, with the alloc_pdata hook, stuff that was freed by cpu_schedule_down (with the free_pdata hook) already. It actually allocates two things, the other being the entire idle vCPU of the pCPU, but that is only if idle_vcpu[cpu] is NULL, which should not be happening during suspend/resume. In any case, if one of these allocations fails, it already returns ENOMEM. What happens then is, of course, caller's call. If it happens during boot for pCPU 0, we BUG(). If it happens with non-boot pCPUs (either at boot, during hotplug or during suspend/resume), we fail the bringup of that pCPU at the CPU_UP_PREPARE phase. This patch introduces one more occasion for the failure to occur, sure, if there's no (proper) memory. Without this patch the system crashes already, and even if there is enough (proper) memory, thuogh. Furthermore, at a later point on the pCPU bringup path, schedule_cpu_switch() is called. In the scenario described in the changelog, what it will do is deallocating the existing scheduler specific per-vCPU data, and re-allocating a new one. If the allocation fails at that point, what happens is that we fail the bringup at the CPU_ONLINE stage and (in cpu_up()) we BUG(). This does not really look better than what happens with the patch applied to me... Quite the opposite, TBH. Actually, re-looking at schedule_cpu_switch(), the situation is probably even worse than how I described it in the changelog! In fact, still in the scenario introduced there, when we actually get to call schedule_cpu_switch() (during CPU_ONLINE), the pCPU --let it be pCPU 6-- is in the following state: - it's still formally in cpupool1, but it's scheduler is set to &ops, i.e., to Credit1, - idle_vcpu[6].sched_priv points to Credit2 vCPU data, - it is being switched to pool1, with ops == Credit2. So, schedule_cpu_switch() itself will: - allocate a new Credit2 per vCPU data structure, - make idle_vcpu[6].sched_priv point such new structure, - try to deallocate the still existing Credit2 per vCPU data structure with Credit1's (old_ops in the function) free_vdata hook!! - bad things (or so I expect!) Note, in particular, the the next to last item: we'd be calling Credit1's csched_free_vdata() to deallocate something that was allocated with Credit2's csched_alloc_vdata()... Which looks like another rather interesting bug to me (I haven't been able to trigger it, most likely because the one shown in the changelog manifests first, but I can try, if we're curios :-D). So, in summary, this patch fixes two bugs, one that is showing and one latent... Not bad! :-) Fact is, there is an inconsistency between the pCPU's scheduler (set in cpu_schedule_up(), to Credit1, in the example) and the pCPU's idle vCPU's private data (Credit2 data in the example), which, if left in place, could explode somewhere else, at some future point. In my opinion, the best solution is to remove such inconsistency, and that's what the patch tries to do. Another idea could be trying to completely and fully initialize the _proper_ scheduler for the pCPU in cpu_schedule_up(), but it's not guaranteed that this will be doable for all schedulers; e.g., it is impossible for Credit2. Maybe, but I'd have to try, since it's only the inconsistency in the status of the idle vCPU that triggers the bug, we can at least set that straight in cpu_schedule_up()... As I said, I'd have to check, and I'm fine with trying, if it is deemed worthwhile. Let me know what you think. Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |