Xen project Mailing List

Re: [Xen-devel] patch "x86/cpufreq: relocate the driver register function" breaks cpu hot(un)plug

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, Dario Faggioli <dario.faggioli@xxxxxxxxxx>

From: "Wang, Wei W" <wei.w.wang@xxxxxxxxx>

Date: Sat, 10 Oct 2015 01:38:36 +0000

Accept-language: en-US

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>

Delivery-date: Sat, 10 Oct 2015 01:39:11 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHRArJg5tsTjphejEqknCFdhtRgqJ5jDu2AgADh4HA=

Thread-topic: [Xen-devel] patch "x86/cpufreq: relocate the driver register function" breaks cpu hot(un)plug

On 09/10/2015 04:01, Konrad Rzeszutek Wilk wrote: > On Fri, Oct 09, 2015 at 06:48:23PM +0200, Dario Faggioli wrote: > > Hey, > > > > As far as my bisection goes, commit > > 49388f11d512bb92706ce046643bfbb3c1d963c9 "x86/cpufreq: relocate the > > driver register function" prevents me from hot unplugging pCPUs. > > > > Xen does not crash or anything, but dom0 is stalled. In fact, with > > current staging, here's what I see: > > > > root@Zhaman:~# echo 0 > /sys/devices/system/xen_cpu/xen_cpu6/online > > [ 81.583001] INFO: rcu_sched detected stalls on CPUs/tasks: { 12} > > (detected > by 3, t=5252 jiffies, g=1691, c=1690, q=76) > > [ 81.583036] Task dump for CPU 12: > > [ 81.583044] bash R running task 0 1347 1094 > > 0x00000008 > > [ 81.583056] ffffffff00000000 0000000000000000 0000000000000000 > ffff8800192c2e38 > > [ 81.583070] ffff8800008472e8 0000000000000002 ffff8800008472e8 > ffff880013817858 > > [ 81.583082] 0000000000000000 00000000000081a4 ffffffff811e8137 > ffff8800192c2e38 > > [ 81.583095] Call Trace: > > [ 81.583110] [<ffffffff811e8137>] ? notify_change+0x2f7/0x390 > > [ 81.583148] [<ffffffff811c8c74>] ? do_truncate+0x74/0x90 > > [ 81.583158] [<ffffffff811e2866>] ? dput+0x26/0x230 > > [ 81.583167] [<ffffffff811d53c5>] ? terminate_walk+0x35/0x40 > > [ 81.583176] [<ffffffff811d92b1>] ? do_last+0x621/0x12c0 > > [ 81.583188] [<ffffffff8139f0e7>] ? xen_pcpu_down+0x47/0x70 > > [ 81.583199] [<ffffffff8156c64d>] ? store_online+0x9d/0xb0 > > [ 81.583210] [<ffffffff81240bfc>] ? kernfs_fop_write+0x12c/0x180 > > [ 81.583220] [<ffffffff811ca513>] ? __vfs_write+0x23/0xf0 > > [ 81.583230] [<ffffffff811cd142>] ? __sb_start_write+0x42/0xf0 > > [ 81.583241] [<ffffffff8125f711>] ? security_file_permission+0x21/0xa0 > > [ 81.583250] [<ffffffff811caea1>] ? vfs_write+0xa1/0x1c0 > > [ 81.583259] [<ffffffff811c828f>] ? filp_close+0x4f/0x70 > > [ 81.583268] [<ffffffff811cbb12>] ? SyS_write+0x42/0xb0 > > [ 81.583277] [<ffffffff811e9031>] ? __close_fd+0x71/0xb0 > > [ 81.583287] [<ffffffff815780f2>] ? system_call_fastpath+0x16/0x75 > > [ 144.555020] INFO: rcu_sched detected stalls on CPUs/tasks: { 12} > > (detected by 4, t=21007 jiffies, g=1691, c=1690, q=244) [ 144.555046] Task > dump for CPU 12: > > [ 144.555051] bash R running task 0 1347 1094 > > 0x00000008 > > [ 144.555059] ffffffff00000000 0000000000000000 0000000000000000 > > ffff8800192c2e38 [ 144.555068] ffff8800008472e8 0000000000000002 > > ffff8800008472e8 ffff880013817858 [ 144.555076] 0000000000000000 > > 00000000000081a4 ffffffff811e8137 ffff8800192c2e38 [ 144.555084] Call > Trace: > > [ 144.555096] [<ffffffff811e8137>] ? notify_change+0x2f7/0x390 [ > > 144.555105] [<ffffffff811c8c74>] ? do_truncate+0x74/0x90 [ > > 144.555112] [<ffffffff811e2866>] ? dput+0x26/0x230 [ 144.555118] > > [<ffffffff811d53c5>] ? terminate_walk+0x35/0x40 [ 144.555124] > > [<ffffffff811d92b1>] ? do_last+0x621/0x12c0 [ 144.555164] > > [<ffffffff8139f0e7>] ? xen_pcpu_down+0x47/0x70 [ 144.555172] > > [<ffffffff8156c64d>] ? store_online+0x9d/0xb0 [ 144.555179] > > [<ffffffff81240bfc>] ? kernfs_fop_write+0x12c/0x180 [ 144.555186] > > [<ffffffff811ca513>] ? __vfs_write+0x23/0xf0 [ 144.555192] > > [<ffffffff811cd142>] ? __sb_start_write+0x42/0xf0 [ 144.555200] > > [<ffffffff8125f711>] ? security_file_permission+0x21/0xa0 > > [ 144.555206] [<ffffffff811caea1>] ? vfs_write+0xa1/0x1c0 [ > > 144.555212] [<ffffffff811c828f>] ? filp_close+0x4f/0x70 [ > > 144.555217] [<ffffffff811cbb12>] ? SyS_write+0x42/0xb0 [ 144.555223] > > [<ffffffff811e9031>] ? __close_fd+0x71/0xb0 [ 144.555230] > > [<ffffffff815780f2>] ? system_call_fastpath+0x16/0x75 > > > > If I revert that patch, the issue goes away. > > > > Any ideas? Hi Dario, Please also remove "register_cpu_notifier(&cpu_nfb)" in the cpufreq_register_driver function as well. (found that it has already been included in cpufreq_presmp_nfb()). Best, Wei > I think it is due to xen-acpi-processor re-uploading the C and P states > whenever > an CPU goes up. It also does this after S3 suspend. > > Anyhow it may be due to the fact that cpufreq_register_driver in Xen is now > '__init' If you remove that little thing would it work? > > > > > Regards, > > Dario > > > > PS. yes, I'll implement a cpu hotplug/unplug testcase ASAP. :-) > > > > -- > > <<This happens because I choose it to happen!>> (Raistlin Majere) > > ----------------------------------------------------------------- > > Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software > > Engineer, Citrix Systems R&D Ltd., Cambridge (UK) > > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@xxxxxxxxxxxxx > > http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.