[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sched=null vwfi=native and call_rcu()


  • To: Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • From: George Dunlap <George.Dunlap@xxxxxxxxxx>
  • Date: Mon, 17 Jan 2022 11:05:02 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=SlCsRUooNNaI5KVE5wRrRdgRQ/mD0PJpHLi1XoICSFI=; b=GkcW0POdofOcwTyMLMKPw8DRJiSFG5pr/KSaWdgvD1BrAYHABGChc6aDVas0Wa5JlVDsmW4/Rh28wrkItqyTV06UPTY8emBxnizBPFIyIyoEMeTfGWwouQs+b62fUZUGYYkZsqF3Owsw6ItSJsQCmitQJgsz8RfytBDOYiWzTmkO+9TjA0hCtPZd+yxpxLSEchz57gp7lGD/PG4WhrLEbjJtOHs97LL23tQ2iOBkrBN24B+TCxgjUWzOFQv8m41umGT8jHJZsW2gPs6Xp4foUbXkkdZWSBJtLMBFN6OiBtK67BWNjQCFDdvNdltJEujaAONkZpT4CTwjM49fks8C/Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PeiGdvnzyriyDCOpfxqOJdoXB7nCyZh+QdzL6b8lQV1Cb4cztnZw3MEmFdov7ZPWFT6XalVhuDW1zYfmrUyLBA1Ypavkpl9kWEjSajdxZ7M5rZ0MEpY1a/3SLKmwv4sTYSohL19VvxVzuR8TkEKtEKG/GrWAQUbkfE8B28imFqT980eRxtCf4sMgzIRnYUIpWj8RZKpWcMWPKYjNNdr06yJB83ehoHtq40dAz9gFoDZGbE80cVF/bpJ3XUtxHfGkvUhfmZy5oWr8D+vdaNqnx78liE3OlIxxLRPDIO1NNQ3TWO3G1nceXJy5jdr3F95dudJQuZ8Xrxqx2OFQ0Q48Ew==
  • Authentication-results: esa1.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Dario Faggioli <dfaggioli@xxxxxxxx>, "julien@xxxxxxx" <julien@xxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, Juergen Gross <JGross@xxxxxxxx>, "bertrand.marquis@xxxxxxx" <bertrand.marquis@xxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>, "Volodymyr_Babchuk@xxxxxxxx" <Volodymyr_Babchuk@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • Delivery-date: Mon, 17 Jan 2022 11:05:31 +0000
  • Ironport-data: A9a23:dTfvP63K59uDbRHT1/bD5Yl3kn2cJEfYwER7XKvMYLTBsI5bp2BTm 2EeWjuObPfYYGahL4twa9ng80pS75DcyNRqSQJkpC1hF35El5HIVI+TRqvS04J+DSFhoGZPt Zh2hgzodZhsJpPkS5PE3oHJ9RGQ74nRLlbHILOCanAZqTNMEn9700o6wrFh2+aEvPDia++zk YKqyyHgEAfNNw5cagr4PIra9XuDFNyr0N8plgRWicJj5TcypFFMZH4rHomjLmOQf2VhNrXSq 9Avbl2O1jixEx8FUrtJm1tgG6EAaua60QOm0hK6V0U+6/TrS+NbPqsTbZIhhUlrZzqhofBcj 8lhlbuMW0QGYL/QqL4+VzZSOnQrVUFG0OevzXmXtMWSywvNcmf2wuUoB0YzVWEa0r8pWycUr 6VecW1TKEDY7w616OvTpu1EqckkNsbmeq8CvHVp1RnSDOo8QICFSKLPjTNd9Glq25wQTauBD yYfQQFRXD/EODt0AWUWWJ99xuiQ2CffTCIN/Tp5ooJoujOOnWSdyoPFOtfPZsaDQ8kTm0+Cv 3/H5EzwGBRcP9uaoRKn3WirnfTnhj7gVcQZE7jQ3u5nhhify3IeDDUSVECnur+ph0imQdVdJ kcIvC00osAaykuvSdXsWgyil1SNtBUcRtl4HvUz7UeGza+8ywqXD2cLTzlFafQ9qdQ7Azct0 zehhMj1DDZitLmUT3O19bqOqz62fy8PIgcqRwUJUA8E6NnLu5wog1TESdMLOKyoitz4Hxngz jbMqzIx74j/luZSif/9pwqexWvx+N6ZFWbZ+zk7QEqbzi4+eKKFQLC6qneF/+pwdqWdYVOo6 S1sd9el0MgCCpSElSqoSeoLHa206/vtDAAwkWKDDLF6qW3zpifLkZR4pWgneRw3appslSrBO ReL4WtsCIlv0GxGhEOdS6a4EIwUwKfpDrwJvdiEP4MVMvCdmOJqlRyChHJ8PUiwyiDAcollY P93lPpA615AU8yLKxLsFo8gPUcDnHxW+I8qbcmTI+6b+bSffmWJbrwOLUGDaOs0hIvd/lmPq IsPbJvWl0sPOAEbXsUx2dRORbztBSJqba0aVuQNLrLTSuaYMDxJ5wDtLUMJJNU+wvU9ehbg9 XChQE5IoGcTdlWcQThmnktLMeu1Nb4m9CpTFXV1YT6AhiZ/Ca7ysvZ3X8ZnLNEPqb04pdYpH qZtRil1KqkVItgx025DPcCVQU0LXEnDuD9iyAL+MWdvJMAxFlWZkjImFyO2nBQz4uOMnZJWi 5Wr1x/BQIpFQAJnDc3Mb+mowU/3tn8Y8N+elWOTSjWKUEmzooVsNQLrifo7f5MFJRnZn2PI3 AeKGxYI4+LKptZtotXOgKmFqaavEvd/QRUGTzWKs+7uOHmI5HenzK9BTP2MIWLXWlTr9fjwf u5S1fz9bqEKxQ4Yr4pmHr935qsi/N+z9aRCxwFpESyTPVSmA79tOFec2sxLuvEfz7NVo1LuC EmO5sNbKfOCP8a8SAwdIw8sb+Ki0/AIm2aNsaRpcRuivCIupeiJS0RfORWImRdxFrotPdN32 /olte4X9xe71kggPOGZg30G7G+LNHEBDfkq78lIHI/xhwM34VheepiAWDTu6ZSCZtgQYEknJ jiY2PjLi7hGnxeQdnMyET7G3PZHhIRIsxdPlQdQK1OMk9vDp/k2wBwOrmhnElULlk1Kg7BpJ 2xmF0xpPqHfrT5nif9KU32oBwwcVgaS/Vb8ygdRmWDUJ6Vyurch8IHp1T6xwX0k
  • Ironport-hdrordr: A9a23:g7tw8KpDLkzr4MhEbeGTCvMaV5uPL9V00zEX/kB9WHVpm5Oj+P xGzc526farslsssSkb6K290KnpewK4yXbsibNhc4tKLzOWxFdAS7sSrLcKogeQVBEWk9Qy6U 4OSdkGNDSdNykYsS++2njDLz9C+qjGzEnLv5an854Fd2gDAMsAjzuRSDzraXGeLDM2X6bRf6 Dsgvav0gDQH0j/Gf7LYUXtMdKzxeHjpdbDW1orFhQn4A6BgXeD87jhCSWV2R8YTndm3aoi2X KtqX272oyT99WAjjPM3W7a6Jpb3PH7zMFYOcCKgs8Jbh3xlweTYph7UbHqhkF2nAjv0idurD D/mWZmAy1B0QKWQohzm2q15+DU6kdr15Yl8y7BvZKsm72jeNtwMbs/uWsQSGqm16NnhqAg7E sD5RPoi7NHSRzHhyjz/N7OSlVjkVe1u2MrlaoJg2VYSpZ2Us4akWSOlHklYavoMRiKoLzPKt MeR/00JcwmBW+yfjTcpC1i0dasVnM8ElOPRVUDoNWc13xTkGpix0UVycQDljNYnahNB6Vs9q DBKOBlhbtORsgZYeZ0A/oAW9K+DijITQjXOGyfLFz7HOUMOm7LqZTw/LIpjdvaNaAg3d83gt DMQVlYvWk9dwbnDtCPxoRC9lTXTGC0TV3Wu4hjDlhCy8vBrZbQQF++oWEV4rydSq8kc77mst 6ISedrP8M=
  • Ironport-sdr: 32dzvE535uf7MCDSzJ0w28+uHXI3IxlI04wGJY268HY1+FFajNLCwlnJqD9RdAx/ofEblnQVk1 NRDDxIFrX80CuCgPyBkbvID2ExnWGTfJNE8epE5OkoNwLzOSMoq5msFsuQ1xkyCYk3yCOpawdT vVIDaUOAgPWjRZ3HhCr2IGJyBU7O3h89Ak4OF2dZT3DJX8rUCdktfrulCcAbQ33HxRkdpvkSeY ipVJxq9o3tyDxgkjiq81lr9j0//wFcv2IRQ+OJ3uK2f65jTanTFxOwlAyjVz+7M52UbKZTz991 S1Riyo3nh9W0UJBt8KkaykEL
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHYApYKcBQAyeId3UeWzqXDNXhpe6xVueAAgAETSYCADBqGAIAAJu+AgAQQWoA=
  • Thread-topic: sched=null vwfi=native and call_rcu()


> On Jan 14, 2022, at 9:01 PM, Stefano Stabellini <sstabellini@xxxxxxxxxx> 
> wrote:
> 
> On Fri, 14 Jan 2022, Dario Faggioli wrote:
>> On Thu, 2022-01-06 at 17:52 -0800, Stefano Stabellini wrote:
>>> On Thu, 6 Jan 2022, Julien Grall wrote:
>>>> 
>>>> This issue and solution were discussed numerous time on the ML. In
>>>> short, we
>>>> want to tell the RCU that CPU running in guest context are always
>>>> quiesced.
>>>> For more details, you can read the previous thread (which also
>>>> contains a link
>>>> to the one before):
>>>> 
>>>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fxen-devel%2Ffe3dd9f0-b035-01fe-3e01-ddf065f182ab%40codiax.se%2F&amp;data=04%7C01%7CGeorge.Dunlap%40citrix.com%7Cb6795e0be3af416841a408d9d7a12030%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637777909305566330%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=9%2BoiFfdK3rGAeWFSNCRu5aSuYgql1XZcaGJgT3aRsOA%3D&amp;reserved=0
>>> 
>>> Thanks Julien for the pointer!
>>> 
>>> Dario, I forward-ported your three patches to staging:
>>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.com%2Fxen-project%2Fpeople%2Fsstabellini%2Fxen%2F-%2Ftree%2Frcu-quiet&amp;data=04%7C01%7CGeorge.Dunlap%40citrix.com%7Cb6795e0be3af416841a408d9d7a12030%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637777909305566330%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=vrNN5KgwXj93ZThreIDNB7UgKJdPNz%2BoL98b%2FoopN8w%3D&amp;reserved=0
>>> 
>> Hi Stefano!
>> 
>> I definitely would like to see the end of this issue, so thanks a lot
>> for your interest and your help with the patches.
>> 
>>> I can confirm that they fix the bug. Note that I had to add a small
>>> change on top to remove the ASSERT at the beginning of
>>> rcu_quiet_enter:
>>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.com%2Fxen-project%2Fpeople%2Fsstabellini%2Fxen%2F-%2Fcommit%2F6fc02b90814d3fe630715e353d16f397a5b280f9&amp;data=04%7C01%7CGeorge.Dunlap%40citrix.com%7Cb6795e0be3af416841a408d9d7a12030%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637777909305566330%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=vjxT35b%2FglqzzA4DCLqTjbo0bAfOjtLcvN90OFs8U9Q%3D&amp;reserved=0
>>> 
>> Yeah, that should be fine.
>> 
>>> Would you be up for submitting them for upstreaming? I would prefer
>>> if
>>> you send out the patches because I cannot claim to understand them
>>> completely (except for the one doing renaming :-P )
>>> 
>> Haha! So, I am up for properly submitting, but there's one problem. As
>> you've probably got, the idea here is to use transitions toward the
>> guest and inside the hypervisor as RCU quiescence and "activation"
>> points.
>> 
>> Now, on ARM, that just meant calling rcu_quiet_exit() in
>> enter_hypervisor_from_guest() and calling rcu_quiet_enter() in
>> leave_hypervisor_to_guest(). Nice and easy, and even myself --and I'm
>> definitely not an ARM person-- cloud understand it (although with some
>> help from Julien) and put the patches together.
>> 
>> However, the problem is really arch independent and, despite not
>> surfacing equally frequently, it affects x86 as well. And for x86 the
>> situation is by far not equally nice, when it comes to identifying all
>> the places from where to call rcu_quiet_{enter,exit}().
>> 
>> And finding out where to put them, among the various functions that we
>> have in the various entry.S variants is where I stopped. The plan was
>> to get back to it, but as shamefully as it sounds, I could not do that
>> yet.
>> 
>> So, if anyone wants to help with this, handing over suggestions for
>> potential good spots, that would help a lot.
> 
> Unfortunately I cannot volunteer due to time and also because I wouldn't
> know where to look and I don't have a reproducer or a test environment
> on x86. I would be flying blind.
> 
> 
>> Alternatively, we can submit the series as ARM-only... But I fear that
>> the x86 side of things would then be easily forgotten. :-(
> 
> I agree with you on this, but at the same time we are having problems
> with customers in the field -- it is not like we can wait to solve the
> problem on ARM any longer. And the issue is certainly far less likely to
> happen on x86 (there is no vwfi=native, right?) In other words, I think
> it is better to have half of the solution now to solve the worst part of
> the problem than to wait more months for a full solution.

An x86 equivalent of vwfi=native could be implemented easily, but AFAIK nobody 
has asked for it yet.  I agree that we need to fix if for ARM, and so in the 
absence of someone with the time to fix up the x86 side, I think fixing 
ARM-only is the way to go.

It would be good if we could add appropriate comments warning anyone who 
implements `hlt=native` on x86 the problems they’ll face and how to fix them.  
Not sure the best place to do that; in the VMX / SVM code that sets the exit 
for HLT &c?

 -George


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.