[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [qemu-upstream-4.11-testing test] 136184: regressions - FAIL
Hi Stefano, On 6/5/19 9:29 PM, Stefano Stabellini wrote: On Wed, 5 Jun 2019, Julien Grall wrote:Hi Stefano, On 05/06/2019 00:11, Stefano Stabellini wrote:On Tue, 4 Jun 2019, Julien Grall wrote:On 6/4/19 6:39 PM, Stefano Stabellini wrote:On Tue, 4 Jun 2019, Julien Grall wrote:No, this patch introducing another source of TLB conflict if the processor is caching intermediate translation (this is implementation defined).By "another source of TLB conflict" are you referring to something new that wasn't there before? Or are you referring to the fact that still we are not following the proper sequence to update the Xen pagetable? If you are referring to the latter, wouldn't it be reasonable to say that such a problem could have happened also before 00c96d7742?It is existent but in a different form. I can't tell whether this is bad or not because the re-ordering of the code (and therefore memory access) will affect how TLBs are used. So it is a bit of gambling here.If I read this right, this is the same underlying issue but due to the re-ordering of the code, it could manifest differently. For instance the impact on cache lines could be different. I am sorry, but how did you came up with cache line difference here? It has nothing about cachelines, it just has to do how the TLBs are filled at a given point. If you re-order memory access, then you may as well have a different state of the TLBs at a given point. Is this the case? If so, I think this is a tolerable risk, as other things could affect it too, such as CONFIG options being enabled/disabled, as we have just seen with CONFIG_LIVEPATCH. It is almost "random". See above. But yes it is almost random. The bug reported by osstest actually taught me that even if Xen may boot today on a given platform, this may not be the case tomorrow because of the slight change in the code ordering (and therefore memory access). /!\ Below is my interpretation and does not imply I am correct ;) However, such Arm Arm violations are mostly gathered around boot and shouldn't affect runtime. IOW, Xen would stop booting on those platforms rather than making unrealiable. So it would not be too bad. /!\ End We just have to be aware of the risk we are taking with backporting the patch.What you wrote here seems to make sense but I would like to understand the problem mentioned earlier a bit betterWhat about the other older stanging branches?The only one we could consider is 4.10, but AFAICT Jan already did cut the last release for it. So I wouldn't consider any backport unless we begin to see the branch failing.If Jan already made the last release for 4.10, then little point in backporting it to it. However, it is not ideal to have something like 00c96d7742 in some still-maintained staging branches but not all.Jan pointed out it is not yet release. However, we didn't get any report for problem (aside the Arm Arm violation) with Xen 4.10 today. So I would rather avoid such backport in a final point release as we have a risk to make more broken than it is today. I find this acceptable for Xen 4.11 because it has been proven to help. We also still have point release afterwards if this goes wrong.If we do the backport, I would prefer to backport it to both trees, for consistency, and because there might be machines out there where 4.10 doesn't boot with the wrong kconfig. This patch should decrease the risk of breakage. The counter point here is Xen 4.10 is going to be out of support in a few weeks. If you are about to use Xen 4.10 for your new product, then you already made the wrong choice. Why would you use an out of support release? If you already use Xen 4.10, then you are probably fine to run this release on your platform. Why would you take the risk to break them? Note that Osstest does not test Xen 4.10 (or earlier) on Thunder-X, this is does not need to be factored in the decision. However, I see your point too. This is a judgement call -- we have not enough data but we have to make a decision anyway. No way to tell which way is best "scientifically". I also understand your point, however this is a bit worrying that not enough data means that we are happy to backport a patch in a final point release. I would have thought more caution would happen during backport. My vote is to backport to both. Jan/others please express your opinion. To follow the vote convention: 4.11: -14.10: -1 (I was tempted by a -2 but if the other feels it should be backported then I will not push back). Cheers, -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |