[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] arch: arm: vgic-v3: fix GICD_ISACTIVER range
Hi Stefano, On 21/11/2019 18:50, Stefano Stabellini wrote: On Tue, 19 Nov 2019, Julien Grall wrote:Hi Andre, On 12/11/2019 12:46, Andre Przywara wrote:On Mon, 11 Nov 2019 11:01:07 -0800 (PST) Stefano Stabellini <sstabellini@xxxxxxxxxx> wrote:On Sat, 9 Nov 2019, Julien Grall wrote:On Sat, 9 Nov 2019, 04:27 Stefano Stabellini, <sstabellini@xxxxxxxxxx> wrote: On Thu, 7 Nov 2019, Peng Fan wrote: > The end should be GICD_ISACTIVERN not GICD_ISACTIVER. > > Signed-off-by: Peng Fan <peng.fan@xxxxxxx> Reviewed-by: Stefano Stabellini <sstabellini@xxxxxxxxxx> To be honest, I am not sure the code is correct. A read to those registers should tell you the list of interrupts active. As we always return 0, this will not return the correct state of the GIC. I know that returning the list of actives interrupts is complicated with the old vGIC, but I don't think silently ignoring it is a good idea. The question here is why the guest accessed those registers? What is it trying to figure out?I see Linux querying the active state (IRQCHIP_STATE_ACTIVE) at two relevant points for ARM: - In kernel/irq/manage.c, in __synchronize_hardirq(). - In KVM's arch timer emulation code. I think the latter is of no concern (yet), but the first might actually trigger. At the moment it's beyond me what it actually does, but maybe some IRQ changes (RT, threaded IRQs?) trigger this now?\It happens I am sitting right next to Marc now, so I had a chat with him about this :). Let me try to summarize the discussion here. __synchronize_hardirq() is used to ensure that all active interrupts have been handled before continuing. When sync_chip == false, we only ensure that all in progress interrupts (from Linux PoV) are handled before continue. sync_chip == true will additionally ensure that any interrupt that were acknowledge but not yet marked as in progress by the kernel are also handled. The assumption is this is called after the interrupt has been masked/disabled. The call to __synchronize_hardirq() in free_irq() (as reported by Peng stack trace) was introduced recently (see [1]). It is not entirely clear whether this would affect anyone using Linux 5.4 or just a limited subset of users. Anyhow, this is a genuine bug and I think returning 0 is only papering over the bug with no long-term resolution. As Marc pointed out, Even the old vGIC in KVM was not spec compliant and thankfully this was fixed in the new vGIC. As I said in a different sub-thread, I would reluctanly be ok to see returning 0 as long as we have add a warning for *every* access. Long-term, the current vGIC should really get killed.I appreciate your intention of raising awareness and getting help in fixing things in the community, which is the right thing to do. However, I am doubtful of the usefulness of a warning to achieve the goal. Maybe it would be best to curate an "open issues" section somewhere, even just as an email after every release or as part of the release notes, or as a jira ticket, or a wikipage, or a document under docs/. The state of the brokeness of the vGIC have been documented numerous time on the ML (see [1]) and on JIRA (see XEN-92). And there are probably more not reported (Marc mentionned a new one during the week). Yet no-one ever looked at finishing the new vGIC. So allow me to doubt that the documentation is going to help here... Actually, an "open issues" document under docs/ could be a good idea for this and other similar items. A warning is a blunt instrument that lacks the subtleties necessary to raise the attention in the right way and achieve the goal of getting help and contributions. Especially it has the risk of causing problems and confusion with less knowledgeable users. Maybe a dprintk warning message (only DEBUG builds) could avoid some of the issues, while still gaining attention of savvy developers who could understand what it means. But I think that the "open issues" document would be more effective. See above, this is not the first time the state of the vGIC was raised. While I agree this is a blunt instruction, all the other options had no effect so far. So with this solution maybe someone will soon realized there are effort to put in the core architecture of Xen. Now, if you suggest me a plan (a person, a date of completion...) for fixing properly the vGIC then I would be more willing to accept the workaround suggested here. Cheers,[1] https://lists.xenproject.org/archives/html/xen-devel/2018-02/msg00784.html -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |