Xen project Mailing List

Re: [Xen-devel] [PATCH RFC 0/N] xen: arm: rework early bring up

To: Julien Grall <julien.grall@xxxxxxxxxx>

From: Ian Campbell <ian.campbell@xxxxxxxxxx>

Date: Thu, 19 Sep 2013 16:51:26 +0100

Cc: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Andre Przywara <andre.przywara@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Thu, 19 Sep 2013 15:51:44 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Thu, 2013-09-19 at 15:49 +0100, Julien Grall wrote: > On 09/18/2013 10:28 PM, Ian Campbell wrote: > > On Wed, 2013-09-18 at 22:14 +0100, Julien Grall wrote: > >> On 09/18/2013 09:22 PM, Ian Campbell wrote: > >>> On Wed, 2013-09-18 at 21:00 +0100, Julien Grall wrote: > >>>> On 09/18/2013 08:24 PM, Ian Campbell wrote: > >>>>> On Wed, 2013-09-18 at 20:22 +0100, Ian Campbell wrote: > >>>>>> On Wed, 2013-09-18 at 18:33 +0100, Julien Grall wrote: > >>>>>>> On 09/17/2013 02:37 AM, Ian Campbell wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> The following reworks early bring up on ARM to use a separate set of > >>>>>>>> boot pagetables. This simplifies things by avoiding the need to > >>>>>>>> bring up > >>>>>>>> all CPUs in lock step, which in turn allows us to do secondary CPU > >>>>>>>> bringup in C code. > >>>>>>>> > >>>>>>>> Unfortunately the main bulk of this change is a single large patch > >>>>>>>> which > >>>>>>>> is hard to decompose any further since it is basically pulling on the > >>>>>>>> thread and then knitting a new jumper from it. > >>>>>>>> > >>>>>>>> With these changes Xen now absolutely requires that the bootloader > >>>>>>>> calls > >>>>>>>> the hypervisor in HYP mode, the previous workarounds have been > >>>>>>>> removed. > >>>>>>>> For use on models a bootwrapper is now required. See > >>>>>>>> git://xenbits.xen.org/people/ianc/boot-wrapper.git > >>>>>>>> xen-arm32 > >>>>>>>> git://xenbits.xen.org/people/ianc/boot-wrapper-aarch64.git > >>>>>>>> xen-arm64 > >>>>>>>> > >>>>>>>> I have implemented support for CPU bringup on the fastmodel vexpress > >>>>>>>> platforms (v7and v8) here, I suppose it should work OK on a real > >>>>>>>> vexpress too but I've not tried it. > >>>>>>> > >>>>>>> I have just tried the patch series on the versatile express and it > >>>>>>> doesn't work. I'm using the u-boot from Andre and all the cpus already > >>>>>>> boot in hyp mode (checked without this patch series). > >>>>>> > >>>>>> :-( > >>>>>> > >>>>>> The code should essentially the same as the old kick cpus, but > >>>>>> reimplemented in C. The only difference I can spot is that the C > >>>>>> version > >>>>>> is kicking a single CPU at a time while the ASM version is kicking > >>>>>> everyone at once. Does using send_SGI_allbutself instead work as a > >>>>>> hack? > >>>>>> > >>>>>> Perhaps our idea of the CPUID is wrong or something? Your patches to > >>>>>> fixup the logical CPU id stuff might help? > >>>> > >>>> I have found the issue, in send_SGI_mask we remove the offline CPUs with > >>>> the following line: > >>>> mask &= cpumask_bits(&cpu_online_map); > >>>> > >>>> As the secondary cpu (assuming we have a board with only 2 cores) is not > >>>> yet online, we can't send SGI with our code. > >>>> I'm very surprised that the code works on the fast models. > >>> > >>> So am I, now. > >>> > >>> OK. So online mask == 0x1 and we are trying to wake mask 0x2, so the > >>> actual target list we end up using is 0... > >>> > >>> The gic manual says "When TargetListFilter is 0b00, if the CPUTargetList > >>> field is 0x00 the Distributor does not forward the interrupt to any CPU > >>> interface." (footnote to table 4-21), so it shouldn't be waking > >>> anything... It may be a model bug > >>> > >>>> To solve the issue, I think we need to create another helper which will > >>>> send an SGI wihout removing offline CPUs. > >>> > >>> Looking briefly at the callchains leading to send_SGI_mask I don't see > >>> any which deliberately pass an offline CPU (which would be pretty > >>> non-sensical apart from this one case). So we could perhaps just remove > >>> this check and make it the caller's problem. > >>> Or maybe the right check is for cpu_possible mask or something similar > >>> instead of online? > >> > >> You are right, cpu_possible mask is better. > >> I can modify this patch http://patches.linaro.org/20437/ to add the > >> check in gic_cpu_mask. > > > > Sounds reasonable. > > Hmm ... in fact this solution is not enough :/. > With my patch series "handle non-contiguous CPU ID" (mainly the patch #3 > http://patches.linaro.org/20437/), the gic cpu ID of the target cpu is > not yet initialize. > > Linux solves this issue by setting each gic cpu mask (our gic_cpu_id) to > 0xff and refine each time a new cpu boot. If we choose this solution > that would mean that all secondaries cpus will boot at the same time. > As I understand with your new boot code (patch #5), we are only able to > handle one cpu simultaneously. > > Perhaps, we need to reintroduce an holding pen which contains the MPIDR. Yeah, I was hoping to avoid that but it looks like the only sensible way forward. A least we can still do the kick late and directly onto the relocated xen, so it's not too bad. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.