[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC 0/N] xen: arm: rework early bring up



On Thu, 2013-09-19 at 15:49 +0100, Julien Grall wrote:
> On 09/18/2013 10:28 PM, Ian Campbell wrote:
> > On Wed, 2013-09-18 at 22:14 +0100, Julien Grall wrote:
> >> On 09/18/2013 09:22 PM, Ian Campbell wrote:
> >>> On Wed, 2013-09-18 at 21:00 +0100, Julien Grall wrote:
> >>>> On 09/18/2013 08:24 PM, Ian Campbell wrote:
> >>>>> On Wed, 2013-09-18 at 20:22 +0100, Ian Campbell wrote:
> >>>>>> On Wed, 2013-09-18 at 18:33 +0100, Julien Grall wrote:
> >>>>>>> On 09/17/2013 02:37 AM, Ian Campbell wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> The following reworks early bring up on ARM to use a separate set of
> >>>>>>>> boot pagetables. This simplifies things by avoiding the need to 
> >>>>>>>> bring up
> >>>>>>>> all CPUs in lock step, which in turn allows us to do secondary CPU
> >>>>>>>> bringup in C code.
> >>>>>>>>
> >>>>>>>> Unfortunately the main bulk of this change is a single large patch 
> >>>>>>>> which
> >>>>>>>> is hard to decompose any further since it is basically pulling on the
> >>>>>>>> thread and then knitting a new jumper from it.
> >>>>>>>>
> >>>>>>>> With these changes Xen now absolutely requires that the bootloader 
> >>>>>>>> calls
> >>>>>>>> the hypervisor in HYP mode, the previous workarounds have been 
> >>>>>>>> removed.
> >>>>>>>> For use on models a bootwrapper is now required. See
> >>>>>>>>           git://xenbits.xen.org/people/ianc/boot-wrapper.git 
> >>>>>>>> xen-arm32
> >>>>>>>>           git://xenbits.xen.org/people/ianc/boot-wrapper-aarch64.git 
> >>>>>>>> xen-arm64
> >>>>>>>>
> >>>>>>>> I have implemented support for CPU bringup on the fastmodel vexpress
> >>>>>>>> platforms (v7and v8) here, I suppose it should work OK on a real
> >>>>>>>> vexpress too but I've not tried it.
> >>>>>>>
> >>>>>>> I have just tried the patch series on the versatile express and it
> >>>>>>> doesn't work. I'm using the u-boot from Andre and all the cpus already
> >>>>>>> boot in hyp mode (checked without this patch series).
> >>>>>>
> >>>>>> :-(
> >>>>>>
> >>>>>> The code should essentially the same as the old kick cpus, but
> >>>>>> reimplemented in C. The only difference I can spot is that the C 
> >>>>>> version
> >>>>>> is kicking a single CPU at a time while the ASM version is kicking
> >>>>>> everyone at once. Does using send_SGI_allbutself instead work as a 
> >>>>>> hack?
> >>>>>>
> >>>>>> Perhaps our idea of the CPUID is wrong or something? Your patches to
> >>>>>> fixup the logical CPU id stuff might help?
> >>>>
> >>>> I have found the issue, in send_SGI_mask we remove the offline CPUs with
> >>>> the following line:
> >>>>     mask &= cpumask_bits(&cpu_online_map);
> >>>>
> >>>> As the secondary cpu (assuming we have a board with only 2 cores) is not
> >>>> yet online, we can't send SGI with our code.
> >>>> I'm very surprised that the code works on the fast models.
> >>>
> >>> So am I, now.
> >>>
> >>> OK. So online mask == 0x1 and we are trying to wake mask 0x2, so the
> >>> actual target list we end up using is 0...
> >>>
> >>> The gic manual says "When TargetListFilter is 0b00, if the CPUTargetList
> >>> field is 0x00 the Distributor does not forward the interrupt to any CPU
> >>> interface." (footnote to table 4-21), so it shouldn't be waking
> >>> anything... It may be a model bug
> >>>
> >>>> To solve the issue, I think we need to create another helper which will
> >>>> send an SGI wihout removing offline CPUs.
> >>>
> >>> Looking briefly at the callchains leading to send_SGI_mask I don't see
> >>> any which deliberately pass an offline CPU (which would be pretty
> >>> non-sensical apart from this one case). So we could perhaps just remove
> >>> this check and make it the caller's problem.
> >>> Or maybe the right check is for cpu_possible mask or something similar
> >>> instead of online?
> >>
> >> You are right, cpu_possible mask is better.
> >> I can modify this patch http://patches.linaro.org/20437/ to add the 
> >> check in gic_cpu_mask.
> > 
> > Sounds reasonable.
> 
> Hmm ... in fact this solution is not enough :/.
> With my patch series "handle non-contiguous CPU ID" (mainly the patch #3
> http://patches.linaro.org/20437/), the gic cpu ID of the target cpu is
> not yet initialize.
> 
> Linux solves this issue by setting each gic cpu mask (our gic_cpu_id) to
> 0xff and refine each time a new cpu boot. If we choose this solution
> that would mean that all secondaries cpus will boot at the same time.
> As I understand with your new boot code (patch #5), we are only able to
> handle one cpu simultaneously.
> 
> Perhaps, we need to reintroduce an holding pen which contains the MPIDR.

Yeah, I was hoping to avoid that but it looks like the only sensible way
forward.

A least we can still do the kick late and directly onto the relocated
xen, so it's not too bad.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.