[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 64494: regressions - FAIL



On 19/11/15 13:19, Wei Liu wrote:
> On Thu, Nov 19, 2015 at 12:47:41PM +0100, Juergen Gross wrote:
>> On 19/11/15 11:30, Ian Campbell wrote:
>>> On Thu, 2015-11-19 at 11:24 +0100, Juergen Gross wrote:
>>>> On 18/11/15 15:49, Wei Liu wrote:
>>>>> Hi Juergen
>>>>>
>>>>> Looks like there is something we missed after all.
>>>>>
>>>>> On Wed, Nov 18, 2015 at 02:31:57PM +0000, osstest service owner wrote:
>>>>>> flight 64494 xen-unstable real [real]
>>>>>> http://logs.test-lab.xenproject.org/osstest/logs/64494/
>>>>>>
>>>>>> Regressions :-(
>>>>>>
>>>>>> Tests which did not succeed and are blocking,
>>>>>> including tests which could not be run:
>>>>>
>>>>>>  test-amd64-amd64-i386-pvgrub 10 guest-start               fail REGR.
>>>>>> vs. 64035
>>>>>
>>>>> Nov 18 05:11:19.753014 (d2) Bootstrapping...
>>>>> Nov 18 05:11:19.769108 (d2) Xen Minimal OS!
>>>>> Nov 18 05:11:19.769134 (d2)   start_info: 0xa13000(VA)
>>>>> Nov 18 05:11:19.769158 (d2)     nr_pages: 0x20000
>>>>> Nov 18 05:11:19.777046 (d2)   shared_inf: 0xca1fc000(MA)
>>>>> Nov 18 05:11:19.777072 (d2)      pt_base: 0xa16000(VA)
>>>>> Nov 18 05:11:19.785042 (d2) nr_pt_frames: 0xb
>>>>> Nov 18 05:11:19.785077 (d2)     mfn_list: 0x993000(VA)
>>>>> Nov 18 05:11:19.785108 (d2)    mod_start: 0x0(VA)
>>>>> Nov 18 05:11:19.785135 (d2)      mod_len: 0
>>>>> Nov 18 05:11:19.793047 (d2)        flags: 0x0
>>>>> Nov 18 05:11:19.793077 (d2)     cmd_line: (hd0,0)/boot/grub/menu.lst
>>>>> Nov 18 05:11:19.793108 (d2)        stack: 0x972580-0x992580
>>>>> Nov 18 05:11:19.801150 (d2) MM: Init
>>>>> Nov 18 05:11:19.801181 (d2)       _text: 0x0(VA)
>>>>> Nov 18 05:11:19.801197 (d2)      _etext: 0x7b22d(VA)
>>>>> Nov 18 05:11:19.809104 (d2)    _erodata: 0xa4000(VA)
>>>>> Nov 18 05:11:19.809123 (d2)      _edata: 0xa81a8(VA)
>>>>> Nov 18 05:11:19.809138 (d2) stack start: 0x972580(VA)
>>>>> Nov 18 05:11:19.817062 (d2)        _end: 0x992b30(VA)
>>>>> Nov 18 05:11:19.817099 (d2)   start_pfn: a24
>>>>> Nov 18 05:11:19.817125 (d2)     max_pfn: 20000
>>>>> Nov 18 05:11:19.825037 (d2) Mapping memory range 0x1000000 - 0x20000000
>>>>> Nov 18 05:11:19.825071 (d2) setting 0x0-0xa4000 readonly
>>>>> Nov 18 05:11:19.825100 (d2) skipped 1000
>>>>> Nov 18 05:11:19.833049 (d2) MM: Initialise page allocator for
>>>>> b1c000(b1c000)-20000000(20000000)
>>>>> Nov 18 05:11:19.833089 (d2) Page fault at linear address c00008, eip
>>>>> 5fc70, regs 0x98ff28, sp b1c000, our_sp 0x98fefc, code 2
>>>>> Nov 18 05:11:19.849044 (d2) Page fault in pagetable walk (access to
>>>>> invalid memory?).
>>>>>
>>>>> The pvgrub in used is 32 bit. 64 bit (which I myself tested) seemed to
>>>>> be working fine.
>>>>
>>>> Okay, I'm hitting this issue, too. I'll investigate further.
>>>
>>> Do we want to revert $something in the meantime? If so, what...
>>>
>>
>> The problem is really located in pvgrub:
>>
> 
> To be precise, the problem is in mini-os, which is used by rump kernel
> as well. :-(
> 
>> pvgrub is making assumptions about the page table allocation scheme
>> of the toolset starting pvgrub. It is calculating the first not yet
>> mapped pfn by:
>>
>> pfn_to_map =
>>     (start_info.nr_pt_frames - NOT_L1_FRAMES) * L1_PAGETABLE_ENTRIES;
>>
>> NOT_L1_FRAMES is 3 for 64 bit and 32 bit:
>>
>> 64 bit: 1 level 4 + 1 level 4 + 1 level 2
>> 32 bit: 1 level 3 + 2 level 2 (assuming level 2 pt are allocated
>>   for the first and the last GB)
>>
>> This is wrong now, as for 32 bit I'm allocating all level 2 page
>> tables from the lowest needed address up to 0xffffffff, resulting
>> in 4 level 2 page tables in pvgrub case.
>>
>> Setting NOT_L1_FRAMES to 5 for 32 bit pvgrub makes the system boot
>> again.
>>
>> I think it is wrong for pvgrub to assume a fixed allocation scheme
>> of the page tables. The question is how to fix it: either via
>> changing NOT_L1_FRAMES or by doing it in a clean but more complicated
>> way.
>>
> 
> In the mean time, shall we revert the series and think about this a bit
> more?
> 
> How complicated will be clean fix be?

I think I can fix it today. I can just count the not_l1_frames
dynamically by walking through the higher level page tables. This
will work for 32 and 64 bit and with and without my patch series
and even for 32 bit mini-os started via grub-xen.

> Does changing NOT_L1_FRAMES break mini-os boot with toolstack prior to
> your series?

Yes.

Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.