[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PATastic fun



On 26.02.2013 18:53, Konrad Rzeszutek Wilk wrote:
> On Tue, Feb 26, 2013 at 12:43:13PM -0500, Konrad Rzeszutek Wilk wrote:
>> On Tue, Feb 26, 2013 at 06:11:42PM +0100, Stefan Bader wrote:
>>> On 26.02.2013 18:03, Konrad Rzeszutek Wilk wrote:
>>>> On Tue, Feb 26, 2013 at 04:32:37PM +0100, Stefan Bader wrote:
>>>>> On 25.02.2013 10:10, Stefan Bader wrote:
>>>>>> On 25.02.2013 04:15, Liu, Jinsong wrote:
>>>>>>> Konrad Rzeszutek Wilk wrote:
>>>>>>>> On Fri, Feb 22, 2013 at 02:54:16PM +0100, Stefan Bader wrote:
>>>>>>>>> Hi Konrad,
>>>>>>>>
>>>>>>>> Hey Stefan,
>>>>>>>>>
>>>>>>>>> here is another one from the hm-what? department:
>>>>>>>>
>>>>>>>> Heh - the really good-bug-hunting one. Lets also include Jinsong as
>>>>>>>> he has been tracking a similar one with mcelog.
>>>>>>>>>
>>>>>>>>> Colin discovered that running the attached program with the fork
>>>>>>>>> active (e.g. "./mmap-example -f 0x10000", the address can be that or
>>>>>>>>> iomem) this triggers the following weird messages: 
>>>>>>>>>
>>>>>>>>> [ 6824.453724] mmap-example:3481 map pfn expected mapping type
>>>>>>>>> write-back for [mem 0x00010000-0x00010fff], got uncached-minus
>>>>>>>>> [ 6824.453776] ------------[ cut here ]------------
>>>>>>>>> [ 6824.453796] WARNING: at
>>>>>>>>> /build/buildd/linux-3.8.0/arch/x86/mm/pat.c:774
>>>>>>>>> untrack_pfn+0xb8/0xd0() ... [ 6824.453920] Pid: 3481, comm:
>>>>>>>>> mmap-example Tainted: GF 
>>>>>>>>> 3.8.0-6-generic #13-Ubuntu
>>>>>>>>> [ 6824.453926] Call Trace:
>>>>>>>>> [ 6824.453944]  [<ffffffff8105879f>] warn_slowpath_common+0x7f/0xc0
>>>>>>>>> [ 6824.453954]  [<ffffffff810587fa>] warn_slowpath_null+0x1a/0x20
>>>>>>>>> [ 6824.453963]  [<ffffffff8104bcc8>] untrack_pfn+0xb8/0xd0
>>>>>>>>> [ 6824.453975]  [<ffffffff81156c1c>] unmap_single_vma+0xac/0x100
>>>>>>>>> [ 6824.453985]  [<ffffffff81157459>] unmap_vmas+0x49/0x90
>>>>>>>>> [ 6824.453995]  [<ffffffff8115f808>] exit_mmap+0x98/0x170
>>>>>>>>> [ 6824.454007]  [<ffffffff810559a4>] mmput+0x64/0x100
>>>>>>>>> [ 6824.454017]  [<ffffffff810560f5>] dup_mm+0x445/0x660
>>>>>>>>> [ 6824.454027]  [<ffffffff81056d9f>]
>>>>>>>>> copy_process.part.22+0xa5f/0x1510 [ 6824.454038] 
>>>>>>>>> [<ffffffff81057931>] do_fork+0x91/0x350 [ 6824.454048] 
>>>>>>>>> [<ffffffff81057c76>] sys_clone+0x16/0x20 [ 6824.454060] 
>>>>>>>>> [<ffffffff816ccbf9>] stub_clone+0x69/0x90 [ 6824.454069] 
>>>>>>>>> [<ffffffff816cc89d>] ? system_call_fastpath+0x1a/0x1f [ 6824.454076]
>>>>>>>>> ---[ end trace 4918cdd0a4c9fea4 ]--- 
>>>>>>>>>
>>>>>>>>> I found that this is related to your bandaid patch
>>>>>>>>>
>>>>>>>>> commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1
>>>>>>>>> Author: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
>>>>>>>>> Date:   Fri Feb 10 09:16:27 2012 -0500
>>>>>>>>>
>>>>>>>>>     xen/pat: Disable PAT support for now.
>>>>>>>>>
>>>>>>>>> I just do not understand how this happens. From the trace it seems
>>>>>>>>> the fork 
>>>>>>>>> fails when duplicating the VMAs (dup_mm calls mmput on failure). So
>>>>>>>>> maybe the 
>>>>>>>>> warning is just related to this. So primarily the question is how on
>>>>>>>>> fork the _PAGE_PCD bit can become set? That and _PAGE_PWT are
>>>>>>>>> cleared from the supported 
>>>>>>>>> mask by the patch, so somehow I would think nothing should be able
>>>>>>>>> to set it... 
>>>>>>>>> But apparently not so.
>>>>>>>>> Not sure it is a big deal since I never saw this in normal operation
>>>>>>>>> and it 
>>>>>>>>> seems to be ok when unapping before doing the fork. It is just plain
>>>>>>>>> odd. 
>>>>>>>>
>>>>>>>> Jinsong mentioned that there is some oddity with the MTRR. Somehow the
>>>>>>>> ranges are swapped or not correct. Jinsong, could you shed some light
>>>>>>>> on what you have found so far?
>>>>>>>>
>>>>>>>
>>>>>>> Yes, Sander once also reported a similar weird warning when start 
>>>>>>> mcelog daemon, as attached.
>>>>>>>
>>>>>>> Basically, it occurs when mcelog user daemon start, 
>>>>>>> do_fork
>>>>>>>   --> copy_process
>>>>>>>     --> dup_mm
>>>>>>>       --> dup_mmap
>>>>>>>         --> copy_page_range
>>>>>>>           --> track_pfn_copy
>>>>>>>             --> reserve_pfn_range
>>>>>
>>>>> So that makes it clearer as this will do
>>>>>
>>>>> reserve_memtype(...)
>>>>> --> pat_x_mtrr_type
>>>>>   --> mtrr_type_lookup
>>>>>     --> __mtrr_type_lookup
>>>>>
>>>>> And that can return -1/0xff in case of mtrr not being 
>>>>> enabled/initialized. Which
>>>>> is not the case (given there are no messages for it in dmesg). This is 
>>>>> not equal
>>>>> to MTRR_TYPE_WRBACK and thus becomes _PAGE_CACHE_UC_MINUS.
>>>>>
>>>>> It looks like the problem starts early in reserve_memtype:
>>>>>
>>>>>         if (!pat_enabled) {
>>>>>                 /* This is identical to page table setting without PAT */
>>>>>                 if (new_type) {
>>>>>                         if (req_type == _PAGE_CACHE_WC)
>>>>>                                 *new_type = _PAGE_CACHE_UC_MINUS;
>>>>>                         else
>>>>>                                 *new_type = req_type & _PAGE_CACHE_MASK;
>>>>>                 }
>>>>>                 return 0;
>>>>>         }
>>>>>
>>>>> This would be what we want, but only clearing the PWT and PCD flags from 
>>>>> the
>>>>> supported flags is not changing pat_enabled (which is 1 when PAT support 
>>>>> is
>>>>> compiled into the kernel). Unfortunately the variable is local and since 
>>>>> there
>>>>> are not any messages about PAT in dmesg I would say pat_init() is not 
>>>>> called
>>>>> either. Which might be used to disable PAT support by clearing the CPU 
>>>>> feature
>>>>> flag.
>>>>
>>>> Hm, so:
>>>>
>>>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>>>> index 39928d1..9ac70c5 100644
>>>> --- a/arch/x86/xen/enlighten.c
>>>> +++ b/arch/x86/xen/enlighten.c
>>>> @@ -379,6 +379,7 @@ static void __init xen_init_cpuid_mask(void)
>>>>  
>>>>         cpuid_leaf1_edx_mask =
>>>>                 ~((1 << X86_FEATURE_MTRR) |  /* disable MTRR */
>>>> +                 (1 << X86_FEATURE_PAT) |   /* disable PAT */
>>>>                   (1 << X86_FEATURE_ACC));   /* thermal monitoring */
>>>>  
>>>>         if (!xen_initial_domain())
>>>>
>>>>
>>>>
>>>> should do it right? Let me check on the troublesome machine I saw
>>>> it.
>>>>
>>> I could try it as well but somehow my reading of pat_init() is that if that
>>> would have been called at all, there should be some message about PAT in 
>>> dmesg.
>>> And normal dom0 boots do not show anything.
>>>
>>> It looks like pat_init only gets called from two places in mtrr code, and 
>>> that
>>> probably is not done in dom0 either. So clearing the feature is one step, 
>>> but I
>>> would think that also there needs to be a call to pat_init...
>>
>> So how about this super-complex patch:
>>
>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>> index 39928d1..c8e1c7b 100644
>> --- a/arch/x86/xen/enlighten.c
>> +++ b/arch/x86/xen/enlighten.c
>> @@ -67,6 +67,7 @@
>>  #include <asm/hypervisor.h>
>>  #include <asm/mwait.h>
>>  #include <asm/pci_x86.h>
>> +#include <asm/pat.h>
>>  
>>  #ifdef CONFIG_ACPI
>>  #include <linux/acpi.h>
>> @@ -1417,7 +1418,14 @@ asmlinkage void __init xen_start_kernel(void)
>>       */
>>      acpi_numa = -1;
>>  #endif
>> -
>> +#ifdef CONFIG_X86_PAT
>> +    /*
>> +     * For right now disable the PAT. We should remove this once
>> +     * git commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1
>> +     * (xen/pat: Disable PAT support for now) is reverted.
>> +     */
>> +    pat_enabled = 0;
>> +#endif
>>      /* Don't do the full vcpu_info placement stuff until we have a
>>         possible map and a non-dummy shared_info. */
>>      per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0];
>>
>> trying it out now.
> 
> Seems to work for me with the mcelog that kept on failing.
> 
Yeah, would need to compile it but that looks to be a good solution. Somewhat
missed the fact that pat_enabled is actually exported in pat.h. Sometimes the
obvious is so blanked out. :-P

-Stefan

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.