[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v10 08/12] xen/page_alloc: introduce preserved page flags macro


  • To: Jan Beulich <jbeulich@xxxxxxxx>, Carlo Nonato <carlo.nonato@xxxxxxxxxxxxxxx>
  • From: Andrea Bastoni <andrea.bastoni@xxxxxxxxxxxxxxx>
  • Date: Mon, 2 Dec 2024 13:51:07 +0100
  • Autocrypt: addr=andrea.bastoni@xxxxxxxxxxxxxxx; keydata= xsFNBF5Nh4sBEAC7UM3QJtjrFO3pjcMCCh04JFyCCDzLFMIqMTB1UWCLamZ9dUwIau7ScgWv 49aqbM++edVvEBmG8JHDC83DFWymvFVXBgqgcR7tHHBbg33XJKFMHvuW/kFm/67XPTFcec4L JsH5MWms9TLJbgCnaWQQMH3kztTRQaf5QcULIoHnTySKlt3WzzzHosaMO+/GNYX7vzfc4ypJ mD5SQWYDhfRefASkyxdrN6/QkPwS2vGTyVK58o2U9I27KPYvs+77JrjrNBfpnebapaYVA55C 7BvTnno5Kr6QHwA6LcnIZqefz7KxQ1n+1C5QQbmhi9S68aloGCeUo9R06UMJG79TXC2Mc68t AtSCN/HpgcvN1CSL45f/4WCDPG572ebo5M6MPcTb4ptV1SC/i+4U/3cG0LNSUap+sGRCf0Iy C5xy0KOtgoq8jesdleSy8j/3DNIMGekSYbQYMO39DfZds2XFh9lVDjG7tQcChwW+lQDPo113 ENBRftDyqJthrvmJXGyrOmn0su56qh2Zqvi5pSHWsH88vAZUJsOU+9lpohmcb3o/cQ18UXNK H/9wjo2zKHFdSylQFERHIzj6WlBp01wkTcCqtUGpxsjJHmVSyakWs3TrGXooKR9SPMxqVrD/ oCCEo9IUD9jd+TxLsp/4TzUp4ScTO/43uPgdkMekU5mRs6B6WwARAQABzS9BbmRyZWEgQmFz dG9uaSA8YW5kcmVhLmJhc3RvbmlAbWluZXJ2YXN5cy50ZWNoPsLBlAQTAQgAPgIbAwULCQgH AgYVCgkICwIEFgIDAQIeAQIXgBYhBImpnm1L3x9XIoXhD3VSShFTR9xSBQJitcUIBQkKC9f2 AAoJEHVSShFTR9xSmSoP/0W/VboJmWmLh89eIl1QRoiL1nGcti1fTN835Q2TPiSdg+nFVqy1 8oLrJaHNe5THVaSr4S2du56SASYSG6f+Y5aiQccbalUJIV7KwXr741kovTnUPUKjPoi61Bx4 DUzis0Xo0NmMnz7M1YudbQZmjoakE/wZJRB+b79+kJwrfZFcNg4DSuSvNOUeI17uapLKRQ1a ukuRgXwUpMOudKngJ8HB+16aHIQX+yfpcsanNr1nGwHSLhEPEM20DVzKiCp2cKjv9Y7zZ+6y akbJHdbRuJliyZmXaSVO8sQ1tO/W6Y/4zAjejw2c1qDKISeIlGi+o6UEVYZlKCThzrV9vYok AcJF4DlYcAuBMVYCTomovXb/9/Y48pRGlfDGrmnt+FvGVA0jYrG02oufItY2JAGzFcX1KMTG AGf1S7pOj3AvBTGJjW8d8+sXuedH51HNixJtMnzcmi+qJfPJujBW3BEZ1p0ZoDyTH/WCZF+7 B5lFvTi0DRKY6AI0TSzBdjtaOMt64fn6KXtLtaTZKDKkFUBwAShJyYcWQSp/7EO+ZJW7dWIw 1vM5AcnugonJby+q+JGgBVC2rjscu5Okl3lnviF9WLAzmRH/pDkAa3jifiUm8eS+dP+4cN6g WXL9vTF6FtFyI8sgzRlY/IX8mao5bV/P1HCwNvkBhO8C3XMaul4FwQsjzsFNBF5Nh4sBEADN J99l+vOp8LB8jDjWOhINlpgp+EcrmWOuler5QnoJUywc2zkLelQIwVGP2lFliMdLHM6DbMEX ySIzHbhw7oPRP0QRPK/6I4bXYkSQCrLyqYd0CYSbkar8YV6Xa6nGxRmP1bBv1lPFHN66D0yE /z1ScGMXyX+ZOIvH0ekIkqFvi7Ec/7a/ntfU43o2t05dmbnEKoECZgeS8SraojfKnQRpz7+P N0q45O5fMETZpIiQh1/mB12HOcklDNELcKohqVfevbknJw04Yjbcv79aGpBRqoVWWBS4TxcD CRPQZ+H0tMUVEL/MqO7tNLA1VuGpOccyFtZnC/+J/twa7iKpPIxS9Ec/LDYTddebWH+8gOmr /PkBerBXghlZpxmQUlJeQ8kyecOOc4C7ec3aUGj+x28j0+zlXFLUbjiKIEM5VowIMgDDRwA/ MDr9IJhFzHaY2VCfBnX8sgJSn62IxqREq4X3KkR/Jtxt+HYXQYLl0yva2MBplkRcwQO799o6 woAMW0uyct4+BUcKo1sBFP2x2n4NFiPEjeoH3y9baruD9iiMQsmbJ3IKqtT13crCa+bcY3ZS Oz+CymgzNdH+RabJMC3mGfKIhUQGwEHz+wyMnv16nqO49bmoCk3q5Oneo4I3XwI3QbIJr0rd QkX6oh6R0taC3naal1ZYGxs0vZK567bT5wARAQABwsF2BBgBCAAgFiEEiamebUvfH1ciheEP dVJKEVNH3FIFAl5Nh4sCGwwACgkQdVJKEVNH3FLafxAAl7pW0v6Jju19I6wtB+XNzzi5Wota 3AyWzCxO/hUHNGRV/ZufhMkNFCMNzAxbdmO56eCk9ZYf/JMLu8H1GwhV1NgQ7HL4FNXXxLZ0 ixZDik86iiSjVLtEjLuwkS4Fj9wjqevycL/t16kJduFNyxT0/XiB5UPh5NClOMonHSC+V2If Kf6l2Ic34CrA3ovkfVvBXJTzia0VgyQCikeazgPRELH8bq2WTBWfjR3/l86Y0twiYyXqBNQ8 Q2Z83mu+yt38gTanz4YuDYz7YFjvL6IU2MZ5+ByothK6Cfx4W595q81dsGcJOlcd6j3QE+ps b3SHuToWZCHZRHyKrgGDqCL5RbsK3wXgDmc48SfN+5TxenT8A1lkoOHFcQ0SV8xleiwURXHc Ao+SzyDcTfltBNjzQmntQjAfq1Lq5Ux9nfpPMgnVHu5ANWeLtwLJyh+4aPVE5hGjeCa+Ab5U MyocCYdTuAmDHB9RQdf9c+qlVYuZCg7yYlWsvId5DGZnab2MzvExayaFCJVEoCccpfrqFFiF kJ19BogE4A6VTU0ShoHYJhLg7PuEZS1oWzULZnM8sNNI72MecvfZn5Oi0ZEJhFh+HETlJnIT 7gh7CGFBxPacT8vHxmeMPod7qrvYgKW+QKhU+tAI8gkI6hHXLBg/dxn7wAwTjlX1bo+jRJyp Id5SuAU=
  • Cc: marco.solieri@xxxxxxxxxxxxxxx, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Mon, 02 Dec 2024 12:51:35 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi,

On 29/11/2024 12:09, Jan Beulich wrote:
> On 29.11.2024 10:32, Carlo Nonato wrote:
>> Hi Jan,
>>
>> On Thu, Nov 28, 2024 at 12:05 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>>>
>>> On 19.11.2024 15:13, Carlo Nonato wrote:
>>>> PGC_static, PGC_extra and PGC_broken need to be preserved when assigning a
>>>> page. Define a new macro that groups those flags and use it instead of 
>>>> or'ing
>>>> every time.
>>>>
>>>> To make preserved flags even more meaningful, they are kept also when
>>>> switching state in mark_page_free().
>>>> Enforce the removal of PGC_extra before freeing domain pages as this is
>>>> considered an error and can cause ASSERT violations.
>>>>
>>>> Signed-off-by: Carlo Nonato <carlo.nonato@xxxxxxxxxxxxxxx>
>>>> ---
>>>> v10:
>>>> - fixed commit message
>>>> v9:
>>>> - add PGC_broken to PGC_preserved
>>>> - clear PGC_extra in alloc_domheap_pages() only if MEMF_no_refcount is set
>>>> v8:
>>>> - fixed PGC_extra ASSERT fail in alloc_domheap_pages() by removing 
>>>> PGC_extra
>>>>   before freeing
>>>> v7:
>>>> - PGC_preserved used also in mark_page_free()
>>>> v6:
>>>> - preserved_flags renamed to PGC_preserved
>>>> - PGC_preserved is used only in assign_pages()
>>>> v5:
>>>> - new patch
>>>> ---
>>>>  xen/common/page_alloc.c | 19 ++++++++++++++-----
>>>>  1 file changed, 14 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
>>>> index 7b911b5ed9..34cd473150 100644
>>>> --- a/xen/common/page_alloc.c
>>>> +++ b/xen/common/page_alloc.c
>>>> @@ -160,6 +160,7 @@
>>>>  #endif
>>>>
>>>>  #define PGC_no_buddy_merge PGC_static
>>>> +#define PGC_preserved (PGC_extra | PGC_static | PGC_broken)
>>>>
>>>>  #ifndef PGT_TYPE_INFO_INITIALIZER
>>>>  #define PGT_TYPE_INFO_INITIALIZER 0
>>>> @@ -1427,12 +1428,11 @@ static bool mark_page_free(struct page_info *pg, 
>>>> mfn_t mfn)
>>>>      {
>>>>      case PGC_state_inuse:
>>>>          BUG_ON(pg->count_info & PGC_broken);
>>>> -        pg->count_info = PGC_state_free;
>>>> +        pg->count_info = PGC_state_free | (pg->count_info & 
>>>> PGC_preserved);
>>>>          break;
>>>
>>> PGC_extra doesn't want preserving here. Since it's mark_page_free(), and
>>> since PGC_extra is removed before freeing, the state change is apparently
>>> fine. But an assertion may want adding, for documentation purposes if
>>> nothing else.
>>>
>>> Alternatively it may make sense to indeed exclude PGC_extra here. In fact
>>> PGC_static doesn't need using here either, as unprepare_staticmem_pages()
>>> will explicitly set it again anyway.
>>>
>>> Hence I wonder whether the change here really is necessary (one will then
>>> be needed in the next patch aiui, when PGC_colored is introduced). Which
>>> would then eliminate the need for the final two hunks of the patch, I
>>> think.
>>>
>>>>      case PGC_state_offlining:
>>>> -        pg->count_info = (pg->count_info & PGC_broken) |
>>>> -                         PGC_state_offlined;
>>>> +        pg->count_info = (pg->count_info & PGC_preserved) | 
>>>> PGC_state_offlined;
>>>>          pg_offlined = true;
>>>>          break;
>>>
>>> I'm similarly unconvinced that anything other than PGC_broken (and
>>> subsequently perhaps PGC_colored) would need preserving here.
>>
>> Yes, we (re)checked the code and also believe that the introduction of
>> PGC_preserved is generating more confusion (and code) then the actual logical
>> help it provides.
>>
>> We'll remove this patch and integrate PGC_colored explicitly in the flags to
>> be preserved. This avoid the clumsy logic of preserving something (extra)
>> when it's not needed and then handling the special case to remove it
>> explicitly.
>> Basically my goal is to go back to v4 where this patch didn't exist.
> 
> Hmm, no, I don't think I said anything in the direction of removing 
> PGC_preserved
> again. I merely think you went too far in where it actually wants using.

Let me try to better detail the intention of what we have in mind.
Here again parts of this patch:

> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index 7b911b5ed9..34cd473150 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -160,6 +160,7 @@
>  #endif
> 
>  #define PGC_no_buddy_merge PGC_static
> +#define PGC_preserved (PGC_extra | PGC_static | PGC_broken)

The granularity of preserved as implemented by v10 is different than what you 
require...

> 
>  #ifndef PGT_TYPE_INFO_INITIALIZER
>  #define PGT_TYPE_INFO_INITIALIZER 0
> @@ -1427,12 +1428,11 @@ static bool mark_page_free(struct page_info *pg, 
> mfn_t mfn)
>      {
>      case PGC_state_inuse:
>          BUG_ON(pg->count_info & PGC_broken);
> -        pg->count_info = PGC_state_free;
> +        pg->count_info = PGC_state_free | (pg->count_info & PGC_preserved);

here (no need),

>          break;
> 
>      case PGC_state_offlining:
> -        pg->count_info = (pg->count_info & PGC_broken) |
> -                         PGC_state_offlined;
> +        pg->count_info = (pg->count_info & PGC_preserved) | 
> PGC_state_offlined;

here (only broken),

>          pg_offlined = true;
>          break;
> 
> @@ -2366,7 +2366,7 @@ int assign_pages(
> 
>          for ( i = 0; i < nr; i++ )
>          {
> -            ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static)));
> +            ASSERT(!(pg[i].count_info & ~PGC_preserved));

here (no broken),

>              if ( pg[i].count_info & PGC_extra )
>                  extra_pages++;
>          }
> @@ -2426,7 +2426,7 @@ int assign_pages(
>          page_set_owner(&pg[i], d);
>          smp_wmb(); /* Domain pointer must be visible before updating refcnt. 
> */
>          pg[i].count_info =
> -            (pg[i].count_info & (PGC_extra | PGC_static)) | PGC_allocated | 
> 1;
> +            (pg[i].count_info & PGC_preserved) | PGC_allocated | 1;

here (no broken).

And it also leads (as noted) to the special management of extra...

> 
>          page_list_add_tail(&pg[i], page_to_list(d, &pg[i]));
>      }
> @@ -2485,6 +2485,14 @@ struct page_info *alloc_domheap_pages(
>          }
>          if ( assign_page(pg, order, d, memflags) )
>          {
> +            if ( memflags & MEMF_no_refcount )
> +            {
> +                unsigned long i;
> +
> +                for ( i = 0; i < (1UL << order); i++ )
> +                    pg[i].count_info &= ~PGC_extra;
> +            }

here (likely not need, should be impossible to trigger),

> +
>              free_heap_pages(pg, order, memflags & MEMF_no_scrub);
>              return NULL;
>          }
> @@ -2539,6 +2547,7 @@ void free_domheap_pages(struct page_info *pg, unsigned 
> int order)
>                  {
>                      ASSERT(d->extra_pages);
>                      d->extra_pages--;
> +                    pg[i].count_info &= ~PGC_extra;

and here.

The proposal would be to go back to v9, which reduces (for the PGC)
the management to colored only:

> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
...
> @@ -2382,7 +2556,7 @@ int assign_pages(
> 
>          for ( i = 0; i < nr; i++ )
>          {
> -            ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static)));
> +            ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static | 
> PGC_colored)));
>              if ( pg[i].count_info & PGC_extra )
>                  extra_pages++;
>          }
> @@ -2442,7 +2616,8 @@ int assign_pages(
>          page_set_owner(&pg[i], d);
>          smp_wmb(); /* Domain pointer must be visible before updating refcnt. 
> */
>          pg[i].count_info =
> -            (pg[i].count_info & (PGC_extra | PGC_static)) | PGC_allocated | 
> 1;
> +            (pg[i].count_info & (PGC_extra | PGC_static | PGC_colored)) |
> +            PGC_allocated | 1;
> 
>          page_list_add_tail(&pg[i], page_to_list(d, &pg[i]));

Thanks,
Andrea




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.