[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] mm/pdx: Add comments throughout the codebase for pdx
On 21.06.2023 16:25, Alejandro Vallejo wrote: > On Mon, Jun 19, 2023 at 05:30:20PM +0200, Jan Beulich wrote: >>> @@ -57,9 +99,25 @@ uint64_t __init pdx_init_mask(uint64_t base_addr) >>> (uint64_t)1 << (MAX_ORDER + PAGE_SHIFT)) - 1); >>> } >>> >>> -u64 __init pdx_region_mask(u64 base, u64 len) >>> +uint64_t __init pdx_region_mask(uint64_t base, uint64_t len) >>> { >>> - return fill_mask(base ^ (base + len - 1)); >>> + uint64_t last = base + len - 1; >>> + /* >>> + * The only bit that matters in base^last is the MSB. There are 2 >>> cases. >>> + * >>> + * case msb(base) < msb(last): >>> + * then fill_mask(base^last) == fill_mask(last). This is non >>> + * compressible. >>> + * case msb(base) == msb(last): >>> + * This means that there _may_ be a sequence of compressible zeroes >>> + * for all addresses between `base` and `last` iff `base` has >>> enough >>> + * trailing zeroes. That is, it's compressible when >>> + * fill_mask(base^last) < fill_mask(last) >>> + * >>> + * The resulting mask is effectively the moving bits between `base` and >>> + * `last` >>> + */ >>> + return fill_mask(base ^ last); >>> } >> >> I don't see a need for you to actually change the code here. You can >> as well introduce "last" as shorthand just for the comment. > I thought as you did initially and wrote it as such. In the end it felt > wrong to have an explanation in terms of a token not present in the code. > Furthermore, explaining what the shorthand is in the comment takes more > space than introducing `last` in the code itself. > > ``` > uint64_t last = base + len - 1; > /* > * The only bit that matters in base^last is the MSB. There are 2 cases. > ``` > vs > ``` > /* > * Let `last = base + len -1` out of convenience. > * The only bit that matters in base^last is the MSB. There are 2 cases. > ``` > > TL;DR: I didn't factor out `last` due to aesthetics (I'd rather not touch > the code in this patch, in fact), but it seems warranted in order to reduce > the impedance mismatch between this big-ish comment and the call it > describes. I'll post v2 without that adjustment in case I managed to > convince you. Otherwise feel free to adjust it on commit. > >> What I dislike in your way of putting it is the use of fill_mask(last) when >> such a call never occurs in code. Starting from the first sentence, >> can't you explain things just in terms of said MSB > I see. I can refer to the MSBs instead. Works equally well. > > e.g: > fill_mask(base^last) == fill_mask(last) > | > V > msb(fill_mask(base^last)) == msb(last) > >> (where the two cases are "set" and "clear")? > I'm not sure I understand what you mean here. Hmm, yes, I think I was getting confused here. >>> --- a/xen/include/xen/mm.h >>> +++ b/xen/include/xen/mm.h >>> @@ -31,6 +31,22 @@ >>> * (i.e. all devices assigned to) a guest share a single DMA address >>> space >>> * and, by default, Xen will ensure dfn == pfn. >>> * >>> + * pdx: Page InDeX >>> + * Indices into the frame table holding the per-page's book-keeping >>> + * metadata. A compression scheme is used and there's a non-identity >>> + * mapping between valid(mfn) <-> valid(pdx) See the comments in pdx.c >>> + * for an in-depth explanation of that mapping. >> >> The mapping may very well be (and on x86 typically is) an identity >> one. IOW you want to describe not only the compression case, but also >> the "no compression possible" one. > Point taken. I'll rephrase it slightly as "possibly non-identity" and > explicitly state the "no compression is possible" case. > >> >> PDXes also aren't just indexes to the frame table, but also to the >> direct mapping. > I had something to that effect earlier on, but I removed it because it > doesn't seem to be the case on ARM. There's a directmap_base_pdx global > that states the first pdx to be mapped on the directmap. Which would merely make it a biased index. I very much hope they eliminate holes (and not just unused leading space) from the directmap as well. >>> + * ## PDX compression >>> + * >>> + * This is a technique to avoid wasting memory on machines known to have >>> + * split their machine address space in two big discontinuous and highly >>> + * disjoint chunks. >> >> Why two? There can be any number, and in fact on the system I originally >> had data from for reference (when first writing this code) iirc there >> were 8 nodes, each with a chunk of memory far away from the other chunks. >> The compression scheme used merely requires that some "inner" bits are >> unused (always zero) in all of those ranges. > Well, our implementation only supports two and I didn't see any obvious > hints about intending to increasing that number. Where are you taking that "supports two" from? When I first wrote this code, it was tested against a system with 8 (maybe it was 4, but certainly more than 2) discontiguous regions (not counting the hole below 4G). > I see where you're coming > from, though. I can make it more general so it's not outdated if the > pfn_to_pdx()/pdx_to_pfn() pair ever increases in scope to do several holes. > > Out of curiosity (and for posterity's sake), what was/is that system? I'm not sure I'm permitted to mention that. I'm pretty sure I carefully avoided mentioning the partner of ours back at the time. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |