[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] mm/pdx: Add comments throughout the codebase for pdx
On Mon, Jun 19, 2023 at 05:30:20PM +0200, Jan Beulich wrote: > > + * ma_{top,bottom}_mask is simply a shifted pfn_{top,pdx_bottom}_mask > > where the > > + * bottom one shifts in 1s rather than 0s. > > + */ > > Nit: That 2nd bottom variable is ma_va_bottom_mask. Sure > > > @@ -57,9 +99,25 @@ uint64_t __init pdx_init_mask(uint64_t base_addr) > > (uint64_t)1 << (MAX_ORDER + PAGE_SHIFT)) - 1); > > } > > > > -u64 __init pdx_region_mask(u64 base, u64 len) > > +uint64_t __init pdx_region_mask(uint64_t base, uint64_t len) > > { > > - return fill_mask(base ^ (base + len - 1)); > > + uint64_t last = base + len - 1; > > + /* > > + * The only bit that matters in base^last is the MSB. There are 2 > > cases. > > + * > > + * case msb(base) < msb(last): > > + * then fill_mask(base^last) == fill_mask(last). This is non > > + * compressible. > > + * case msb(base) == msb(last): > > + * This means that there _may_ be a sequence of compressible zeroes > > + * for all addresses between `base` and `last` iff `base` has > > enough > > + * trailing zeroes. That is, it's compressible when > > + * fill_mask(base^last) < fill_mask(last) > > + * > > + * The resulting mask is effectively the moving bits between `base` and > > + * `last` > > + */ > > + return fill_mask(base ^ last); > > } > > I don't see a need for you to actually change the code here. You can > as well introduce "last" as shorthand just for the comment. I thought as you did initially and wrote it as such. In the end it felt wrong to have an explanation in terms of a token not present in the code. Furthermore, explaining what the shorthand is in the comment takes more space than introducing `last` in the code itself. ``` uint64_t last = base + len - 1; /* * The only bit that matters in base^last is the MSB. There are 2 cases. ``` vs ``` /* * Let `last = base + len -1` out of convenience. * The only bit that matters in base^last is the MSB. There are 2 cases. ``` TL;DR: I didn't factor out `last` due to aesthetics (I'd rather not touch the code in this patch, in fact), but it seems warranted in order to reduce the impedance mismatch between this big-ish comment and the call it describes. I'll post v2 without that adjustment in case I managed to convince you. Otherwise feel free to adjust it on commit. > What I dislike in your way of putting it is the use of fill_mask(last) when > such a call never occurs in code. Starting from the first sentence, > can't you explain things just in terms of said MSB I see. I can refer to the MSBs instead. Works equally well. e.g: fill_mask(base^last) == fill_mask(last) | V msb(fill_mask(base^last)) == msb(last) > (where the two cases are "set" and "clear")? I'm not sure I understand what you mean here. > > > --- a/xen/include/xen/mm.h > > +++ b/xen/include/xen/mm.h > > @@ -31,6 +31,22 @@ > > * (i.e. all devices assigned to) a guest share a single DMA address > > space > > * and, by default, Xen will ensure dfn == pfn. > > * > > + * pdx: Page InDeX > > + * Indices into the frame table holding the per-page's book-keeping > > + * metadata. A compression scheme is used and there's a non-identity > > + * mapping between valid(mfn) <-> valid(pdx) See the comments in pdx.c > > + * for an in-depth explanation of that mapping. > > The mapping may very well be (and on x86 typically is) an identity > one. IOW you want to describe not only the compression case, but also > the "no compression possible" one. Point taken. I'll rephrase it slightly as "possibly non-identity" and explicitly state the "no compression is possible" case. > > PDXes also aren't just indexes to the frame table, but also to the > direct mapping. I had something to that effect earlier on, but I removed it because it doesn't seem to be the case on ARM. There's a directmap_base_pdx global that states the first pdx to be mapped on the directmap. > > > + * maddr: Machine Address > > + * The physical address that corresponds to an mfn > > + * > > + * vaddr: Xen Virtual Address > > + * A virtual address of memory accesible to Xen. It is typically either > > + * an address into the direct map or to Xen's own code/data. The direct > > + * map implements several compression tricks to save memory, so an offset > > + * into it does _not_ necessarily correspond to an maddr due to pdx > > + * compression. > > We need to be careful here: If I'm not mistaken at least Arm uses vaddr > also for guest addresses. In fact I'm not sure vaddr (and perhaps even > maddr) need explaining here I'd like to have at least maddr. It's sufficiently project-specific to be otherwise confusing to find unexplained elsewhere. i.e: In other bare-metal projects that would be a paddr instead. vaddr might be trying too hard to boil the ocean as far as definitions go. I can get rid of it. > the more that nothing in this header uses either term. True. But it should be somewhere and this is the main memory-management header, where the frame number definitions are. In general, things that change together ought to stay together. > > + * ## PDX compression > > + * > > + * This is a technique to avoid wasting memory on machines known to have > > + * split their machine address space in two big discontinuous and highly > > + * disjoint chunks. > > Why two? There can be any number, and in fact on the system I originally > had data from for reference (when first writing this code) iirc there > were 8 nodes, each with a chunk of memory far away from the other chunks. > The compression scheme used merely requires that some "inner" bits are > unused (always zero) in all of those ranges. Well, our implementation only supports two and I didn't see any obvious hints about intending to increasing that number. I see where you're coming from, though. I can make it more general so it's not outdated if the pfn_to_pdx()/pdx_to_pfn() pair ever increases in scope to do several holes. Out of curiosity (and for posterity's sake), what was/is that system? > > > + * In its uncompressed form the frame table must have book-keeping metadata > > + * structures for every page between [0, max_mfn) (whether they exist or > > s/they exist/there is RAM/ ? They exist is ambiguous, true. Rewrote it as "they are backed by RAM" > > > + * not), and a similar condition exists for the direct map. We know some > > + * architectures, however, that have some sparsity in their address space, > > + * leading to a lot of wastage in the form of unused frame table entries. > > Hmm, "architectures" suggests e.g. Arm might have such, but x86 won't. > Perhaps "systems", "designs", or "system designs"? I like `systems` better. Sure. > > > @@ -13,22 +69,77 @@ extern unsigned long pfn_top_mask, ma_top_mask; > > (sizeof(*frame_table) & -sizeof(*frame_table))) > > extern unsigned long pdx_group_valid[]; > > > > -extern uint64_t pdx_init_mask(u64 base_addr); > > -extern u64 pdx_region_mask(u64 base, u64 len); > > +/** > > + * Calculates a mask covering "moving" bits of all addresses of a region > > + * > > + * e.g: > > + * base=0x1B00000000 > > + * len+base=0x1B0008200; > > + * > > + * ought to return 0x00000FFFFF; > > + * > > + * @param base Base address of the region > > + * @param len Size in octets of the region > > + * @return Mask of moving bits at the bottom of all the region addresses > > + */ > > This looks to be a copy-and-paste of pdx_region_mask()'s comment, when > the function has neither a "base" parameter, nor a and one at all. Oops. A victim of incompatible rebases. I extracted these comments from an ongoing patch series I'm working on. I'll (try to) write an actual comment for it. > > > +uint64_t pdx_init_mask(u64 base_addr); > > No u64 -> uint64_t here? > > > -extern void set_pdx_range(unsigned long smfn, unsigned long emfn); > > +/** > > + * Calculates a mask covering "moving" bits of all addresses of a region > > + * > > + * e.g: > > + * base=0x1B00000000 > > + * len+base=0x1B0008200; > > + * > > + * ought to return 0x00000FFFFF; > > I think it would help if you actually said how the return value actually > derives. The term "moving" may be understood differently be different > people, and hence such an explanation actually would also clarify what > "moving" means. Hmmm, I'd rather not explicitly state the XOR here though. I'm adding a couple more lines explaining things in terms of the i-th bit of the mask and all the region addresses. Ideally this comment ought to explain the intuition, while the comment in pdx.c explains the implementation. > > I also thing there's a 0 missing in the len+base value, without which > the result would be quite a bit different. Indeed. > > +/** > > + * Mark range between smfn and emfn is allocatable in the frame table > > + * > > + * @param smfn Start mfn > > + * @param emfn End mfn > > + */ > > +void set_pdx_range(unsigned long smfn, unsigned long emfn); > > This could do with mathematically expressing the range in the comment, > such that (in|ex)clusiveness of, in particular, emfn is clarified. Good point. Sure. > > +/** > > + * Invoked to determine if an mfn maps to a legal pdx > > I wouldn't use "pdx" here, but refer to frame_table[] instead. I can rewrite it as something along those lines, sure. > > > + * In order for it to be legal it must pass bounds, grouping and > > + * compression sanity checks. > > + * > > + * @param smfn Start mfn > > + * @param emfn End mfn > > + * @return True iff all checks pass > > + */ > > bool __mfn_valid(unsigned long mfn); > > Comment again mentions inapplicable parameters. Ack. > > > @@ -38,7 +149,16 @@ static inline unsigned long pdx_to_pfn(unsigned long > > pdx) > > #define mfn_to_pdx(mfn) pfn_to_pdx(mfn_x(mfn)) > > #define pdx_to_mfn(pdx) _mfn(pdx_to_pfn(pdx)) > > > > -extern void pfn_pdx_hole_setup(unsigned long); > > +/** > > + * Initializes global variables with information about the compressible > > + * range of the current memory regions. > > + * > > + * @param mask This mask is the biggest pdx_mask of every region in the > > + * system ORed with all base addresses of every region in the > > + * system. The result is a mask where every sequence of zeroes > > + * surrounded by ones is compressible. > > + */ > > +void pfn_pdx_hole_setup(unsigned long mask); > > With the function returning void, I find "The result" problematic. How about > "This results in ..."? Sounds better. Sure. > > Btw, "surrounded by ones" isn't really necessary. We could compress shorter > sequences of zeros, so this may want re-wording a little to be as precise > as possible. > > Jan Fair. I'll tweak the definition. Alejandro
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |