Xen project Mailing List

Re: [Xen-devel] [PATCH v9 2/5] xen: introduce SYMBOLS_SUBTRACT and SYMBOLS_COMPARE

I would particularly welcome the opinion of hypervisor maintainers on my type safety point, below. Stefano Stabellini writes ("[Xen-devel] [PATCH v9 2/5] xen: introduce SYMBOLS_SUBTRACT and SYMBOLS_COMPARE"): > +/* > + * Calculate (end - start), where start and/or end are linker symbols, > + * returning a ptrdiff_t. The size is in units of start's referent. > + */ > +#define SYMBOLS_SUBTRACT(end, start) ({ > \ > + (ptrdiff_t)((uintptr_t)(end) - (uintptr_t)(start)) / sizeof(*start); > \ > +}) I'm afraid I have several problems with the details of these macros. Firstly, I really don't like the lack a type check, which was present in my proposal. For example, given: extern struct wombat _foos_start[]; extern char _end[]; and your macro definition, the two expressions SYMBOLS_SUBTRACT(_foos_start, _end) -SYMBOLS_SUBTRACT(_end, _foos_start) produce different values because one is divided by sizeof(struct wombat) and the other by sizeof(char). This is hardly desirable. Secondly, I find the argument ordering extremely confusing. With your macros DIFFERENCE(start,end) is negative but COMPARE(start,end) is positive. I suggest that you call the macro DIFFERENCE and have DIFFERENCE(start,end) be positive. Thirdly, in an earlier exchange: > On Thu, 7 Feb 2019, Ian Jackson wrote: > > FAOD, I think you should expect people to declare the linker symbols > > either as I suggested: > > > > extern const struct wombat _wombats_start[]; > > extern const struct abstract_symbol _wombats_end[]; > > > > (or along the lines of Jan's suggestion, but frankly I think that is > > going to be too hard to sort out now.) > > Yes, they are already declared this way, I would prefer to avoid > changing the declaration as part of this series. Are you sure things are not currently declared like this extern const struct wombat _wombats_start[]; extern const struct wombat _wombats_end[]; ? Looking at your v9 series, that seems to be the case. I think using the `abstract_symbol' version (or Jan's refinement of it) is important because it arranges that accidental raw pointer comparisons, which are UB, are a type error. IOW this approach moves and centralises the burden of remembering (on pain of miscompilation) that something is special about these pointers, to the point where the linker symbols are introduced into the C environment. That is one place, and one where it is easier to remember because something odd is already going on. Unless you want to propose some other approach which will reliably find these kind of linker symbol pointer comparison rule violations and ensure that they are not deployed in production code ? This type difference is why my proposal had two macros. If both start and end have the same type, only one macro is necessary. It should be one which checks the types are identical and returns a signed value in units of the type. That can be used for comparisons. If desired, we could enhance the macro so that the compiler can prove that the division can be removed when the result is compared for inequality with 0. But even with two types, it may be that there is only one macro needed because the vast majority of use sites are formulaic. You said earlier: > Most of the call sites only do things like (_end - _start) or (p > > _end). I wanted to bring up cases that are not trivial. When designing a general scheme for macros like this it is best to consider the usual case and make it straightforward to use, and bulletproof. Ie presenting unusual cases as your examples is not helpful to the design process for a macro. IMO the situation you describe in the snipped I quote is what the purpose of SYMBOL_DIFFERENCE is. For the two examples you give, always one of the arguments is _end. So if we make the type of _end be struct abstract_symbol[], SYMBOL_DIFFERENCE can (i) check that _end is of that type (ii) return an answer in units of its other (first) argument. For pointers derived from _start, it is actually straightforwardly legal to compare them with _start, or subtract _start from them. So no macrology is needed in that case. Stepping back a bit, it is indeed the second symbol referrring to the same memory object that triggers the compiler bug: the compiler wrongly assumes that two extern declarations must refer to two different objects. Making the two declarations have different types will simply prevent *all* triggers for this bug, because raw comparisons or arithmetic between differently typed pointers is a compile error. Then all that is needed is to embed the correct usage pattern into a macro (or, a sufficiently general set of correct usage patterns that ad-hoc approaches are rare). You also write: > We have a couple of cases where we are "punning wildly between pointers > and integers", for instance: > > xen/arch/arm/arm64/livepatch.c:arch_livepatch_apply > xen/arch/arm/setup.c:start_xen line 772 > xen/arch/x86/setup.c:__start_xen line 1382 > > I think it is OK to manually cast to (uintptr_t) in those cases as you > suggest. I haven't looked at these, but IMO whatever technique is used comes with a proof obligation. In the case of a macro, the proof obligation attaches to the author of the macro. The proof should show not only that correct use of the macro will result in correct code; but there should also be discussion of the risks of incorrect use of the macro - such incorrect use should be prevented if that's reasonably possible. In the case of an ad-hoc technique, the proof obligation comes with the author of the exciting code. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.