[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 5/9] vpci/header: Implement guest BAR register handlers
On 07.09.21 19:30, Jan Beulich wrote: > On 07.09.2021 15:33, Oleksandr Andrushchenko wrote: >> On 06.09.21 17:31, Jan Beulich wrote: >>> On 03.09.2021 12:08, Oleksandr Andrushchenko wrote: >>>> --- a/xen/drivers/vpci/header.c >>>> +++ b/xen/drivers/vpci/header.c >>>> @@ -400,12 +400,72 @@ static void bar_write(const struct pci_dev *pdev, >>>> unsigned int reg, >>>> static void guest_bar_write(const struct pci_dev *pdev, unsigned int >>>> reg, >>>> uint32_t val, void *data) >>>> { >>>> + struct vpci_bar *bar = data; >>>> + bool hi = false; >>>> + >>>> + if ( bar->type == VPCI_BAR_MEM64_HI ) >>>> + { >>>> + ASSERT(reg > PCI_BASE_ADDRESS_0); >>>> + bar--; >>>> + hi = true; >>>> + } >>>> + else >>>> + val &= PCI_BASE_ADDRESS_MEM_MASK; >>>> + bar->guest_addr &= ~(0xffffffffull << (hi ? 32 : 0)); >>>> + bar->guest_addr |= (uint64_t)val << (hi ? 32 : 0); >>> What you store here is not the address that's going to be used, >> bar->guest_addr is never used directly to be reported to a guest. > And this is what I question, as an approach. Your model _may_ work, > but its needlessly deviating from how people like me would expect > this to work. And if it's intended to be this way, how would I > have known? Well, I just tried to follow the model we already have in the existing code... > >> It is always used as an initial value which is then modified to reflect >> lower bits, e.g. BAR type and if prefetchable, so I think this is perfectly >> fine to have it this way. > And it is also perfectly fine to store the value to be handed > back to guests on the next read. Keeps the read path simple, > which I think can be assumed to be taken more frequently than > the write one. Plus stored values reflect reality. > > Plus - if what you say was really the case, why do you mask off > PCI_BASE_ADDRESS_MEM_MASK here? You should then simply store > the written value and do _all_ the processing in the read path. > No point having partial logic in two places. I will move the logic to the write handler, indeed we can save some CPU cycles here > >>> as >>> you don't mask off the low bits (to account for the BAR's size). >>> When a BAR gets written with all ones, all writable bits get these >>> ones stored. The address of the BAR, aiui, really changes to >>> (typically) close below 4Gb (in the case of a 32-bit BAR), which >>> is why memory / I/O decoding should be off while sizing BARs. >>> Therefore you shouldn't look for the specific "all writable bits >>> are ones" pattern (or worse, as you presently do, the "all bits >>> outside of the type specifier are ones" one) on the read path. >>> Instead mask the value appropriately here, and simply return back >>> the stored value from the read path. >> "PCI LOCAL BUS SPECIFICATION, REV. 3.0", "IMPLEMENTATION NOTE >> >> Sizing a 32-bit Base Address Register Example" says, that >> >> "Software saves the original value of the Base Address register, writes >> 0 FFFF FFFFh to the register, then reads it back." >> >> The same applies for 64-bit BARs. So what's wrong if I try to catch such >> a write when a guest tries to size the BAR? The only difference is that >> I compare as >> if ( (val & PCI_BASE_ADDRESS_MEM_MASK_32) == >> PCI_BASE_ADDRESS_MEM_MASK_32 ) >> which is because val in the question has lower bits cleared. > Because while this matches what the spec says, it's not enough to > match how hardware behaves. But we should implement as the spec says, not like buggy hardware behaves > Yet you want to mimic hardware behavior > as closely as possible here. There is (iirc) at least one other > place in the source tree were we had to adjust a similar initial > implementation to be closer to one expected by guests, Could you please point me to that code so I can consult and possibly re-use the approach? > no matter > that they may not be following the spec to the letter. Don't > forget that there may be bugs in kernels which don't surface until > the kernel gets run on an overly simplistic emulation. This is sad. And the kernel needs to be fixed then, not Xen > >>>> @@ -522,6 +582,13 @@ static int add_bar_handlers(struct pci_dev *pdev, >>>> bool is_hwdom) >>>> if ( rc ) >>>> return rc; >>>> } >>>> + /* >>>> + * It is neither safe nor secure to initialize guest's view of >>>> the BARs >>>> + * with real values which are used by the hardware domain, so >>>> assign >>>> + * all zeros to guest's view of the BARs, so the guest can perform >>>> + * proper PCI device enumeration and assign BARs on its own. >>>> + */ >>>> + bars[i].guest_addr = 0; >>> I'm afraid I don't understand the comment: Without memory decoding >>> enabled, the BARs are simple registers (with a few r/o bits). >> My first implementation was that bar->guest_addr was initialized with >> the value of bar->addr (physical BAR value), but talking on IRC with >> Roger he suggested that this might be a security issue to let guest >> a hint about physical values, so then I changed the assignment to be 0. > Well, yes, that's certainly correct. It's perhaps too unobvious to me > why one may want to use the host value here in the first place. It > simply has no meaning here. Do you want me to remove the comment? > >>>> --- a/xen/include/xen/pci_regs.h >>>> +++ b/xen/include/xen/pci_regs.h >>>> @@ -103,6 +103,7 @@ >>>> #define PCI_BASE_ADDRESS_MEM_TYPE_64 0x04 /* 64 bit address */ >>>> #define PCI_BASE_ADDRESS_MEM_PREFETCH 0x08 /* prefetchable? */ >>>> #define PCI_BASE_ADDRESS_MEM_MASK (~0x0fUL) >>>> +#define PCI_BASE_ADDRESS_MEM_MASK_32 (~0x0fU) >>> Please don't introduce an identical constant that's merely of >>> different type. (uint32_t)PCI_BASE_ADDRESS_MEM_MASK at the use >>> site (if actually still needed as per the comment above) would >>> seem more clear to me. >> Ok, I thought type casting is a bigger evil here > Often it is, but imo here it is not. I hope you realize that ~0x0fU > if not necessarily 0xfffffff0? We make certain assumptions on type > widths. For unsigned int we assume it to be at least 32 bits wide. > We should avoid assumptions of it being exactly 32 bits wide. Just > like we cannot (more obviously) assume the width of unsigned long. > (Which tells us that for 32-bit arches PCI_BASE_ADDRESS_MEM_MASK is > actually of the wrong type. This constant should be the same no > matter the bitness.) Ok, I will not introduce a new define and use (uint32_t) > > Jan > Thank you, Oleksandr
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |