[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] 2/3: MCA/MCE correctable error handling
On Wednesday 22 August 2007 18:13:30 Keir Fraser wrote: > I'm not sure you should use shared_info at all. This interface should be > low-rate enough that you can add a sysctl (assuming this info is applicable > for dom0 only, which it looks to be). The polling service is dom0 only. Uncorrectable errors may also be reported to a DomU. > Actually, probably a platform_op rather than a sysctl... What you propose is to copy the MCA info from Xen to guest? The MCA info structure will be used for both correctable and uncorrectable error notification (as I said in my first patch 0/3 mail). I assumed you agreed on all this as is in the patches, since you did not object/comment when this was discussed on this list in the "MCA/MCE concept" topic (and also your questions in your patch 3/3 mail were discussed there). The purpose of this discussion was to find an agreement on how to proceed. Therefore, all what I expected from you were some questions/comments about implementation details and no design/concept questions. Christoph > -- Keir > > On 22/8/07 16:47, "Christoph Egger" <Christoph.Egger@xxxxxxx> wrote: > > I fixed all this in my code and will re-submit the new patch. > > > > Christoph > > > > On Wednesday 22 August 2007 12:01:26 Jan Beulich wrote: > >>>>> "Christoph Egger" <Christoph.Egger@xxxxxxx> 22.08.07 10:47 >>> > >>> > >>> On Tuesday 21 August 2007 17:53:45 Jan Beulich wrote: > >>>>> +} __attribute__((packed)); > >>>> > >>>> I think it was a general agreement that it is not a good idea > >>>> (non-portable to non-GNU compilers) to put things like this in public > >>>> headers. I think by properly ordering things you can get away without > >>>> this (and almost without padding). > >>> > >>> Oh, you're right. I should use #pragma pack(1) instead. > >>> Is this fine with you? > >> > >> I'm afraid that wouldn't work with the compat mode header generation, as > >> the original use of #pragma pack(push, ...) was banned for (Sun) > >> compatibility reasons. Consequently, I believe the only way of doing > >> this properly is to avoid depending on compiler behavior by arranging > >> things appropriately (including padding members if needed) and avoiding > >> #pragma pack() altogether. > >> > >>>>> +struct mcinfo_global { > >>>>> ... > >>>>> + uint16_t mc_socketid; > >>>>> + uint16_t mc_coreid; > >>>>> + uint16_t mc_vcpu_id; > >>>> > >>>> Unless mc_vcpu_id is intended for that purpose, this neglects > >>>> hyperthreading (I know, AMD doesn't use this, but the interface should > >>>> be vendor neutral). > >>>> > >>>> If mc_vcpu_id is meant for that purpose, its naming is ambiguous. > >>>> > >>>> If mc_vcpu_id is meant as a vcpuid, then ordering things os that > >>>> mc_domid and mc_vcpu_id are contiguous would seem more natural (making > >>>> eventual examination in raw memory less confusing). > >>> > >>> mc_coreid is the physical core that reported the machine check > >>> information. mc_socketid is the physical socket the physical core is > >>> in. This is useful to find all other affected physical cores, when an > >>> error in the L3-Cache is reported that is shared over all cores in the > >>> chip. > >>> > >>> mc_vcpu_id is the id of the active vcpu for the domain, that reported > >>> the machine check information. Inside Xen, this is current->vcpu_id. > >>> mc_vcpu_id is needed to deal properly with the upcoming NUMA support my > >>> collegue is working on. > >> > >> Okay, but then you're really lacking a thread id here, for being filled > >> by respective Intel code (AMD code would set this to zero). > >> > >>>>> +/* sizeof(struct mcinfo_global) + 6 * sizeof(struct mcinfo_bank) == > >>>>> 200. + * This is enough space to store mc information of up to six > >>>>> banks. + */ > >>>>> +#define MCINFO_MAXSIZE (204 - sizeof(uint32_t)) > >>>> > >>>> Why don't you use the sizeof()-s from the description? If this is > >>>> really needed for some reason, then having 200 in the description and > >>>> 204 in the macro is at least confusing... > >>> > >>> MCINFO_MAXSIZE is the size for the content of the mi_data array. > >>> MCINFO_MAXSIZE + sizeof(mi_nentries) == 204. That is where is number > >>> comes from. > >> > >> I concluded that, but pointed out that seeing two different numbers made > >> me think of why this is so, whereas identical numbers would have avoided > >> that (and will likely keep others from asking later, too). > >> > >> Also, you don't say what was the reason for you to use constants instead > >> of sizeof() here. > >> > >>>>> /* Frame containing list of mfns containing list of mfns > >>>>> containing p2m. */ xen_pfn_t pfn_to_mfn_frame_list_list; > >>>>> unsigned long nmi_reason; > >>>>> + struct arch_mc_info mc_info; /* machine check information */ > >>>>> uint64_t pad[32]; > >>>>> }; > >>>> > >>>> Are you sure it is appropriate to add a member here without reducing > >>>> the padding member? > >>> > >>> You want me to replace "uint64_t pad[32];" with "uint32_t pad[13];" ? > >> > >> It would be my understanding that that's the right thing to do (assuming > >> you calculated the numbers correctly), but I'd rely on Keir to give a > >> final word on this. > >> > >> Jan -- AMD Saxony, Dresden, Germany Operating System Research Center Legal Information: AMD Saxony Limited Liability Company & Co. KG Sitz (Geschäftsanschrift): Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland Registergericht Dresden: HRA 4896 vertretungsberechtigter Komplementär: AMD Saxony LLC (Sitz Wilmington, Delaware, USA) Geschäftsführer der AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |