[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] RFC: xen config changes v4



On Fri, Feb 27, 2015 at 09:53:46AM -0800, Luis R. Rodriguez wrote:
> On Fri, Feb 27, 2015 at 6:30 AM, Juergen Gross <jgross@xxxxxxxx> wrote:
> > On 02/27/2015 02:38 PM, Stefano Stabellini wrote:
> >>
> >> On Fri, 27 Feb 2015, Juergen Gross wrote:
> >>>
> >>> On 02/27/2015 01:24 PM, Stefano Stabellini wrote:
> >>>>
> >>>> On Fri, 27 Feb 2015, Juergen Gross wrote:
> >>>>>
> >>>>> On 02/27/2015 11:11 AM, Stefano Stabellini wrote:
> >>>>>>
> >>>>>> On Fri, 27 Feb 2015, Juergen Gross wrote:
> >>>>>>>
> >>>>>>> On 02/27/2015 10:41 AM, Stefano Stabellini wrote:
> >>>>>>>>
> >>>>>>>> On Fri, 27 Feb 2015, Juergen Gross wrote:
> >>>>>>>>>
> >>>>>>>>> On 02/26/2015 06:42 PM, Stefano Stabellini wrote:
> >>>>>>>>>>
> >>>>>>>>>> On Thu, 26 Feb 2015, Luis R. Rodriguez wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> On Thu, Feb 26, 2015 at 11:08:20AM +0000, Stefano Stabellini
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Thu, 26 Feb 2015, David Vrabel wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 26/02/15 04:59, Juergen Gross wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> So we are again in the situation that pv-drivers always
> >>>>>>>>>>>>>> imply
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>> pvops
> >>>>>>>>>>>>>> kernel (PARAVIRT selected). I started the whole Kconfig
> >>>>>>>>>>>>>> rework
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>> eliminate this dependency.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Yes.  Can you produce a series that just addresses this
> >>>>>>>>>>>>> one
> >>>>>>>>>>>>> issue.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> In the absence of any concrete requirement for this big
> >>>>>>>>>>>>> Kconfig
> >>>>>>>>>>>>> reorg
> >>>>>>>>>>>>> I
> >>>>>>>>>>>>> I don't think it is helpful.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> I clearly missed some context as I didn't realize that this
> >>>>>>>>>>>> was
> >>>>>>>>>>>> the
> >>>>>>>>>>>> intended goal. Why do we want this? Please explain as it
> >>>>>>>>>>>> won't
> >>>>>>>>>>>> come
> >>>>>>>>>>>> for free.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> We have a few PV interfaces for HVM guests that need
> >>>>>>>>>>>> PARAVIRT in
> >>>>>>>>>>>> Linux
> >>>>>>>>>>>> in order to be used, for example pv_time_ops and
> >>>>>>>>>>>> HVMOP_pagetable_dying.
> >>>>>>>>>>>> They are critical performance improvements and from the
> >>>>>>>>>>>> interface
> >>>>>>>>>>>> perspective, small enough that doesn't make much sense
> >>>>>>>>>>>> having a
> >>>>>>>>>>>> separate
> >>>>>>>>>>>> KConfig option for them.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> In order to reach the goal above we necessarily need to
> >>>>>>>>>>>> introduce a
> >>>>>>>>>>>> differentiation in terms of PV on HVM guests in Linux:
> >>>>>>>>>>>>
> >>>>>>>>>>>> 1) basic guests with PV network, disk, etc but no PV timers,
> >>>>>>>>>>>> no
> >>>>>>>>>>>>         HVMOP_pagetable_dying, no PV IPIs
> >>>>>>>>>>>> 2) full PV on HVM guests that have PV network, disk, timers,
> >>>>>>>>>>>>         HVMOP_pagetable_dying, PV IPIs and anything else that
> >>>>>>>>>>>> makes
> >>>>>>>>>>>> sense.
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2) is much faster than 1) on Xen and 2) is only a tiny bit
> >>>>>>>>>>>> slower
> >>>>>>>>>>>> than
> >>>>>>>>>>>> 1) on native x86
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Also don't we shove 2) down hvm guests right now? Even when
> >>>>>>>>>>> everything
> >>>>>>>>>>> is
> >>>>>>>>>>> built in I do not see how we opt out for HVM for 1) at run
> >>>>>>>>>>> time
> >>>>>>>>>>> right
> >>>>>>>>>>> now.
> >>>>>>>>>>>
> >>>>>>>>>>> If this is true then the question of motivation for this
> >>>>>>>>>>> becomes
> >>>>>>>>>>> even
> >>>>>>>>>>> stronger I think.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Yes, indeed there is no way to do 1) at the moment. And for good
> >>>>>>>>>> reasons, see above.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Hmm, after checking the code I'm not convinced:
> >>>>>>>>>
> >>>>>>>>> - HVMOP_pagetable_dying is obsolete on modern hardware supporting
> >>>>>>>>>       EPT/HAP
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> That might be true, but what about older hardware?
> >>>>>>>> Even on modern hardware a few workloads still run faster on shadow.
> >>>>>>>> But if HVMOP_pagetable_dying is the only reason to keep PARAVIRT for
> >>>>>>>> HVM
> >>>>>>>> guests, then I agree with you that we should remove it.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> - PV IPIs are not needed on single-vcpu guests
> >>>>>>>>>
> >>>>>>>>> - PARAVIRT_CLOCK doesn't need PARAVIRT (in fact the SUSEs kernel
> >>>>>>>>> configs
> >>>>>>>>>       for all x86_64 kernels have CONFIG_PARAVIRT_CLOCK=y)
> >>>>>>>>>
> >>>>>>>>> So I think we really should enable building Xen frontends without
> >>>>>>>>> PARAVIRT, implying at least no XEN_PV and no XEN_PVH.
> >>>>>>>>>
> >>>>>>>>> I'll have a try setting up patches.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> If we are doing this as a performance improvement, I would like to
> >>>>>>>> see a
> >>>>>>>> couple of benchmarks (kernbench, hackbench) to show that on a
> >>>>>>>> single-vcpu guest and multi-vcpu guest (let's say 4 vcpus) disabling
> >>>>>>>> PARAVIRT leads to better performance on Xen on EPT hardware.
> >>>>>>>
> >>>>>>>
> >>>>>>> This is not meant to be a performance improvement. It is meant to
> >>>>>>> enable
> >>>>>>> a standard distro kernel configured without PARAVIRT to be able to
> >>>>>>> run
> >>>>>>> as a HVM guest using the pv-drivers.
> >>>>>>
> >>>>>>
> >>>>>> This is not a convincing explanation.  Debian, Ubuntu and Fedora seems
> >>>>>> to be able to cope with it just fine.
> >>>>>>
> >>>>>> Why do you want to do that, even though it will cause a performance
> >>>>>> regression and a maintenance pain?  You haven't provided a reason yet.
> >>>>>>
> >>>>>
> >>>>> Either we are talking about different things, or I really don't
> >>>>> understand your problem here. I don't want to disable something. I
> >>>>> just want to enable kernels without PARAVIRT to run under Xen better
> >>>>> than today. Being it 32 bit non-PAE kernels as Ian pointed out or
> >>>>> distro kernels like e.g. SLES and probably RHEL.
> >>>>>
> >>>>> Using PV frontends is completely orthogonal to other PV enhancements
> >>>>> like PARAVIRT_CLOCK, HVMOP_pagetable_dying or PV IPIs. So why do you
> >>>>> object enabling the PV frontends for those kernels?
> >>>>
> >>>>
> >>>> I am for it.  I would like to avoid two user visible XEN enablement
> >>>> options (XEN_FRONTEND vs. XEN_PVHVM) for x86_64 and PAE HVM guests to
> >>>> avoid configurations with just XEN_FRONTEND, that can be considered a
> >>>> performance regression compared to what we have now (on x86_64 and PAE).
> >>>
> >>>
> >>> Would you be okay with making this an expert configuration alternative
> >>> for PAE/x86_64? This would enable the possibility to use PV drivers for
> >>> native-performance-tuned kernels. I would explicitly mention the better
> >>> alternative XEN_PVHVM in the Kconfig help text.
> >>
> >>
> >> I would prefer to hide it on PAE and x86_64.
> >
> >
> > Okay, as long as it is still _possible_ somehow to configure it.
> 
> That begs the question, all this just for 32-bit non-PAE ?

There was another reason. Some distros remove the CONFIG_XEN_DOM0 altogether
even thought they do enable the rest of the pieces (backends, frontends, etc).

Which begs the question - why do we care about DOM0 at all.

What we care about is drivers - either frontend or backend. If we want
backends and we want PV - then we want to build an kernel that can boot as
a normal PV or as an dom0 PV.

Ditto for HVM - if you want to build an kernel that won't do PV but
can do backends - we should be able to do that.

Or PVH  - we want an domain that can be an backend (or frontend).

That does mean the "PV" gets broken down further to be concrete
pieces and have nothing to do with drivers. 

The idea would be that you would just select four knobs:

 Yes/No Backend PV drivers [and maybe remove the PV part?]
 Yes/No Frontend PV drivers [and maybe remove the PV part?]
 Yes/No PV support (so utilizing the PV ABI)
 Yes/No PVH support (a stricter subset of PV ABI - with less pieces)

The HVM support would automatically be picked if the config had
the 'baremetal' type support - like IOAPIC, APIC, ACPI, etc.

So if you said Y, N, N, N, the kernel would only be able to
boot in HVM mode but still have pciback, netback, scsiback, blkback, and 
usbback.
(good for an device backend). And it could be an PAE or non-PAE kernel.

If you said N,Y,Y,Y then it could boot under HVM, PV, PVH, and only
have pcifront, netfront, scsifront, blkfront, and usbfront.
(not very good for an initial domain).

And so on.

I hope I hadn't confused the matter?
> 
>  Luis
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.