[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu



> -----Original Message-----
> From: Kevin Wolf [mailto:kwolf@xxxxxxxxxx]
> Sent: 16 October 2015 17:43
> To: Paul Durrant
> Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> devel@xxxxxxxxxx; xen-devel@xxxxxxxxxxxxx; qemu-block@xxxxxxxxxx
> Subject: Re: [Qemu-devel] Question about xen disk unplug support for ahci
> missed in qemu
> 
> Am 16.10.2015 um 18:20 hat Paul Durrant geschrieben:
> > > -----Original Message-----
> > > From: Kevin Wolf [mailto:kwolf@xxxxxxxxxx]
> > > Sent: 16 October 2015 17:12
> > > To: Paul Durrant
> > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> > > devel@xxxxxxxxxx; xen-devel@xxxxxxxxxxxxx; qemu-block@xxxxxxxxxx
> > > Subject: Re: [Qemu-devel] Question about xen disk unplug support for
> ahci
> > > missed in qemu
> > >
> > > Am 16.10.2015 um 17:10 hat Paul Durrant geschrieben:
> > > > > -----Original Message-----
> > > > > From: Kevin Wolf [mailto:kwolf@xxxxxxxxxx]
> > > > > Sent: 16 October 2015 16:02
> > > > > To: Paul Durrant
> > > > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard;
> qemu-
> > > > > devel@xxxxxxxxxx; xen-devel@xxxxxxxxxxxxx; qemu-
> block@xxxxxxxxxx
> > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug support
> for
> > > ahci
> > > > > missed in qemu
> > > > >
> > > > > Am 16.10.2015 um 16:24 hat Paul Durrant geschrieben:
> > > > > > > -----Original Message-----
> > > > > > > From: Kevin Wolf [mailto:kwolf@xxxxxxxxxx]
> > > > > > > Sent: 16 October 2015 15:04
> > > > > > > To: Paul Durrant
> > > > > > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard;
> > > qemu-
> > > > > > > devel@xxxxxxxxxx; xen-devel@xxxxxxxxxxxxx; qemu-
> > > block@xxxxxxxxxx
> > > > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug
> support
> > > for
> > > > > ahci
> > > > > > > missed in qemu
> > > > > > >
> > > > > > > Am 14.10.2015 um 14:48 hat Paul Durrant geschrieben:
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Fabio Fantoni [mailto:fabio.fantoni@xxxxxxx]
> > > > > > > > > Sent: 14 October 2015 12:12
> > > > > > > > > To: Kevin Wolf; Stefano Stabellini
> > > > > > > > > Cc: John Snow; Anthony Perard; qemu-devel@xxxxxxxxxx;
> xen-
> > > > > > > > > devel@xxxxxxxxxxxxx; qemu-block@xxxxxxxxxx; Paul Durrant
> > > > > > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug
> > > support
> > > > > for
> > > > > > > ahci
> > > > > > > > > missed in qemu
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Il 14/10/2015 11:47, Kevin Wolf ha scritto:
> > > > > > > > > > [ CC qemu-block ]
> > > > > > > > > >
> > > > > > > > > > Am 13.10.2015 um 19:10 hat Stefano Stabellini geschrieben:
> > > > > > > > > >> On Tue, 13 Oct 2015, John Snow wrote:
> > > > > > > > > >>> On 10/13/2015 11:55 AM, Fabio Fantoni wrote:
> > > > > > > > > >>>> I added ahci disk support in libxl and using it for week
> seems
> > > > > that
> > > > > > > was
> > > > > > > > > >>>> ok, after a reply of Stefano Stabellini seems that xen 
> > > > > > > > > >>>> disk
> > > unplug
> > > > > > > > > >>>> support only ide disks:
> > > > > > > > > >>>>
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=679f4f8b178e7c66fbc2f39
> > > > > > > > > c905374ee8663d5d8
> > > > > > > > > >>>>
> > > > > > > > > >>>> Today Paul Durrant told me that even if pv disk is ok 
> > > > > > > > > >>>> also
> > > with
> > > > > ahci
> > > > > > > and
> > > > > > > > > >>>> the emulated one is offline can be a risk:
> > > > > > > > > >>>> http://lists.xenproject.org/archives/html/win-pv-
> > > devel/2015-
> > > > > > > > > 10/msg00021.html
> > > > > > > > > >>>>
> > > > > > > > > >>>>
> > > > > > > > > >>>> I tried to take a fast look in qemu code but I not
> understand
> > > the
> > > > > > > > > needed
> > > > > > > > > >>>> thing for add the xen disk unplug support also for ahci,
> can
> > > > > > > someone do
> > > > > > > > > >>>> it or tell me useful information for do it please?
> > > > > > > > > >>>>
> > > > > > > > > >>>> Thanks for any reply and sorry for my bad english.
> > > > > > > > > >>>>
> > > > > > > > > >>> I'm not entirely sure what features you need AHCI to
> support
> > > in
> > > > > > > order
> > > > > > > > > >>> for Xen to be happy.
> > > > > > > > > >>>
> > > > > > > > > >>> I'd guess hotplugging, but where I get confused is that 
> > > > > > > > > >>> IDE
> > > disks
> > > > > don't
> > > > > > > > > >>> support hotplugging either, so I guess I'm not sure sure
> what
> > > you
> > > > > > > need.
> > > > > > > > > >>>
> > > > > > > > > >>> Stefano, can you help bridge my Xen knowledge gap?
> > > > > > > > > >>
> > > > > > > > > >> Hi John,
> > > > > > > > > >>
> > > > > > > > > >> we need something like
> > > > > hw/i386/xen/xen_platform.c:unplug_disks
> > > > > > > but
> > > > > > > > > that
> > > > > > > > > >> can unplug AHCI disk. And by unplug, I mean "make
> disappear"
> > > like
> > > > > > > > > >> pci_piix3_xen_ide_unplug does for ide.
> > > > > > > > > > Maybe this would be the right time to stop the craziness
> with
> > > your
> > > > > > > > > > hybrid IDE/xendisk setup. It's a horrible thing that would
> never
> > > > > happen
> > > > > > > > > > on real hardware.
> > > > > > > >
> > > > > > > > Unfortunately, it's going to be difficult to remove such 
> > > > > > > > 'craziness'
> > > when
> > > > > you
> > > > > > > don't know a priori whether the VM has PV drivers or not.
> > > > > > >
> > > > > > > Why wouldn't you know that beforehand? I mean, even on real
> > > > > hardware
> > > > > > > you
> > > > > > > can have different disk interfaces (IDE, AHCI, SCSI) and you 
> > > > > > > install
> > > > > > > the exact driver that your hardware needs. You just do the same
> > > thing on
> > > > > > > VM: If your hardware is PV, you install a PV driver. If your
> hardware is
> > > > > > > IDE, you install an IDE driver. Whether it's PV or IDE is 
> > > > > > > something
> that
> > > > > > > you, the user, decided when configuring the VM, so you definitely
> > > know.
> > > > > > >
> > > > > >
> > > > > > That's not necessarily true. The host admin that provisions the VM
> does
> > > not
> > > > > necessarily know what OS the user of that VM will install. The admin
> may
> > > just
> > > > > be providing a generic VM with an emulated CD drive that the user
> can
> > > point
> > > > > at any ISO they want.
> > > > > >
> > > > > > So, as a host admin, if you provide a VM with only PV backends and
> > > your
> > > > > user is trying to boot an OS with no PV drivers they are not going to 
> > > > > be
> > > > > happy, so you provide emulated devices. Then, at some point later,
> when
> > > > > the user installs PV drivers, there really should be some way for 
> > > > > those
> > > drivers
> > > > > to start up without any need to contact the host admin and have the
> VM
> > > > > reconfigured.
> > > > >
> > > > > Why only IDE and xendisk then? Maybe I have an OS that works great
> > > with
> > > > > AHCI, or virtio-blk, or an LSI SCSI controller, or a Megasas SCSI
> > > > > controller, or USB sticks, or... (and IDE will hardly ever be the
> > > > > optimal one)
> > > > >
> > > > > What about network cards? My OS might support the Xen PV one, or
> it
> > > > > might support rtl8139, or e1000, or virtio-net, or pcnet, or...
> > > > >
> > > > > Should we always put all of the hardware that can possibly be
> emulated
> > > > > in a VM just so that the one right device is definitely included even
> > > > > though we don't know what OS will be running?
> > > > >
> > > > > This is ridiculous.
> > > >
> > > > It might be, but to some extent it's reality. The reason that the
> > > > default emulated network device chosen by xl is rtl8193 is that it has
> > > > drivers in just about every OS. The same reason for IDE being the
> > > > default choice for storage.
> > >
> > > So what does this mean for a justification for the AHCI + xendisk hybrid
> > > proposal?
> > >
> > > > > Just tell your admin what virtual hardware you really need. (Or tell
> > > > > them to give you a proper interface to configure your VMs yourself.)
> > > > >
> > > >
> > > > My point is that the virtual hardware that the OS user wants will
> > > > change. Before they install PV drivers, they will need emulated
> > > > device. After installing PV drivers they will want PV devices. Should
> > > > they really have to contact their cloud provider to make the switch,
> > > > when at the moment it happens automatically and transparently (the
> > > > AHCI problem aside)?
> > >
> > > My point is that such a magic change shouldn't happen. It doesn't happen
> > > on real hardware either and people still get things installed to non-IDE
> > > disks.
> > >
> > > There is no reason to install the OS onto a different device than will
> > > be used later. With Linux, it's no problem at all because the PV drivers
> > > are already included on the installation media anyway, and on Windows
> or
> > > presumably any other OS you can load and install the drivers right from
> > > the beginning.
> > >
> > > In fact, I would be surprised if using xendisk instead of IDE for
> > > installing Windows didn't result in a noticably faster installation.
> > >
> >
> > It most certainly would, but requiring users do it this way is likely to 
> > meet
> some resistance I suspect.
> 
> Why do you think so? Installing the PV drivers afterwards doesn't seem
> easier than just providing them during the installation.
> 

My experience of XenServer customers tells me that any form of manual 
intervention during guest install is likely to meet with resistance, 
unfortunately.

> > > Now, if you really insist on providing a legacy interface even to guests
> > > that eventually use PV drivers, there actually are sane ways to
> > > implement this. It will be tricky to make that transition now without
> > > breaking compatibility, but it could have been done from the start.
> > >
> > > Sane means for example that you don't open the same image twice (and
> > > even read-write!) at the same time. This is a recipe for disaster and
> > > it's surprising that you don't see corrupted images more often.
> > >
> >
> > We don't because unplug is supposed to ensure the emulated device is
> > gone before the PV frontend is started
> 
> The important part is the backend, but it seems that you open the second
> instance of the image only when starting the PV frontend?

I believe this is the case, yes.

> 
> As long as you don't enable the user to use most of qemu's functionality
> like starting block jobs (which would keep the IDE instance around even
> after unplugging the disk), it might actually be safe assuming that the
> guest cooperates. Not sure what a malicious guest could do, though, as
> nobody seems to check whether IDE is really unplugged before the second
> instance is opened.

The Windows drivers do check. After the unplug Windows is asked to re-enumerate 
the IDE buses and we make sure the disks we expect to be gone are really gone.

> raw and qcow2 should be safe these days, but in
> earlier times it would probably have been possible for the guest to
> overwrite the image header and access arbitrary files on the host as
> backing file. It might still be true for other image formats.
> 
> > > So if you wanted to have a clean solution, try to think how real
> > > hardware would solve the problem. If you want me to suggest something
> > > off the top of my head, I would come up with an extended IDE device
> (one
> > > single device!) that provides the IDE I/O ports and additionally some
> > > MMIO BAR that enables access to PV functionality.
> > >
> > > Once you enable PV functionality, the IDE ports stop working; device
> > > reset disables the PV ring and goes back to IDE mode. No hard disk
> > > suddenly disappearing from the machine, no image corruption if the IDE
> > > device is written to before enabling PV, etc.
> > >
> >
> > That's not sufficient though. The IDE device must not be enumerated by
> > the OS and, in Windows at least, that enumeration occurs before the PV
> > frontend has started up.
> 
> The trick is that it's only a single device, so there is no second
> device that must be prevented from being enumerated. You provide a
> driver for this specific IDE controller, so Windows wouldn't even try
> the generic IDE driver when your driver is available.
> 

But the whole point is that we want Windows to use the generic IDE driver. If 
we had a driver in Windows from the outset then it would be pure PV and there'd 
be no problem :-)

  Paul

> It's kind of the same sort of IDE controller extension as Bus Master
> DMA, which just added a new BAR. If you had an old driver, it would just
> ignore the new registers. If you had a new one, it would use them. But
> in no way would the old appearance of the device simply disappear, you
> just use an extended register set on the same device.
> 
> > > But it's your choice. You can keep your broken hack in IDE. Just don't
> > > expect anyone to support adding new broken hacks to other devices.
> > >
> >
> > I'd prefer to have a cleaner solution and I believe can achieve that in
> Windows by obscuring the emulated disks using filter drivers, so that's the
> way I'll probably go.
> 
> I wouldn't consider anything that works with two distinct disk devices
> and two separate BlockDriverStates for the same image file a clean
> solution.
> 
> Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.