Xen project Mailing List

Re: [Xen-devel] Design session report: Live-Updating Xen

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Thu, 18 Jul 2019 09:15:16 +0000

Accept-language: en-US

Arc-authentication-results: i=1; mx.microsoft.com 1;spf=pass smtp.mailfrom=suse.com;dmarc=pass action=none header.from=suse.com;dkim=pass header.d=suse.com;arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ScDBGuBLJlQXR/8FJasUvHk7gQIcXfH0XjLE1/A1w1M=; b=nVl9Xm8dAqc1f6lxf28pU7qhiop1AqL9hs+ejUiYU7eGeLAZKq7OAeRtdRzXMNGClwqI/NWgc/otdW4qf7IiuJyyoEEQLZToSxbHrxAYjw2eOiWIMlou9xPcTw936bY0qbIFPaq5znAREWxLJUgdGcLOPMQD0I3tyjjXvanoU0byQlVBnzhcXa1oZGNYDnHaD0Zhq/anbAAbhuWXNuqONhg2RBDPPwGcF8NoiuD3MvzhXvMrOxrc7mBdJt02HJLr7tHPktOabPPIFIEcb6aBcasGz6/TRnOYowYmOL6A80/oHk54M+IKRGiOHGKs1emAOnSKuaILAn93fAmcIIeRrw==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=i49BNLYfRC3T7JzMi+/myuSTbl2N6ut8ZyixdzVd9CBra7f5a8cawGOcePsmW2vWtulG65DHMG73gpO7LTQWF1KukcBgBpaGAHbKVHlg5aFS19W+ZAnOywCXkn92ZSlqua63GFVuszyYjlM4dzEQKJDeNjn06ehagBP4DzTgUS0X7Bs7V0DNGHKtbPW5Hd/jT7JA/cP7H7AXlyR6faUO07CGSarzXWuBN24ysiBDNK+TcQghajXdLGoGXLRiXtRcSLZOKX+MGtvLweVnuFU4pFyw5COQ8cSavc1qmw17LPpRDhEhbP4JS1A8gKahw9GemJPQAFOl4yEnL7m9aoEK/g==

Authentication-results: spf=none (sender IP is ) smtp.mailfrom=JBeulich@xxxxxxxx;

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Leonard Foerster <foersleo@xxxxxxxxxx>

Delivery-date: Thu, 18 Jul 2019 09:15:38 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHVOz84rrbZg3jr7UWo9Co44cwhi6bN7RdMgAB5rACAAEe/loAAGskAgABeoF+AAPQ0AA==

Thread-topic: [Xen-devel] Design session report: Live-Updating Xen

On 17.07.2019 20:40, Andrew Cooper wrote: > On 17/07/2019 14:02, Jan Beulich wrote: >> On 17.07.2019 13:26, Andrew Cooper wrote: >>> We do not want to be grovelling around in the old Xen's datastructures, >>> because that adds a binary A=>B translation which is >>> per-old-version-of-xen, meaning that you need a custom build of each >>> target Xen which depends on the currently-running Xen, or have to >>> maintain a matrix of old versions which will be dependent on the local >>> changes, and therefore not suitable for upstream. >> Now the question is what alternative you would suggest. By you >> saying "the pinned state lives in the migration stream", I assume >> you mean to imply that Dom0 state should be handed from old to >> new Xen via such a stream (minus raw data page contents)? > > Yes, and this in explicitly identified in the bullet point saying "We do > only rely on domain state and no internal xen state". > > In practice, it is going to be far more efficient to have Xen > serialise/deserialise the domain register state etc, than to bounce it > via hypercalls. By the time you're doing that in Xen, adding dom0 as > well is trivial. So I must be missing some context here: How could hypercalls come into the picture at all when it comes to "migrating" Dom0? >>> The in-guest evtchn data structure will accumulate events just like a >>> posted interrupt descriptor. Real interrupts will queue in the LAPIC >>> during the transition period. >> Yes, that'll work as long as interrupts remain active from Xen's POV. >> But if there's concern about a blackout period for HVM/PVH, then >> surely there would also be such for PV. > > The only fix for that is to reduce the length of the blackout period. > We can't magically inject interrupts half way through the xen-to-xen > transition, because we can't run vcpus at that point in time. Hence David's proposal to "re-inject". We'd have to record them during the blackout period, and inject once Dom0 is all set up again. >>>> Re-using large data structures (or arrays thereof) may also turn out >>>> useful in terms of latency until the new Xen actually becomes ready to >>>> resume. >>> When it comes to optimising the latency, there is a fair amount we might >>> be able to do ahead of the critical region, but I still think this would >>> be better done in terms of a "clean start" in the new Xen to reduce >>> binary dependences. >> Latency actually is only one aspect (albeit the larger the host, the more >> relevant it is). Sufficient memory to have both old and new copies of the >> data structures in place, plus the migration stream, is another. This >> would especially become relevant when even DomU-s were to remain in >> memory, rather than getting saved/restored. > > But we're still talking about something which is on a multi-MB scale, > rather than multi-GB scale. On multi-TB systems frame_table[] is a multi-GB table. And with boot times often scaling (roughly) with system size, live updating is (I guess) all the more interesting on bigger systems. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.