[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] device-tree: optimize dt_device_for_passthrough()


  • To: Julien Grall <julien@xxxxxxx>, "Orzel, Michal" <michal.orzel@xxxxxxx>, Grygorii Strashko <gragst.linux@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • From: Grygorii Strashko <grygorii_strashko@xxxxxxxx>
  • Date: Wed, 12 Feb 2025 12:16:05 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=SHy/JMCn/bf/bvBaFQ5KcynEv6NkeHeNrmVO5RhcjTw=; b=OG7cntWXkAkauLeYm1d2Gv6gWcOq4kZYU3b/UTAK1O5gDwTSt5Clt8AS2UWDQx1Ds/sP5V9UU2ZlM3lsuZL8rYebcC+iWp9QqAJb8fqyKwH9tOsR/8kVNexR+wCAegpEBjzU+sejbA46ea9a5UUFeYmqOR7QhlwiV+xiTHX2X0uvmCNifoYnpkoMw03qlOg9qWGkJXOGxuBd3exAnuAnKspLHvulIyyvo2zFiOtHwfXMGXrEpeC3YqL0EKRFxU/ZI86a8vc3SCJ11CNk3cWyTy4ZCCJpsDgp3nO1fE3+BqpW4t4/1pZ3ZmMAc7EEd2TepW9BC7t+2nIAbNOd61t4lQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=ysQ7IesBRpZqR0Ywm5HgfjJElE5d91pz9GH74x651MLu5dyaxdJmlZzpcnq3Rz1IpZTazC4Vy9gTjifOPBki8BcK6VFSdO7AkBe4L8q3TZyGCl/80EsGaowBfpAkS7s6e5uBoybA1Sx1HGsqgvMu6EdECS6GcBw9tPmWeakaUfE81Nd8bG6B9xnu/AXrtN2kCKgOwCViW+gaa95WdATnu5Qpcqg1OUpHUS+C3AKlQpGRH8AK+nRdZuDUg2U1XewL/N8FAMx4HOZjB5pUNHXN/mm+QLdu8JTXy+C9EGtPV6yPVXdjr7wbXsPWcy7+L46Ea2aXoA22LL0J1RlaahSAVw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=epam.com;
  • Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>
  • Delivery-date: Wed, 12 Feb 2025 10:16:35 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>



On 12.02.25 11:04, Julien Grall wrote:
Hi,

On 11/02/2025 15:14, Grygorii Strashko wrote:


On 11.02.25 14:32, Julien Grall wrote:


On 11/02/2025 11:57, Orzel, Michal wrote:
On 11/02/2025 12:18, Grygorii Strashko wrote:


The dt_device_for_passthrough() is called many times during Xen
initialization and Dom0 creation. On every call it traverses struct
dt_device_node properties list and compares compares properties name with
double "compares"

10x


"xen,passthrough" which is runtime overhead. This can be optimized by
Are you sure? Looking at the calls, it's almost only used at boot except for dt
overlay.

marking dt_device_node as passthrough while unflattening DT.

This patch introduced new struct dt_device_node property "is_passthrough"
which is filled if "xen,passthrough" property is present while unflattening
DT and dt_device_for_passthrough() just return it's value.
In the past we were skeptical about adding new fields to the dt_device_node
structure for use cases like this one. I would say this optimization is not
worth it. Also, why would you optimize dt_device_for_passthrough but not
e.g. dt_device_is_available.

So we are trading speed with memory usage. It looks like we may be using a 
padding, although I didn't check whether the existing structure could be 
packed...

Actually I see it consumes nothing due to alignments:
- before sizeof(dt_device_node)=144
- after sizeof(dt_device_node)=144

But it could be made bool is_passthrough:1; together with other bools, and 
probably moved at the end of struct dt_device_node.

By the way, below diff can save 8 bytes on arm64 per struct dt_device_node.



You can check with other Arm maintainers.

Before forging an opinion, I'd like to see some numbers showing the performance 
improvement. Also, is this impacting only boot?

Sry, indeed only boot, need to be more specific.

My DT: ~700 nodes
Number of calls till the end of create_dom0():
(XEN) =============== dt_device_for_passthrough=2684 
dt_device_is_available=1429.

I've enabled console_timestamps=boot and did 5 boots and calculated average - 
it gives ~20 microseconds improvement.

This doesn't seem to be worth it. But I also don't know what's the normal boot 
time on your system... I guess we are still in seconds?

Yes. in general if exclude SILO timeout.

(XEN) [    0.433789] ***************************************************
(XEN) [    0.434588] WARNING: SILO mode is not enabled.
(XEN) [    0.435204] It has implications on the security of the system,
(XEN) [    0.435992] unless the communications have been forbidden between
(XEN) [    0.436813] untrusted domains.
(XEN) [    0.437256] ***************************************************
(XEN) [    0.438055] 3... 2... 1...
(XEN) [    3.438566] *** Serial input to DOM0 (type 'CTRL-a' three times to 
switch input)
(XEN) [    3.439559] Freed 400kB init memory.



Also, I agree with Michal, if this is a concern for 
dt_device_device_for_passthrough(). Then it would be a concern for a few others 
calls using dt_find_property(). Are you going to modify all of them?

I follow the rule one patch one functional change. Regarding further optimization - I think only 
generic properties can be optimized and personally I see now only "xen,passthrough" and 
may be "status".

Ok. 20 microseconds. it's probably more like a measurement error margin.
Please advice if i should continue or drop?

See above. Regardless that, would you be able to provide a bit more information 
of your end goal?Are you intending to be able to boot Xen/dom0/dom0less guest 
in less than N milliseconds?
How far are you from that target? Are those numbers done on the latest Xen?

It's more like result of code studying from my side as Xen newbie.
I've considered it as kinda "obvious" change - calc some value once is better 
then do the same calculations many times.
Ok, it's not obvious. I'll drop it.


Asking because there are probably bigger place where optimization can be done 
first.

Thanks,
-grygorii



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.