Xen project Mailing List

Re: DMA restriction and NUMA node number

To: Julien Grall <julien@xxxxxxx>, Wei Chen <Wei.Chen@xxxxxxx>

Date: Tue, 13 Jul 2021 12:21:09 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=YPogjXGnneTfopHremv1nQK2ZA0ZEzkrqvTgoUvG+No=; b=YmVOdv1G7dDWRN3lT7S3Xq1mxMuw8qa+ddgesEbxEIwIgbCTZxgTZRGL4rPB/cuMplMLtD+uCHm1FxmBlQpIBsiiJBX9GmVXgOd2AwOhsfw7CIl71gKLYMygFP2DlppP2nkQHpGugbY8nHL3TTFUxwlAXdWJj16qU2n/FNIVAyskxb44vRgcdP1iep4sswZ+5AvYeqfVBYuA0XPXKH7aCQL4OqC5gOM/XZRJ8X0e1mTMC4NB4Q1afszY1AE+G+AV3EVH/hi59v+lcLVS69/Ir2bEnfGolhVbhlnp4AB1z7CbG1w3pJo4/ydZDcWVqF3ESl6+WLFkGle34e7IPN53jw==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Vuzqk5t5i55oceFN4lsD8S8OjcNHlBsMbJSm4xioe0CAqudPSg5w7RH5Oo9n3HENFqcYVLRc/Fv8Knw5qNiPSgfhcdsqdmV/In5oiX4YX3lZ4kJWYMRJPpoa3lYE+b2FmenCBapBls3mYJik0+uU5b/JeBEJGp+QTQSdBG/2iQXtmjjJosGijr2eeQBvrrgnUQSoSq2YmJDcvl+BQ9LCq8AsYsETsB2Wv6dtUP5Sqz3ZFwhymg4UPn6caYBnIdPEqh3f4ed43xMMA9NWA0DcAFvs7vSP+ujx01WziaiJoGkd4PnQKBHGIf3ZOPz1FLidoBhIEVBOZ3y1E2xXR1L2xQ==

Authentication-results: citrix.com; dkim=none (message not signed) header.d=none;citrix.com; dmarc=none action=none header.from=suse.com;

Cc: Penny Zheng <Penny.Zheng@xxxxxxx>, Bertrand Marquis <Bertrand.Marquis@xxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Delivery-date: Tue, 13 Jul 2021 10:21:27 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 13.07.2021 11:26, Julien Grall wrote: > On 13/07/2021 04:19, Wei Chen wrote: >> I am doing some NUMA testing on Xen. And I find the DMA restriction is >> based on NUMA node number [1]. >> if ( !dma_bitsize && (num_online_nodes() > 1) ) >> dma_bitsize = arch_get_dma_bitsize(); >> >> On Arm64, we will set dma_bitsize [2] to 0, that means we don't need to >> reserve DMA memory. But when num_online_nodes > 1, the dma_bitsize >> will override to 32. This may be caused by the Arm64 version >> arch_get_dma_bitsize, it may be a simple implementation and not NUMA >> aware. >> >> But I still quite curious about why DMA restriction depends on NUMA >> node number. So really do you mean "node count", not "node number"? >> In Arm64, dma_bitsize does not change when the NUMA node >> changes. So we didn't expect arch_get_dma_bitsize to be called here. >> >> I copied Keir's commit message from 2008. It seems this code was considered >> only for x86, when he was working on it. But I'm not an x86 expert, so I >> hope Xen x86 folks can give some help. Understanding this will help us to > > It is best to CCed the relevant person so they know you have requested > there input. I have added the x86 maintainers in the thread. > >> do some adaptations to Arm in subsequent modifications : ) >> >> commit accacb43cb7f16e9d1d8c0e58ea72c9d0c32cec2 >> Author: Keir Fraser <keir.fraser@xxxxxxxxxx> >> Date: Mon Jul 28 16:40:30 2008 +0100 >> >> Simplify 'dma heap' logic. >> >> 1. Only useful for NUMA systems, so turn it off on non-NUMA systems by >> default. >> 2. On NUMA systems, by default relate the DMA heap size to NUMA node 0 >> memory size (so that not all of node 0's memory ends up being 'DMA >> heap'). >> 3. Remove the 'dma emergency pool'. It's less useful now that running >> out of low memory isn;t as fatal as it used to be (e.g., when we >> needed to be able to allocate low-memory PAE page directories). So on x86 memory starts from 0, and we want to be cautious with giving out memory that may be needed for special purposes (first and foremost DMA). With the buddy allocator working from high addresses to lower ones, low addresses will be used last (unless specifically requested) without any further precautions when not taking NUMA into account. This in particular covers the case of just a single NUMA node. When taking NUMA into account we need to be more careful: If a single node contains the majority (or all) of the more precious memory, we want to prefer non-local allocations over exhausting the more precious memory ranges. Hence we need to set aside some largely arbitrary amount allocation of which would happen only after also exhausting all other nodes' memory. I hope I have suitably reconstructed the thinking back then. And yes, there are x86 implications in here. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.