Xen project Mailing List

Re: Proposal for deviations in static analyser findings

From: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>

Date: Tue, 18 Oct 2022 16:11:43 +0000

Accept-language: en-GB, en-US

Arc-authentication-results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=lists.xenproject.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com])

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none

Arc-message-signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZFQ3hU6pgKRN9KLaAKrTOdnpabWGjgVOhZI8BI+ZFYE=; b=XPSq14Yw4Kk2xzZk2ihkWdMSOb4Wio2Q5WPhG8plVO/5y13FtzhbYKz/8R01yx9vbyloIE6JMFZ+lgwS9jGozDADD9MtXPSh/bnHKQ4DYZqrzrqbN37R/dK66xhYowf+3dGvKqFkX+kSj5qoSVPCoVrmttHWn8NsfhTUhT63YZpuslvBXWzWwcYN82soTx4I/cqVNZacNAPuHi+HmsbMzysBuQtV7idlXlG51PMVgeYBDpXWvJbjNZshW/xzYpVDvkPKF2R0PLC3hRtChqqXP25vLIt7AFl8HSkBaZMatfFhlpvQ155HXMHuufL1WLl94Ies+ZFqauyqc3tmRHHmbg==

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZFQ3hU6pgKRN9KLaAKrTOdnpabWGjgVOhZI8BI+ZFYE=; b=a2gmSSLm8HJzbvbH5g7CL11xcyGn2bmVXmmj5mveWiZBij/1jEZYBNSZKvF+9vCZ3to4k4KnPu8x6lRhYdUkHKazKkwdjD3zPfn2qf3G32hyYRYg/iyPhhC3/j/5iGMWx2bG0GIQ8/J65uF7fva7qDiTQL/wZV6AoDgSPjuNCW2D0zTQUZt+ajWnXKr6QyyYt0NhCRTpq1bZqPXjSmjOac66Z8uKaZe9LV+dhA++P0i+xu5goypXSmT6vELtjqVPCriztpE3+CC4QJ2Ai/JAe98BSNCBf1XC9npgjqqHy03tNVPBcjfDg7jthRoZUAa15QU8QPOJdlDI359bnL1GmQ==

Arc-seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=eumjy+BzVu7eRBaFndoRLmBo+Re90O6CzPOEzmoCLszG+jWFTeeSkPjovTnGfANpSABZCLd7B9KTZ5kH4J8v8MoEvnniMmkaulUZCkBRZmNoNToSbEZf8zErPGYsTQ3F4Px7DjgHuDf2z3UUXtw1SPN2bNoWROvHmIvMe/5lCMY+E7x7bDaqbVo1ADWWDua6jevuszxQkR82u637P5htOdegOAaeo6EHGWGlND/fq0+5xqTXIqnQwO34NH3f3C1MRsjIPuKVVWw49klJrkd4Yrvp87BoFxJr6EySQ/Uwz23pRqYl/H7id77pEW5R/6P3vP1Q/CV4j+3jroF5jRdCKg==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Koje0jNsrMBjrc6b4Keh8d8UGbCaOm7MHmWiEMcOAjXyfatT8TG0i6DYol+T2gIO70ECUD5QyN68pWFldge1yPuVrzptxbzhJ3pk4+3Kq/VV4Rkc8fSG0TXXxVEL+wta28fQLz9PX0gY5vi3aIlh2EvY7x8GeIgwMcp/FevIondozJkcM+vpPtLhAQTpsGBaHLzhBCdgQyhXB2jP4n2bJTuykg/sesKvMnD0sTlQ2xVVxdrQtRT+FlbU0eZAxuaHBEFMChElGPa6L8H7pyLFgAEl10o8Z+gmZmhj3fWTH4u01Us5Mbcixh+jGABu2Rom4aZLtqkNFZWGy8+Px2sqsw==

Authentication-results-original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;

Cc: Luca Fancellu <Luca.Fancellu@xxxxxxx>, Wei Chen <Wei.Chen@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 18 Oct 2022 16:12:17 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Nodisclaimer: true

Original-authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;

Thread-index: AQHY3lPHM2Tyba3hCkeKMW07GDOb2a4L9DeAgAAnewCAABdGAIAIGdUAgAADdICAAAu3gA==

Thread-topic: Proposal for deviations in static analyser findings

Hi, > On 18 Oct 2022, at 16:29, Jan Beulich <jbeulich@xxxxxxxx> wrote: > > On 18.10.2022 17:17, Luca Fancellu wrote: >>> On 13 Oct 2022, at 12:34, Jan Beulich <jbeulich@xxxxxxxx> wrote: >>> On 13.10.2022 12:11, Luca Fancellu wrote: >>>>> On 13 Oct 2022, at 08:50, Jan Beulich <jbeulich@xxxxxxxx> wrote: >>>>> On 12.10.2022 18:00, Luca Fancellu wrote: >>>>>> Entries in the database should never be removed, even if they are not >>>>>> used >>>>>> anymore in the code (if a patch is removing or modifying the faulty >>>>>> line). >>>>>> This is to make sure that numbers are not reused which could lead to >>>>>> conflicts >>>>>> with old branches or misleading justifications. >>>>> >>>>> Can we add provisions for shrinking such entries to e.g. just their "id" >>>>> line? Or is the intention to be able to re-use such an entry if a matching >>>>> instance appears again later? >>>> >>>> I prefer to don’t shrink it, the name itself is not very long, even using >>>> many digits of the incremental >>>> number, it removes also the dependency on the file name. >>> >>> Name length isn't relevant here, and I have no idea what dependency on a >>> file name you're thinking of. My question is a scalability one: Over time >>> the table will grow large. If all entries remain there in full forever, >>> table size may become unwieldy. >> >> Ok I misunderstood your question, now I understand what you are asking, we >> could remove the content >> of the “analyser” dictionary for sure, because if there is not anymore a >> link with the current code. >> >> Regarding removing the “name” and “text”, could it be that at some point we >> can introduce in the code >> a violation that requires the same justification provided by the “orphan” >> entry? >> In that case we could reuse that entry without creating a new one that will >> only waste space. >> What is the opinion on this? > > Well, yes, this is the one case that I, too, was wondering about. It's not > clear to me whether it wouldn't be better to allocate a fresh ID in such a > case. For traceability and release handling I think it is important that: - we never reuse the same ID in any case - we keep IDs in the database even if there is no occurrence in xen code as this will prevent adding a new ID if an existing one can be reused In a certification context, when a justification has been validated and agreed it will make life a lot easier to not modify it and reuse it in the future if it is needed. From our point of view, having a clear and simple way of handling those will make backports a lot easier. >>>>>> Here a brief explanation of the field inside an object of the "content" >>>>>> array: >>>>>> - id: it is a unique string that is used to refer to the finding, many >>>>>> finding >>>>>> can be tagged with the same id, if the justification holds for any >>>>>> applied >>>>>> case. >>>>>> It tells the tool to substitute a Xen in-code comment having this >>>>>> structure: >>>>>> /* SAF-0-safe [...] \*/ >>>>>> - analyser: it is an object containing pair of key-value strings, the >>>>>> key is >>>>>> the analyser, so it can be cppcheck, coverity or eclair. The value is the >>>>>> proprietary id corresponding on the finding, for example when coverity is >>>>>> used as analyser, the tool will translate the Xen in-code coment in this >>>>>> way: >>>>>> /* SAF-0-safe [...] \*/ -> /* coverity[coverity-id] \*/ >>>>> >>>>> In here, where would coverity-id come from? And how does the >>>>> transformation >>>>> here match up with the value of the "coverity": field in the table? >>>> >>>> I can put an example of that, as you pointed out it could be difficult to >>>> get where >>>> this proprietary tool ID comes from. >>>> >>>> The proprietary ID (Coverity in this case) comes from the report it >>>> produces: >>>> >>>> […] >>>> <file path>:<line number>: >>>> 1. proprietary_ID: […] >>>> […] >>>> >>>> after we see the finding, we take that ID, we put it in the “analyser” >>>> dictionary as: >>>> >>>> […] >>>> "id":”SAF-2-safe", >>>> “analyser”: { >>>> “coverity”: “proprietary_ID" >>>> }, >>>> […] >>>> >>>> So in the source code we will have: >>>> >>>> /* SAF-2-safe [optional text] */ >>>> C code affected line; >>>> >>>> And when the analysis will be performed, the tool (coverity for example) >>>> will run on this source code: >>>> >>>> /* coverity[proprietary_ID] */ >>>> C code affected line; >>>> >>>> The tool will write a report and will suppress the finding with >>>> “proprietary_ID” that comes in the C code >>>> line after the comment. >>> >>> Let me add some background to my earlier comment: >>> >>> If we wanted to add such IDs to the table, then I guess this would result in >>> a proliferation of entries. If my observations haven't misguided me, >>> Coverity might re-use the same ID for multiple similar new issues found in a >>> single run, but it would not re-use them across runs. Hence irrespective of >>> their similarity, multiple table entries would be needed just because of the >>> different Coverity IDs. >> >> Coverity will use every run the same id for the same class of violation, for >> example >> misra_c_2012_rule_8_6_violation for violation of rule 8.6. > > Hmm, I've never seen such. I always saw it use numeric IDs, and we've > actually been putting these in commits when addressing their findings. Here I think you are mixing a finding inside the code with the type associated. We might have several findings of the same type but with different justifications. > >>>> After the analysis, the source code will return as the original (with the >>>> SAF-* tag). >>> >>> While you mention something similar also as step 3 in the original document >>> near the top, I'm afraid I don't understand what this "return as the >>> original" >>> means. If you want to run the tool on an altered (comments modified) source >>> tree, what I'd expect you to do is clone the sources into a throw-away tree, >>> massage the comments, run the tool, and delete the massaged tree. >>>>>> if the object doesn't have a key-value, then the corresponding in-code >>>>>> comment won't be translated. >>>>> >>>>> Iirc at least Coverity ignores certain instances of what it might consider >>>>> violations (fall-through in switch() statements in particular) in case >>>>> _any_ comment is present. Therefore may I suggest that such comments be >>>>> deleted (really: replaced by a blank line, to maintain correct line >>>>> numbering) if there's no matching key-value pair? >>>> >>>> Yes the line won’t be altered if there is no match. This to ensure the >>>> correct line >>>> numbering is not affected. >>> >>> "won't be altered" is the opposite of what I've been asking to consider: >>> Observing that comments _regardless_ of their contents may silence findings, >>> the suggestion is to remove comments (leaving a blank line) when there's no >>> entry for the targeted tool in the table entry. >> >> Why? The tag comment won’t do anything, it would act as a blank line from >> the analyser >> perspective. > > The _tag_ won't do anything, but as said any _comment_ may have an effect. I am not sure I follow this one but in any case we can choose to anyway substitute the tag with something like /* Not applicable */. Cheers Bertrand > > Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.