[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposal for deviations in static analyser findings


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
  • Date: Tue, 18 Oct 2022 16:11:43 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=lists.xenproject.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com])
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZFQ3hU6pgKRN9KLaAKrTOdnpabWGjgVOhZI8BI+ZFYE=; b=XPSq14Yw4Kk2xzZk2ihkWdMSOb4Wio2Q5WPhG8plVO/5y13FtzhbYKz/8R01yx9vbyloIE6JMFZ+lgwS9jGozDADD9MtXPSh/bnHKQ4DYZqrzrqbN37R/dK66xhYowf+3dGvKqFkX+kSj5qoSVPCoVrmttHWn8NsfhTUhT63YZpuslvBXWzWwcYN82soTx4I/cqVNZacNAPuHi+HmsbMzysBuQtV7idlXlG51PMVgeYBDpXWvJbjNZshW/xzYpVDvkPKF2R0PLC3hRtChqqXP25vLIt7AFl8HSkBaZMatfFhlpvQ155HXMHuufL1WLl94Ies+ZFqauyqc3tmRHHmbg==
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZFQ3hU6pgKRN9KLaAKrTOdnpabWGjgVOhZI8BI+ZFYE=; b=a2gmSSLm8HJzbvbH5g7CL11xcyGn2bmVXmmj5mveWiZBij/1jEZYBNSZKvF+9vCZ3to4k4KnPu8x6lRhYdUkHKazKkwdjD3zPfn2qf3G32hyYRYg/iyPhhC3/j/5iGMWx2bG0GIQ8/J65uF7fva7qDiTQL/wZV6AoDgSPjuNCW2D0zTQUZt+ajWnXKr6QyyYt0NhCRTpq1bZqPXjSmjOac66Z8uKaZe9LV+dhA++P0i+xu5goypXSmT6vELtjqVPCriztpE3+CC4QJ2Ai/JAe98BSNCBf1XC9npgjqqHy03tNVPBcjfDg7jthRoZUAa15QU8QPOJdlDI359bnL1GmQ==
  • Arc-seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=eumjy+BzVu7eRBaFndoRLmBo+Re90O6CzPOEzmoCLszG+jWFTeeSkPjovTnGfANpSABZCLd7B9KTZ5kH4J8v8MoEvnniMmkaulUZCkBRZmNoNToSbEZf8zErPGYsTQ3F4Px7DjgHuDf2z3UUXtw1SPN2bNoWROvHmIvMe/5lCMY+E7x7bDaqbVo1ADWWDua6jevuszxQkR82u637P5htOdegOAaeo6EHGWGlND/fq0+5xqTXIqnQwO34NH3f3C1MRsjIPuKVVWw49klJrkd4Yrvp87BoFxJr6EySQ/Uwz23pRqYl/H7id77pEW5R/6P3vP1Q/CV4j+3jroF5jRdCKg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Koje0jNsrMBjrc6b4Keh8d8UGbCaOm7MHmWiEMcOAjXyfatT8TG0i6DYol+T2gIO70ECUD5QyN68pWFldge1yPuVrzptxbzhJ3pk4+3Kq/VV4Rkc8fSG0TXXxVEL+wta28fQLz9PX0gY5vi3aIlh2EvY7x8GeIgwMcp/FevIondozJkcM+vpPtLhAQTpsGBaHLzhBCdgQyhXB2jP4n2bJTuykg/sesKvMnD0sTlQ2xVVxdrQtRT+FlbU0eZAxuaHBEFMChElGPa6L8H7pyLFgAEl10o8Z+gmZmhj3fWTH4u01Us5Mbcixh+jGABu2Rom4aZLtqkNFZWGy8+Px2sqsw==
  • Authentication-results-original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Cc: Luca Fancellu <Luca.Fancellu@xxxxxxx>, Wei Chen <Wei.Chen@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 18 Oct 2022 16:12:17 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHY3lPHM2Tyba3hCkeKMW07GDOb2a4L9DeAgAAnewCAABdGAIAIGdUAgAADdICAAAu3gA==
  • Thread-topic: Proposal for deviations in static analyser findings

Hi,

> On 18 Oct 2022, at 16:29, Jan Beulich <jbeulich@xxxxxxxx> wrote:
> 
> On 18.10.2022 17:17, Luca Fancellu wrote:
>>> On 13 Oct 2022, at 12:34, Jan Beulich <jbeulich@xxxxxxxx> wrote:
>>> On 13.10.2022 12:11, Luca Fancellu wrote:
>>>>> On 13 Oct 2022, at 08:50, Jan Beulich <jbeulich@xxxxxxxx> wrote:
>>>>> On 12.10.2022 18:00, Luca Fancellu wrote:
>>>>>> Entries in the database should never be removed, even if they are not 
>>>>>> used
>>>>>> anymore in the code (if a patch is removing or modifying the faulty 
>>>>>> line).
>>>>>> This is to make sure that numbers are not reused which could lead to 
>>>>>> conflicts
>>>>>> with old branches or misleading justifications.
>>>>> 
>>>>> Can we add provisions for shrinking such entries to e.g. just their "id"
>>>>> line? Or is the intention to be able to re-use such an entry if a matching
>>>>> instance appears again later?
>>>> 
>>>> I prefer to don’t shrink it, the name itself is not very long, even using 
>>>> many digits of the incremental
>>>> number, it removes also the dependency on the file name.
>>> 
>>> Name length isn't relevant here, and I have no idea what dependency on a
>>> file name you're thinking of. My question is a scalability one: Over time
>>> the table will grow large. If all entries remain there in full forever,
>>> table size may become unwieldy.
>> 
>> Ok I misunderstood your question, now I understand what you are asking, we 
>> could remove the content
>> of the “analyser” dictionary for sure, because if there is not anymore a 
>> link with the current code.
>> 
>> Regarding removing the “name” and “text”, could it be that at some point we 
>> can introduce in the code
>> a violation that requires the same justification provided by the “orphan” 
>> entry?
>> In that case we could reuse that entry without creating a new one that will 
>> only waste space.
>> What is the opinion on this?
> 
> Well, yes, this is the one case that I, too, was wondering about. It's not
> clear to me whether it wouldn't be better to allocate a fresh ID in such a
> case.

For traceability and release handling I think it is important that:
- we never reuse the same ID in any case
- we keep IDs in the database even if there is no occurrence in xen code as 
this will prevent adding a new ID if an existing one can be reused

In a certification context, when a justification has been validated and agreed 
it will make life a lot easier to not modify it and reuse it in the future if 
it is needed.
From our point of view, having a clear and simple way of handling those will 
make backports a lot easier.


>>>>>> Here a brief explanation of the field inside an object of the "content" 
>>>>>> array:
>>>>>> - id: it is a unique string that is used to refer to the finding, many 
>>>>>> finding
>>>>>> can be tagged with the same id, if the justification holds for any 
>>>>>> applied
>>>>>> case.
>>>>>> It tells the tool to substitute a Xen in-code comment having this 
>>>>>> structure:
>>>>>> /* SAF-0-safe [...] \*/
>>>>>> - analyser: it is an object containing pair of key-value strings, the 
>>>>>> key is
>>>>>> the analyser, so it can be cppcheck, coverity or eclair. The value is the
>>>>>> proprietary id corresponding on the finding, for example when coverity is
>>>>>> used as analyser, the tool will translate the Xen in-code coment in this 
>>>>>> way:
>>>>>> /* SAF-0-safe [...] \*/ -> /* coverity[coverity-id] \*/
>>>>> 
>>>>> In here, where would coverity-id come from? And how does the 
>>>>> transformation
>>>>> here match up with the value of the "coverity": field in the table?
>>>> 
>>>> I can put an example of that, as you pointed out it could be difficult to 
>>>> get where
>>>> this proprietary tool ID comes from.
>>>> 
>>>> The proprietary ID (Coverity in this case) comes from the report it 
>>>> produces:
>>>> 
>>>> […]
>>>> <file path>:<line number>:
>>>> 1. proprietary_ID: […]
>>>> […]
>>>> 
>>>> after we see the finding, we take that ID, we put it in the “analyser” 
>>>> dictionary as:
>>>> 
>>>> […]
>>>> "id":”SAF-2-safe",
>>>> “analyser”: {
>>>>    “coverity”: “proprietary_ID"
>>>> },
>>>> […]
>>>> 
>>>> So in the source code we will have:
>>>> 
>>>> /* SAF-2-safe [optional text] */
>>>> C code affected line;
>>>> 
>>>> And when the analysis will be performed, the tool (coverity for example) 
>>>> will run on this source code:
>>>> 
>>>> /* coverity[proprietary_ID] */
>>>> C code affected line;
>>>> 
>>>> The tool will write a report and will suppress the finding with 
>>>> “proprietary_ID” that comes in the C code
>>>> line after the comment.
>>> 
>>> Let me add some background to my earlier comment:
>>> 
>>> If we wanted to add such IDs to the table, then I guess this would result in
>>> a proliferation of entries. If my observations haven't misguided me,
>>> Coverity might re-use the same ID for multiple similar new issues found in a
>>> single run, but it would not re-use them across runs. Hence irrespective of
>>> their similarity, multiple table entries would be needed just because of the
>>> different Coverity IDs.
>> 
>> Coverity will use every run the same id for the same class of violation, for 
>> example
>> misra_c_2012_rule_8_6_violation for violation of rule 8.6.
> 
> Hmm, I've never seen such. I always saw it use numeric IDs, and we've
> actually been putting these in commits when addressing their findings.

Here I think you are mixing a finding inside the code with the type associated.
We might have several findings of the same type but with different 
justifications.

> 
>>>> After the analysis, the source code will return as the original (with the 
>>>> SAF-* tag).
>>> 
>>> While you mention something similar also as step 3 in the original document
>>> near the top, I'm afraid I don't understand what this "return as the 
>>> original"
>>> means. If you want to run the tool on an altered (comments modified) source
>>> tree, what I'd expect you to do is clone the sources into a throw-away tree,
>>> massage the comments, run the tool, and delete the massaged tree.
>>>>>> if the object doesn't have a key-value, then the corresponding in-code
>>>>>> comment won't be translated.
>>>>> 
>>>>> Iirc at least Coverity ignores certain instances of what it might consider
>>>>> violations (fall-through in switch() statements in particular) in case
>>>>> _any_ comment is present. Therefore may I suggest that such comments be
>>>>> deleted (really: replaced by a blank line, to maintain correct line
>>>>> numbering) if there's no matching key-value pair?
>>>> 
>>>> Yes the line won’t be altered if there is no match. This to ensure the 
>>>> correct line
>>>> numbering is not affected.
>>> 
>>> "won't be altered" is the opposite of what I've been asking to consider:
>>> Observing that comments _regardless_ of their contents may silence findings,
>>> the suggestion is to remove comments (leaving a blank line) when there's no
>>> entry for the targeted tool in the table entry.
>> 
>> Why? The tag comment won’t do anything, it would act as a blank line from 
>> the analyser
>> perspective.
> 
> The _tag_ won't do anything, but as said any _comment_ may have an effect.

I am not sure I follow this one but in any case we can choose to anyway 
substitute the tag with something like /* Not applicable */.

Cheers
Bertrand

> 
> Jan


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.