[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Proposal for deviations in static analyser findings
Hi, > On 18 Oct 2022, at 16:29, Jan Beulich <jbeulich@xxxxxxxx> wrote: > > On 18.10.2022 17:17, Luca Fancellu wrote: >>> On 13 Oct 2022, at 12:34, Jan Beulich <jbeulich@xxxxxxxx> wrote: >>> On 13.10.2022 12:11, Luca Fancellu wrote: >>>>> On 13 Oct 2022, at 08:50, Jan Beulich <jbeulich@xxxxxxxx> wrote: >>>>> On 12.10.2022 18:00, Luca Fancellu wrote: >>>>>> Entries in the database should never be removed, even if they are not >>>>>> used >>>>>> anymore in the code (if a patch is removing or modifying the faulty >>>>>> line). >>>>>> This is to make sure that numbers are not reused which could lead to >>>>>> conflicts >>>>>> with old branches or misleading justifications. >>>>> >>>>> Can we add provisions for shrinking such entries to e.g. just their "id" >>>>> line? Or is the intention to be able to re-use such an entry if a matching >>>>> instance appears again later? >>>> >>>> I prefer to don’t shrink it, the name itself is not very long, even using >>>> many digits of the incremental >>>> number, it removes also the dependency on the file name. >>> >>> Name length isn't relevant here, and I have no idea what dependency on a >>> file name you're thinking of. My question is a scalability one: Over time >>> the table will grow large. If all entries remain there in full forever, >>> table size may become unwieldy. >> >> Ok I misunderstood your question, now I understand what you are asking, we >> could remove the content >> of the “analyser” dictionary for sure, because if there is not anymore a >> link with the current code. >> >> Regarding removing the “name” and “text”, could it be that at some point we >> can introduce in the code >> a violation that requires the same justification provided by the “orphan” >> entry? >> In that case we could reuse that entry without creating a new one that will >> only waste space. >> What is the opinion on this? > > Well, yes, this is the one case that I, too, was wondering about. It's not > clear to me whether it wouldn't be better to allocate a fresh ID in such a > case. For traceability and release handling I think it is important that: - we never reuse the same ID in any case - we keep IDs in the database even if there is no occurrence in xen code as this will prevent adding a new ID if an existing one can be reused In a certification context, when a justification has been validated and agreed it will make life a lot easier to not modify it and reuse it in the future if it is needed. From our point of view, having a clear and simple way of handling those will make backports a lot easier. >>>>>> Here a brief explanation of the field inside an object of the "content" >>>>>> array: >>>>>> - id: it is a unique string that is used to refer to the finding, many >>>>>> finding >>>>>> can be tagged with the same id, if the justification holds for any >>>>>> applied >>>>>> case. >>>>>> It tells the tool to substitute a Xen in-code comment having this >>>>>> structure: >>>>>> /* SAF-0-safe [...] \*/ >>>>>> - analyser: it is an object containing pair of key-value strings, the >>>>>> key is >>>>>> the analyser, so it can be cppcheck, coverity or eclair. The value is the >>>>>> proprietary id corresponding on the finding, for example when coverity is >>>>>> used as analyser, the tool will translate the Xen in-code coment in this >>>>>> way: >>>>>> /* SAF-0-safe [...] \*/ -> /* coverity[coverity-id] \*/ >>>>> >>>>> In here, where would coverity-id come from? And how does the >>>>> transformation >>>>> here match up with the value of the "coverity": field in the table? >>>> >>>> I can put an example of that, as you pointed out it could be difficult to >>>> get where >>>> this proprietary tool ID comes from. >>>> >>>> The proprietary ID (Coverity in this case) comes from the report it >>>> produces: >>>> >>>> […] >>>> <file path>:<line number>: >>>> 1. proprietary_ID: […] >>>> […] >>>> >>>> after we see the finding, we take that ID, we put it in the “analyser” >>>> dictionary as: >>>> >>>> […] >>>> "id":”SAF-2-safe", >>>> “analyser”: { >>>> “coverity”: “proprietary_ID" >>>> }, >>>> […] >>>> >>>> So in the source code we will have: >>>> >>>> /* SAF-2-safe [optional text] */ >>>> C code affected line; >>>> >>>> And when the analysis will be performed, the tool (coverity for example) >>>> will run on this source code: >>>> >>>> /* coverity[proprietary_ID] */ >>>> C code affected line; >>>> >>>> The tool will write a report and will suppress the finding with >>>> “proprietary_ID” that comes in the C code >>>> line after the comment. >>> >>> Let me add some background to my earlier comment: >>> >>> If we wanted to add such IDs to the table, then I guess this would result in >>> a proliferation of entries. If my observations haven't misguided me, >>> Coverity might re-use the same ID for multiple similar new issues found in a >>> single run, but it would not re-use them across runs. Hence irrespective of >>> their similarity, multiple table entries would be needed just because of the >>> different Coverity IDs. >> >> Coverity will use every run the same id for the same class of violation, for >> example >> misra_c_2012_rule_8_6_violation for violation of rule 8.6. > > Hmm, I've never seen such. I always saw it use numeric IDs, and we've > actually been putting these in commits when addressing their findings. Here I think you are mixing a finding inside the code with the type associated. We might have several findings of the same type but with different justifications. > >>>> After the analysis, the source code will return as the original (with the >>>> SAF-* tag). >>> >>> While you mention something similar also as step 3 in the original document >>> near the top, I'm afraid I don't understand what this "return as the >>> original" >>> means. If you want to run the tool on an altered (comments modified) source >>> tree, what I'd expect you to do is clone the sources into a throw-away tree, >>> massage the comments, run the tool, and delete the massaged tree. >>>>>> if the object doesn't have a key-value, then the corresponding in-code >>>>>> comment won't be translated. >>>>> >>>>> Iirc at least Coverity ignores certain instances of what it might consider >>>>> violations (fall-through in switch() statements in particular) in case >>>>> _any_ comment is present. Therefore may I suggest that such comments be >>>>> deleted (really: replaced by a blank line, to maintain correct line >>>>> numbering) if there's no matching key-value pair? >>>> >>>> Yes the line won’t be altered if there is no match. This to ensure the >>>> correct line >>>> numbering is not affected. >>> >>> "won't be altered" is the opposite of what I've been asking to consider: >>> Observing that comments _regardless_ of their contents may silence findings, >>> the suggestion is to remove comments (leaving a blank line) when there's no >>> entry for the targeted tool in the table entry. >> >> Why? The tag comment won’t do anything, it would act as a blank line from >> the analyser >> perspective. > > The _tag_ won't do anything, but as said any _comment_ may have an effect. I am not sure I follow this one but in any case we can choose to anyway substitute the tag with something like /* Not applicable */. Cheers Bertrand > > Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |