[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposal: List files in Xen originated from external sources


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Luca Fancellu <Luca.Fancellu@xxxxxxx>
  • Date: Mon, 6 Feb 2023 12:21:48 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Nk01eoEzdydpuKXtKkiVQl/biFAmUsr/YQFF4kLQvgQ=; b=NlZB0wtqfLv3gl4e3h+RC8YMzE2GQpLJYFUIhNWvi7tXGNMocvJM2fj/RYPxWYlPjkIq42M2ARnyauRRNOHdFkf3/d/+mlQykcrPpwDQ+SOxfDbAMuVdSWT/QqrtDsbY8Uv/Pb+jTdcjVl67ktql0Y2W38DD41GpZiXtnFoFx1w1WKpDK6y17kJjfp3Art8jJBIoDMHkaVbt3PF9i7hOCQBM+Ztqam/AB/7YOrwUClLOraf+izOQb7/VwBbJEmZXBEI1TXJVoHRGK9piSVP5f0yDd9lu15/hpu+eLQh3ovmpGtrGMm3kdvTwKwBEC1QGCY2bZycaZbObbjiYBMQn9Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RP+TP4aIw+RfsgA7is46xI8CyUhqZkbicGWA6SOUfk5mbVIuOaPdA3gb09FNUKqbmkwLGXHAsNjyo0dIi2FtsbmV8DFV8UQ7lE6ji+hHySHFQOYPq6YbNDMlVn8vdrpsVp6w1qxBh8/ow8ULcZKxLrvBPVTScSOWke+zYjq18wGwLqUHh5+1+sSXeMPY+Q1E6+1ClynYLR+GLdT4zM5GodFefYTQL0/WIw80FIZSqVTSnWUa4bbJYVO+tRlrk5gob3BBO/DLu/KhTd2n0mV+6xvPGI3rjlfEEaJMLFIVLch6emac9KeZzEeYxJriVToJcOcPfA1M2u12Q/8RrQJJEA==
  • Authentication-results-original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 06 Feb 2023 12:22:57 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHZN7IkxuQrtwTv+0WBDKkxqc3Ztq7BtLQAgAAnBYA=
  • Thread-topic: Proposal: List files in Xen originated from external sources

Hi Jan,

Thank you for leaving your feedbacks, really appreciated.

> On 6 Feb 2023, at 10:01, Jan Beulich <jbeulich@xxxxxxxx> wrote:
> 
> On 03.02.2023 10:30, Luca Fancellu wrote:
>> I’m starting a proposal for the external files that needs to be removed from 
>> the MISRA scan,
>> the work was originally started by Michal here:
>> https://patchwork.kernel.org/project/xen-devel/patch/20221116092032.4423-1-michal.orzel@xxxxxxx/
>> so I started by that thread, the aim of this work is to have an initial 
>> format to start as soon as possible to
>> exclude the external files from the MISRA scan (at least initially from 
>> cppcheck).
>> 
>> I think we can use the JSON format because it’s easy to integrate it with 
>> python and it’s easy to add/remove
>> fields from the structure without interfering with the other elements during 
>> time, so we can define a structure
>> now but if in the future we see the needs for additional field, we can just 
>> add them without changes to the
>> analysis script.
>> 
>> In my opinion many of these fields can be left empty in a first push, to let 
>> the script exclude the files and during
>> time with the contributions of the community we can add the missing 
>> informations.
> 
> I'm having trouble making a connection between a file having an entry here
> and the need/desire to include it in scanning. Can you clarify please why
> you see an implication from one to the other here?

My understanding is that an external file should not be included in the report, 
because we can’t do direct changes
so there might be situations where the fix can’t be included in the original 
project and back ported afterwards because
the original project doesn’t apply the same coding standard, or code that needs 
to be justified simply can’t be justified
because the tag we use doesn’t make sense in the original project.
So we decided (?) to leave this burden to the final user that might just 
consult the list of external files to fix/justify what
is not compliant with the coding standard that needs to be followed.

> 
>> I think it’s easier for the community to pick an entry, do some research to 
>> fill the gaps and push, instead to wait
>> until having all the informations before adding the entry.
> 
> Maybe it's easier, but it's then also less useful. For example I view
> it as quite relevant what the origin of a file is. That may, for
> example, have an implication on whether "backport" is sensible to set
> to "true" (origin projects could, after all, be largely unmaintained,
> and hence it may be difficult to get any changes into there).

It’s a first step to unblock the code scanner to do properly its job, that is 
produce a report of findings we need to fix/justify
without the problem of having external files in the report (that can’t be 
touched).

So I get your point that a list of files without other data is not really 
useful, but it enables the community member to share
the burden of finding the history of it, currently we know which files are 
external (most of the times by experience or just by looking
into its codestyle), we list them and from time to time community members can 
pick one entry and do some archeology on it
to reconstruct its origin.
We might end up eventually to recognise that for a particular file we could 
just make direct modifications
because it can’t be anymore subject to back ports (origin diverged too much 
maybe, but I’m open to suggestions about this case).

My personal opinion is that we can’t give the burden of doing that from A to Z 
for every entry to the contributors, it will results on
not having anything at all, the community should spread the burden in these 
situations, it’s not like a feature that introduces issues
in Xen and the person should make it complete from the beginning, it’s 
documentation that can be produced during time and contributions
are welcome.

> 
>> This is my first though for the fields, let me know yours:
>> 
>> docs/misra/external-files.json:
>> {
>> "version": "1.0”,
>> "content": [
>>   {
>>     "path": "relative/path/from/xen/“,
>>     "backport": True/False,
>>     "origin_project": "URL to origin project",
>>     "origin_path": "relative path in the original project",
>>     "origin_revision": "revision in original project”
>>   }
>> ]
>> }
>> 
>> 
>> Maybe, documentation for this file and the fields can reside in 
>> docs/misra/external-files.rst.
> 
> Couldn't documentation of the fields live at the top of the file itself?

Unfortunately not, JSON is a bit strict about that.

> 
>> Here follows the explanation for the fields:
>> 
>> path: is a relative path from the xen base folder, it can refer to a file or 
>> it can be a path to an entire folder
>>         (Taking as example libfdt)
>> 
>> backport: it’s a boolean flag that says if the file is subject to backport, 
>> so every file where this field is true
>>                won’t be included in the analysis. A file is subject to 
>> backport when it’s external and it doesn’t
>>                accept direct changes (changes needs to be made in the 
>> original project and then backported
>>                to Xen)
>> 
>> origin_project: url of the repository where this file was originated
> 
> Personally I'd drop "url of" - the specification of the origin needs to be
> precise, but I'm not convinced it absolutely needs to come in the form of
> a URL. What would imo be more important to mandate is that the reference
> be to the "canonical" copy of the project, not e.g. some github clone.

Sure, could you explain it a bit more, maybe with an example?

> 
>> origin_path: relative path in the original project, mostly linked to the 
>> original_revision field
>> 
>> origin_revision: revision of the changes in the repository when this file 
>> was taken.
> 
> I guess this needs clarifying: What revision is meant here? The one originally
> imported, the one last updated from, or yet something else? Keep in mind that
> we may also be selectively importing changes, in which case any particular
> "revision" can easily end up misleading. The format likely also wants
> specifying, e.g. (like we do for Fixes: tags) the first 12 digits of a commit
> hash. For said selective updating this might then allow for something like
> <base-hash>+<cherry-pick>+...

This is a good point, what would it be the best format in your opinion?
Maybe we should have the field as an array of revisions?

"origin_revision": [
   "Revision of latest backport”,
   [...]
   "originally imported revision”,
]

What should be the best to capture the history of the file?


> 
> Jan

> On 6 Feb 2023, at 09:49, Jan Beulich <jbeulich@xxxxxxxx> wrote:
> 
> On 03.02.2023 20:55, Stefano Stabellini wrote:
>> On Fri, 3 Feb 2023, Luca Fancellu wrote:
>>> And then, if the file is not subject to backport and now we “own” the file, 
>>> there is no more the need to list it
>>> in the external file, a commit search can reveal when it was converted to 
>>> Xen codestyle and removed from
>>> the external file list so we don’t lose the history.
>> 
>> Yes, I agree. If a file is coverted to Xen coding style and follows
>> MISRA rules, then there is no need to list the file in
>> docs/misra/external-files.json.
>> 
>> 
>>> So we would end up excluding just all the file listed in external-file.json 
>>> by the analysis.
>> 
>> Right, I think so too
> 
> Personally I think this is too focused on Misra. There are other uses of
> such data, like automatically identifying changes to the origin file(s)
> which we ought to be pulling in (recall that we've been doing a pretty
> bad job in this regard).

Yes, we could find a place outside the /misra/ folder

> 
> Jan





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.