[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] ubsan: Introduce CONFIG_UBSAN_FATAL to panic on UBSAN failure



On Tue, 28 Nov 2023, Julien Grall wrote:
> On 28/11/2023 18:15, Michal Orzel wrote:
> > On 28/11/2023 18:09, Julien Grall wrote:
> > > On 28/11/2023 18:00, Michal Orzel wrote:
> > > > On 28/11/2023 17:14, Julien Grall wrote:
> > > > > On 27/11/2023 15:41, Michal Orzel wrote:
> > > > > > Introduce the CONFIG_UBSAN_FATAL option to cater to scenarios where
> > > > > > prompt
> > > > > > attention to undefined behavior issues, notably during CI test runs,
> > > > > > is
> > > > > > essential. When enabled, this option causes Xen to panic upon
> > > > > > detecting
> > > > > > UBSAN failure (as the last step in ubsan_epilogue()).
> > > > > 
> > > > > I have mixed opinions on this patch. This would be a good one if we
> > > > > had
> > > > > a Xen mostly free from UBSAN behavior. But this is not the case at
> > > > > least
> > > > > on arm32. So we would end up to stop at the first error. IOW, we would
> > > > > need to fix the first error before we can see the next one.
> > > > Well, this patch introduces a config option disabled by default.
> > > 
> > > I understood this is disabled by default... I am pointing out that I am
> > > not convinced about the usefulness until we are at the stage where Xen
> > > is normally not reporting any USBAN error.
> > > 
> > > > If we end up enabling it for CI for reasons* stated by Andrew, then the
> > > > natural way
> > > > of handling such issues is to do the investigation locally.
> > > 
> > > This will not always be possible. One example is when you are only able
> > > to reproduce some of the USBAN errors on a specific HW.
> > > 
> > > > Then, you are not forced
> > > > to select this option and you can see all the UBSAN issues if you want.
> > > 
> > > See above, I got that point. I am mostly concerned about the implication
> > > in the CI right now.
> > > 
> > > > 
> > > > > 
> > > > > So I feel it would be best if the gitlab CI jobs actually check for
> > > > > USBAN in the logs and fail if there are any. With that, we can get the
> > > > > full list of UBSAN issues on each job.
> > > > Well, I prefer Andrew suggestion (both [1] and on IRC), hence this
> > > > patch.
> > > > 
> > > > *my plan was to first fix the UBSAN issues and then enable this option
> > > > for CI.
> > > 
> > > That would have been useful to describe your goal after "---". With that
> > > in mind, then I suggest to revisit this patch once all the UBSAN issues
> > > in a normal build are fixed.
> > But this patch does not enable this option for CI automatically, right?
> 
> Correct.
> 
> > Why are you so keen to push it back?
> 
> I have been pushing back because your commit message refers to the CI
> specifically and today this would not really work (read as I would not be
> happy if this option is enabled in the CI right now at least on arm32).
> 
> If you want to fail a CI job for UBSAN today, then we need to find a different
> approach so we can get the full list of UBSAN error rather than fixing one,
> retry and then next one.
> 
> > Is it because you see no point in this option other than for the upstream CI
> > loop?
> 
> Even in the upstream CI loop, I am a little unsure about this approach. At
> least, I feel I would want to see all the reports at once in the CI.
> 
> But this is not really a strong feeling.
> 
> > I find it useful on a day-to-day
> > Xen runs, and I would for sure enable it by default in my config not to miss
> > UBSAN failures.
> 
> Fair enough. I view USBAN issues the same a WARN_ON. They all need to be
> investigated. So now you have an inconsistent policy.
> 
> Are you are also intending to switch WARN_ON() to fatal? If not, then why
> would UBSAN warnings more important that the other warnings?

I think it would be useful to be able to turn WARN_ONs into BUG_ONs
selectively by component. We could turn all WARN_ONs into BUG_ONs but
that's a bit extreme.

It is true that this patch is a bit... "ad-hoc". But given its
simplicity it doesn't hurt and I think it is nice-to-have for UBSAN. So
I would go with that.

But if we want something more sophisticated, here is an idea. The
concept is that one could specify warn=arch/arm/domain_build.c in the
Xen command line to change a WARN_ON in arch/arm/domain_build.c into a
BUG. I tested it with a WARN_ON I added on purpose and it works.


diff --git a/xen/common/symbols.c b/xen/common/symbols.c
index 133b580768..c78d2963c3 100644
--- a/xen/common/symbols.c
+++ b/xen/common/symbols.c
@@ -20,6 +20,10 @@
 #include <public/platform.h>
 #include <xen/guest_access.h>
 #include <xen/errno.h>
+#include <xen/param.h>
+
+char opt_warn[30] = "";
+string_param("warn", opt_warn);
 
 #ifdef SYMBOLS_ORIGIN
 extern const unsigned int symbols_offsets[];
diff --git a/xen/include/xen/lib.h b/xen/include/xen/lib.h
index 1793be5b6b..60e14cdb85 100644
--- a/xen/include/xen/lib.h
+++ b/xen/include/xen/lib.h
@@ -12,12 +12,19 @@
 #include <xen/xmalloc.h>
 #include <xen/string.h>
 
+extern char opt_warn[30];
+
 #define BUG_ON(p)  do { if (unlikely(p)) BUG();  } while (0)
 #define WARN_ON(p)  ({                  \
     bool ret_warn_on_ = (p);            \
                                         \
     if ( unlikely(ret_warn_on_) )       \
-        WARN();                         \
+    {                                   \
+        if ( !strcmp(__FILE__, opt_warn) ) \
+            BUG();                      \
+        else                            \
+            WARN();                     \
+    }                                   \
     unlikely(ret_warn_on_);             \
 })
 



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.