[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features

To: Julien Grall <julien.grall.oss@xxxxxxxxx>
From: Oleksandr <olekstysh@xxxxxxxxx>
Date: Wed, 12 Aug 2020 18:08:11 +0300
Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Daniel De Graaf <dgdegra@xxxxxxxxxxxxx>, Julien Grall <julien.grall@xxxxxxx>
Delivery-date: Wed, 12 Aug 2020 15:08:29 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>


Hi Julien

@@ -2275,6 +2282,16 @@ static void check_for_vcpu_work(void)
    */
   void leave_hypervisor_to_guest(void)
   {
+#ifdef CONFIG_IOREQ_SERVER
+    /*
+     * XXX: Check the return. Shall we call that in
+     * continue_running and context_switch instead?
+     * The benefits would be to avoid calling
+     * handle_hvm_io_completion on every return.
+     */

Yeah, that could be a simple and good optimization

Well, it is not simple as it is sounds :).
handle_hvm_io_completion() is the function in charge to mark the
vCPU as waiting for I/O. So we would at least need to split the
function.

I wrote this TODO because I wasn't sure about the complexity of
handle_hvm_io_completion(current). Looking at it again, the main
complexity is the looping over the IOREQ servers.

I think it would be better to optimize handle_hvm_io_completion()

rather than trying to hack the context_switch() orcontinue_running().

Well, is the idea in proposed dirty test patch below close to what
you expect? Patch optimizes handle_hvm_io_completion() to avoid extra
actions if vcpu's domain doesn't have ioreq_server, alternatively
the check could be moved out of handle_hvm_io_completion() to avoid
calling that function at all.

This looks ok to me.

BTW, TODO also suggests checking the return value of
handle_hvm_io_completion(), but I am completely sure we can simply
just return from leave_hypervisor_to_guest() at this point. Could you
please share your opinion?

 From my understanding, handle_hvm_io_completion() may return false if
there is pending I/O or a failure.

It seems, yes

In the former case, I think we want to call handle_hvm_io_completion()
later on. Possibly after we call do_softirq().

I am wondering whether check_for_vcpu_work() could return whether
there are more work todo on the behalf of the vCPU.

So we could have:

do
{
   check_for_pcpu_work();
} while (check_for_vcpu_work())

The implementation of check_for_vcpu_work() would be:

if ( !handle_hvm_io_completion() )
   return true;

/* Rest of the existing code */

return false;

Thank you, will give it a try.

Can we behave the same way for both "pending I/O" and "failure" or we
need to distinguish them?

We don't need to distinguish them. In both cases, we will want to
process softirqs. In all the failure cases, the domain will have
crashed. Therefore the vCPU will be unscheduled.


Got it.

Probably we need some sort of safe timeout/number attempts in order to
not spin forever?

Well, anything based on timeout/number of attempts is flaky. How do
you know whether the I/O is just taking a "long time" to complete?

But a vCPU shouldn't continue until an I/O has completed. This is
nothing very different than what a processor would do.

In Xen case, if an I/O never completes then it most likely means that
something went horribly wrong with the Device Emulator. So it is most
likely not safe to continue. In HW, when there is a device failure,
the OS may receive an SError (this is implementation defined) and
could act accordingly if it is able to recognize the issue.

It *might* be possible to send a virtual SError but there are a couple
of issues with it:
      * How do you detect a failure?
      * SErrors are implementations defined. You would need to teach
your OS (or the firmware) how to deal with them.

I would expect quite a bit of effort in order to design and implement
it. For now, it is probably best to just let the vCPU spin forever.

This wouldn't be an issue for Xen as do_softirq() would be called at
every loop.


 Thank you for clarification. Fair enough and sounds reasonable.

I added logic to properly handle the return value ofhandle_hvm_io_completion() as you had suggested. For test purpose Isimulated handle_hvm_io_completion() to return false sometimes(I couldn't detect real "pending I/O" failure during testing) to see hownew logic behaved. I assume I can take this solution for non-RFC series (?)



---
 xen/arch/arm/traps.c         | 36 ++++++++++++++++++++++--------------
 xen/common/hvm/ioreq.c       |  9 ++++++++-
 xen/include/asm-arm/domain.h |  1 +
 xen/include/xen/hvm/ioreq.h  |  5 +++++
 4 files changed, 36 insertions(+), 15 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 974c744..f74b514 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2264,12 +2264,26 @@ static void check_for_pcpu_work(void)
  * Process pending work for the vCPU. Any call should be fast or
  * implement preemption.
  */
-static void check_for_vcpu_work(void)
+static bool check_for_vcpu_work(void)
 {
     struct vcpu *v = current;

+#ifdef CONFIG_IOREQ_SERVER
+    if ( hvm_domain_has_ioreq_server(v->domain) )
+    {
+        bool handled;
+
+        local_irq_enable();
+        handled = handle_hvm_io_completion(v);
+        local_irq_disable();
+
+        if ( !handled )
+            return true;
+    }
+#endif
+
     if ( likely(!v->arch.need_flush_to_ram) )
-        return;
+        return false;

     /*
      * Give a chance for the pCPU to process work before handling the vCPU
@@ -2280,6 +2294,8 @@ static void check_for_vcpu_work(void)
     local_irq_enable();
     p2m_flush_vm(v);
     local_irq_disable();
+
+    return false;
 }

 /*
@@ -2290,20 +2306,12 @@ static void check_for_vcpu_work(void)
  */
 void leave_hypervisor_to_guest(void)
 {
-#ifdef CONFIG_IOREQ_SERVER
-    /*
-     * XXX: Check the return. Shall we call that in
-     * continue_running and context_switch instead?
-     * The benefits would be to avoid calling
-     * handle_hvm_io_completion on every return.
-     */
-    local_irq_enable();
-    handle_hvm_io_completion(current);
-#endif
     local_irq_disable();

-    check_for_vcpu_work();
-    check_for_pcpu_work();
+    do
+    {
+        check_for_pcpu_work();
+    } while ( check_for_vcpu_work() );

     vgic_sync_to_lrs();

diff --git a/xen/common/hvm/ioreq.c b/xen/common/hvm/ioreq.c
index 7e1fa23..81b41ab 100644
--- a/xen/common/hvm/ioreq.c
+++ b/xen/common/hvm/ioreq.c

@@ -38,9 +38,15 @@ static void set_ioreq_server(struct domain *d,unsigned int id,

                              struct hvm_ioreq_server *s)
 {
     ASSERT(id < MAX_NR_IOREQ_SERVERS);
-    ASSERT(!s || !d->arch.hvm.ioreq_server.server[id]);
+    ASSERT((!s && d->arch.hvm.ioreq_server.server[id]) ||
+           (s && !d->arch.hvm.ioreq_server.server[id]));

     d->arch.hvm.ioreq_server.server[id] = s;
+
+    if ( s )
+        d->arch.hvm.ioreq_server.nr_servers ++;
+    else
+        d->arch.hvm.ioreq_server.nr_servers --;
 }

 /*

@@ -1415,6 +1421,7 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, boolbuffered)

 void hvm_ioreq_init(struct domain *d)
 {
     spin_lock_init(&d->arch.hvm.ioreq_server.lock);
+    d->arch.hvm.ioreq_server.nr_servers = 0;

     arch_hvm_ioreq_init(d);
 }
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 6a01d69..484bd1a 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -68,6 +68,7 @@ struct hvm_domain
     struct {
         spinlock_t              lock;
         struct hvm_ioreq_server *server[MAX_NR_IOREQ_SERVERS];
+        unsigned int            nr_servers;
     } ioreq_server;

     bool_t qemu_mapcache_invalidate;
diff --git a/xen/include/xen/hvm/ioreq.h b/xen/include/xen/hvm/ioreq.h
index 40b7b5e..8f78852 100644
--- a/xen/include/xen/hvm/ioreq.h
+++ b/xen/include/xen/hvm/ioreq.h
@@ -23,6 +23,11 @@

 #include <asm/hvm/ioreq.h>

+static inline bool hvm_domain_has_ioreq_server(const struct domain *d)
+{
+    return (d->arch.hvm.ioreq_server.nr_servers > 0);
+}
+
 #define GET_IOREQ_SERVER(d, id) \
     (d)->arch.hvm.ioreq_server.server[id]

--
2.7.4


--
Regards,

Oleksandr Tyshchenko

References:
- [RFC PATCH V1 00/12] IOREQ feature (+ virtio-mmio) on Arm
  - From: Oleksandr Tyshchenko
- [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features
  - From: Oleksandr Tyshchenko
- Re: [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features
  - From: Stefano Stabellini
- Re: [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features
  - From: Julien Grall
- Re: [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features
  - From: Oleksandr
- Re: [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features
  - From: Julien Grall
- Re: [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features
  - From: Oleksandr
- Re: [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features
  - From: Julien Grall
- Re: [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features
  - From: Oleksandr

Prev by Date: Re: linux-next: Tree for Aug 12 (x86: xen/pci)
Next by Date: [libvirt test] 152569: regressions - FAIL
Previous by thread: Re: [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features
Next by thread: Re: [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.