| Hi Keir: 
 I spent more time on how event channel works. And now I  know that event is bind to
 irq with call of request_irq. When event is sent, the other side of the channel will run into
 asm_do_IRQ->generic_handle_irq->generic_handle_irq_desc->handle_level_irq(
 here it actually invokes desc->handle_irq, and for evtchn this is handle_level_irq).
 I noticed that in handle_level_irq the event mask and pending is cleared.
 
 Well I have one more analysis to be discussed.
 
 Attached is the evtchn when a VM is hang in physical server. Domain 10 is hang.
 We can see domain 10 CPU info on the bottem the log, its has flags = 4 which means
 _VPF_blocked_in_xen.
 
 (XEN) VCPU information and callbacks for domain 10:
 (XEN)     VCPU0: CPU11 [has=F] flags=4 poll=0 upcall_pend = 00, upcall_mask = 00 dirty_cpus={} cpu_affinity={4-15}
 (XEN)     paging assistance: shadowed 2-on-3
 (XEN)     No periodic timer
 (XEN)     Notifying guest (virq 1, port 0, stat 0/-1/0)
 (XEN)     VCPU1: CPU9 [has=T] flags=0 poll=0 upcall_pend = 00, upcall_mask = 00 dirty_cpus={9} cpu_affinity={4-15}
 (XEN)     paging assistance: shadowed 2-on-3
 (XEN)     No periodic timer
 (XEN)     Notifying guest (virq 1, port 0, stat 0/-1/0)
 
 And its domain event info is :
 (XEN) Domain 10 polling vCPUs: {No periodic timer}
 (XEN) Event channel information for domain 10:
 (XEN)     port [p/m]
 (XEN)        1 [0/1]: s=3 n=0 d=0 p=105 x=1
 (XEN)        2 [0/1]: s=3 n=1 d=0 p=106 x=1
 (XEN)        3 [0/0]: s=3 n=0 d=0 p=104 x=0
 (XEN)        4 [0/1]: s=2 n=0 d=0 x=0
 (XEN)        5 [0/0]: s=6 n=0 x=0
 (XEN)        6 [0/0]: s=2 n=0 d=0 x=0
 (XEN)        7 [0/0]: s=3 n=0 d=0 p=107 x=0
 (XEN)        8 [0/0]: s=3 n=0 d=0 p=108 x=0
 (XEN)        9 [0/0]: s=3 n=0 d=0 p=109 x=0
 (XEN)       10 [0/0]: s=3 n=0 d=0 p=110 x=0
 
 Base on our situation, we only interest in the event channel which consumer_is_xen is 1,
 and here "x=1", that is port 1 and 2. According to the log, the other side of the channel
 is domain 0, port 105, and 106.
 
 Take a look at domain 0 event channel with port 105,106, I find on port 105, it pending is
 1.(in [1,0], first bit refer to pending, and is 1, second bit refer to mask, is 0).
 
 (XEN)      105 [1/0]: s=3 n=2 d=10 p=1 x=0
 (XEN)      106 [0/0]: s=3 n=2 d=10 p=2 x=0
 
 In all, we have domain U cpu blocking on _VPF_blocked_in_xen, and it must set the pending bit.
 Consider pending is 1, it looks like the irq is not triggered, am I  right ?
 Since if it is triggerred, it should clear the pending bit. (line 361).
 
 ------------------------------/linux-2.6-pvops.git/kernel/irq/chip.c---
 354 void
 355 handle_level_irq(unsigned int irq, struct irq_desc *desc)
 356 {
 357         struct irqaction *action;
 358         irqreturn_t action_ret;
 359
 360         spin_lock(&desc->lock);
 361         mask_ack_irq(desc, irq);
 362
 363         if (unlikely(desc->status & IRQ_INPROGRESS))
 364                 goto out_unlock;
 365         desc->status &= ~(IRQ_REPLAY | IRQ_WAITING);
 366         kstat_incr_irqs_this_cpu(irq, desc);
 367
 
 BTW, the qemu still works fine when VM is hang. Below is it strace output.
 No much difference between other well worked qemu instance, other than select all Timeout.
 -------------------
 select(14, [3 7 11 12 13], [], [], {0, 10000}) = 0 (Timeout)
 clock_gettime(CLOCK_MONOTONIC, {673470, 59535265}) = 0
 clock_gettime(CLOCK_MONOTONIC, {673470, 59629728}) = 0
 clock_gettime(CLOCK_MONOTONIC, {673470, 59717700}) = 0
 clock_gettime(CLOCK_MONOTONIC, {673470, 59806552}) = 0
 select(14, [3 7 11 12 13], [], [], {0, 10000}) = 0 (Timeout)
 clock_gettime(CLOCK_MONOTONIC, {673470, 70234406}) = 0
 clock_gettime(CLOCK_MONOTONIC, {673470, 70332116}) = 0
 clock_gettime(CLOCK_MONOTONIC, {673470, 70419835}) = 0
 
 
 
 
 > Date: Mon, 20 Sep 2010 10:35:46 +0100
 > Subject: Re: VM hung after running sometime
 > From: keir.fraser@xxxxxxxxxxxxx
 > To: tinnycloud@xxxxxxxxxxx
 > CC: xen-devel@xxxxxxxxxxxxxxxxxxx; jbeulich@xxxxxxxxxx
 >
 > On 20/09/2010 10:15, "MaoXiaoyun" <tinnycloud@xxxxxxxxxxx> wrote:
 >
 > > Thanks Keir.
 > >
 > > You're right, after I deeply looked into the wait_on_xen_event_channel, it is
 > > smart enough
 > > to avoid the race I assumed.
 > >
 > > How about prepare_wait_on_xen_event_channel ?
 > > Currently Istill don't know when it will be invoked.
 > > Could enlighten me?
 >
 > As you can see it is called from hvm_send_assist_req(), hence it is called
 > whenever an ioreq is sent to qemu-dm. Note that it is called *before*
 > qemu-dm is notified -- hence it cannot race the wakeup from qemu, as we will
 > not get woken u
 ntil qemu-dm has done the work, and it cannot start the work
 > until it is notified, and it is not notified until after
 > prepare_wait_on_xen_event_channel has been executed.
 >
 > -- Keir
 >
 > >
 > >> Date: Mon, 20 Sep 2010 08:45:21 +0100
 > >> Subject: Re: VM hung after running sometime
 > >> From: keir.fraser@xxxxxxxxxxxxx
 > >> To: tinnycloud@xxxxxxxxxxx
 > >> CC: xen-devel@xxxxxxxxxxxxxxxxxxx; jbeulich@xxxxxxxxxx
 > >>
 > >> On 20/09/2010 07:00, "MaoXiaoyun" <tinnycloud@xxxxxxxxxxx> wrote:
 > >>
 > >>> When IO is not ready, domain U in VMEXIT->hvm_do_resume might invoke
 > >>> wait_on_xen_event_channel
 > >>> (where it is blocked in _VPF_blocked_in_xen).
 > >>>
 > >>> Here is my assumption of event missed.
 > >>>
 > >>> step 1: hvm_do_re
 sume execute 260, and suppose p->state is STATE_IOREQ_READY
 > >>> or STATE_IOREQ_INPROCESS
 > >>> step 2: then in cpu_handle_ioreq is in line 547, it execute line 548 so
 > >>> quickly before hvm_do_resume execute line 270.
 > >>> Well, the event is missed.
 > >>> In other words, the _VPF_blocked_in_xen is cleared before it is actually
 > >>> setted, and Domian U who is blocked
 > >>> might never get unblocked, it this possible?
 > >>
 > >> Firstly, that code is very paranoid and it should never actually be the case
 > >> that we see STATE_IOREQ_READY or STATE_IOREQ_INPROCESS in hvm_do_resume().
 > >> Secondly, even if you do, take a look at the implementation of
 > >> wait_on_xen_event_channel() -- it is smart enough to avoid the race you
 > >> mention.
 > >>
 > >> -- Keir
 > >&
 gt;
 > >>
 > >
 >
 >
 
 |