[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xen/arm: Software Step ARMv8 - PC stuck on instruction



Hi Florian,

On 03/08/17 17:00, Florian Jakobsmeier wrote:
> regarding your previous mails. I was able to single step every instruction
> of my module. The problem (or rather the solution) was to _disable_ the IRQ
> interrupts from within my guest module. This solves the problem of
> singlestepping a module which previously ended in a spinlock. But it does
> not solve the problem with system that is singlestepped with enabled IRQ's,
> as it still will be locked within a spinlock.

If I understand correctly: if you try to single-step a spinlock then you get
stuck in a loop. The code you want to single-step doesn't take any spinlocks,
but if you take an IRQ, the IRQ handler does.

[...]

> This is my module code which is executed in the DomU:
> 
> int init_module()
>> {
>>     printk(KERN_INFO "###     Init address 0x%lx\n", &init_module);
>>     printk(KERN_INFO "        Set function hook\n");
>>     patch_function_hook();
>>
>>     printk(KERN_INFO "        Starting singlestep\n");
> 
>>     local_irq_disable();
>>     __asm__ __volatile__ ("SMC 1");
>>     __asm__ __volatile__ ("SMC 1");

(Back-to-back SMCs, this explains why I thought you were missing the PC-advance
logic).


>>     if(!already_trapped)
>>     {
>>         __asm__ __volatile__ ("NOP");
>>         __asm__ __volatile__ ("NOP");
>>         __asm__ __volatile__ ("NOP");
>>
>>         //Just for keeping module busy while singlestep
>>         for ( c = 0 ; c < n ; c++ )
>>
> When executing these exact steps, it is possible to singlestep the whole
> module. Without the local_irq_disable() the system will stop the module
> execution right after the first SMC.

What do you mean by 'stop'? Options are:
Fail to make progress by taking IRQs all the time instead?
Fail to make progress, instead taking a single step exception on the same
instruction forever.
(or something else)

[...]

> But: It's not possible to singlestep a system as long as the VM IRQ's are
> enabled. If we would activate single stepping with enabled interrupts, we
> will be locked in the mentioned spinlock.
> Because of this it is not possible to singlestep other application.
> Additionally it is not possible to print anything while singlestepping
> because, as far as I understood, the system will wait within a spinlock
> until the terminal is free to print.
>
> Do you have any idea why it's not possible to escape the lock while
> singlestepping? Like I mentioned, my guess is on timer interrupts, which
> should unlock the spinlock but generate problems with singlestep enabled at
> the same time. This would also explain why i can observe the control flow
> of my guest module with IRQ's being disabled.

I suspect you are trying to single-step Linux's spinlocks which use
load-exclusive/store-exclusive. (There is an annotated example in the ARM-ARM
'K9.3.1 Acquiring a lock').
LDAXR sets the 'exclusive monitor' and STXR only succeeds if the exclusive
monitor is still set. If another CPU accesses the memory protected by the
exclusive monitor, the monitor is cleared. This is how the spinlock code knows
it has to re-read its value and try to take the lock again.
Changing exception level also clears the exclusive monitor, so taking
single-step exception between a LDAXR/STXR pair means the loop has to be 
retried.

Ideally, you should avoid single-stepping atomic instruction sequences, as the
single-step mechanism has made them in-atomic, the CPU detects this and the code
then retries.

As a workaround (for just the case you are looking at), you could ask Xen's vgic
to empty the LR registers so that there are no pending guest interrupts, but
this doesn't guarantee you don't try and single-step atomic code.


The best way to fix this is for Xen to emulate the load/store exclusives. You
will need to ensure that no other vcpu for that domain is running between the
load and store. (This stops them from modifying the protected value, and means
they must retry their own load/store exclusives if they were in the same code).

You also have the problem that another vcpu may be holding the lock, so if the
single-stepped vcpu is not making progress with the emulated-and-single-stepped
sequence, you should run each other vcpu in the domain to see if they release
the lock.

Linux enters a low-power state by issuing a 'wfe' from the waiting CPU and a
'sev' from the releasing CPU, you may end up trapping these as you try and step
them.

This is going to be tricky to get right, I'm not aware of anything that allows
single-stepping of atomic regions like this today.


Thanks,

James

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.