[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device



>>> On 14.10.15 at 07:12, <yang.z.zhang@xxxxxxxxx> wrote:
> Jan Beulich wrote on 2015-10-13:
>>>>> On 13.10.15 at 07:27, <yang.z.zhang@xxxxxxxxx> wrote:
>>> Jan Beulich wrote on 2015-10-12:
>>>>>>> On 12.10.15 at 03:42, <yang.z.zhang@xxxxxxxxx> wrote:
>>>>> So, my suggestion is that we can rely on user to not assign the
>>>>> ATS device if hypervisor says it cannot support such device. For
>>>>> example, if hypervisor find the invalidation isn't completed in 1
>>>>> second, then hypervisor can crash itself and tell the user this
>>>>> ATS device needs more than 1 second invalidation time which is not
>>>>> support by
>>> Xen.
>>>> 
>>>> Crashing the hypervisor in such a case is a security issue, i.e. is
>>>> not
>>> 
>>> Indeed. Crashing the guest is more reasonable.
>>> 
>>>> an acceptable thing (and the fact that we panic() on timeout expiry
>>>> right now isn't really acceptable either). If crashing the
>>>> offending guest was sufficient to contain the issue, that might be an 
>>>> option.
>>>> Else
>>> 
>>> I think it should be sufficient (any concern from you?).
>> 
>> Having looked at the code, it wasn't immediately clear whether that
>> would work. After all there one would think there would be a reason
>> for the code panic()ing right now instead.
> 
> What the panic()ing refer to here?

E.g. what we have in queue_invalidate_wait():

        while ( poll_slot != QINVAL_STAT_DONE )
        {
            if ( NOW() > (start_time + DMAR_OPERATION_TIMEOUT) )
            {
                print_qi_regs(iommu);
                panic("queue invalidate wait descriptor was not executed");
            }
            cpu_relax();
        }

>>> But if solution 1 is acceptable, I prefer it since most of ATS
>>> devices are still able to play with Xen.
>> 
>> With a multi-millisecond spin, solution 1 would imo be acceptable only
>> as a transitional measure.
> 
> What does the transitional measure mean? Do you mean we still need the async 
> flush for ATS issue or we can adapt solution 1 before ATS spec changed?

As long as the multi-millisecond spins aren't going to go away by
other means, I think conversion to async mode is ultimately
unavoidable.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.