[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device



Jan Beulich wrote on 2015-10-13:
>>>> On 13.10.15 at 07:27, <yang.z.zhang@xxxxxxxxx> wrote:
>> Jan Beulich wrote on 2015-10-12:
>>>>>> On 12.10.15 at 03:42, <yang.z.zhang@xxxxxxxxx> wrote:
>>>> So, my suggestion is that we can rely on user to not assign the
>>>> ATS device if hypervisor says it cannot support such device. For
>>>> example, if hypervisor find the invalidation isn't completed in 1
>>>> second, then hypervisor can crash itself and tell the user this
>>>> ATS device needs more than 1 second invalidation time which is not
>>>> support by
>> Xen.
>>> 
>>> Crashing the hypervisor in such a case is a security issue, i.e. is
>>> not
>> 
>> Indeed. Crashing the guest is more reasonable.
>> 
>>> an acceptable thing (and the fact that we panic() on timeout expiry
>>> right now isn't really acceptable either). If crashing the
>>> offending guest was sufficient to contain the issue, that might be an 
>>> option.
>>> Else
>> 
>> I think it should be sufficient (any concern from you?).
> 
> Having looked at the code, it wasn't immediately clear whether that
> would work. After all there one would think there would be a reason
> for the code panic()ing right now instead.

What the panic()ing refer to here?

> 
>> Hypervisor can
>> crash the guest with hint that the device may need long time to
>> complete the invalidation or device maybe bad. And user should add
>> the device to a blacklist to disallow assignment again.
>> 
>>> ripping out ATS support (and limiting spin time below what there is
>>> currently) may be the only alternative to fixing it.
>> 
>> Yes, it is another solution considering ATS device is rare currently.
>> For spin time, 10ms should be enough in both two solutions.
> 
> But 10ms is awfully much already. Which is why I've been advocating
> async flushing independent of ATS.

Agree. Technically speaking, async flush is the best solution. But considering 
the complexity and the benefit it brings, a compromise solution maybe better.

> 
>> But if solution 1 is acceptable, I prefer it since most of ATS
>> devices are still able to play with Xen.
> 
> With a multi-millisecond spin, solution 1 would imo be acceptable only
> as a transitional measure.

What does the transitional measure mean? Do you mean we still need the async 
flush for ATS issue or we can adapt solution 1 before ATS spec changed?

> 
> Jan


Best regards,
Yang



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.