[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] VT-d flush timeout

Jan Beulich wrote on 2014-08-19:
>>>> "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx> 08/19/14 3:34 AM >>>
>> Jan Beulich wrote on 2014-08-18:
>>>>>> "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx> 08/18/14 4:01 AM >>>
>>>> The only place it doesn't use this timeout mechanism is queue
>>>> based invalidation. I think the reason is that the max number of
>>>> queue entry is
>>>> 2^15 and we don't know how much time is needed really to flush
>>>> 2^15 entries. So it is better to not use timeout here. Likewise,
>>>> for Xen side, we will only remove the timeout in qi flush function
>>>> and use spin for instead.
>>> What would that buy us? You spin either way. According what was
>>> discussed on our weekly calls so far, the intention for a first
>>> step was to reduce the current 1s timeout to a value large enough
>>> to cover what the IOMMU really requires (in the non-ATS cases only of 
>>> course).
>> I don't get your point. What's the different between the 1s and the
>> large enough value? Since hardware completed the flush quickly, it
>> will never spin more than real flush time in normal case. 1s spin
>> only happens in the abnormal case.
> Right, but we can't ignore this abnormal case. In particular we can't
> exclude that a DomU with a device assigned may have ways to (perhaps
> indirectly) affect the completion time of the flushes.

I don't think timeout value helps in this case.

>> My only concern is that, for QI flush, the spin time relies on the
>> length of the queue. I am not sure whether 1s is enough for worst
>> case and I think we should remove the 1s in QI flush. And I think
>> this also the same reason for Linux don't use timeout mechanism in QI flush.
> First of all I think both Linux and Xen in the majority of cases waits
> for completion of just individual queue entries. I.e. I'm not sure if
> the practical worst case really is equal to the theoretical one. And

This is my guessing from Linux's implementation but may wrong.

> then removing a timeout just to allow _longer_ spinning isn't really a
> step forward. If the timeout isn't big enough, the only solution is to
> immediately replace it with asynchronous handling.

Agree. I think it is better to leave as it is before we have the asynchronous 

> Jan

Best regards,

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.