[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 01/10] xen/arm: vpl011: Add pl011 uart emulation in Xen
Hi Bhupinder, On 26/04/2017 08:49, Bhupinder Thakur wrote: Regarding the optimization you introduced in this patch, delaying write notifications until we receive a notification from xenconsoled, how many notifications from xen to xenconsoled does it actually save? xenconsoled is going to send a notification for every read: we might end up sending the same number of notifications, only delayed.In the PV console design, the events from the guest are sent to xenconsole for every chunk of data. Since in the pl011 emulation, the data comes in bytes only, it would generate lot of events to xenconsole. To reduce the flurry of events, this optimisation was added. Xenconsole sends an event in the following conditions: 1. There is data available for Xen to process 2. It has finished processing the data and Xen can send more data In the 2nd case, xenconsole will keep reading the data from the ring buffer until it goes empty. At that point, it would send an event to Xen. Between sending of this event and processing of this event by Xen, there could be more data added for the xenconsole to process. While handling an event, the Xen will check for that condition and if there is data to be processed by xenconsole, it would send an event. Also sending delayed events helps with the rate limit check in xenconsole. If there are too many events, they maybe masked by xenconsole. I could test whether this rate limit check is really getting hit with and without this optimisation.I understand the idea behind, my question is whether this approach was actually verified by any scientific measurements. Did you run a test to count how many notifications were skipped thanks to this optimization? If so, what was the scenario? How many notifications were saved? If you didn't run a test, I suggest you do :-)Today I did some instrumentation and count the number of events sent by Xen to xenconsole and how many are really processed by xenconsole. I could not see any difference in the number of events processed by xenconsole with or without the optimization. The total number of events processed by xenconsole were about 500 for the complete guest booting till the login prompt (for both optimised and non-optimised case). Although Xen calls notify_via_xen_event_channel() far more number of times (about 12000 times until the guest loging prompt comes) without the optimisation, it does not translate into sending those many events to xenconsole though. With the optmization it just reduces the number of times notify_via_xen_event_channel() is called which is about 500 times. I believe the reason could be that if the event is still pending with xenconsole when the next event comes via notify_via_xen_event_channel() then all such events would be coalesced and delivered to xenconsole as a single event. So the optimization does not help with saving any processing on xenconsole though it saves the overhead of calling notify_via_xen_event_channel() very frequently in Xen. I don't see any issue to call notify_via_xen_event_channel many time because you should do it for every batch sent. Yes, the batch consists only of one character, but this is how an UART is designed. You rely on the behavior of notify_via_xen_event_channel. What if in the future it changes? Then maybe, you will miss event and data. So I would rather avoid this premature optimization and see how it is behaving. If it is too slow, then we can think about it. Cheers, -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |