[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 2/4] Map: backport find_opt/update from 4.06
Thanks for this interesting experiment. My recollection was correct that exceptions are very fast but only if you don't capture stack traces (which we we do capture). With backtraces the difference is not massive. It still does pay off to use exceptions when they are raised rarely as the it saves the lookup for mem(). And moving to OCaml 4.06 improves performance. -- C ________________________________________ From: Edwin Torok Sent: 28 August 2020 18:43 To: Christian Lindig; xen-devel@xxxxxxxxxxxxxxxxxxxx Cc: Ian Jackson; dave@xxxxxxxxxx; wl@xxxxxxx Subject: Re: [PATCH v4 2/4] Map: backport find_opt/update from 4.06 On Fri, 2020-08-28 at 10:30 +0200, Christian Lindig wrote: > +let find_opt k t = > > + (* avoid raising exceptions, they can be expensive *) > > + if mem k t then Some (find k t) else None > > > > I disagree with this argument. Exceptions in OCaml are cheap because > they don't walk the stack and cut to the exception handler directly. > Is there a reason why they could be expensive here? In any case, the > code is correct. > Interesting question, it is best to measure. The answer depends on whether the key is found or not in the map. I'll change the impl to the one with find+catch, I think we might be looking up keys that are present more often than those that are missing (although the 4.06 series will make this moot, 4.06 version is faster than both approaches, regardless whether key is present or not). To sum up the measurements below: If the key is not found then my approach is faster (takes only 80% of time), if the key is found then find+catch is faster (again an approx 80% of the time taken). I based my comment on the documentation of raise_notrace, which says: "A faster version raise which does not record the backtrace.", which implies that recording the backtrace has a measurable perf impact. One could argue that if performance matters backtraces should be turned off in production, but I think the value of having backtraces when some hard-to-reprodue bug occurs outweights any perf penalty. We should try to use exceptions only for unexpected situations though, not finding a value in a map doesn't qualify. See the attachment for a small micro-benchmark: $ dune exec --profile=release -- ./updatet.exe raise Estimated testing time 20s (2 benchmarks x 10s). Change using '-quota'. ┌───────────────┬──────────┬────────────┐ │ Name │ Time/Run │ Percentage │ ├───────────────┼──────────┼────────────┤ │ raise │ 33.52ns │ 100.00% │ │ raise_notrace │ 19.16ns │ 57.16% │ └───────────────┴──────────┴────────────┘ So raising with a backtrace is measurably slower, taking the backtrace spends some CPU cycles. $ dune exec --profile=release -- ./updatet.exe find-opt Estimated testing time 1m30s (9 benchmarks x 10s). Change using '- quota'. ┌──────────────────────────┬──────────┬────────────┐ │ Name │ Time/Run │ Percentage │ ├──────────────────────────┼──────────┼────────────┤ │ find_opt 4.06:10 │ 49.10ns │ 24.06% │ │ find_opt 4.06:100 │ 115.38ns │ 56.55% │ │ find_opt 4.06:1000 │ 161.27ns │ 79.03% │ │ find_opt=mem+find:10 │ 50.48ns │ 24.74% │ │ find_opt=mem+find:100 │ 110.39ns │ 54.10% │ │ find_opt=mem+find:1000 │ 162.48ns │ 79.63% │ │ find_opt=find+catch:10 │ 89.10ns │ 43.67% │ │ find_opt=find+catch:100 │ 160.80ns │ 78.80% │ │ find_opt=find+catch:1000 │ 204.04ns │ 100.00% │ └──────────────────────────┴──────────┴────────────┘ 4.06 and mem+find take 80% of the time of find+catch. But of course if the key is actually found in the map then we have this: edwin@edvin-tower:~/uex % dune exec --profile=release -- ./updatet.exe find-opt-found Estimated testing time 1m30s (9 benchmarks x 10s). Change using '- quota'. ┌──────────────────────────┬──────────┬─────────┬────────────┐ │ Name │ Time/Run │ mWd/Run │ Percentage │ ├──────────────────────────┼──────────┼─────────┼────────────┤ │ find_opt 4.06:10 │ 38.38ns │ 2.00w │ 52.65% │ │ find_opt 4.06:100 │ 20.66ns │ 2.00w │ 28.35% │ │ find_opt 4.06:1000 │ 20.63ns │ 2.00w │ 28.30% │ │ find_opt=mem+find:10 │ 72.89ns │ 2.00w │ 100.00% │ │ find_opt=mem+find:100 │ 39.06ns │ 2.00w │ 53.59% │ │ find_opt=mem+find:1000 │ 39.07ns │ 2.00w │ 53.60% │ │ find_opt=find+catch:10 │ 49.54ns │ 2.00w │ 67.97% │ │ find_opt=find+catch:100 │ 33.01ns │ 2.00w │ 45.29% │ │ find_opt=find+catch:1000 │ 32.97ns │ 2.00w │ 45.23% │ └──────────────────────────┴──────────┴─────────┴────────────┘ In this case find+catch is faster. And here is update for a key that is not present: $ dune exec --profile=release -- ./updatet.exe update Estimated testing time 1m30s (9 benchmarks x 10s). Change using '- quota'. ┌───────────────────────────────────┬──────────┬─────────┬────────────┐ │ Name │ Time/Run │ mWd/Run │ Percentage │ ├───────────────────────────────────┼──────────┼─────────┼────────────┤ │ update 4.06:10 │ 79.96ns │ 24.00w │ 17.27% │ │ update 4.06:100 │ 171.96ns │ 48.00w │ 37.15% │ │ update 4.06:1000 │ 243.95ns │ 66.00w │ 52.70% │ │ update=find+catch+add/remove:10 │ 183.46ns │ 24.00w │ 39.63% │ │ update=find+catch+add/remove:100 │ 340.00ns │ 48.00w │ 73.45% │ │ update=find+catch+add/remove:1000 │ 462.89ns │ 66.00w │ 100.00% │ │ update=mem+find+add/remove:10 │ 126.06ns │ 24.00w │ 27.23% │ │ update=mem+find+add/remove:100 │ 274.79ns │ 48.00w │ 59.36% │ │ update=mem+find+add/remove:1000 │ 401.62ns │ 66.00w │ 86.76% │ └───────────────────────────────────┴──────────┴─────────┴────────────┘ Here 4.06 is a clear win, and mem+add is faster than find+catch+add. Estimated testing time 1m30s (9 benchmarks x 10s). Change using '- quota'. ┌───────────────────────────────────┬──────────┬─────────┬────────────┐ │ Name │ Time/Run │ mWd/Run │ Percentage │ ├───────────────────────────────────┼──────────┼─────────┼────────────┤ │ update 4.06:10 │ 72.76ns │ 24.00w │ 31.25% │ │ update 4.06:100 │ 164.69ns │ 48.00w │ 70.74% │ │ update 4.06:1000 │ 232.79ns │ 66.00w │ 100.00% │ │ update=find+catch+add/remove:10 │ 133.24ns │ 23.00w │ 57.24% │ │ update=find+catch+add/remove:100 │ 118.76ns │ 35.00w │ 51.02% │ │ update=find+catch+add/remove:1000 │ 161.22ns │ 59.00w │ 69.26% │ │ update=mem+find+add/remove:10 │ 156.29ns │ 23.00w │ 67.14% │ │ update=mem+find+add/remove:100 │ 122.98ns │ 35.00w │ 52.83% │ │ update=mem+find+add/remove:1000 │ 161.53ns │ 59.00w │ 69.39% │ └───────────────────────────────────┴──────────┴─────────┴────────────┘ Interestingly here the 4.06 implementation is actually slower and not much difference between my other two. Best regards, --Edwin
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |