[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH 0/1] qemu-qdisk: indirect descriptors
In the meantime I tried an implementation for indirect descriptors for qemu. Described further in the next mail. It is based on current staging branch of qemu. From tests I did not observed an improvement. A decrease of bandwith starts earlier when the block size increase then for staging branch, especially for higher values of iodepth[1]. I run it under gprof and all the results are available on my github[2] but below is a part of flat profile for staging and indirect descriptors when fio is run with iodepth=256 and bs=256 for 300 sec. In the indirect descriptors implementation more time is spent in ioreq_unmap function with smaller number of calls. I tried to check if it cooperate better with grant copy running in the same time vmstat but then rapidly memory is exhausted and swap-out/in, the part of the listings are below, and that is not a case for poor grant copy implementation. I tried also different values of MAX_INDIRECT_SEGMENTS in the range {256, 128, 64, 32, 16} without bigger difference. I would appreciate any suggestions how to approach the problem. flat profiles: indirect descriptors Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 13.19 1.12 1.12 653798 0.00 0.00 get_clock_realtime 10.13 1.98 0.86 83570 0.00 0.00 ioreq_unmap 4.77 2.38 0.41 31245461 0.00 0.00 rcu_read_unlock 4.12 2.73 0.35 83423 0.00 0.00 ioreq_map 3.65 3.04 0.31 20900170 0.00 0.00 phys_page_find 3.12 3.31 0.27 20886790 0.00 0.00 address_space_rw 2.24 3.50 0.19 20886790 0.00 0.00 address_space_translate 2.00 3.67 0.17 10849312 0.00 0.00 test_and_clear_bit 1.88 3.83 0.16 31245456 0.00 0.00 rcu_read_lock 1.71 3.98 0.14 41773586 0.00 0.00 memory_access_is_direct 1.65 4.12 0.14 10330994 0.00 0.00 xen_map_cache_unlocked 1.59 4.25 0.14 20886785 0.00 0.00 address_space_translate_internal 1.53 4.38 0.13 10339152 0.00 0.00 cpu_inw 1.41 4.50 0.12 10458730 0.00 0.00 find_portio 1.30 4.61 0.11 10389053 0.00 0.00 cpu_physical_memory_rw 1.12 4.71 0.10 10358655 0.00 0.00 qemu_get_ram_block 1.06 4.79 0.09 31245450 0.00 0.00 xen_enabled 1.06 4.88 0.09 10447242 0.00 0.00 portio_read 1.06 4.97 0.09 237496 0.00 0.00 cpu_ioreq_pio 1.06 5.07 0.09 1557 0.00 0.00 vnc_refresh_server_sur staging Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 11.51 1.61 1.61 970388 0.00 0.00 get_clock_realtime 9.58 2.95 1.34 1186036 0.00 0.00 ioreq_unmap 5.50 3.72 0.77 1187881 0.00 0.00 ioreq_map 4.15 4.30 0.58 31195245 0.00 0.00 rcu_read_unlock 2.50 4.65 0.35 31195243 0.00 0.00 rcu_read_lock 2.50 5.00 0.35 20866261 0.00 0.00 phys_page_find 1.79 5.25 0.25 20852888 0.00 0.00 address_space_rw 1.36 5.44 0.19 4912499 0.00 0.00 qemu_coroutine_switch 1.22 5.61 0.17 20852881 0.00 0.00 address_space_translate 1.22 5.78 0.17 6141137 0.00 0.00 bdrv_is_inserted 1.07 5.93 0.15 2455277 0.00 0.00 tracked_request_end 1.07 6.08 0.15 1187877 0.00 0.00 ioreq_parse 1.00 6.22 0.14 20852887 0.00 0.00 address_space_translate_internal 1.00 6.36 0.14 2456463 0.00 0.00 qemu_aio_unref 1.00 6.50 0.14 2456156 0.00 0.00 qemu_coroutine_enter 0.93 6.63 0.13 41705784 0.00 0.00 memory_access_is_direct vmstat listings: grant map procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 1 11 62 2775 638 0 0 4052 244 16250 14124 6 20 71 2 1 1 0 11 58 2779 638 0 0 4308 0 16227 14254 7 18 74 1 0 1 0 11 56 2781 638 0 0 2320 1456 16310 14124 6 19 74 0 1 1 0 13 67 2776 631 0 1 3924 1372 14720 14019 6 20 74 0 1 1 0 13 66 2779 631 0 0 2768 0 16105 14038 6 19 74 0 0 1 0 13 63 2782 631 0 0 3000 0 14471 14002 6 19 74 0 0 1 0 13 58 2786 632 0 0 3988 36 12383 13135 7 19 73 1 0 1 0 13 56 2789 632 0 0 2488 116 12417 13853 6 20 74 0 0 1 0 13 61 2788 627 0 0 2556 296 12402 13382 7 20 73 0 0 2 0 13 59 2791 627 0 0 2552 0 16114 14085 7 18 74 0 1 1 0 13 56 2793 627 0 0 2320 0 16155 14092 5 20 75 0 0 1 0 14 69 2787 621 0 1 2848 1248 16766 14480 7 19 73 1 1 1 0 14 65 2792 620 0 0 4356 6 16369 14136 6 20 74 0 0 1 0 14 62 2795 621 0 0 3020 0 16079 14079 7 19 74 0 1 1 0 14 59 2798 621 0 0 2964 0 16229 14084 5 19 75 0 1 1 0 14 57 2800 621 0 0 2172 0 16454 14257 6 18 75 0 0 2 0 15 69 2794 614 0 0 3024 712 16416 14241 7 18 73 1 1 1 0 15 67 2797 615 0 0 2936 32 16168 14084 6 19 74 0 0 grant map with indirect desriptors procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 1 0 89 1900 1477 0 0 5760 24 9876 11438 5 19 76 0 0 2 0 0 83 1906 1478 0 0 5568 0 8829 12670 5 19 76 1 0 1 0 0 78 1911 1477 0 0 4736 0 8649 11291 5 19 76 0 1 1 0 0 73 1916 1478 0 0 5120 984 8746 11946 5 20 75 0 0 1 0 0 66 1922 1478 0 0 6016 0 8959 11785 6 18 76 0 1 1 0 0 61 1927 1478 0 0 5312 32 9031 11559 5 18 76 1 0 2 0 0 56 1932 1477 0 0 4608 0 9170 12156 5 19 75 0 1 2 0 0 63 1937 1466 0 0 4992 28 8205 11871 5 21 74 0 0 2 0 0 57 1942 1466 0 0 4928 0 8249 12198 5 18 76 0 0 1 0 0 67 1948 1450 0 0 5376 8 10813 11381 6 20 74 0 0 2 0 0 63 1952 1450 0 0 4288 192 9651 11814 5 20 70 4 1 1 0 0 59 1956 1450 0 0 4096 0 8960 12058 4 19 76 0 0 1 0 0 68 1962 1434 0 0 5184 12 9207 12089 5 20 75 0 1 1 0 0 64 1966 1434 0 0 4096 0 8433 12016 5 20 75 0 0 1 0 0 60 1970 1434 0 0 4224 140 10919 10750 5 18 76 0 0 1 0 0 55 1976 1434 0 0 5440 0 8362 12207 5 19 76 0 0 1 0 0 60 1980 1425 0 0 3776 64 8437 12020 6 19 74 1 1 1 0 0 55 1984 1425 0 0 4416 0 8902 11962 6 17 76 0 0 grant copy procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 3 1 0 63 2789 760 0 0 2268 8 36651 32671 6 19 74 0 1 0 0 0 62 2791 760 0 0 1240 4 36465 33543 5 18 75 1 1 1 0 0 61 2792 760 0 0 1584 0 36237 32312 4 21 74 0 1 3 0 0 59 2794 760 0 0 1628 0 36475 32888 4 20 75 0 1 2 0 0 57 2796 760 0 0 1968 0 34898 33329 5 19 75 0 1 0 0 0 55 2798 759 0 0 1948 0 31510 31938 4 20 75 0 1 1 0 0 66 2794 753 0 0 2244 12 36692 34147 5 18 75 1 1 1 1 0 64 2796 753 0 0 1792 20 29159 32907 5 18 76 0 1 1 0 0 62 2798 753 0 0 2416 0 37445 35323 2 19 77 1 1 2 0 0 59 2800 753 0 0 2188 0 35741 32670 4 20 76 0 1 1 0 0 58 2802 753 0 0 1772 0 36770 34468 4 17 78 0 1 0 0 0 56 2803 752 0 0 1260 0 36317 33152 4 19 76 0 1 1 0 0 55 2805 753 0 0 1216 0 36364 32263 4 19 76 0 1 4 0 0 67 2802 743 0 0 1068 16 35886 32045 5 19 75 1 1 2 0 0 65 2805 743 0 0 2928 0 28347 33364 5 20 74 0 1 2 0 0 62 2807 743 0 0 1944 0 36737 35010 4 17 78 0 1 1 0 0 61 2808 743 0 0 1540 0 35855 31968 3 20 76 0 1 1 0 0 58 2810 743 0 0 2268 0 36047 31639 4 20 75 0 1 grant copy with indirect descriptors procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 29 296 55 9 150 0 153 124752 167260 11989 13103 2 7 6 84 0 0 28 296 53 9 156 0 0 2692 620 596 852 0 0 0 100 0 0 27 296 72 1 159 0 0 7896 2072 675 882 0 2 0 98 0 2 11 313 53 13 146 0 17 35392 17032 2920 3640 1 5 10 83 0 0 13 324 58 1 139 0 10 25240 10688 2270 2584 0 3 22 74 0 0 27 447 72 0 117 0 122 126116 120204 10926 12598 1 5 4 90 0 1 24 450 67 0 130 0 3 11968 3608 772 1283 0 2 5 93 0 0 21 486 71 0 133 0 35 5176 34968 596 952 0 2 46 53 0 0 18 486 62 0 141 0 0 9352 0 652 1029 0 1 35 64 0 0 14 488 80 0 146 0 2 12068 2124 584 706 1 2 26 71 0 0 27 619 59 8 104 0 131 126264 128104 10722 12416 1 7 5 87 0 0 22 619 78 0 111 0 0 16652 28 1267 1865 0 2 0 98 0 0 25 800 68 5 81 0 180 166844 176752 13875 16661 1 6 0 92 0 0 19 800 55 1 100 0 0 14436 200 763 909 0 1 0 99 0 2 15 801 57 6 105 0 1 16308 1080 1103 1461 0 4 17 79 0 0 28 832 68 0 82 0 30 179036 30080 14029 16746 2 8 6 83 0 0 17 831 57 0 94 0 0 14140 0 1082 1316 0 2 13 86 0 1 18 849 68 1 101 0 17 9908 17444 691 873 0 2 6 92 0 [1] https://docs.google.com/spreadsheets/d/1E6AMiB8ceJpExL6jWpH9u2yy6DZxzhmDUyFf-eUuJ0c/edit#gid=1390267663 [2] https://github.com/paulina-szubarczyk/xen-benchmark/tree/master/gprof Thanks and regards, Paulina _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |