[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 6/6] tools/libxl: Adjust datacopiers POLLHUP handling when the fd is also readable
On 01/04/15 11:34, Roger Pau Monnà wrote: > Hello, > > El 30/03/15 a les 12.40, Ian Campbell ha escrit: >> On Thu, 2015-03-26 at 16:20 +0100, Roger Pau Monnà wrote: >>> El 16/03/15 a les 14.29, Ross Lagerwall ha escrit: >>>> From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >>>> >>>> POLLHUP|POLLIN is a valid revent to receive when there is readable data in >>>> a >>>> pipe, but the writable fd has been closed. This occurs in migration v2 >>>> when >>>> the legacy conversion process (which transforms the data inline) completes >>>> and >>>> exits successfully. >>>> >>>> In the case that there is data to read, suppress the POLLHUP. POSIX states >>>> that the hangup state is latched[1], which means it will reoccur on >>>> subsequent >>>> poll() calls. The datacopier is thus provided the opportunity to read >>>> until >>>> EOF, if possible. >>>> >>>> A POLLHUP on its own is treated exactly as before, indicating a different >>>> error with the fd. >>>> >>>> [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/poll.html >>> This patch breaks booting pygrub guests on FreeBSD, it seems to trigger >>> some kind of error when pygrub exits: >>> >>> libxl: error: libxl_bootloader.c:596:bootloader_copyfail: unexpected eof >>> copying bootloader output >>> libxl: error: libxl_create.c:1129:domcreate_rebuild_done: cannot (re-)build >>> domain: -3 >>> >>> I will try to look into it, have you tested if the same happens on >>> Linux? >> Ross/Andrew, have you tried this on Linux? >> >> Roger, Does xl -vvv or even strace give any clues as to the sequences of >> events which lead to the error? > Here is the output of xl -vvv, it doesn't seem to contain any useful > info apart from what I've already posted: > > # xl -vvv create debian.cfg > Parsing config from debian.cfg > libxl: debug: libxl_create.c:1521:do_domain_create: ao 0x802834080: create: > how=0x0 callback=0x0 poller=0x80281f0a0 > libxl: debug: libxl_device.c:269:libxl__device_disk_set_backend: Disk > vdev=xvda spec.backend=unknown > libxl: debug: libxl_device.c:298:libxl__device_disk_set_backend: Disk > vdev=xvda, using backend phy > libxl: debug: libxl_create.c:924:initiate_domain_create: running bootloader > libxl: debug: libxl_device.c:269:libxl__device_disk_set_backend: Disk > vdev=(null) spec.backend=phy > libxl: debug: libxl.c:3065:libxl__device_disk_local_initiate_attach: locally > attaching PHY disk /dev/zvol/tank/debian > libxl: debug: libxl_bootloader.c:411:bootloader_disk_attached_cb: Config > bootloader value: pygrub > libxl: debug: libxl_bootloader.c:427:bootloader_disk_attached_cb: Checking > for bootloader in libexec path: /usr/local/lib/xen/bin/pygrub > libxl: debug: libxl_create.c:1537:do_domain_create: ao 0x802834080: > inprogress: poller=0x80281f0a0, flags=i > libxl: debug: libxl_event.c:577:libxl__ev_xswatch_register: watch > w=0x8028367a0 wpath=/local/domain/2 token=3/0: register slotnum=3 > libxl: debug: libxl_event.c:1942:libxl__ao_progress_report: ao 0x802834080: > progress report: ignored > libxl: debug: libxl_bootloader.c:539:bootloader_gotptys: executing > bootloader: /usr/local/lib/xen/bin/pygrub > libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: > /usr/local/lib/xen/bin/pygrub > libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: > --output=/var/run/xen/bootloader.2.out > libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: > --output-format=simple0 > libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: > --output-directory=/var/run/xen/bootloader.2.d > libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: > /dev/zvol/tank/debian > libxl: debug: libxl_event.c:514:watchfd_callback: watch w=0x8028367a0 > wpath=/local/domain/2 token=3/0: event epath=/local/domain/2 > libxl: error: libxl_bootloader.c:596:bootloader_copyfail: unexpected eof > copying bootloader output > libxl: debug: libxl_bootloader.c:638:bootloader_finished: bootloader completed > libxl: debug: libxl_event.c:615:libxl__ev_xswatch_deregister: watch > w=0x8028367a0 wpath=/local/domain/2 token=3/0: deregister slotnum=3 > libxl: error: libxl_create.c:1138:domcreate_rebuild_done: cannot (re-)build > domain: -3 > libxl: info: libxl.c:1699:devices_destroy_cb: forked pid 763 for destroy of > domain 2 > libxl: debug: libxl_event.c:1766:libxl__ao_complete: ao 0x802834080: > complete, rc=-3 > libxl: debug: libxl_event.c:1738:libxl__ao__destroy: ao 0x802834080: destroy > xc: debug: hypercall buffer: total allocations:44 total releases:44 > xc: debug: hypercall buffer: current allocations:0 maximum allocations:2 > xc: debug: hypercall buffer: cache current size:2 > xc: debug: hypercall buffer: cache hits:35 misses:2 toobig:7 > > And this is (I think) the relevant output from ktrace/kdump: > > 766 xl RET read 20/0x14 > 766 xl CALL read(0xc,0x80280702f,0x3cd) > 766 xl RET read -1 errno 35 Resource temporarily unavailable > 766 xl CALL gettimeofday(0x7fffffffde90,0) > 766 xl RET gettimeofday 0 > 766 xl CALL poll(0x802816180,0x6,0xffffffff) > 766 xl RET poll 1 > 766 xl CALL gettimeofday(0x7fffffffde90,0) > 766 xl RET gettimeofday 0 > 766 xl CALL write(0x10,0x802807014,0x1b) > 766 xl GIO fd 16 wrote 27 bytes > 0x0000 1b5b 3f31 6c1b 3e1b 5b32 343b 3148 0000 0000 000d 1b5b > |.[?1l.>.[24;1H.......[| > 0x0016 3f31 6c1b 3e |?1l.>| > > 766 xl RET write 27/0x1b > 766 xl CALL gettimeofday(0x7fffffffde90,0) > 766 xl RET gettimeofday 0 > 766 xl CALL poll(0x802816180,0x5,0xffffffff) > 766 xl RET poll 1 > 766 xl PSIG SIGCHLD caught handler=0x80114bdf0 mask=0x0 > code=CLD_EXITED > 766 xl CALL sigprocmask(SIG_SETMASK,0x7fffffffd81c,0) > 766 xl RET sigprocmask 0 > 766 xl CALL write(0xe,0x800af1fd2,0x1) > 766 xl GIO fd 14 wrote 1 byte > "\0" > 766 xl RET write 1 > 766 xl CALL sigreturn(0x7fffffffd440) > 766 xl RET sigreturn JUSTRETURN > 766 xl CALL gettimeofday(0x7fffffffde90,0) > 766 xl RET gettimeofday 0 > 766 xl CALL read(0xc,0x802807014,0x3e8) > 766 xl GIO fd 12 read 0 bytes > "" > 766 xl RET read 0 > 766 xl CALL write(0x2,0x7fffffffd0b0,0x7) > 766 xl GIO fd 2 wrote 7 bytes > "libxl: " > 766 xl RET write 7 > 766 xl CALL write(0x2,0x7fffffffd0b0,0x7) > 766 xl GIO fd 2 wrote 7 bytes > "error: " > >> Did/does commit 916735e1814d "tools/libxl: Restore errnoval behavior for >> datacopier callback" make any difference? > No, I still get the same error. I will try to take a closer look into this > when I finish some pending stuff... This works on Linux libxl: debug: libxl_event.c:577:libxl__ev_xswatch_register: watch w=0xb50870 wpath=/local/domain/11 token=3/0: register slotnum=3 libxl: debug: libxl_event.c:1942:libxl__ao_progress_report: ao 0xb53700: progress report: ignored libxl: debug: libxl_bootloader.c:539:bootloader_gotptys: executing bootloader: /usr/lib/xen/bin/pygrub libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: /usr/lib/xen/bin/pygrub libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: --args=root=/dev/xvda2 ro libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: --output=/var/run/xen/bootloader.11.out libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: --output-format=simple0 libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: --output-directory=/var/run/xen/bootloader.11.d libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: /dev/vm/andrewcoop-jessie-disk libxl: debug: libxl_event.c:514:watchfd_callback: watch w=0xb50870 wpath=/local/domain/11 token=3/0: event epath=/local/domain/11 libxl: debug: libxl_bootloader.c:638:bootloader_finished: bootloader completed libxl: debug: libxl_bootloader.c:138:bootloader_result_command: bootloader output contained kernel /var/run/xen/bootloader.11.d/boot_kernel.t2OsEV libxl: debug: libxl_bootloader.c:138:bootloader_result_command: bootloader output contained ramdisk /var/run/xen/bootloader.11.d/boot_ramdisk.3i2X1b libxl: debug: libxl_bootloader.c:138:bootloader_result_command: bootloader output contained args root=/dev/xvda2 ro elevator=noop root=/dev/xvda2 ro libxl: debug: libxl_bootloader.c:651:bootloader_finished: bootloader execution successful As 7e9ec5 has positively been identified as the cause of the regression on FreeBSD, that presumably means that FreeBSD won't permit a read() of an FD which has POLLHUP set. POSIX appears to specifically allow this action. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |