[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] FreeBSD PVH guest support



On Mon, Oct 28, 2013 at 02:35:03PM +0100, Roger Pau Monné wrote:
> Hello,
> 
> The Xen community is working on a new virtualization mode (or maybe I 
> should say an extension of HVM) to be able to run PV guests inside HVM 
> containers without requiring a device-model (Qemu). One of the 
> advantages of this new virtualization mode is that now it is much more 
> easier to port guests to run under it (as compared to pure PV guests).
> 
> Given that FreeBSD already supports PVHVM, adding PVH support is quite 
> easy, we only need some glue for the PV entry point and then support 
> for diverging some early init functions (like fetching the e820 map or 
> starting the APs).
> 
> The attached patch contains all this changes, and allows a SMP FreeBSD 
> guest to fully boot (and AFAIK work) under this new PVH mode. The patch 
> can also be found on my git repo:
> 
> git://xenbits.xen.org/people/royger/freebsd.git pvh_v2

Awesome! That is really fantastic!
> 
> The patch touches quite a lot of the early init, so I've Cced the 
> persons that maintain those areas, so they can review it.
> 
> In order to test it, and since the PVH changes are not yet merged into 
> upstream Xen, the use of a patched Xen is necessary. I've collected the 
> patches for PVH guest support from George Dunlap (v13) and fixed some 
> bugs on top of them, the tree can be found at:
> 
> git://xenbits.xen.org/people/royger/xen.git fix_pvh
> 
> For those curious, here is a dmesg of a FreeBSD PVH guest booting:
> 
> GDB: no debug ports present
> KDB: debugger backends: ddb
> KDB: current backend: ddb
> SMAP type=01 base=0000000000000000 len=0000000138800000
> ACPI BIOS Error (bug): A valid RSDP was not found (20130823/tbxfroot-223)
> APIC: Using the Xen PV enumerator.
> SMP: Added CPU 0 (BSP)
> SMP: Added CPU 2 (AP)
> SMP: Added CPU 4 (AP)
> SMP: Added CPU 6 (AP)
> SMP: Added CPU 8 (AP)
> SMP: Added CPU 10 (AP)
> SMP: Added CPU 12 (AP)
> Copyright (c) 1992-2013 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>       The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 11.0-CURRENT #420: Mon Oct 28 13:07:53 CET 2013
>     root@odin:/usr/obj/usr/src/sys/GENERIC amd64
> FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610
> WARNING: WITNESS option enabled, expect reduced performance.
> Hypervisor: Origin = "XenVMMXenVMM"
> Calibrating TSC clock ... TSC clock: 3066775691 Hz
> CPU: Intel(R) Xeon(R) CPU           W3550  @ 3.07GHz (3066.78-MHz K8-class 
> CPU)
>   Origin = "GenuineIntel"  Id = 0x106a5  Family = 0x6  Model = 0x1a  Stepping 
> = 5
>   
> Features=0x1fc98b75<FPU,DE,TSC,MSR,PAE,CX8,APIC,SEP,CMOV,PAT,CLFLUSH,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT>
>   Features2=0x80982201<SSE3,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,HV>
>   AMD Features=0x20100800<SYSCALL,NX,LM>
>   AMD Features2=0x1<LAHF>
> real memory  = 5242880000 (5000 MB)
> Physical memory chunk(s):
> 0x0000000000010000 - 0x00000000001fffff, 2031616 bytes (496 pages)
> 0x0000000002708000 - 0x0000000130864fff, 5068148736 bytes (1237341 pages)
> avail memory = 5035581440 (4802 MB)
> INTR: Adding local APIC 2 as a target
> INTR: Adding local APIC 4 as a target
> INTR: Adding local APIC 6 as a target
> INTR: Adding local APIC 8 as a target
> INTR: Adding local APIC 10 as a target
> INTR: Adding local APIC 12 as a target
> FreeBSD/SMP: Multiprocessor System Detected: 7 CPUs
> FreeBSD/SMP: 1 package(s) x 7 core(s)
>  cpu0 (BSP): APIC ID:  0
>  cpu1 (AP): APIC ID:  2
>  cpu2 (AP): APIC ID:  4
>  cpu3 (AP): APIC ID:  6
>  cpu4 (AP): APIC ID:  8
>  cpu5 (AP): APIC ID: 10
>  cpu6 (AP): APIC ID: 12
> XEN: CPU 0 has VCPU ID 0
> XEN: CPU 1 has VCPU ID 1
> XEN: CPU 2 has VCPU ID 2
> XEN: CPU 3 has VCPU ID 3
> XEN: CPU 4 has VCPU ID 4
> XEN: CPU 5 has VCPU ID 5
> XEN: CPU 6 has VCPU ID 6
> x86bios:  IVT 0x000000-0x0004ff at 0xfffff80000000000
> x86bios: SSEG 0x010000-0x010fff at 0xfffffe012e79d000
> x86bios:  ROM 0x0a0000-0x0fefff at 0xfffff800000a0000
> random device not loaded; using insecure entropy
> ULE: setup cpu 0
> ULE: setup cpu 1
> ULE: setup cpu 2
> ULE: setup cpu 3
> ULE: setup cpu 4
> ULE: setup cpu 5
> ULE: setup cpu 6
> Event-channel device installed.
> snd_unit_init() u=0x00ff8000 [512] d=0x00007c00 [32] c=0x000003ff [1024]
> feeder_register: snd_unit=-1 snd_maxautovchans=16 latency=5 feeder_rate_min=1 
> feeder_rate_max=2016000 feeder_rate_round=25
> wlan: <802.11 Link Layer>
> Hardware, VIA Nehemiah Padlock RNG: VIA Padlock RNG not present
> Hardware, Intel IvyBridge+ RNG: RDRAND is not present
> null: <null device, zero device>
> Falling back to <Software, Yarrow> random adaptor
> random: <Software, Yarrow> initialized
> nfslock: pseudo-device
> kbd0 at kbdmux0
> module_register_init: MOD_LOAD (vesa, 0xffffffff80d21c60, 0) error 19
> io: <I/O>
> VMBUS: load
> mem: <memory>
> hpt27xx: RocketRAID 27xx controller driver v1.1
> hptrr: RocketRAID 17xx/2xxx SATA controller driver v1.2
> hptnr: R750/DC7280 controller driver v1.0
> ACPI BIOS Error (bug): A valid RSDP was not found (20130823/tbxfroot-223)
> ACPI: Table initialisation failed: AE_NOT_FOUND
> ACPI: Try disabling either ACPI or apic support.
> xenstore0: <XenStore> on motherboard
> Grant table initialized
> xc0: <Xen Console> on motherboard
> xen_et0: <Xen PV Clock> on motherboard
> Event timer "XENTIMER" frequency 1000000000 Hz quality 950
> Timecounter "XENTIMER" frequency 1000000000 Hz quality 950
> xen_et0: registered as a time-of-day clock (resolution 10000000us, adjustment 
> 5.000000000s)
> pvcpu0: <Xen PV CPU> on motherboard
> pvcpu1: <Xen PV CPU> on motherboard
> pvcpu2: <Xen PV CPU> on motherboard
> pvcpu3: <Xen PV CPU> on motherboard
> pvcpu4: <Xen PV CPU> on motherboard
> pvcpu5: <Xen PV CPU> on motherboard
> pvcpu6: <Xen PV CPU> on motherboard
> legacy_pcib_identify: no bridge found, adding pcib0 anyway
> pcib0 pcibus 0 on motherboard
> pci0: <PCI bus> on pcib0
> pci0: domain=0, physical bus=0
> cpu0 on motherboard
> cpu1 on motherboard
> cpu2 on motherboard
> cpu3 on motherboard
> cpu4 on motherboard
> cpu5 on motherboard
> cpu6 on motherboard
> isa0: <ISA bus> on motherboard
> qpi0: <QPI system bus> on motherboard
> ex_isa_identify()
> isa_probe_children: disabling PnP devices
> isa_probe_children: probing non-PnP devices
> fb: new array size 4
> sc0: <System console> on isa0
> sc0: MDA <16 virtual consoles, flags=0x100>
> sc0: fb0, kbd0, terminal emulator: scteken (teken terminal)
> vga0: <Generic ISA VGA> at port 0x3b0-0x3bb iomem 0xb0000-0xb7fff on isa0
> isa_probe_children: probing PnP devices
> Device configuration finished.
> procfs registered
> Timecounters tick every 1.000 msec
> vlan: initialized, using hash tables with chaining
> tcp_init: net.inet.tcp.tcbhashsize auto tuned to 65536
> lo0: bpf attached
> hpt27xx: no controller detected.
> hptrr: no controller detected.
> hptnr: no controller detected.
> xenbusb_front0: <Xen Frontend Devices> on xenstore0
> xenbusb_add_device: Device device/suspend/event-channel ignored. State 6
> xn0: <Virtual Network Interface> at device/vif/0 on xenbusb_front0
> xn0: bpf attached
> xn0: Ethernet address: 00:16:3e:0b:a4:b1
> xenbusb_back0: <Xen Backend Devices> on xenstore0
> xctrl0: <Xen Control Device> on xenstore0
> xn0: backend features: feature-sg feature-gso-tcp4
> xbd0: 20480MB <Virtual Block Device> at device/vbd/51712 on xenbusb_front0
> xbd0: features: flush, write_barrier
> xbd0: synchronize cache commands enabled.
> GEOM: new disk xbd0
> random: unblocking device.
> Netvsc initializing... SMP: AP CPU #5 Launched!
> SMP: AP CPU #2 Launched!
> SMP: AP CPU #1 Launched!
> SMP: AP CPU #3 Launched!
> SMP: AP CPU #6 Launched!
> SMP: AP CPU #4 Launched!
> TSC timecounter discards lower 1 bit(s)
> Timecounter "TSC-low" frequency 1533387845 Hz quality -100
> WARNING: WITNESS option enabled, expect reduced performance.
> Trying to mount root from ufs:/dev/xbd0p2 []...
> start_init: trying /sbin/init
> Setting hostuuid: c9230f36-1a54-489e-877c-1d15b8f463e9.
> Setting hostid: 0xd52252c7.
> ZFS filesystem version: 5
> ZFS storage pool version: features support (5000)
> Entropy harvesting: interrupts ethernet point_to_pointsha256: /kernel: No 
> such file or directory
>  kickstart.
> Starting file system checks:
> /dev/xbd0p2: FILE SYSTEM CLEAN; SKIPPING CHECKS
> /dev/xbd0p2: clean, 2213647 free (17111 frags, 274567 blocks, 0.4% 
> fragmentation)
> Mounting local file systems:.
> Writing entropy file:.
> xn0: link state changed to DOWN
> xn0: link state changed to UP
> Starting Network: lo0 xn0.
> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
>       options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
>       inet6 ::1 prefixlen 128
>       inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
>       inet 127.0.0.1 netmask 0xff000000
>       nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
> xn0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>       options=503<RXCSUM,TXCSUM,TSO4,LRO>
>       ether 00:16:3e:0b:a4:b1
>       nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>       media: Ethernet manual
>       status: active
> Starting devd.
> Starting dhclient.
> DHCPDISCOVER on xn0 to 255.255.255.255 port 67 interval 7
> DHCPOFFER from 172.16.1.1
> DHCPREQUEST on xn0 to 255.255.255.255 port 67
> DHCPACK from 172.16.1.1
> bound to 172.16.1.149 -- renewal in 43200 seconds.
> add net ::ffff:0.0.0.0: gateway ::1
> add net ::0.0.0.0: gateway ::1
> add net fe80::: gateway ::1
> add net ff02::: gateway ::1
> ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib
> 32-bit compatibility ldconfig path: /usr/lib32
> Creating and/or trimming log files.
> Starting syslogd.
> No core dumps found.
> lock order reversal:
>  1st 0xfffffe012e861e28 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:3050
>  2nd 0xfffff80005b87c00 dirhash (dirhash) @ 
> /usr/src/sys/ufs/ufs/ufs_dirhash.c:284
> KDB: stack backtrace:
> X_db_symbol_values() at X_db_symbol_values+0x10b/frame 0xfffffe012fb8c410
> kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe012fb8c4c0
> witness_checkorder() at witness_checkorder+0xd23/frame 0xfffffe012fb8c550
> _sx_xlock() at _sx_xlock+0x75/frame 0xfffffe012fb8c590
> ufsdirhash_add() at ufsdirhash_add+0x3b/frame 0xfffffe012fb8c5d0
> ufs_direnter() at ufs_direnter+0x688/frame 0xfffffe012fb8c690
> ufs_vinit() at ufs_vinit+0x33f3/frame 0xfffffe012fb8c890
> VOP_MKDIR_APV() at VOP_MKDIR_APV+0xf0/frame 0xfffffe012fb8c8c0
> kern_mkdirat() at kern_mkdirat+0x1ff/frame 0xfffffe012fb8cae0
> amd64_syscall() at amd64_syscall+0x265/frame 0xfffffe012fb8cbf0
> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe012fb8cbf0
> --- syscall (136, FreeBSD ELF64, sys_mkdir), rip = 0x80092faaa, rsp = 
> 0x7fffffffd788, rbp = 0x7fffffffdc70 ---
> Clearing /tmp (X related).
> Updating motd:.
> Configuring syscons: keymap blanktime.
> Performing sanity check on sshd configuration.
> Starting sshd.
> Starting cron.
> Starting background file system checks in 60 seconds.
> 
> Mon Oct 28 13:22:52 CET 2013
> 
> FreeBSD/amd64 (Amnesiac) (xc0)

> >From 16de1566ada65e5838105870df576ab8258ed8b6 Mon Sep 17 00:00:00 2001
> From: Roger Pau Monne <roger.pau@xxxxxxxxxx>
> Date: Mon, 14 Oct 2013 18:33:17 +0200
> Subject: [PATCH] Xen x86 PVH support
> 
> This is still very experimental, and PVH support has not yet been
> merged into upstream Xen.
> 
> PVH mode is basically a PV guest inside an HVM container, and shares
> a great amount of code with PVHVM. The main difference is the way the
> guest is started, PVH uses the PV start sequence, jumping directly
> into the kernel entry point in long mode and with page tables set.
> The main work of this patch consists in setting the environment as
> similar as possible to what native FreeBSD expects, and then adding
> hooks to the PV ops when necessary.
> 
> sys/amd64/amd64/locore.S:
>  * Add PV entry point, hypervisor_page and the necessary elfnotes.
> 
> sys/amd64/amd64/machdep.c:
>  * Add hooks to replace bare metal operations that should use a PV
>   helper, this includes:
>    - Preload metadata
>    - i8254_init and i8254_delay
>    - Fetching the e820 memory map
>    - Reserve of the MP bootstrap region
> 
>  * Create a DELAY function that uses the PV hooks.
>  * Introduce a new hammer_time_xen that sets the necessary stuff when
>    running in PVH mode.
> 
> sys/amd64/amd64/mp_machdep.c:
>  * Introduce a hook to replace start_all_aps.
>  * Introduce a lapic_disabled variable to prevent polluting the code
>    with xen specific gates.
> 
> sys/amd64/include/asmacros.h:
>  * Copy the ELFNOTE macro from the i386 Xen PV port.
> 
> sys/amd64/include/clock.h:
> sys/i386/include/clock.h:
>  * Prototypes for the xen early delay initialization and usage.
> 
> sys/amd64/include/cpu.h:
>  * Introduce a new cpu hook to init APs.
> 
> sys/amd64/include/sysarch.h:
>  * Declare the init_ops structure.
> 
> sys/amd64/include/xen/hypercall.h:
> sys/i386/include/xen/hypercall.h
>  * Switch to the PV style hypercall mechanism for HVM also.
> 
> sys/conf/files:
>  * Make the PV console available on XENHVM also.
> 
> sys/conf/files.amd64:
>  * Include the new files for the PVH port.
> 
> sys/dev/xen/console/console.c:
> sys/dev/xen/console/xencons_ring.c:
>  * Gate the PV console attach so it is only used on PV ports.
>  * Use HYPERVISOR_start_info instead of xen_start_info.
>  * Use HYPERVISOR_event_channel_op to kick the event channel before
>    xen interrupts are setup.
> 
> sys/dev/xen/control/control.c:
>  * Use the PV shutdown on PVH.
> 
> sys/dev/xen/timer/timer.c:
>  * Pass a vcpu_info to xen_fetch_vcpu_time, this allows using this
>    function at very early init, before per-cpu vcpu_info is set.
>  * Remove critical_{enter/exit} from xen_fetch_vcpu_time so it can be
>    used at early boot, instead place them on the callers.
>  * Introduce two new functions, xen_delay_init and xen_delay that can
>    be used at early boot to implement the generic DELAY function.
> 
> sys/i386/i386/locore.s:
>  * Reserve space for the hypercall page.
> 
> sys/i386/i386/machdep.c:
>  * Create a generic DELAY function.
> 
> sys/i386/xen/xen_machdep.c:
>  * Set HYPERVISOR_start_info.
> 
> sys/x86/isa/clock.c:
>  * Rename the generic DELAY function to i8254_delay.
> 
> sys/x86/x86/delay.c:
>  * Put generic delay helpers here, get_tsc and delay_tc.
> 
> sys/x86/x86/local_apic.c:
>  * Prevent the local apic from attaching when running on PVH mode.
> 
> sys/x86/xen/hvm.c:
>  * Set the start_all_aps hook.
>  * Fix the setting of the hypercall page now that we are using the
>    same mechanism as the PV port.
>  * Initialize Xen CPU hooks for the PVH port.
>  * Introduce the xen_early_printf debug function, which prints
>    directly to the hypervisor console.
> 
> sys/x86/xen/mptable.c:
>  * Create a dummy PV CPU enumerator for the PVH port.
> 
> sys/x86/xen/pv.c:
>  * Implement the PV functions for the early boot hooks,
>    parse_preload_data and fetch_e820_map.
>  * Implement the PV function for the start_all_aps hook.
> 
> sys/x86/xen/pvcpu.c:
>  * Dummy Xen PV CPU device, that we use to set the per-cpu pc_device.
> 
> sys/xen/gnttab.c:
>  * Allocate resume_frames for the PVH port.
> 
> sys/xen/interface/arch-x86/xen.h:
>  * Interface change for the PVH port (not used on FreeBSD).
> 
> sys/xen/pv.h:
>  * Header that exports the specific PV functions.
> 
> sys/xen/xen-os.h:
>  * Declare prototypes for the newly added functions.
> 
> sys/xen/xenstore/xenstore.c:
>  * Make the xenstore driver hang from both xenpci and the nexus when
>    running XENHVM, this is because we don't have a xenpci device on
>    the PVH port.
>  * Gate xenstore addition to parent == xenpci on the HVM case.
> ---
>  sys/amd64/amd64/locore.S           |   53 ++++++++
>  sys/amd64/amd64/machdep.c          |  179 ++++++++++++++++++++++----
>  sys/amd64/amd64/mp_machdep.c       |   27 +++--
>  sys/amd64/include/asmacros.h       |   26 ++++
>  sys/amd64/include/clock.h          |    6 +
>  sys/amd64/include/cpu.h            |    1 +
>  sys/amd64/include/sysarch.h        |   19 +++
>  sys/amd64/include/xen/hypercall.h  |    7 -
>  sys/conf/files                     |    4 +-
>  sys/conf/files.amd64               |    4 +
>  sys/conf/files.i386                |    1 +
>  sys/dev/xen/console/console.c      |   23 +++-
>  sys/dev/xen/console/xencons_ring.c |   15 ++-
>  sys/dev/xen/control/control.c      |   37 +++---
>  sys/dev/xen/timer/timer.c          |   59 +++++++--
>  sys/i386/i386/locore.s             |    9 ++
>  sys/i386/i386/machdep.c            |    9 ++
>  sys/i386/include/clock.h           |    6 +
>  sys/i386/include/xen/hypercall.h   |    7 -
>  sys/i386/xen/xen_machdep.c         |    4 +-
>  sys/x86/isa/clock.c                |   53 +--------
>  sys/x86/x86/delay.c                |   95 ++++++++++++++
>  sys/x86/x86/local_apic.c           |    8 +-
>  sys/x86/xen/hvm.c                  |   93 ++++++++++----
>  sys/x86/xen/mptable.c              |  136 ++++++++++++++++++++
>  sys/x86/xen/pv.c                   |  247 
> ++++++++++++++++++++++++++++++++++++
>  sys/x86/xen/pvcpu.c                |   98 ++++++++++++++
>  sys/xen/gnttab.c                   |   21 +++-
>  sys/xen/interface/arch-x86/xen.h   |   11 ++-
>  sys/xen/pv.h                       |   29 ++++
>  sys/xen/xen-os.h                   |    8 +
>  sys/xen/xenstore/xenstore.c        |   32 ++++--
>  32 files changed, 1141 insertions(+), 186 deletions(-)
>  create mode 100644 sys/x86/x86/delay.c
>  create mode 100644 sys/x86/xen/mptable.c
>  create mode 100644 sys/x86/xen/pv.c
>  create mode 100644 sys/x86/xen/pvcpu.c
>  create mode 100644 sys/xen/pv.h
> 
> diff --git a/sys/amd64/amd64/locore.S b/sys/amd64/amd64/locore.S
> index 55cda3a..e04cc48 100644
> --- a/sys/amd64/amd64/locore.S
> +++ b/sys/amd64/amd64/locore.S
> @@ -31,6 +31,12 @@
>  #include <machine/pmap.h>
>  #include <machine/specialreg.h>
>  
> +#ifdef XENHVM
> +#include <xen/xen-os.h>
> +#define __ASSEMBLY__
> +#include <xen/interface/elfnote.h>
> +#endif
> +
>  #include "assym.s"
>  
>  /*
> @@ -86,3 +92,50 @@ NON_GPROF_ENTRY(btext)
>       ALIGN_DATA                      /* just to be sure */
>       .space  0x1000                  /* space for bootstack - temporary 
> stack */
>  bootstack:
> +
> +#ifdef XENHVM
> +/* Xen */
> +.section __xen_guest
> +     ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS,       .asciz, "FreeBSD")
> +     ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION,  .asciz, "HEAD")
> +     ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION,    .asciz, "xen-3.0")
> +     ELFNOTE(Xen, XEN_ELFNOTE_VIRT_BASE,      .quad,  KERNBASE)
> +     ELFNOTE(Xen, XEN_ELFNOTE_PADDR_OFFSET,   .quad,  KERNBASE) /* Xen 
> honours elf->p_paddr; compensate for this */
> +     ELFNOTE(Xen, XEN_ELFNOTE_ENTRY,          .quad,  xen_start)
> +     ELFNOTE(Xen, XEN_ELFNOTE_HYPERCALL_PAGE, .quad,  hypercall_page)
> +     ELFNOTE(Xen, XEN_ELFNOTE_HV_START_LOW,   .quad,  HYPERVISOR_VIRT_START)
> +     ELFNOTE(Xen, XEN_ELFNOTE_FEATURES,       .asciz, 
> "writable_descriptor_tables|auto_translated_physmap|supervisor_mode_kernel|hvm_callback_vector")
> +     ELFNOTE(Xen, XEN_ELFNOTE_PAE_MODE,       .asciz, "yes")
> +     ELFNOTE(Xen, XEN_ELFNOTE_L1_MFN_VALID,   .long,  PG_V, PG_V)
> +     ELFNOTE(Xen, XEN_ELFNOTE_LOADER,         .asciz, "generic")
> +     ELFNOTE(Xen, XEN_ELFNOTE_SUSPEND_CANCEL, .long,  0)
> +     ELFNOTE(Xen, XEN_ELFNOTE_BSD_SYMTAB,     .asciz, "yes")
> +
> +     .text
> +.p2align PAGE_SHIFT, 0x90    /* Hypercall_page needs to be PAGE aligned */
> +
> +NON_GPROF_ENTRY(hypercall_page)
> +     .skip   0x1000, 0x90    /* Fill with "nop"s */
> +
> +NON_GPROF_ENTRY(xen_start)
> +     /* Don't trust what the loader gives for rflags. */
> +     pushq   $PSL_KERNEL
> +     popfq
> +
> +     /* Parameters for the xen init function */
> +     movq    %rsi, %rdi              /* shared_info (arg 1) */
> +     movq    %rsp, %rsi              /* xenstack    (arg 2) */
> +
> +     /* Use our own stack */
> +     movq    $bootstack,%rsp
> +     xorl    %ebp, %ebp
> +
> +     /* u_int64_t hammer_time_xen(start_info_t *si, u_int64_t xenstack); */
> +     call    hammer_time_xen
> +     movq    %rax, %rsp              /* set up kstack for mi_startup() */
> +     call    mi_startup              /* autoconfiguration, mountroot etc */
> +
> +     /* NOTREACHED */
> +0:   hlt
> +     jmp     0b
> +#endif
> diff --git a/sys/amd64/amd64/machdep.c b/sys/amd64/amd64/machdep.c
> index 2b2e47f..b649def 100644
> --- a/sys/amd64/amd64/machdep.c
> +++ b/sys/amd64/amd64/machdep.c
> @@ -127,6 +127,7 @@ __FBSDID("$FreeBSD$");
>  #include <machine/reg.h>
>  #include <machine/sigframe.h>
>  #include <machine/specialreg.h>
> +#include <machine/sysarch.h>
>  #ifdef PERFMON
>  #include <machine/perfmon.h>
>  #endif
> @@ -147,10 +148,20 @@ __FBSDID("$FreeBSD$");
>  #include <isa/isareg.h>
>  #include <isa/rtc.h>
>  
> +#ifdef XENHVM
> +/* Xen */
> +#include <xen/xen-os.h>
> +#include <xen/hvm.h>
> +#include <xen/pv.h>
> +#endif
> +
>  /* Sanity check for __curthread() */
>  CTASSERT(offsetof(struct pcpu, pc_curthread) == 0);
>  
>  extern u_int64_t hammer_time(u_int64_t, u_int64_t);
> +#ifdef XENHVM
> +extern u_int64_t hammer_time_xen(start_info_t *, u_int64_t);
> +#endif
>  
>  extern void printcpuinfo(void);      /* XXX header file */
>  extern void identify_cpu(void);
> @@ -166,6 +177,23 @@ static int  set_fpcontext(struct thread *td, const 
> mcontext_t *mcp,
>      char *xfpustate, size_t xfpustate_len);
>  SYSINIT(cpu, SI_SUB_CPU, SI_ORDER_FIRST, cpu_startup, NULL);
>  
> +/* Preload data parse function */
> +static caddr_t native_parse_preload_data(u_int64_t);
> +
> +/* Native function to fetch the e820 map */
> +static void native_fetch_e820_map(caddr_t, struct bios_smap **, u_int32_t *);
> +
> +/* Default init_ops implementation. */
> +struct init_ops init_ops = {
> +     .parse_preload_data =   native_parse_preload_data,
> +     .early_delay_init =     i8254_init,
> +     .early_delay =          i8254_delay,
> +     .fetch_e820_map =       native_fetch_e820_map,
> +#ifdef SMP
> +     .mp_bootaddress =       mp_bootaddress,
> +#endif
> +};
> +
>  /*
>   * The file "conf/ldscript.amd64" defines the symbol "kernphys".  Its value 
> is
>   * the physical address at which the kernel is loaded.
> @@ -216,6 +244,15 @@ struct mem_range_softc mem_range_softc;
>  
>  struct mtx dt_lock;  /* lock for GDT and LDT */
>  
> +void
> +DELAY(int n)
> +{
> +     if (delay_tc(n))
> +             return;
> +
> +     init_ops.early_delay(n);
> +}
> +
>  static void
>  cpu_startup(dummy)
>       void *dummy;
> @@ -1408,6 +1445,24 @@ add_smap_entry(struct bios_smap *smap, vm_paddr_t 
> *physmap, int *physmap_idxp)
>       return (1);
>  }
>  
> +static void
> +native_fetch_e820_map(caddr_t kmdp, struct bios_smap **smap, u_int32_t *size)
> +{
> +     /*
> +      * get memory map from INT 15:E820, kindly supplied by the
> +      * loader.
> +      *
> +      * subr_module.c says:
> +      * "Consumer may safely assume that size value precedes data."
> +      * ie: an int32_t immediately precedes smap.
> +      */
> +     *smap = (struct bios_smap *)preload_search_info(kmdp,
> +         MODINFO_METADATA | MODINFOMD_SMAP);
> +     if (*smap == NULL)
> +             panic("No BIOS smap info from loader!");
> +     *size = *((u_int32_t *)*smap - 1);
> +}
> +
>  /*
>   * Populate the (physmap) array with base/bound pairs describing the
>   * available physical memory in the system, then test this memory and
> @@ -1433,19 +1488,8 @@ getmemsize(caddr_t kmdp, u_int64_t first)
>       basemem = 0;
>       physmap_idx = 0;
>  
> -     /*
> -      * get memory map from INT 15:E820, kindly supplied by the loader.
> -      *
> -      * subr_module.c says:
> -      * "Consumer may safely assume that size value precedes data."
> -      * ie: an int32_t immediately precedes smap.
> -      */
> -     smapbase = (struct bios_smap *)preload_search_info(kmdp,
> -         MODINFO_METADATA | MODINFOMD_SMAP);
> -     if (smapbase == NULL)
> -             panic("No BIOS smap info from loader!");
> +     init_ops.fetch_e820_map(kmdp, &smapbase, &smapsize);
>  
> -     smapsize = *((u_int32_t *)smapbase - 1);
>       smapend = (struct bios_smap *)((uintptr_t)smapbase + smapsize);
>  
>       for (smap = smapbase; smap < smapend; smap++)
> @@ -1467,7 +1511,8 @@ getmemsize(caddr_t kmdp, u_int64_t first)
>  
>  #ifdef SMP
>       /* make hole for AP bootstrap code */
> -     physmap[1] = mp_bootaddress(physmap[1] / 1024);
> +     if (init_ops.mp_bootaddress)
> +             physmap[1] = init_ops.mp_bootaddress(physmap[1] / 1024);
>  #endif
>  
>       /*
> @@ -1681,6 +1726,98 @@ do_next:
>       msgbufp = (struct msgbuf *)PHYS_TO_DMAP(phys_avail[pa_indx]);
>  }
>  
> +static caddr_t
> +native_parse_preload_data(u_int64_t modulep)
> +{
> +     caddr_t kmdp;
> +
> +     preload_metadata = (caddr_t)(uintptr_t)(modulep + KERNBASE);
> +     preload_bootstrap_relocate(KERNBASE);
> +     kmdp = preload_search_by_type("elf kernel");
> +     if (kmdp == NULL)
> +             kmdp = preload_search_by_type("elf64 kernel");
> +     boothowto = MD_FETCH(kmdp, MODINFOMD_HOWTO, int);
> +     kern_envp = MD_FETCH(kmdp, MODINFOMD_ENVP, char *) + KERNBASE;
> +#ifdef DDB
> +     ksym_start = MD_FETCH(kmdp, MODINFOMD_SSYM, uintptr_t);
> +     ksym_end = MD_FETCH(kmdp, MODINFOMD_ESYM, uintptr_t);
> +#endif
> +
> +     return (kmdp);
> +}
> +
> +#ifdef XENHVM
> +/*
> + * First function called by the Xen PVH boot sequence.
> + *
> + * Set some Xen global variables and prepare the environment so it is
> + * as similar as possible to what native FreeBSD init function expects.
> + */
> +u_int64_t
> +hammer_time_xen(start_info_t *si, u_int64_t xenstack)
> +{
> +     u_int64_t physfree;
> +     u_int64_t *PT4 = (u_int64_t *)xenstack;
> +     u_int64_t *PT3 = (u_int64_t *)(xenstack + PAGE_SIZE);
> +     u_int64_t *PT2 = (u_int64_t *)(xenstack + 2 * PAGE_SIZE);
> +     int i;
> +
> +     KASSERT((si != NULL && xenstack != 0),
> +             ("invalid start_info or xenstack"));
> +
> +     xen_early_printf("FreeBSD PVH running on %s\n", si->magic);
> +
> +     /* We use 3 pages of xen stack for the boot pagetables */
> +     physfree = xenstack + 3 * PAGE_SIZE - KERNBASE;
> +
> +     /* Setup Xen global variables */
> +     HYPERVISOR_start_info = si;
> +     HYPERVISOR_shared_info =
> +             (shared_info_t *)(si->shared_info + KERNBASE);
> +
> +     /*
> +      * Setup some misc global variables for Xen devices
> +      *
> +      * XXX: devices that need this specific variables should
> +      *      be rewritten to fetch this info by themselves from the
> +      *      start_info page.
> +      */
> +     console_page =
> +             (char *)(ptoa(si->console.domU.mfn) + KERNBASE);
> +     xen_store = (struct xenstore_domain_interface *)
> +                 (ptoa(si->store_mfn) + KERNBASE);
> +
> +     xen_domain_type = XEN_PV_DOMAIN;
> +     vm_guest = VM_GUEST_XEN;
> +
> +     /*
> +      * Use the stack Xen gives us to build the page tables
> +      * as native FreeBSD expects to find them (created
> +      * by the boot trampoline).
> +      */
> +     for (i = 0; i < 512; i++) {
> +             /* Each slot of the level 4 pages points to the same level 3 
> page */
> +             PT4[i] = ((u_int64_t)&PT3[0]) - KERNBASE;
> +             PT4[i] |= PG_V | PG_RW | PG_U;
> +
> +             /* Each slot of the level 3 pages points to the same level 2 
> page */
> +             PT3[i] = ((u_int64_t)&PT2[0]) - KERNBASE;
> +             PT3[i] |= PG_V | PG_RW | PG_U;
> +
> +             /* The level 2 page slots are mapped with 2MB pages for 1GB. */
> +             PT2[i] = i * (2 * 1024 * 1024);
> +             PT2[i] |= PG_V | PG_RW | PG_PS | PG_U;
> +     }
> +     load_cr3(((u_int64_t)&PT4[0]) - KERNBASE);
> +
> +     /* Set the hooks for early functions that diverge from bare metal */
> +     xen_pv_set_init_ops();
> +
> +     /* Now we can jump into the native init function */
> +     return hammer_time(0, physfree);
> +}
> +#endif
> +
>  u_int64_t
>  hammer_time(u_int64_t modulep, u_int64_t physfree)
>  {
> @@ -1705,17 +1842,7 @@ hammer_time(u_int64_t modulep, u_int64_t physfree)
>        */
>       proc_linkup0(&proc0, &thread0);
>  
> -     preload_metadata = (caddr_t)(uintptr_t)(modulep + KERNBASE);
> -     preload_bootstrap_relocate(KERNBASE);
> -     kmdp = preload_search_by_type("elf kernel");
> -     if (kmdp == NULL)
> -             kmdp = preload_search_by_type("elf64 kernel");
> -     boothowto = MD_FETCH(kmdp, MODINFOMD_HOWTO, int);
> -     kern_envp = MD_FETCH(kmdp, MODINFOMD_ENVP, char *) + KERNBASE;
> -#ifdef DDB
> -     ksym_start = MD_FETCH(kmdp, MODINFOMD_SSYM, uintptr_t);
> -     ksym_end = MD_FETCH(kmdp, MODINFOMD_ESYM, uintptr_t);
> -#endif
> +     kmdp = init_ops.parse_preload_data(modulep);
>  
>       /* Init basic tunables, hz etc */
>       init_param1();
> @@ -1799,10 +1926,10 @@ hammer_time(u_int64_t modulep, u_int64_t physfree)
>       lidt(&r_idt);
>  
>       /*
> -      * Initialize the i8254 before the console so that console
> +      * Initialize the early delay before the console so that console
>        * initialization can use DELAY().
>        */
> -     i8254_init();
> +     init_ops.early_delay_init();
>  
>       /*
>        * Initialize the console before we print anything out.
> diff --git a/sys/amd64/amd64/mp_machdep.c b/sys/amd64/amd64/mp_machdep.c
> index 4ef4b3d..44c2a45 100644
> --- a/sys/amd64/amd64/mp_machdep.c
> +++ b/sys/amd64/amd64/mp_machdep.c
> @@ -90,7 +90,8 @@ extern  struct pcpu __pcpu[];
>  
>  /* AP uses this during bootstrap.  Do not staticize.  */
>  char *bootSTK;
> -static int bootAP;
> +int bootAP;
> +bool lapic_disabled = false;
>  
>  /* Free these after use */
>  void *bootstacks[MAXCPU];
> @@ -122,9 +123,12 @@ u_long *ipi_rendezvous_counts[MAXCPU];
>  static u_long *ipi_hardclock_counts[MAXCPU];
>  #endif
>  
> +int native_start_all_aps(void);
> +
>  /* Default cpu_ops implementation. */
>  struct cpu_ops cpu_ops = {
> -     .ipi_vectored = lapic_ipi_vectored
> +     .ipi_vectored = lapic_ipi_vectored,
> +     .start_all_aps = native_start_all_aps,
>  };
>  
>  extern inthand_t IDTVEC(fast_syscall), IDTVEC(fast_syscall32);
> @@ -138,7 +142,7 @@ extern int pmap_pcid_enabled;
>  static volatile cpuset_t ipi_nmi_pending;
>  
>  /* used to hold the AP's until we are ready to release them */
> -static struct mtx ap_boot_mtx;
> +struct mtx ap_boot_mtx;
>  
>  /* Set to 1 once we're ready to let the APs out of the pen. */
>  static volatile int aps_ready = 0;
> @@ -165,7 +169,6 @@ static int cpu_cores;                     /* cores per 
> package */
>  
>  static void  assign_cpu_ids(void);
>  static void  set_interrupt_apic_ids(void);
> -static int   start_all_aps(void);
>  static int   start_ap(int apic_id);
>  static void  release_aps(void *dummy);
>  
> @@ -569,7 +572,7 @@ cpu_mp_start(void)
>       assign_cpu_ids();
>  
>       /* Start each Application Processor */
> -     start_all_aps();
> +     cpu_ops.start_all_aps();
>  
>       set_interrupt_apic_ids();
>  }
> @@ -707,7 +710,8 @@ init_secondary(void)
>       wrmsr(MSR_SF_MASK, PSL_NT|PSL_T|PSL_I|PSL_C|PSL_D);
>  
>       /* Disable local APIC just to be sure. */
> -     lapic_disable();
> +     if (!lapic_disabled)
> +             lapic_disable();
>  
>       /* signal our startup to the BSP. */
>       mp_naps++;
> @@ -733,7 +737,7 @@ init_secondary(void)
>  
>       /* A quick check from sanity claus */
>       cpuid = PCPU_GET(cpuid);
> -     if (PCPU_GET(apic_id) != lapic_id()) {
> +     if (!lapic_disabled && PCPU_GET(apic_id) != lapic_id()) {
>               printf("SMP: cpuid = %d\n", cpuid);
>               printf("SMP: actual apic_id = %d\n", lapic_id());
>               printf("SMP: correct apic_id = %d\n", PCPU_GET(apic_id));
> @@ -749,7 +753,8 @@ init_secondary(void)
>       mtx_lock_spin(&ap_boot_mtx);
>  
>       /* Init local apic for irq's */
> -     lapic_setup(1);
> +     if (!lapic_disabled)
> +             lapic_setup(1);
>  
>       /* Set memory range attributes for this CPU to match the BSP */
>       mem_range_AP_init();
> @@ -764,7 +769,7 @@ init_secondary(void)
>       if (cpu_logical > 1 && PCPU_GET(apic_id) % cpu_logical != 0)
>               CPU_SET(cpuid, &logical_cpus_mask);
>  
> -     if (bootverbose)
> +     if (!lapic_disabled && bootverbose)
>               lapic_dump("AP");
>  
>       if (smp_cpus == mp_ncpus) {
> @@ -908,8 +913,8 @@ assign_cpu_ids(void)
>  /*
>   * start each AP in our list
>   */
> -static int
> -start_all_aps(void)
> +int
> +native_start_all_aps(void)
>  {
>       vm_offset_t va = boot_address + KERNBASE;
>       u_int64_t *pt4, *pt3, *pt2;
> diff --git a/sys/amd64/include/asmacros.h b/sys/amd64/include/asmacros.h
> index 1fb592a..ce8dce4 100644
> --- a/sys/amd64/include/asmacros.h
> +++ b/sys/amd64/include/asmacros.h
> @@ -201,4 +201,30 @@
>  
>  #endif /* LOCORE */
>  
> +#ifdef __STDC__
> +#define ELFNOTE(name, type, desctype, descdata...) \
> +.pushsection .note.name                 ;       \
> +  .align 4                              ;       \
> +  .long 2f - 1f         /* namesz */    ;       \
> +  .long 4f - 3f         /* descsz */    ;       \
> +  .long type                            ;       \
> +1:.asciz #name                          ;       \
> +2:.align 4                              ;       \
> +3:desctype descdata                     ;       \
> +4:.align 4                              ;       \
> +.popsection
> +#else /* !__STDC__, i.e. -traditional */
> +#define ELFNOTE(name, type, desctype, descdata) \
> +.pushsection .note.name                 ;       \
> +  .align 4                              ;       \
> +  .long 2f - 1f         /* namesz */    ;       \
> +  .long 4f - 3f         /* descsz */    ;       \
> +  .long type                            ;       \
> +1:.asciz "name"                         ;       \
> +2:.align 4                              ;       \
> +3:desctype descdata                     ;       \
> +4:.align 4                              ;       \
> +.popsection
> +#endif /* __STDC__ */
> +
>  #endif /* !_MACHINE_ASMACROS_H_ */
> diff --git a/sys/amd64/include/clock.h b/sys/amd64/include/clock.h
> index d7f7d82..e7817ab 100644
> --- a/sys/amd64/include/clock.h
> +++ b/sys/amd64/include/clock.h
> @@ -25,6 +25,12 @@ extern int smp_tsc;
>  #endif
>  
>  void i8254_init(void);
> +void i8254_delay(int);
> +#ifdef XENHVM
> +void xen_delay_init(void);
> +void xen_delay(int);
> +#endif
> +int  delay_tc(int);
>  
>  /*
>   * Driver to clock driver interface.
> diff --git a/sys/amd64/include/cpu.h b/sys/amd64/include/cpu.h
> index 3d9ff531..ed9f1db 100644
> --- a/sys/amd64/include/cpu.h
> +++ b/sys/amd64/include/cpu.h
> @@ -64,6 +64,7 @@ struct cpu_ops {
>       void (*cpu_init)(void);
>       void (*cpu_resume)(void);
>       void (*ipi_vectored)(u_int, int);
> +     int  (*start_all_aps)(void);
>  };
>  
>  extern struct        cpu_ops cpu_ops;
> diff --git a/sys/amd64/include/sysarch.h b/sys/amd64/include/sysarch.h
> index cd380d4..27fd3ba 100644
> --- a/sys/amd64/include/sysarch.h
> +++ b/sys/amd64/include/sysarch.h
> @@ -4,3 +4,22 @@
>  /* $FreeBSD$ */
>  
>  #include <x86/sysarch.h>
> +
> +#include <machine/pc/bios.h>
> +/*
> + * Struct containing pointers to init functions whose
> + * implementation is run time selectable.  Selection can be made,
> + * for example, based on detection of a BIOS variant or
> + * hypervisor environment.
> + */
> +struct init_ops {
> +     caddr_t (*parse_preload_data)(u_int64_t);
> +     void    (*early_delay_init)(void);
> +     void    (*early_delay)(int);
> +     void    (*fetch_e820_map)(caddr_t, struct bios_smap **, u_int32_t *);
> +#ifdef SMP
> +     u_int   (*mp_bootaddress)(u_int);
> +#endif
> +};
> +
> +extern struct init_ops init_ops;
> diff --git a/sys/amd64/include/xen/hypercall.h 
> b/sys/amd64/include/xen/hypercall.h
> index a1b2a5c..499fb4d 100644
> --- a/sys/amd64/include/xen/hypercall.h
> +++ b/sys/amd64/include/xen/hypercall.h
> @@ -51,15 +51,8 @@
>  #define CONFIG_XEN_COMPAT    0x030002
>  #define __must_check
>  
> -#ifdef XEN
>  #define HYPERCALL_STR(name)                                  \
>       "call hypercall_page + ("STR(__HYPERVISOR_##name)" * 32)"
> -#else
> -#define HYPERCALL_STR(name)                                  \
> -     "mov $("STR(__HYPERVISOR_##name)" * 32),%%eax; "\
> -     "add hypercall_stubs(%%rip),%%rax; "                    \
> -     "call *%%rax"
> -#endif
>  
>  #define _hypercall0(type, name)                      \
>  ({                                           \
> diff --git a/sys/conf/files b/sys/conf/files
> index f3e298c..6040447 100644
> --- a/sys/conf/files
> +++ b/sys/conf/files
> @@ -2508,8 +2508,8 @@ dev/xe/if_xe_pccard.c           optional xe pccard
>  dev/xen/balloon/balloon.c    optional xen | xenhvm
>  dev/xen/blkfront/blkfront.c  optional xen | xenhvm
>  dev/xen/blkback/blkback.c    optional xen | xenhvm
> -dev/xen/console/console.c    optional xen
> -dev/xen/console/xencons_ring.c       optional xen
> +dev/xen/console/console.c    optional xen | xenhvm
> +dev/xen/console/xencons_ring.c       optional xen | xenhvm
>  dev/xen/control/control.c    optional xen | xenhvm
>  dev/xen/netback/netback.c    optional xen | xenhvm
>  dev/xen/netfront/netfront.c  optional xen | xenhvm
> diff --git a/sys/conf/files.amd64 b/sys/conf/files.amd64
> index 1914c48..bd52e8f 100644
> --- a/sys/conf/files.amd64
> +++ b/sys/conf/files.amd64
> @@ -554,5 +554,9 @@ x86/x86/mptable_pci.c             optional        mptable 
> pci
>  x86/x86/msi.c                        optional        pci
>  x86/x86/nexus.c                      standard
>  x86/x86/tsc.c                        standard
> +x86/x86/delay.c                      standard
>  x86/xen/hvm.c                        optional        xenhvm
>  x86/xen/xen_intr.c           optional        xen | xenhvm
> +x86/xen/mptable.c            optional        xenhvm
> +x86/xen/pvcpu.c                      optional        xenhvm
> +x86/xen/pv.c                 optional        xenhvm
> diff --git a/sys/conf/files.i386 b/sys/conf/files.i386
> index e259659..15a3aae 100644
> --- a/sys/conf/files.i386
> +++ b/sys/conf/files.i386
> @@ -577,5 +577,6 @@ x86/x86/mptable_pci.c             optional apic native pci
>  x86/x86/msi.c                        optional apic pci
>  x86/x86/nexus.c                      standard
>  x86/x86/tsc.c                        standard
> +x86/x86/delay.c                      standard
>  x86/xen/hvm.c                        optional xenhvm
>  x86/xen/xen_intr.c           optional xen | xenhvm
> diff --git a/sys/dev/xen/console/console.c b/sys/dev/xen/console/console.c
> index 65a0e7d..86dc2a4 100644
> --- a/sys/dev/xen/console/console.c
> +++ b/sys/dev/xen/console/console.c
> @@ -69,11 +69,14 @@ struct mtx              cn_mtx;
>  static char wbuf[WBUF_SIZE];
>  static char rbuf[RBUF_SIZE];
>  static int rc, rp;
> -static unsigned int cnsl_evt_reg;
> +unsigned int cnsl_evt_reg;
>  static unsigned int wc, wp; /* write_cons, write_prod */
>  xen_intr_handle_t xen_intr_handle;
>  device_t xencons_dev;
>  
> +/* Virt address of the shared console page */
> +char *console_page;
> +
>  #ifdef KDB
>  static int   xc_altbrk;
>  #endif
> @@ -113,6 +116,9 @@ static struct ttydevsw xc_ttydevsw = {
>  static void
>  xc_cnprobe(struct consdev *cp)
>  {
> +     if (!xen_pv_domain())
> +             return;
> +
>       cp->cn_pri = CN_REMOTE;
>       sprintf(cp->cn_name, "%s0", driver_name);
>  }
> @@ -175,7 +181,7 @@ static void
>  xc_cnputc(struct consdev *dev, int c)
>  {
>  
> -     if (xen_start_info->flags & SIF_INITDOMAIN)
> +     if (HYPERVISOR_start_info->flags & SIF_INITDOMAIN)
>               xc_cnputc_dom0(dev, c);
>       else
>               xc_cnputc_domu(dev, c);
> @@ -206,8 +212,7 @@ xcons_putc(int c)
>               xcons_force_flush();
>  #endif               
>       }
> -     if (cnsl_evt_reg)
> -             __xencons_tx_flush();
> +     __xencons_tx_flush();
>       
>       /* inform start path that we're pretty full */
>       return ((wp - wc) >= WBUF_SIZE - 100) ? TRUE : FALSE;
> @@ -217,6 +222,10 @@ static void
>  xc_identify(driver_t *driver, device_t parent)
>  {
>       device_t child;
> +
> +     if (!xen_pv_domain())
> +             return;
> +
>       child = BUS_ADD_CHILD(parent, 0, driver_name, 0);
>       device_set_driver(child, driver);
>       device_set_desc(child, "Xen Console");
> @@ -245,7 +254,7 @@ xc_attach(device_t dev)
>       cnsl_evt_reg = 1;
>       callout_reset(&xc_callout, XC_POLLTIME, xc_timeout, xccons);
>      
> -     if (xen_start_info->flags & SIF_INITDOMAIN) {
> +     if (HYPERVISOR_start_info->flags & SIF_INITDOMAIN) {
>               error = xen_intr_bind_virq(dev, VIRQ_CONSOLE, 0, NULL,
>                                          xencons_priv_interrupt, NULL,
>                                          INTR_TYPE_TTY, &xen_intr_handle);
> @@ -309,7 +318,7 @@ __xencons_tx_flush(void)
>               sz = wp - wc;
>               if (sz > (WBUF_SIZE - WBUF_MASK(wc)))
>                       sz = WBUF_SIZE - WBUF_MASK(wc);
> -             if (xen_start_info->flags & SIF_INITDOMAIN) {
> +             if (HYPERVISOR_start_info->flags & SIF_INITDOMAIN) {
>                       HYPERVISOR_console_io(CONSOLEIO_write, sz, 
> &wbuf[WBUF_MASK(wc)]);
>                       wc += sz;
>               } else {
> @@ -424,7 +433,7 @@ xcons_force_flush(void)
>  {
>       int        sz;
>  
> -     if (xen_start_info->flags & SIF_INITDOMAIN)
> +     if (HYPERVISOR_start_info->flags & SIF_INITDOMAIN)
>               return;
>  
>       /* Spin until console data is flushed through to the domain controller. 
> */
> diff --git a/sys/dev/xen/console/xencons_ring.c 
> b/sys/dev/xen/console/xencons_ring.c
> index 3701551..3046498 100644
> --- a/sys/dev/xen/console/xencons_ring.c
> +++ b/sys/dev/xen/console/xencons_ring.c
> @@ -32,9 +32,9 @@ __FBSDID("$FreeBSD$");
>  
>  #define console_evtchn       console.domU.evtchn
>  xen_intr_handle_t console_handle;
> -extern char *console_page;
>  extern struct mtx              cn_mtx;
>  extern device_t xencons_dev;
> +extern int cnsl_evt_reg;
>  
>  static inline struct xencons_interface *
>  xencons_interface(void)
> @@ -60,6 +60,7 @@ xencons_ring_send(const char *data, unsigned len)
>       struct xencons_interface *intf; 
>       XENCONS_RING_IDX cons, prod;
>       int sent;
> +     struct evtchn_send send = { .port = 
> HYPERVISOR_start_info->console.domU.evtchn };
>  
>       intf = xencons_interface();
>       cons = intf->out_cons;
> @@ -76,7 +77,11 @@ xencons_ring_send(const char *data, unsigned len)
>       wmb();
>       intf->out_prod = prod;
>  
> -     xen_intr_signal(console_handle);
> +     if (cnsl_evt_reg)
> +             xen_intr_signal(console_handle);
> +     else
> +             HYPERVISOR_event_channel_op(EVTCHNOP_send, &send);
> +
>  
>       return sent;
>  
> @@ -125,11 +130,11 @@ xencons_ring_init(void)
>  {
>       int err;
>  
> -     if (!xen_start_info->console_evtchn)
> +     if (!HYPERVISOR_start_info->console_evtchn)
>               return 0;
>  
>       err = xen_intr_bind_local_port(xencons_dev,
> -         xen_start_info->console_evtchn, NULL, xencons_handle_input, NULL,
> +         HYPERVISOR_start_info->console_evtchn, NULL, xencons_handle_input, 
> NULL,
>           INTR_TYPE_MISC | INTR_MPSAFE, &console_handle);
>       if (err) {
>               return err;
> @@ -145,7 +150,7 @@ void
>  xencons_suspend(void)
>  {
>  
> -     if (!xen_start_info->console_evtchn)
> +     if (!HYPERVISOR_start_info->console_evtchn)
>               return;
>  
>       xen_intr_unbind(&console_handle);
> diff --git a/sys/dev/xen/control/control.c b/sys/dev/xen/control/control.c
> index a9f8d1b..35c923d 100644
> --- a/sys/dev/xen/control/control.c
> +++ b/sys/dev/xen/control/control.c
> @@ -317,21 +317,6 @@ xctrl_suspend()
>       EVENTHANDLER_INVOKE(power_resume);
>  }
>  
> -static void
> -xen_pv_shutdown_final(void *arg, int howto)
> -{
> -     /*
> -      * Inform the hypervisor that shutdown is complete.
> -      * This is not necessary in HVM domains since Xen
> -      * emulates ACPI in that mode and FreeBSD's ACPI
> -      * support will request this transition.
> -      */
> -     if (howto & (RB_HALT | RB_POWEROFF))
> -             HYPERVISOR_shutdown(SHUTDOWN_poweroff);
> -     else
> -             HYPERVISOR_shutdown(SHUTDOWN_reboot);
> -}
> -
>  #else
>  
>  /* HVM mode suspension. */
> @@ -447,6 +432,21 @@ xctrl_halt()
>       shutdown_nice(RB_HALT);
>  }
>  
> +static void
> +xen_pv_shutdown_final(void *arg, int howto)
> +{
> +     /*
> +      * Inform the hypervisor that shutdown is complete.
> +      * This is not necessary in HVM domains since Xen
> +      * emulates ACPI in that mode and FreeBSD's ACPI
> +      * support will request this transition.
> +      */
> +     if (howto & (RB_HALT | RB_POWEROFF))
> +             HYPERVISOR_shutdown(SHUTDOWN_poweroff);
> +     else
> +             HYPERVISOR_shutdown(SHUTDOWN_reboot);
> +}
> +
>  /*------------------------------ Event Reception 
> -----------------------------*/
>  static void
>  xctrl_on_watch_event(struct xs_watch *watch, const char **vec, unsigned int 
> len)
> @@ -529,10 +529,9 @@ xctrl_attach(device_t dev)
>       xctrl->xctrl_watch.callback_data = (uintptr_t)xctrl;
>       xs_register_watch(&xctrl->xctrl_watch);
>  
> -#ifndef XENHVM
> -     EVENTHANDLER_REGISTER(shutdown_final, xen_pv_shutdown_final, NULL,
> -                           SHUTDOWN_PRI_LAST);
> -#endif
> +     if (xen_pv_domain())
> +             EVENTHANDLER_REGISTER(shutdown_final, xen_pv_shutdown_final, 
> NULL,
> +                                   SHUTDOWN_PRI_LAST);
>  
>       return (0);
>  }
> diff --git a/sys/dev/xen/timer/timer.c b/sys/dev/xen/timer/timer.c
> index 824c75b..13bd852 100644
> --- a/sys/dev/xen/timer/timer.c
> +++ b/sys/dev/xen/timer/timer.c
> @@ -59,6 +59,9 @@ __FBSDID("$FreeBSD$");
>  #include <machine/_inttypes.h>
>  #include <machine/smp.h>
>  
> +/* For the declaration of clock_lock */
> +#include <isa/rtc.h>
> +
>  #include "clock_if.h"
>  
>  static devclass_t xentimer_devclass;
> @@ -234,18 +237,16 @@ xen_fetch_vcpu_tinfo(struct vcpu_time_info *dst, struct 
> vcpu_time_info *src)
>   *       it happens to be less than another CPU's previously determined 
> value.
>   */
>  static uint64_t
> -xen_fetch_vcpu_time(void)
> +xen_fetch_vcpu_time(struct vcpu_info *vcpu)
>  {
>       struct vcpu_time_info dst;
>       struct vcpu_time_info *src;
>       uint32_t pre_version;
>       uint64_t now;
>       volatile uint64_t last;
> -     struct vcpu_info *vcpu = DPCPU_GET(vcpu_info);
>  
>       src = &vcpu->time;
>  
> -     critical_enter();
>       do {
>               pre_version = xen_fetch_vcpu_tinfo(&dst, src);
>               barrier();
> @@ -266,16 +267,19 @@ xen_fetch_vcpu_time(void)
>               }
>       } while (!atomic_cmpset_64(&xen_timer_last_time, last, now));
>  
> -     critical_exit();
> -
>       return (now);
>  }
>  
>  static uint32_t
>  xentimer_get_timecount(struct timecounter *tc)
>  {
> +     uint32_t xen_time;
> +
> +     critical_enter();
> +     xen_time = (uint32_t)xen_fetch_vcpu_time(DPCPU_GET(vcpu_info)) & 
> UINT_MAX;
> +     critical_exit();
>  
> -     return ((uint32_t)xen_fetch_vcpu_time() & UINT_MAX);
> +     return xen_time;
>  }
>  
>  /**
> @@ -305,7 +309,12 @@ xen_fetch_wallclock(struct timespec *ts)
>  static void
>  xen_fetch_uptime(struct timespec *ts)
>  {
> -     uint64_t uptime = xen_fetch_vcpu_time();
> +     uint64_t uptime;
> +
> +     critical_enter();
> +     uptime = xen_fetch_vcpu_time(DPCPU_GET(vcpu_info));
> +     critical_exit();
> +
>       ts->tv_sec = uptime / NSEC_IN_SEC;
>       ts->tv_nsec = uptime % NSEC_IN_SEC;
>  }
> @@ -354,7 +363,7 @@ xentimer_intr(void *arg)
>       struct xentimer_softc *sc = (struct xentimer_softc *)arg;
>       struct xentimer_pcpu_data *pcpu = DPCPU_PTR(xentimer_pcpu);
>  
> -     pcpu->last_processed = xen_fetch_vcpu_time();
> +     pcpu->last_processed = xen_fetch_vcpu_time(DPCPU_GET(vcpu_info));
>       if (pcpu->timer != 0 && sc->et.et_active)
>               sc->et.et_event_cb(&sc->et, sc->et.et_arg);
>  
> @@ -415,7 +424,9 @@ xentimer_et_start(struct eventtimer *et,
>       do {
>               if (++i == 60)
>                       panic("can't schedule timer");
> -             next_time = xen_fetch_vcpu_time() + first_in_ns;
> +             critical_enter();
> +             next_time = xen_fetch_vcpu_time(DPCPU_GET(vcpu_info)) + 
> first_in_ns;
> +             critical_exit();
>               error = xentimer_vcpu_start_timer(cpu, next_time);
>       } while (error == -ETIME);
>  
> @@ -573,6 +584,36 @@ xentimer_suspend(device_t dev)
>       return (0);
>  }
>  
> +/*
> + * Xen delay early init
> + */
> +void xen_delay_init(void)
> +{
> +     /* Init the clock lock */
> +     mtx_init(&clock_lock, "clk", NULL, MTX_SPIN | MTX_NOPROFILE);
> +}
> +/*
> + * Xen PV DELAY function
> + *
> + * When running on PVH mode we don't have an emulated i8524, so
> + * make use of the Xen time info in order to code a simple DELAY
> + * function that can be used during early boot.
> + */
> +void xen_delay(int n)
> +{
> +     uint64_t end_ns;
> +     uint64_t current;
> +
> +     end_ns = xen_fetch_vcpu_time(&HYPERVISOR_shared_info->vcpu_info[0]);
> +     end_ns += n * NSEC_IN_USEC;
> +
> +     for (;;) {
> +             current = 
> xen_fetch_vcpu_time(&HYPERVISOR_shared_info->vcpu_info[0]);
> +             if (current >= end_ns)
> +                     break;
> +     }
> +}
> +
>  static device_method_t xentimer_methods[] = {
>       DEVMETHOD(device_identify, xentimer_identify),
>       DEVMETHOD(device_probe, xentimer_probe),
> diff --git a/sys/i386/i386/locore.s b/sys/i386/i386/locore.s
> index 68cb430..bd136b1 100644
> --- a/sys/i386/i386/locore.s
> +++ b/sys/i386/i386/locore.s
> @@ -898,3 +898,12 @@ done_pde:
>  #endif
>  
>       ret
> +
> +#ifdef XENHVM
> +/* Xen Hypercall page */
> +     .text
> +.p2align PAGE_SHIFT, 0x90    /* Hypercall_page needs to be PAGE aligned */
> +
> +NON_GPROF_ENTRY(hypercall_page)
> +     .skip   0x1000, 0x90    /* Fill with "nop"s */
> +#endif
> diff --git a/sys/i386/i386/machdep.c b/sys/i386/i386/machdep.c
> index c430316..8bd9a8e 100644
> --- a/sys/i386/i386/machdep.c
> +++ b/sys/i386/i386/machdep.c
> @@ -254,6 +254,15 @@ struct mtx icu_lock;
>  
>  struct mem_range_softc mem_range_softc;
>  
> +void
> +DELAY(int n)
> +{
> +     if (delay_tc(n))
> +             return;
> +
> +     i8254_delay(n);
> +}
> +
>  static void
>  cpu_startup(dummy)
>       void *dummy;
> diff --git a/sys/i386/include/clock.h b/sys/i386/include/clock.h
> index d980ec7..287b2c8 100644
> --- a/sys/i386/include/clock.h
> +++ b/sys/i386/include/clock.h
> @@ -22,6 +22,12 @@ extern int tsc_is_invariant;
>  extern int   tsc_perf_stat;
>  
>  void i8254_init(void);
> +void i8254_delay(int);
> +#ifdef XENHVM
> +void xen_delay_init(void);
> +void xen_delay(int);
> +#endif
> +int  delay_tc(int);
>  
>  /*
>   * Driver to clock driver interface.
> diff --git a/sys/i386/include/xen/hypercall.h 
> b/sys/i386/include/xen/hypercall.h
> index edc13f4..1c15b0f 100644
> --- a/sys/i386/include/xen/hypercall.h
> +++ b/sys/i386/include/xen/hypercall.h
> @@ -40,15 +40,8 @@
>  #define CONFIG_XEN_COMPAT    0x030002
>  
>  
> -#if defined(XEN)
>  #define HYPERCALL_STR(name)                                     \
>          "call hypercall_page + ("STR(__HYPERVISOR_##name)" * 32)"
> -#else
> -#define HYPERCALL_STR(name)                                     \
> -        "mov hypercall_stubs,%%eax; "                           \
> -        "add $("STR(__HYPERVISOR_##name)" * 32),%%eax; "        \
> -        "call *%%eax"
> -#endif
>  
>  #define _hypercall0(type, name)                 \
>  ({                                              \
> diff --git a/sys/i386/xen/xen_machdep.c b/sys/i386/xen/xen_machdep.c
> index 7049be6..1b1c74d 100644
> --- a/sys/i386/xen/xen_machdep.c
> +++ b/sys/i386/xen/xen_machdep.c
> @@ -89,6 +89,7 @@ IDTVEC(div), IDTVEC(dbg), IDTVEC(nmi), IDTVEC(bpt), 
> IDTVEC(ofl),
>  
>  int xendebug_flags; 
>  start_info_t *xen_start_info;
> +start_info_t *HYPERVISOR_start_info;
>  shared_info_t *HYPERVISOR_shared_info;
>  xen_pfn_t *xen_machine_phys = machine_to_phys_mapping;
>  xen_pfn_t *xen_phys_machine;
> @@ -744,7 +745,7 @@ void initvalues(start_info_t *startinfo);
>  struct xenstore_domain_interface;
>  extern struct xenstore_domain_interface *xen_store;
>  
> -char *console_page;
> +extern char *console_page;
>  
>  void *
>  bootmem_alloc(unsigned int size) 
> @@ -927,6 +928,7 @@ initvalues(start_info_t *startinfo)
>       HYPERVISOR_vm_assist(VMASST_CMD_enable, 
> VMASST_TYPE_4gb_segments_notify);       
>  #endif       
>       xen_start_info = startinfo;
> +     HYPERVISOR_start_info = startinfo;
>       xen_phys_machine = (xen_pfn_t *)startinfo->mfn_list;
>  
>       IdlePTD = (pd_entry_t *)((uint8_t *)startinfo->pt_base + PAGE_SIZE);
> diff --git a/sys/x86/isa/clock.c b/sys/x86/isa/clock.c
> index a12e175..a5aed1c 100644
> --- a/sys/x86/isa/clock.c
> +++ b/sys/x86/isa/clock.c
> @@ -247,61 +247,13 @@ getit(void)
>       return ((high << 8) | low);
>  }
>  
> -#ifndef DELAYDEBUG
> -static u_int
> -get_tsc(__unused struct timecounter *tc)
> -{
> -
> -     return (rdtsc32());
> -}
> -
> -static __inline int
> -delay_tc(int n)
> -{
> -     struct timecounter *tc;
> -     timecounter_get_t *func;
> -     uint64_t end, freq, now;
> -     u_int last, mask, u;
> -
> -     tc = timecounter;
> -     freq = atomic_load_acq_64(&tsc_freq);
> -     if (tsc_is_invariant && freq != 0) {
> -             func = get_tsc;
> -             mask = ~0u;
> -     } else {
> -             if (tc->tc_quality <= 0)
> -                     return (0);
> -             func = tc->tc_get_timecount;
> -             mask = tc->tc_counter_mask;
> -             freq = tc->tc_frequency;
> -     }
> -     now = 0;
> -     end = freq * n / 1000000;
> -     if (func == get_tsc)
> -             sched_pin();
> -     last = func(tc) & mask;
> -     do {
> -             cpu_spinwait();
> -             u = func(tc) & mask;
> -             if (u < last)
> -                     now += mask - last + u + 1;
> -             else
> -                     now += u - last;
> -             last = u;
> -     } while (now < end);
> -     if (func == get_tsc)
> -             sched_unpin();
> -     return (1);
> -}
> -#endif
> -
>  /*
>   * Wait "n" microseconds.
>   * Relies on timer 1 counting down from (i8254_freq / hz)
>   * Note: timer had better have been programmed before this is first used!
>   */
>  void
> -DELAY(int n)
> +i8254_delay(int n)
>  {
>       int delta, prev_tick, tick, ticks_left;
>  #ifdef DELAYDEBUG
> @@ -317,9 +269,6 @@ DELAY(int n)
>       }
>       if (state == 1)
>               printf("DELAY(%d)...", n);
> -#else
> -     if (delay_tc(n))
> -             return;
>  #endif
>       /*
>        * Read the counter first, so that the rest of the setup overhead is
> diff --git a/sys/x86/x86/delay.c b/sys/x86/x86/delay.c
> new file mode 100644
> index 0000000..7ea70b1
> --- /dev/null
> +++ b/sys/x86/x86/delay.c
> @@ -0,0 +1,95 @@
> +/*-
> + * Copyright (c) 1990 The Regents of the University of California.
> + * Copyright (c) 2010 Alexander Motin <mav@xxxxxxxxxxx>
> + * All rights reserved.
> + *
> + * This code is derived from software contributed to Berkeley by
> + * William Jolitz and Don Ahn.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in the
> + *    documentation and/or other materials provided with the distribution.
> + * 4. Neither the name of the University nor the names of its contributors
> + *    may be used to endorse or promote products derived from this software
> + *    without specific prior written permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> + * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
> + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + *
> + *   from: @(#)clock.c       7.2 (Berkeley) 5/12/91
> + */
> +
> +#include <sys/cdefs.h>
> +__FBSDID("$FreeBSD$");
> +
> +/* Generic x86 routines to handle delay */
> +
> +#include <sys/param.h>
> +#include <sys/systm.h>
> +#include <sys/timetc.h>
> +#include <sys/proc.h>
> +#include <sys/kernel.h>
> +#include <sys/sched.h>
> +
> +#include <machine/clock.h>
> +#include <machine/cpu.h>
> +
> +static u_int
> +get_tsc(__unused struct timecounter *tc)
> +{
> +
> +     return (rdtsc32());
> +}
> +
> +int
> +delay_tc(int n)
> +{
> +     struct timecounter *tc;
> +     timecounter_get_t *func;
> +     uint64_t end, freq, now;
> +     u_int last, mask, u;
> +
> +     tc = timecounter;
> +     freq = atomic_load_acq_64(&tsc_freq);
> +     if (tsc_is_invariant && freq != 0) {
> +             func = get_tsc;
> +             mask = ~0u;
> +     } else {
> +             if (tc->tc_quality <= 0)
> +                     return (0);
> +             func = tc->tc_get_timecount;
> +             mask = tc->tc_counter_mask;
> +             freq = tc->tc_frequency;
> +     }
> +     now = 0;
> +     end = freq * n / 1000000;
> +     if (func == get_tsc)
> +             sched_pin();
> +     last = func(tc) & mask;
> +     do {
> +             cpu_spinwait();
> +             u = func(tc) & mask;
> +             if (u < last)
> +                     now += mask - last + u + 1;
> +             else
> +                     now += u - last;
> +             last = u;
> +     } while (now < end);
> +     if (func == get_tsc)
> +             sched_unpin();
> +     return (1);
> +}
> diff --git a/sys/x86/x86/local_apic.c b/sys/x86/x86/local_apic.c
> index 8c8eef6..d8d7701 100644
> --- a/sys/x86/x86/local_apic.c
> +++ b/sys/x86/x86/local_apic.c
> @@ -1368,9 +1368,13 @@ apic_setup_io(void *dummy __unused)
>       if (retval != 0)
>               printf("%s: Failed to setup I/O APICs: returned %d\n",
>                   best_enum->apic_name, retval);
> -#ifdef XEN
> -     return;
> +
> +#if defined(XEN) || defined(XENHVM)
> +     /* There's no lapic on PV Xen */
> +     if (xen_pv_domain())
> +             return;
>  #endif
> +
>       /*
>        * Finish setting up the local APIC on the BSP once we know how to
>        * properly program the LINT pins.
> diff --git a/sys/x86/xen/hvm.c b/sys/x86/xen/hvm.c
> index 72811dc..be15594 100644
> --- a/sys/x86/xen/hvm.c
> +++ b/sys/x86/xen/hvm.c
> @@ -35,15 +35,21 @@ __FBSDID("$FreeBSD$");
>  #include <sys/proc.h>
>  #include <sys/smp.h>
>  #include <sys/systm.h>
> +#include <sys/lock.h>
> +#include <sys/mutex.h>
> +#include <sys/reboot.h>
>  
>  #include <vm/vm.h>
>  #include <vm/pmap.h>
> +#include <vm/vm_kern.h>
> +#include <vm/vm_extern.h>
>  
>  #include <dev/pci/pcivar.h>
>  
>  #include <machine/cpufunc.h>
>  #include <machine/cpu.h>
>  #include <machine/smp.h>
> +#include <machine/stdarg.h>
>  
>  #include <x86/apicreg.h>
>  
> @@ -52,6 +58,9 @@ __FBSDID("$FreeBSD$");
>  #include <xen/gnttab.h>
>  #include <xen/hypervisor.h>
>  #include <xen/hvm.h>
> +#ifdef __amd64__
> +#include <xen/pv.h>
> +#endif
>  #include <xen/xen_intr.h>
>  
>  #include <xen/interface/hvm/params.h>
> @@ -97,6 +106,11 @@ extern void pmap_lazyfix_action(void);
>  /* Variables used by mp_machdep to perform the bitmap IPI */
>  extern volatile u_int cpu_ipi_pending[MAXCPU];
>  
> +#ifdef __amd64__
> +/* Native AP start used on PVHVM */
> +extern int native_start_all_aps(void);
> +#endif
> +
>  /*---------------------------------- Macros 
> ----------------------------------*/
>  #define      IPI_TO_IDX(ipi) ((ipi) - APIC_IPI_INTS)
>  
> @@ -119,7 +133,10 @@ enum xen_domain_type xen_domain_type = XEN_NATIVE;
>  struct cpu_ops xen_hvm_cpu_ops = {
>       .ipi_vectored   = lapic_ipi_vectored,
>       .cpu_init       = xen_hvm_cpu_init,
> -     .cpu_resume     = xen_hvm_cpu_resume
> +     .cpu_resume     = xen_hvm_cpu_resume,
> +#ifdef __amd64__
> +     .start_all_aps = native_start_all_aps,
> +#endif
>  };
>  
>  static MALLOC_DEFINE(M_XENHVM, "xen_hvm", "Xen HVM PV Support");
> @@ -157,8 +174,9 @@ DPCPU_DEFINE(xen_intr_handle_t, 
> ipi_handle[nitems(xen_ipis)]);
>  
>  /*------------------ Hypervisor Access Shared Memory Regions 
> -----------------*/
>  /** Hypercall table accessed via HYPERVISOR_*_op() methods. */
> -char *hypercall_stubs;
> +extern char *hypercall_page;
>  shared_info_t *HYPERVISOR_shared_info;
> +start_info_t *HYPERVISOR_start_info;
>  
>  #ifdef SMP
>  /*---------------------------- XEN PV IPI Handlers 
> ---------------------------*/
> @@ -522,7 +540,7 @@ xen_setup_cpus(void)
>  {
>       int i;
>  
> -     if (!xen_hvm_domain() || !xen_vector_callback_enabled)
> +     if (!xen_vector_callback_enabled)
>               return;
>  
>  #ifdef __amd64__
> @@ -558,7 +576,7 @@ xen_hvm_cpuid_base(void)
>   * Allocate and fill in the hypcall page.
>   */
>  static int
> -xen_hvm_init_hypercall_stubs(void)
> +xen_hvm_init_hypercall_stubs(enum xen_hvm_init_type init_type)
>  {
>       uint32_t base, regs[4];
>       int i;
> @@ -567,7 +585,7 @@ xen_hvm_init_hypercall_stubs(void)
>       if (base == 0)
>               return (ENXIO);
>  
> -     if (hypercall_stubs == NULL) {
> +     if (init_type == XEN_HVM_INIT_COLD) {
>               do_cpuid(base + 1, regs);
>               printf("XEN: Hypervisor version %d.%d detected.\n",
>                   regs[0] >> 16, regs[0] & 0xffff);
> @@ -577,18 +595,9 @@ xen_hvm_init_hypercall_stubs(void)
>        * Find the hypercall pages.
>        */
>       do_cpuid(base + 2, regs);
> -     
> -     if (hypercall_stubs == NULL) {
> -             size_t call_region_size;
> -
> -             call_region_size = regs[0] * PAGE_SIZE;
> -             hypercall_stubs = malloc(call_region_size, M_XENHVM, M_NOWAIT);
> -             if (hypercall_stubs == NULL)
> -                     panic("Unable to allocate Xen hypercall region");
> -     }
>  
>       for (i = 0; i < regs[0]; i++)
> -             wrmsr(regs[1], vtophys(hypercall_stubs + i * PAGE_SIZE) + i);
> +             wrmsr(regs[1], vtophys(&hypercall_page + i * PAGE_SIZE) + i);
>  
>       return (0);
>  }
> @@ -677,8 +686,6 @@ xen_hvm_disable_emulated_devices(void)
>       if (inw(XEN_MAGIC_IOPORT) != XMI_MAGIC)
>               return;
>  
> -     if (bootverbose)
> -             printf("XEN: Disabling emulated block and network devices\n");
>       outw(XEN_MAGIC_IOPORT, XMI_UNPLUG_IDE_DISKS|XMI_UNPLUG_NICS);
>  }
>  
> @@ -691,7 +698,12 @@ xen_hvm_init(enum xen_hvm_init_type init_type)
>       if (init_type == XEN_HVM_INIT_CANCELLED_SUSPEND)
>               return;
>  
> -     error = xen_hvm_init_hypercall_stubs();
> +     if (xen_pv_domain()) {
> +             /* hypercall page is already set in the PV case */
> +             error = 0;
> +     } else {
> +             error = xen_hvm_init_hypercall_stubs(init_type);
> +     }
>  
>       switch (init_type) {
>       case XEN_HVM_INIT_COLD:
> @@ -701,6 +713,12 @@ xen_hvm_init(enum xen_hvm_init_type init_type)
>               setup_xen_features();
>               cpu_ops = xen_hvm_cpu_ops;
>               vm_guest = VM_GUEST_XEN;
> +#ifdef __amd64__
> +             if (xen_pv_domain())
> +                     cpu_ops.start_all_aps = xen_pv_start_all_aps;
> +             else
> +#endif
> +                     printf("XEN: Disabling emulated block and network 
> devices\n");
>               break;
>       case XEN_HVM_INIT_RESUME:
>               if (error != 0)
> @@ -715,10 +733,13 @@ xen_hvm_init(enum xen_hvm_init_type init_type)
>       }
>  
>       xen_vector_callback_enabled = 0;
> -     xen_domain_type = XEN_HVM_DOMAIN;
> -     xen_hvm_init_shared_info_page();
>       xen_hvm_set_callback(NULL);
> -     xen_hvm_disable_emulated_devices();
> +
> +     if (!xen_pv_domain()) {
> +             xen_domain_type = XEN_HVM_DOMAIN;
> +             xen_hvm_init_shared_info_page();
> +             xen_hvm_disable_emulated_devices();
> +     }
>  } 
>  
>  void
> @@ -749,10 +770,11 @@ xen_set_vcpu_id(void)
>       struct pcpu *pc;
>       int i;
>  
> -     /* Set vcpu_id to acpi_id */
> +     /* Set vcpu_id to acpi_id for PVHVM guests */
>       CPU_FOREACH(i) {
>               pc = pcpu_find(i);
> -             pc->pc_vcpu_id = pc->pc_acpi_id;
> +             if (xen_hvm_domain())
> +                     pc->pc_vcpu_id = pc->pc_acpi_id;
>               if (bootverbose)
>                       printf("XEN: CPU %u has VCPU ID %u\n",
>                              i, pc->pc_vcpu_id);
> @@ -790,6 +812,31 @@ xen_hvm_cpu_init(void)
>               DPCPU_SET(vcpu_info, vcpu_info);
>  }
>  
> +/*----------------------------- Debug functions 
> ------------------------------*/
> +#define PRINTK_BUFSIZE 1024
> +static int
> +vprintk(const char *fmt, __va_list ap)
> +{
> +     int retval, len;
> +     static char buf[PRINTK_BUFSIZE];
> +
> +     retval = vsnprintf(buf, PRINTK_BUFSIZE - 1, fmt, ap);
> +     buf[retval] = 0;
> +     len = strlen(buf);
> +     retval = HYPERVISOR_console_io(CONSOLEIO_write, len, (char *)buf);
> +     return retval;
> +}
> +
> +void
> +xen_early_printf(const char *fmt, ...)
> +{
> +     __va_list ap;
> +
> +     va_start(ap, fmt);
> +     vprintk(fmt, ap);
> +     va_end(ap);
> +}
> +
>  SYSINIT(xen_hvm_init, SI_SUB_HYPERVISOR, SI_ORDER_FIRST, xen_hvm_sysinit, 
> NULL);
>  #ifdef SMP
>  SYSINIT(xen_setup_cpus, SI_SUB_SMP, SI_ORDER_FIRST, xen_setup_cpus, NULL);
> diff --git a/sys/x86/xen/mptable.c b/sys/x86/xen/mptable.c
> new file mode 100644
> index 0000000..8916314
> --- /dev/null
> +++ b/sys/x86/xen/mptable.c
> @@ -0,0 +1,136 @@
> +/*-
> + * Copyright (c) 2003 John Baldwin <jhb@xxxxxxxxxxx>
> + * Copyright (c) 2013 Roger Pau Monné <roger.pau@xxxxxxxxxx>
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in the
> + *    documentation and/or other materials provided with the distribution.
> + * 3. Neither the name of the author nor the names of any co-contributors
> + *    may be used to endorse or promote products derived from this software
> + *    without specific prior written permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> + * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
> + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +
> +#include <sys/cdefs.h>
> +__FBSDID("$FreeBSD$");
> +
> +#include <sys/param.h>
> +#include <sys/systm.h>
> +#include <sys/bus.h>
> +#include <sys/kernel.h>
> +#include <sys/smp.h>
> +#include <sys/pcpu.h>
> +#include <vm/vm.h>
> +#include <vm/pmap.h>
> +
> +#include <machine/intr_machdep.h>
> +#include <machine/apicvar.h>
> +
> +#include <machine/cpu.h>
> +#include <machine/smp.h>
> +
> +#include <xen/xen-os.h>
> +#include <xen/hypervisor.h>
> +
> +#include <xen/interface/vcpu.h>
> +
> +static int xenpv_probe(void);
> +static int xenpv_probe_cpus(void);
> +static int xenpv_setup_local(void);
> +static int xenpv_setup_io(void);
> +
> +static struct apic_enumerator xenpv_enumerator = {
> +     "Xen PV",
> +     xenpv_probe,
> +     xenpv_probe_cpus,
> +     xenpv_setup_local,
> +     xenpv_setup_io
> +};
> +
> +/*
> + * Look for an ACPI Multiple APIC Description Table ("APIC")
> + */
> +static int
> +xenpv_probe(void)
> +{
> +     return (-100);
> +}
> +
> +/*
> + * Run through the MP table enumerating CPUs.
> + */
> +static int
> +xenpv_probe_cpus(void)
> +{
> +     int i, ret;
> +
> +     for (i = 0; i < MAXCPU; i++) {
> +             ret = HYPERVISOR_vcpu_op(VCPUOP_is_up, i, NULL);
> +             if (ret >= 0)
> +                     cpu_add((i * 2), (i == 0));
> +     }
> +
> +     return (0);
> +}
> +
> +/*
> + * Initialize the local APIC on the BSP.
> + */
> +static int
> +xenpv_setup_local(void)
> +{
> +     PCPU_SET(vcpu_id, 0);
> +     return (0);
> +}
> +
> +/*
> + * Enumerate I/O APICs and setup interrupt sources.
> + */
> +static int
> +xenpv_setup_io(void)
> +{
> +     return (0);
> +}
> +
> +static void
> +xenpv_register(void *dummy __unused)
> +{
> +     if (xen_pv_domain()) {
> +             apic_register_enumerator(&xenpv_enumerator);
> +     }
> +}
> +SYSINIT(xenpv_register, SI_SUB_TUNABLES - 1, SI_ORDER_FIRST, xenpv_register, 
> NULL);
> +
> +/*
> + * Setup per-CPU ACPI IDs.
> + */
> +static void
> +xenpv_set_ids(void *dummy)
> +{
> +     struct pcpu *pc;
> +     int i;
> +
> +     CPU_FOREACH(i) {
> +             pc = pcpu_find(i);
> +             pc->pc_vcpu_id = i;
> +     }
> +     return;
> +}
> +SYSINIT(xenpv_set_ids, SI_SUB_CPU, SI_ORDER_MIDDLE, xenpv_set_ids, NULL);
> diff --git a/sys/x86/xen/pv.c b/sys/x86/xen/pv.c
> new file mode 100644
> index 0000000..6756dec
> --- /dev/null
> +++ b/sys/x86/xen/pv.c
> @@ -0,0 +1,247 @@
> +/*
> + * Copyright (c) 2004 Christian Limpach.
> + * Copyright (c) 2004-2006,2008 Kip Macy
> + * Copyright (c) 2013 Roger Pau Monné <roger.pau@xxxxxxxxxx>
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in the
> + *    documentation and/or other materials provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> + * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
> + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +
> +#include <sys/cdefs.h>
> +__FBSDID("$FreeBSD$");
> +
> +#include <sys/param.h>
> +#include <sys/bus.h>
> +#include <sys/kernel.h>
> +#include <sys/malloc.h>
> +#include <sys/proc.h>
> +#include <sys/smp.h>
> +#include <sys/systm.h>
> +#include <sys/lock.h>
> +#include <sys/mutex.h>
> +#include <sys/reboot.h>
> +
> +#include <vm/vm.h>
> +#include <vm/pmap.h>
> +#include <vm/vm_kern.h>
> +#include <vm/vm_extern.h>
> +
> +#include <dev/pci/pcivar.h>
> +
> +#include <machine/cpufunc.h>
> +#include <machine/cpu.h>
> +#include <machine/smp.h>
> +#include <machine/tss.h>
> +#include <machine/sysarch.h>
> +#include <machine/clock.h>
> +
> +#include <x86/apicreg.h>
> +
> +#include <xen/xen-os.h>
> +#include <xen/features.h>
> +#include <xen/gnttab.h>
> +#include <xen/hypervisor.h>
> +#include <xen/hvm.h>
> +#include <xen/pv.h>
> +#include <xen/xen_intr.h>
> +
> +#include <xen/interface/hvm/params.h>
> +#include <xen/interface/vcpu.h>
> +
> +#define MAX_E820_ENTRIES     128
> +
> +/*--------------------------- Forward Declarations 
> ---------------------------*/
> +static caddr_t xen_pv_parse_preload_data(u_int64_t);
> +static void xen_pv_fetch_e820_map(caddr_t, struct bios_smap **, u_int32_t *);
> +
> +/*---------------------------- Extern Declarations 
> ---------------------------*/
> +/* Variables used by amd64 mp_machdep to start APs */
> +extern struct mtx ap_boot_mtx;
> +extern void *bootstacks[];
> +extern char *doublefault_stack;
> +extern char *nmi_stack;
> +extern void *dpcpu;
> +extern int bootAP;
> +extern char *bootSTK;
> +extern bool lapic_disabled;
> +
> +/*-------------------------------- Global Data 
> -------------------------------*/
> +/* Xen init_ops implementation. */
> +struct init_ops xen_init_ops = {
> +     .parse_preload_data =   xen_pv_parse_preload_data,
> +     .early_delay_init =     xen_delay_init,
> +     .early_delay =          xen_delay,
> +     .fetch_e820_map =       xen_pv_fetch_e820_map,
> +};
> +
> +static struct
> +{
> +     const char      *ev;
> +     int             mask;
> +} howto_names[] = {
> +     {"boot_askname",        RB_ASKNAME},
> +     {"boot_single",         RB_SINGLE},
> +     {"boot_nosync",         RB_NOSYNC},
> +     {"boot_halt",           RB_ASKNAME},
> +     {"boot_serial",         RB_SERIAL},
> +     {"boot_cdrom",          RB_CDROM},
> +     {"boot_gdb",            RB_GDB},
> +     {"boot_gdb_pause",      RB_RESERVED1},
> +     {"boot_verbose",        RB_VERBOSE},
> +     {"boot_multicons",      RB_MULTIPLE},
> +     {NULL,  0}
> +};
> +
> +static struct bios_smap xen_smap[MAX_E820_ENTRIES];
> +
> +static int
> +start_xen_ap(int cpu)
> +{
> +     struct vcpu_guest_context *ctxt;
> +     int ms, cpus = mp_naps;
> +
> +     ctxt = malloc(sizeof(*ctxt), M_TEMP, M_NOWAIT | M_ZERO);
> +     if (ctxt == NULL)
> +             panic("unable to allocate memory");
> +
> +     ctxt->flags = VGCF_IN_KERNEL;
> +     ctxt->user_regs.rip = (unsigned long) init_secondary;
> +     ctxt->user_regs.rsp = (unsigned long) bootSTK;
> +
> +     /* Set the CPU to use the same page tables and CR4 value */
> +     ctxt->ctrlreg[3] = KPML4phys;
> +     ctxt->ctrlreg[4] = rcr4();
> +
> +     if (HYPERVISOR_vcpu_op(VCPUOP_initialise, cpu, ctxt))
> +             panic("unable to initialize CPU#%d\n", cpu);
> +
> +     free(ctxt, M_TEMP);
> +
> +     /* Launch the vCPU */
> +     if (HYPERVISOR_vcpu_op(VCPUOP_up, cpu, NULL))
> +             panic("unable to start AP#%d\n", cpu);
> +
> +     /* Wait up to 5 seconds for it to start. */
> +     for (ms = 0; ms < 5000; ms++) {
> +             if (mp_naps > cpus)
> +                     return 1;       /* return SUCCESS */
> +             DELAY(1000);
> +     }
> +
> +     return 0;
> +}
> +
> +int
> +xen_pv_start_all_aps(void)
> +{
> +     int cpu;
> +
> +     mtx_init(&ap_boot_mtx, "ap boot", NULL, MTX_SPIN);
> +     lapic_disabled = true;
> +
> +     for (cpu = 1; cpu < mp_ncpus; cpu++) {
> +
> +             /* allocate and set up an idle stack data page */
> +             bootstacks[cpu] = (void *)kmem_malloc(kernel_arena,
> +                 KSTACK_PAGES * PAGE_SIZE, M_WAITOK | M_ZERO);
> +             doublefault_stack = (char *)kmem_malloc(kernel_arena,
> +                 PAGE_SIZE, M_WAITOK | M_ZERO);
> +             nmi_stack = (char *)kmem_malloc(kernel_arena, PAGE_SIZE,
> +                 M_WAITOK | M_ZERO);
> +             dpcpu = (void *)kmem_malloc(kernel_arena, DPCPU_SIZE,
> +                 M_WAITOK | M_ZERO);
> +
> +             bootSTK = (char *)bootstacks[cpu] + KSTACK_PAGES * PAGE_SIZE - 
> 8;
> +             bootAP = cpu;
> +
> +             /* attempt to start the Application Processor */
> +             if (!start_xen_ap(cpu))
> +                     panic("AP #%d failed to start!", cpu);
> +
> +             CPU_SET(cpu, &all_cpus);        /* record AP in CPU map */
> +     }
> +
> +     return mp_naps;
> +}
> +
> +/*
> + * Functions to convert the "extra" parameters passed by Xen
> + * into FreeBSD boot options (from the i386 Xen port).
> + */
> +static char *
> +xen_setbootenv(char *cmd_line)
> +{
> +     char *cmd_line_next;
> +
> +        /* Skip leading spaces */
> +        for (; *cmd_line == ' '; cmd_line++);
> +
> +     for (cmd_line_next = cmd_line; strsep(&cmd_line_next, ",") != NULL;);
> +     return (cmd_line);
> +}
> +
> +static int
> +xen_boothowto(char *envp)
> +{
> +     int i, howto = 0;
> +
> +     /* get equivalents from the environment */
> +     for (i = 0; howto_names[i].ev != NULL; i++)
> +             if (getenv(howto_names[i].ev) != NULL)
> +                     howto |= howto_names[i].mask;
> +     return (howto);
> +}
> +
> +static caddr_t
> +xen_pv_parse_preload_data(u_int64_t modulep)
> +{
> +     /* Parse the extra boot information given by Xen */
> +     if (HYPERVISOR_start_info->cmd_line)
> +             kern_envp = xen_setbootenv(HYPERVISOR_start_info->cmd_line);
> +     boothowto |= xen_boothowto(kern_envp);
> +
> +     return (NULL);
> +}
> +
> +static void
> +xen_pv_fetch_e820_map(caddr_t kmdp, struct bios_smap **smap, u_int32_t *size)
> +{
> +     struct xen_memory_map memmap;
> +     int rc;
> +
> +     /* Fetch the E820 map from Xen */
> +     memmap.nr_entries = MAX_E820_ENTRIES;
> +     set_xen_guest_handle(memmap.buffer, xen_smap);
> +     rc = HYPERVISOR_memory_op(XENMEM_memory_map, &memmap);
> +     if (rc)
> +             panic("unable to fetch Xen E820 memory map");
> +
> +     *smap = xen_smap;
> +     *size = memmap.nr_entries * sizeof(xen_smap[0]);
> +}
> +
> +void
> +xen_pv_set_init_ops(void)
> +{
> +     /* Init ops for Xen PV */
> +     init_ops = xen_init_ops;
> +}
> diff --git a/sys/x86/xen/pvcpu.c b/sys/x86/xen/pvcpu.c
> new file mode 100644
> index 0000000..00e063b
> --- /dev/null
> +++ b/sys/x86/xen/pvcpu.c
> @@ -0,0 +1,98 @@
> +/*
> + * Copyright (c) 2013 Roger Pau Monné <roger.pau@xxxxxxxxxx>
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in the
> + *    documentation and/or other materials provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> + * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
> + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +
> +#include <sys/cdefs.h>
> +__FBSDID("$FreeBSD$");
> +
> +#include <sys/param.h>
> +#include <sys/systm.h>
> +#include <sys/bus.h>
> +#include <sys/kernel.h>
> +#include <sys/module.h>
> +#include <sys/pcpu.h>
> +#include <sys/smp.h>
> +
> +#include <xen/xen-os.h>
> +
> +static void
> +xenpvcpu_identify(driver_t *driver, device_t parent)
> +{
> +     int i;
> +
> +     if (!xen_pv_domain())
> +             return;
> +
> +     CPU_FOREACH(i)
> +             BUS_ADD_CHILD(parent, 0, "pvcpu", i);
> +}
> +
> +static int
> +xenpvcpu_probe(device_t dev)
> +{
> +     if (!xen_pv_domain())
> +             return (ENXIO);
> +
> +     device_set_desc(dev, "Xen PV CPU");
> +     return (0);
> +}
> +
> +static int
> +xenpvcpu_attach(device_t dev)
> +{
> +     struct pcpu *pc;
> +     int cpu;
> +
> +     cpu = device_get_unit(dev);
> +     pc = pcpu_find(cpu);
> +     pc->pc_device = dev;
> +     return (0);
> +}
> +
> +static int
> +xenpvcpu_detach(device_t dev)
> +{
> +
> +     return (0);
> +}
> +
> +static device_method_t xenpvcpu_methods[] = {
> +     DEVMETHOD(device_identify, xenpvcpu_identify),
> +     DEVMETHOD(device_probe, xenpvcpu_probe),
> +     DEVMETHOD(device_attach, xenpvcpu_attach),
> +     DEVMETHOD(device_detach, xenpvcpu_detach),
> +     DEVMETHOD_END
> +};
> +
> +static driver_t xenpvcpu_driver = {
> +     "pvcpu",
> +     xenpvcpu_methods,
> +     0,
> +};
> +
> +devclass_t xenpvcpu_devclass;
> +
> +DRIVER_MODULE(xenpvcpu, nexus, xenpvcpu_driver, xenpvcpu_devclass, 0, 0);
> +MODULE_DEPEND(xenpvcpu, nexus, 1, 1, 1);
> diff --git a/sys/xen/gnttab.c b/sys/xen/gnttab.c
> index 03c32b7..909378a 100644
> --- a/sys/xen/gnttab.c
> +++ b/sys/xen/gnttab.c
> @@ -25,6 +25,7 @@ __FBSDID("$FreeBSD$");
>  #include <sys/lock.h>
>  #include <sys/malloc.h>
>  #include <sys/mman.h>
> +#include <sys/limits.h>
>  
>  #include <xen/xen-os.h>
>  #include <xen/hypervisor.h>
> @@ -607,6 +608,7 @@ gnttab_resume(void)
>  {
>       int error;
>       unsigned int max_nr_gframes, nr_gframes;
> +     void *alloc_mem;
>  
>       nr_gframes = nr_grant_frames;
>       max_nr_gframes = max_nr_grant_frames();
> @@ -614,11 +616,20 @@ gnttab_resume(void)
>               return (ENOSYS);
>  
>       if (!resume_frames) {
> -             error = xenpci_alloc_space(PAGE_SIZE * max_nr_gframes,
> -                 &resume_frames);
> -             if (error) {
> -                     printf("error mapping gnttab share frames\n");
> -                     return (error);
> +             if (xen_pv_domain()) {
> +                     alloc_mem = contigmalloc(max_nr_gframes * PAGE_SIZE,
> +                                              M_DEVBUF, M_NOWAIT, 0,
> +                                              ULONG_MAX, PAGE_SIZE, 0);
> +                     KASSERT((alloc_mem != NULL),
> +                             ("unable to alloc memory for gnttab"));
> +                     resume_frames = vtophys(alloc_mem);
> +             } else {
> +                     error = xenpci_alloc_space(PAGE_SIZE * max_nr_gframes,
> +                         &resume_frames);
> +                     if (error) {
> +                             printf("error mapping gnttab share frames\n");
> +                             return (error);
> +                     }
>               }
>       }
>  
> diff --git a/sys/xen/interface/arch-x86/xen.h 
> b/sys/xen/interface/arch-x86/xen.h
> index 1c186d7..6cc15d3 100644
> --- a/sys/xen/interface/arch-x86/xen.h
> +++ b/sys/xen/interface/arch-x86/xen.h
> @@ -147,7 +147,16 @@ struct vcpu_guest_context {
>      struct cpu_user_regs user_regs;         /* User-level CPU registers     
> */
>      struct trap_info trap_ctxt[256];        /* Virtual IDT                  
> */
>      unsigned long ldt_base, ldt_ents;       /* LDT (linear address, # ents) 
> */
> -    unsigned long gdt_frames[16], gdt_ents; /* GDT (machine frames, # ents) 
> */
> +    union {
> +        struct {
> +            /* PV: GDT (machine frames, # ents).*/
> +            unsigned long gdt_frames[16], gdt_ents;
> +        } pv;
> +        struct {
> +            /* PVH: GDTR addr and size */
> +            unsigned long gdtaddr, gdtsz;
> +        } pvh;
> +    } u;
>      unsigned long kernel_ss, kernel_sp;     /* Virtual TSS (only SS1/SP1)   
> */
>      /* NB. User pagetable on x86/64 is placed in ctrlreg[1]. */
>      unsigned long ctrlreg[8];               /* CR0-CR7 (control registers)  
> */
> diff --git a/sys/xen/pv.h b/sys/xen/pv.h
> new file mode 100644
> index 0000000..bbb1048
> --- /dev/null
> +++ b/sys/xen/pv.h
> @@ -0,0 +1,29 @@
> +/*
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, 
> and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL 
> THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + * $FreeBSD$
> + */
> +
> +#ifndef      __XEN_PV_H__
> +#define      __XEN_PV_H__
> +
> +int  xen_pv_start_all_aps(void);
> +void xen_pv_set_init_ops(void);
> +
> +#endif       /* __XEN_PV_H__ */
> \ No newline at end of file
> diff --git a/sys/xen/xen-os.h b/sys/xen/xen-os.h
> index 95e8c6a..d3dccad 100644
> --- a/sys/xen/xen-os.h
> +++ b/sys/xen/xen-os.h
> @@ -53,6 +53,11 @@ void force_evtchn_callback(void);
>  extern int gdtset;
>  
>  extern shared_info_t *HYPERVISOR_shared_info;
> +extern start_info_t *HYPERVISOR_start_info;
> +
> +/* XXX: we need to get rid of this and use HYPERVISOR_start_info directly */
> +extern struct xenstore_domain_interface *xen_store;
> +extern char *console_page;
>  
>  enum xen_domain_type {
>       XEN_NATIVE,             /* running on bare hardware    */
> @@ -80,6 +85,9 @@ xen_hvm_domain(void)
>       return (xen_domain_type == XEN_HVM_DOMAIN);
>  }
>  
> +/* Debug function, prints directly to hypervisor console */
> +void xen_early_printf(const char *, ...);
> +
>  #ifndef xen_mb
>  #define xen_mb() mb()
>  #endif
> diff --git a/sys/xen/xenstore/xenstore.c b/sys/xen/xenstore/xenstore.c
> index d404862..b9885af 100644
> --- a/sys/xen/xenstore/xenstore.c
> +++ b/sys/xen/xenstore/xenstore.c
> @@ -1082,6 +1082,19 @@ xs_init_comms(void)
>  static void
>  xs_identify(driver_t *driver, device_t parent)
>  {
> +     const char *parent_name;
> +
> +     if (!xen_domain())
> +             return;
> +
> +     /*
> +      * On HVM domains we will get called twice, once from the nexus
> +      * and another time after the xenpci device is attached, we should
> +      * only attach after the xenpci device has been added.
> +      */
> +     parent_name = device_get_name(parent);
> +     if (xen_hvm_domain() && strncmp(parent_name, "xenpci", 6) != 0)
> +             return;
>  
>       BUS_ADD_CHILD(parent, 0, "xenstore", 0);
>  }
> @@ -1147,13 +1160,15 @@ xs_attach(device_t dev)
>       /* Initialize the interface to xenstore. */
>       struct proc *p;
>  
> -#ifdef XENHVM
> -     xs.evtchn = hvm_get_parameter(HVM_PARAM_STORE_EVTCHN);
> -     xs.gpfn = hvm_get_parameter(HVM_PARAM_STORE_PFN);
> -     xen_store = pmap_mapdev(xs.gpfn * PAGE_SIZE, PAGE_SIZE);
> -#else
> -     xs.evtchn = xen_start_info->store_evtchn;
> -#endif
> +     if (xen_hvm_domain()) {
> +             xs.evtchn = hvm_get_parameter(HVM_PARAM_STORE_EVTCHN);
> +             xs.gpfn = hvm_get_parameter(HVM_PARAM_STORE_PFN);
> +             xen_store = pmap_mapdev(xs.gpfn * PAGE_SIZE, PAGE_SIZE);
> +     } else if (xen_pv_domain()) {
> +             xs.evtchn = HYPERVISOR_start_info->store_evtchn;
> +     } else {
> +             panic("Unknown domain type, cannot initialize xenstore\n");
> +     }
>  
>       TAILQ_INIT(&xs.reply_list);
>       TAILQ_INIT(&xs.watch_events);
> @@ -1263,9 +1278,8 @@ static devclass_t xenstore_devclass;
>   
>  #ifdef XENHVM
>  DRIVER_MODULE(xenstore, xenpci, xenstore_driver, xenstore_devclass, 0, 0);
> -#else
> -DRIVER_MODULE(xenstore, nexus, xenstore_driver, xenstore_devclass, 0, 0);
>  #endif
> +DRIVER_MODULE(xenstore, nexus, xenstore_driver, xenstore_devclass, 0, 0);
>  
>  /*------------------------------- Sysctl Data 
> --------------------------------*/
>  /* XXX Shouldn't the node be somewhere else? */
> -- 
> 1.7.7.5 (Apple Git-26)
> 

> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.