Xen project Mailing List

>>> Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> 09/16/15 11:02 PM >>> >+struct xsplice { >+ const char *name; /* A sensible name for the patch. Up to 40 characters. >*/ Any reason not to make this a 40-character array right in the structure then? It would seem to me the fewer relocation the better. >+ const char *build_id; /* ID of the hypervisor this binary was built >against. */ For this one this might be true to, if the linker can be made emit the build ID at place we want it (which seems unlikely). >+ uint32_t version; /* Version of payload. */ >+ uint32_t id_size; /* Size of the ID. */ Which "the ID"? >+ struct xsplice_code *new; /* Pointer to the new code & data to be >patched. */ >+ struct xsplice_code *old; /* Pointer to the old code & data to be checked >against. */ Considering you use "const" above, any reason not to here (and elsewhere below)? >+struct xsplice_code { >+ struct xsplice_reloc *relocs; /* How to patch it. */ >+ struct xsplice_section *sections; /* Safety data. */ >+ struct xsplice_patch *patches; /* Patch code and data */ >+ uint32_t n_relocs; >+ uint32_t n_sections; >+ uint32_t n_patches; >+ uint8_t pad[28]; /* Must be zero. */ >+}; >+</pre> >+ >+The size of this structure is 64 bytes. >+ >+There can be at most two of those structures in the payload. >+One for the `new` and another for the `old` (optional). >+ >+If it is for the old code the relocs, and relocs_end values will be ignored. s/relocs_end/n_relocs/ >+### xsplice_reloc >+ >+The `xsplice_code` defines an array of these structures. As such >+an singular structure defines an singular point where to patch the >+hypervisor. >+ >+The structure contains the address of the hypervisor (if known), >+the symbol associated with this address, how the patching is to >+be done, and platform specific details. >+ >+The `isns_added` is an value to be used to compute the new offset I think you mean addend here. And what is isns supposed to stand for? >+#define XSPLICE_SECTION_TEXT 0x00000001 /* Section is in .text */ >+#define XSPLICE_SECTION_RODATA 0x00000002 /* Section is in .rodata */ >+#define XSPLICE_SECTION_DATA 0x00000004 /* Section is in .data */ >+#define XSPLICE_SECTION_STRING 0x00000008 /* Section is in .str */ As said before - this enumeration of sections seems too limiting to me. Plus - who is ".str"? >+#define XSPLICE_PATCH_INLINE_TEXT 0x1 >+#define XSPLICE_PATCH_INLINE_DATA 0x2 Why do code and data need separate flags here? >+<pre> >+struct xsplice_symbol { >+ const char *name; /* The ELF name of the symbol. */ >+ const char *label; /* A unique xSplice name for the symbol. */ >+ uint8_t pad[16]; /* Must be zero. */ >+}; >+</pre> >+ >+The size of this structure is 32 bytes. I only notice this now - do these statements make sense, given that they assume a 64-bit architecture? >+### xsplice_reloc_howto >+ >+The howto defines in the detail the change. It contains the type, >+whether the relocation is relative, the size of the relocation, >+bitmask for which parts of the instruction or data are to be replaced, >+amount the final relocation is shifted by (to drop unwanted data), and >+whether the replacement should be interpreted as signed value. >+ >+The structure is as follow: >+ >+<pre> >+#define XSPLICE_HOWTO_INLINE 0x1 /* It is an inline replacement. */ >+#define XSPLICE_HOWTO_RELOC_PATCH 0x2 /* Add a trampoline. */ >+ >+#define XSPLICE_HOWTO_FLAG_PC_REL 0x1 /* Is PC relative. */ >+#define XSPLICE_HOWOT_FLAG_SIGN 0x2 /* Should the new value be treated >as signed value. */ >+ >+struct xsplice_reloc_howto { >+ uint32_t howto; /* XSPLICE_HOWTO_* */ >+ uint32_t flag; /* XSPLICE_HOWTO_FLAG_* */ >+ uint32_t size; /* Size, in bytes, of the item to be relocated. */ >+ uint32_t r_shift; /* The value the final relocation is shifted right >by; used to drop unwanted data from the relocation. */ Surely a uint8_t would suffice here? >+ uint64_t mask; /* Bitmask for which parts of the instruction or data >are replaced with the relocated value. */ Does that mean "size" is limited to 8? The example below only uses the trivial values (r_shift=0, mask=~0), so the real meaning of these isn't clear. In particular I can't see what good right shifting a relocated value can do. Furthermore all this is very x86-centric; (almost) all other architectures I know a little about won't get away with such simple relocations, since immediates/displacements are scattered around an instruction word there. >+## Signature checking requirements. >+ >+The signature checking requires that the layout of the data in memory >+**MUST** be same for signature to be verified. This means that the payload >+data layout in ELF format **MUST** match what the hypervisor would be >+expecting such that it can properly do signature verification. >+ >+The signature is based on the all of the payloads continuously laid out >+in memory. The signature is to be appended at the end of the ELF payload >+prefixed with the string '~Module signature appended~\n', followed by >+an signature header then followed by the signature, key identifier, and >signers >+name. Is all of this following some (de-facto) standard? Or else why don't you simply assign a section for this, which is left empty initially and would get filled by the signing tool? If so, naming the reference might be a good idea. Oh, I see you do ... >+</pre> >+(Note that this has been borrowed from Linux module signature code.). ... here. Is the intention then to use the same tool(s) Linux uses for signing? >+## Hypercalls >+ >+We will employ the sub operations of the system management hypercall (sysctl). >+There are to be four sub-operations: >+ >+ * upload the payloads. >+ * listing of payloads summary uploaded and their state. >+ * getting an particular payload summary and its state. >+ * command to apply, delete, or revert the payload. >+ * querying of the hypervisor build ID. >+ >+Most of the actions are asynchronous therefore the caller is responsible >+to verify that it has been applied properly by retrieving the summary of it >+and verifying that there are no error codes associated with the payload. >+ >+We **MUST** make some of them asynchronous due to the nature of patching >+it requires every physical CPU to be lock-step with each other. ... may require ... - I see no reason the exclude a model where this is not necessary (for example using per-CPU page tables, swapping patched for unpatched pages at the right moment on each CPU individually). >+### Basic type: struct xen_xsplice_id >+ >+Most of the hypercalls employ an shared structure called `struct >xen_xsplice_id` >+which contains: >+ >+ * `name` - pointer where the string for the id is located. >+ * `size` - the size of the string >+ * `_pad` - padding - to be zero. >+ >+The structure is as follow: >+ >+<pre> >+#define XEN_XSPLICE_ID_SIZE 128 >+struct xen_xsplice_id { >+ XEN_GUEST_HANDLE_64(char) name; /* IN, pointer to name. */ >+ uint32_t size; /* IN, size of name. May be upto > >+ XEN_XSPLICE_ID_SIZE. */ I.e. uint8_t or uint16_t would do. Of course I agree that there's no foreseeable use for the two or three extra unused bytes thus regained. >+### XEN_SYSCTL_XSPLICE_GET (1) >+ >+Retrieve an status of an specific payload. This caller provides: >+ >+ * A `struct xen_xsplice_id` called `id` which has the unique id. >+ * A `struct xen_xsplice_status` structure which has all members >+ set to zero: That is: >+ * `status` *MUST* be set to zero. Is there a particular reason for this? >+Upon completion the `struct xen_xsplice_status` is updated. >+ >+ * `_pad` - reserved for further usage. >+ * `status` - whether it has been: >+ * *XSPLICE_STATUS_LOADED* (0x1) has been loaded. >+ * *XSPLICE_STATUS_PROGRESS* (0x2) acting on the >**XEN_SYSCTL_XSPLICE_ACTION** command. >+ * *XSPLICE_STATUS_CHECKED* (0x4) the ELF payload safety checks passed. >+ * *XSPLICE_STATUS_APPLIED* (0x8) loaded, checked, and applied. >+ * *XSPLICE_STATUS_REVERTED* (0x10) loaded, checked, applied and then also >reverted. >+ * Negative values is an error. The error would be of EXX format. XEN_Exx now that we have them, I suppose? >+struct xen_xsplice_status { >+#define XSPLICE_STATUS_LOADED 0x01 >+#define XSPLICE_STATUS_PROGRESS 0x02 >+#define XSPLICE_STATUS_CHECKED 0x04 >+#define XSPLICE_STATUS_APPLIED 0x08 >+#define XSPLICE_STATUS_REVERTED 0x10 >+ /* Any negative value is an error. The error would be in -EXX format. */ >+ int32_t status; /* OUT, On IN has to be zero. */ >+ uint32_t _pad; /* IN: Must be zero. */ >+}; What's the padding field needed for here? >+### XEN_SYSCTL_XSPLICE_LIST (2) >+ >+Retrieve an array of abbreviated status and names of payloads that are loaded >in the >+hypervisor. >+ >+The caller provides: >+ >+ * `version`. Initially (on first hypercall) *MUST* be zero. "first" seems ambiguous here: The first time after boot, or the first in a sequence to obtain the full set? It's not really clear to me how you mean to use this... >+ * `idx` index iterator. On first call *MUST* be zero, subsequent calls >varies. >+ * `nr` the max number of entries to populate. >+ * `_pad` - *MUST* be zero. >+ * `status` virtual address of where to write `struct xen_xsplice_status` >+ structures. *MUST* allocate up to `nr` of them. >+ * `id` - virtual address of where to write the unique id of the payload. >+ *MUST* allocate up to `nr` of them. Each *MUST* be of >+ **XEN_XSPLICE_ID_SIZE** size. >+ * `len` - virtual address of where to write the length of each unique id >+ of the payload. *MUST* allocate up to `nr` of them. Each *MUST* be >+ of sizeof(uint32_t) (4 bytes). s/up to/at least/g >+Note that due to the asynchronous nature of hypercalls the domain might have >+added or removed the number of payloads making this information stale. ... the control domain might have added or removed a number ... ? >+struct xen_sysctl_xsplice_action { >+ xen_xsplice_id_t id; /* IN, name of the patch. */ >+ uint32_t cmd; /* IN: XSPLICE_ACTION_* */ >+ uint32_t _pad; /* IN: MUST be zero. */ >+ uint64_t time; /* IN, upper bound of time (ms) for the operation to >take. */ Surely a 32-bit quantity would suffice here (worth up to 4M seconds, i.e. over 3 years). >+#define XEN_SYSCTL_XSPLICE_INFO_BUILD_ID 0 >+struct xen_sysctl_xsplice_info { >+ uint32_t cmd; /* IN: XEN_SYSCTL_XSPLICE_INFO_* >*/ >+ uint32_t size; /* IN: Size of info: OUT: Amount >of >+ bytes filed out in info. */ >+ union { >+ XEN_GUEST_HANDLE_64(char) info; /* OUT: Requested information. */ > >+ } u; >+}; Considering that earlier operation descriptions referred to this one for retrieving more information after failure, I can't see what further information there is being made available. >+### Alternative assembler >+ >+Alternative assembler is a mechanism to use different instructions depending >+on what the CPU supports. This is done by providing multiple streams of code >+that can be patched in - or if the CPU does not support it - padded with >+`nop` operations. The alternative assembler macros cause the compiler to >+expand the code to place a most generic code in place - emit a special >+ELF .section header to tag this location. During run-time the hypervisor >+can leave the areas alone or patch them with an better suited opcodes. >+ >+However these sections are part of .init. and as such can't reasonably be >+subject to patching. This patching may, however, result in verification failure when checking original code. Multiple patch variants may be needed to cover for all possible cases. >+### .rodata sections >+ >+The patching might require strings to be updated as well. As such we must be >+also able to patch the strings as needed. This sounds simple - but the >compiler >+has a habit of coalescing strings that are the same - which means if we >in-place >+alter the strings - other users will be inadvertently affected as well. Coalescing (according to my understanding of the word) wouldn't be a problem. Compilers and linker fold (partially) duplicate strings where possible. >+This is also where pointers to functions live - and we may need to patch this >+as well. And switch-style jump tables. >+ >+To guard against that we must be prepared to do patching similar to >+trampoline patching or in-line depending on the flavour. If we can >+do in-line patching we would need to: >+ >+ * alter `.rodata` to be writeable. >+ * inline patch. >+ * alter `.rodata` to be read-only. >+ >+If are doing trampoline patching we would need to: >+ >+ * allocate a new memory location for the string. >+ * all locations which use this string will have to be updated to use the >+ offset to the string. >+ * mark the region RO when we are done. Except that we don't currently write-protect .rodata or .text (and you also don't say the same for .text patching). >+### Patching code which is in the stack. >+ >+We should not patch the code which is on the stack. That can lead >+to corruption. I think we did away with all code placement on the stack. Or do you mean code _referenced_ by return addresses on the stack? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.

Re: [Xen-devel] [PATCH v1 1/5] xsplice: Design document.