|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v1 1/5] xsplice: Design document.
>>> Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> 09/16/15 11:02 PM >>>
>+struct xsplice {
>+ const char *name; /* A sensible name for the patch. Up to 40 characters.
>*/
Any reason not to make this a 40-character array right in the structure then?
It would seem to me the fewer relocation the better.
>+ const char *build_id; /* ID of the hypervisor this binary was built
>against. */
For this one this might be true to, if the linker can be made emit the build ID
at
place we want it (which seems unlikely).
>+ uint32_t version; /* Version of payload. */
>+ uint32_t id_size; /* Size of the ID. */
Which "the ID"?
>+ struct xsplice_code *new; /* Pointer to the new code & data to be
>patched. */
>+ struct xsplice_code *old; /* Pointer to the old code & data to be checked
>against. */
Considering you use "const" above, any reason not to here (and elsewhere below)?
>+struct xsplice_code {
>+ struct xsplice_reloc *relocs; /* How to patch it. */
>+ struct xsplice_section *sections; /* Safety data. */
>+ struct xsplice_patch *patches; /* Patch code and data */
>+ uint32_t n_relocs;
>+ uint32_t n_sections;
>+ uint32_t n_patches;
>+ uint8_t pad[28]; /* Must be zero. */
>+};
>+</pre>
>+
>+The size of this structure is 64 bytes.
>+
>+There can be at most two of those structures in the payload.
>+One for the `new` and another for the `old` (optional).
>+
>+If it is for the old code the relocs, and relocs_end values will be ignored.
s/relocs_end/n_relocs/
>+### xsplice_reloc
>+
>+The `xsplice_code` defines an array of these structures. As such
>+an singular structure defines an singular point where to patch the
>+hypervisor.
>+
>+The structure contains the address of the hypervisor (if known),
>+the symbol associated with this address, how the patching is to
>+be done, and platform specific details.
>+
>+The `isns_added` is an value to be used to compute the new offset
I think you mean addend here. And what is isns supposed to stand for?
>+#define XSPLICE_SECTION_TEXT 0x00000001 /* Section is in .text */
>+#define XSPLICE_SECTION_RODATA 0x00000002 /* Section is in .rodata */
>+#define XSPLICE_SECTION_DATA 0x00000004 /* Section is in .data */
>+#define XSPLICE_SECTION_STRING 0x00000008 /* Section is in .str */
As said before - this enumeration of sections seems too limiting to me. Plus -
who is ".str"?
>+#define XSPLICE_PATCH_INLINE_TEXT 0x1
>+#define XSPLICE_PATCH_INLINE_DATA 0x2
Why do code and data need separate flags here?
>+<pre>
>+struct xsplice_symbol {
>+ const char *name; /* The ELF name of the symbol. */
>+ const char *label; /* A unique xSplice name for the symbol. */
>+ uint8_t pad[16]; /* Must be zero. */
>+};
>+</pre>
>+
>+The size of this structure is 32 bytes.
I only notice this now - do these statements make sense, given that
they assume a 64-bit architecture?
>+### xsplice_reloc_howto
>+
>+The howto defines in the detail the change. It contains the type,
>+whether the relocation is relative, the size of the relocation,
>+bitmask for which parts of the instruction or data are to be replaced,
>+amount the final relocation is shifted by (to drop unwanted data), and
>+whether the replacement should be interpreted as signed value.
>+
>+The structure is as follow:
>+
>+<pre>
>+#define XSPLICE_HOWTO_INLINE 0x1 /* It is an inline replacement. */
>+#define XSPLICE_HOWTO_RELOC_PATCH 0x2 /* Add a trampoline. */
>+
>+#define XSPLICE_HOWTO_FLAG_PC_REL 0x1 /* Is PC relative. */
>+#define XSPLICE_HOWOT_FLAG_SIGN 0x2 /* Should the new value be treated
>as signed value. */
>+
>+struct xsplice_reloc_howto {
>+ uint32_t howto; /* XSPLICE_HOWTO_* */
>+ uint32_t flag; /* XSPLICE_HOWTO_FLAG_* */
>+ uint32_t size; /* Size, in bytes, of the item to be relocated. */
>+ uint32_t r_shift; /* The value the final relocation is shifted right
>by; used to drop unwanted data from the relocation. */
Surely a uint8_t would suffice here?
>+ uint64_t mask; /* Bitmask for which parts of the instruction or data
>are replaced with the relocated value. */
Does that mean "size" is limited to 8? The example below only uses
the trivial values (r_shift=0, mask=~0), so the real meaning of these
isn't clear. In particular I can't see what good right shifting a
relocated value can do. Furthermore all this is very x86-centric;
(almost) all other architectures I know a little about won't get away
with such simple relocations, since immediates/displacements are
scattered around an instruction word there.
>+## Signature checking requirements.
>+
>+The signature checking requires that the layout of the data in memory
>+**MUST** be same for signature to be verified. This means that the payload
>+data layout in ELF format **MUST** match what the hypervisor would be
>+expecting such that it can properly do signature verification.
>+
>+The signature is based on the all of the payloads continuously laid out
>+in memory. The signature is to be appended at the end of the ELF payload
>+prefixed with the string '~Module signature appended~\n', followed by
>+an signature header then followed by the signature, key identifier, and
>signers
>+name.
Is all of this following some (de-facto) standard? Or else why don't
you simply assign a section for this, which is left empty initially and
would get filled by the signing tool? If so, naming the reference
might be a good idea. Oh, I see you do ...
>+</pre>
>+(Note that this has been borrowed from Linux module signature code.).
... here. Is the intention then to use the same tool(s) Linux uses for
signing?
>+## Hypercalls
>+
>+We will employ the sub operations of the system management hypercall (sysctl).
>+There are to be four sub-operations:
>+
>+ * upload the payloads.
>+ * listing of payloads summary uploaded and their state.
>+ * getting an particular payload summary and its state.
>+ * command to apply, delete, or revert the payload.
>+ * querying of the hypervisor build ID.
>+
>+Most of the actions are asynchronous therefore the caller is responsible
>+to verify that it has been applied properly by retrieving the summary of it
>+and verifying that there are no error codes associated with the payload.
>+
>+We **MUST** make some of them asynchronous due to the nature of patching
>+it requires every physical CPU to be lock-step with each other.
... may require ... - I see no reason the exclude a model where this
is not necessary (for example using per-CPU page tables, swapping
patched for unpatched pages at the right moment on each CPU
individually).
>+### Basic type: struct xen_xsplice_id
>+
>+Most of the hypercalls employ an shared structure called `struct
>xen_xsplice_id`
>+which contains:
>+
>+ * `name` - pointer where the string for the id is located.
>+ * `size` - the size of the string
>+ * `_pad` - padding - to be zero.
>+
>+The structure is as follow:
>+
>+<pre>
>+#define XEN_XSPLICE_ID_SIZE 128
>+struct xen_xsplice_id {
>+ XEN_GUEST_HANDLE_64(char) name; /* IN, pointer to name. */
>+ uint32_t size; /* IN, size of name. May be upto
>
>+ XEN_XSPLICE_ID_SIZE. */
I.e. uint8_t or uint16_t would do. Of course I agree that there's no
foreseeable use for the two or three extra unused bytes thus
regained.
>+### XEN_SYSCTL_XSPLICE_GET (1)
>+
>+Retrieve an status of an specific payload. This caller provides:
>+
>+ * A `struct xen_xsplice_id` called `id` which has the unique id.
>+ * A `struct xen_xsplice_status` structure which has all members
>+ set to zero: That is:
>+ * `status` *MUST* be set to zero.
Is there a particular reason for this?
>+Upon completion the `struct xen_xsplice_status` is updated.
>+
>+ * `_pad` - reserved for further usage.
>+ * `status` - whether it has been:
>+ * *XSPLICE_STATUS_LOADED* (0x1) has been loaded.
>+ * *XSPLICE_STATUS_PROGRESS* (0x2) acting on the
>**XEN_SYSCTL_XSPLICE_ACTION** command.
>+ * *XSPLICE_STATUS_CHECKED* (0x4) the ELF payload safety checks passed.
>+ * *XSPLICE_STATUS_APPLIED* (0x8) loaded, checked, and applied.
>+ * *XSPLICE_STATUS_REVERTED* (0x10) loaded, checked, applied and then also
>reverted.
>+ * Negative values is an error. The error would be of EXX format.
XEN_Exx now that we have them, I suppose?
>+struct xen_xsplice_status {
>+#define XSPLICE_STATUS_LOADED 0x01
>+#define XSPLICE_STATUS_PROGRESS 0x02
>+#define XSPLICE_STATUS_CHECKED 0x04
>+#define XSPLICE_STATUS_APPLIED 0x08
>+#define XSPLICE_STATUS_REVERTED 0x10
>+ /* Any negative value is an error. The error would be in -EXX format. */
>+ int32_t status; /* OUT, On IN has to be zero. */
>+ uint32_t _pad; /* IN: Must be zero. */
>+};
What's the padding field needed for here?
>+### XEN_SYSCTL_XSPLICE_LIST (2)
>+
>+Retrieve an array of abbreviated status and names of payloads that are loaded
>in the
>+hypervisor.
>+
>+The caller provides:
>+
>+ * `version`. Initially (on first hypercall) *MUST* be zero.
"first" seems ambiguous here: The first time after boot, or the first in
a sequence to obtain the full set? It's not really clear to me how you
mean to use this...
>+ * `idx` index iterator. On first call *MUST* be zero, subsequent calls
>varies.
>+ * `nr` the max number of entries to populate.
>+ * `_pad` - *MUST* be zero.
>+ * `status` virtual address of where to write `struct xen_xsplice_status`
>+ structures. *MUST* allocate up to `nr` of them.
>+ * `id` - virtual address of where to write the unique id of the payload.
>+ *MUST* allocate up to `nr` of them. Each *MUST* be of
>+ **XEN_XSPLICE_ID_SIZE** size.
>+ * `len` - virtual address of where to write the length of each unique id
>+ of the payload. *MUST* allocate up to `nr` of them. Each *MUST* be
>+ of sizeof(uint32_t) (4 bytes).
s/up to/at least/g
>+Note that due to the asynchronous nature of hypercalls the domain might have
>+added or removed the number of payloads making this information stale.
... the control domain might have added or removed a number ... ?
>+struct xen_sysctl_xsplice_action {
>+ xen_xsplice_id_t id; /* IN, name of the patch. */
>+ uint32_t cmd; /* IN: XSPLICE_ACTION_* */
>+ uint32_t _pad; /* IN: MUST be zero. */
>+ uint64_t time; /* IN, upper bound of time (ms) for the operation to
>take. */
Surely a 32-bit quantity would suffice here (worth up to 4M seconds,
i.e. over 3 years).
>+#define XEN_SYSCTL_XSPLICE_INFO_BUILD_ID 0
>+struct xen_sysctl_xsplice_info {
>+ uint32_t cmd; /* IN: XEN_SYSCTL_XSPLICE_INFO_*
>*/
>+ uint32_t size; /* IN: Size of info: OUT: Amount
>of
>+ bytes filed out in info. */
>+ union {
>+ XEN_GUEST_HANDLE_64(char) info; /* OUT: Requested information. */
>
>+ } u;
>+};
Considering that earlier operation descriptions referred to this one
for retrieving more information after failure, I can't see what further
information there is being made available.
>+### Alternative assembler
>+
>+Alternative assembler is a mechanism to use different instructions depending
>+on what the CPU supports. This is done by providing multiple streams of code
>+that can be patched in - or if the CPU does not support it - padded with
>+`nop` operations. The alternative assembler macros cause the compiler to
>+expand the code to place a most generic code in place - emit a special
>+ELF .section header to tag this location. During run-time the hypervisor
>+can leave the areas alone or patch them with an better suited opcodes.
>+
>+However these sections are part of .init. and as such can't reasonably be
>+subject to patching.
This patching may, however, result in verification failure when
checking original code. Multiple patch variants may be needed to
cover for all possible cases.
>+### .rodata sections
>+
>+The patching might require strings to be updated as well. As such we must be
>+also able to patch the strings as needed. This sounds simple - but the
>compiler
>+has a habit of coalescing strings that are the same - which means if we
>in-place
>+alter the strings - other users will be inadvertently affected as well.
Coalescing (according to my understanding of the word) wouldn't
be a problem. Compilers and linker fold (partially) duplicate strings
where possible.
>+This is also where pointers to functions live - and we may need to patch this
>+as well. And switch-style jump tables.
>+
>+To guard against that we must be prepared to do patching similar to
>+trampoline patching or in-line depending on the flavour. If we can
>+do in-line patching we would need to:
>+
>+ * alter `.rodata` to be writeable.
>+ * inline patch.
>+ * alter `.rodata` to be read-only.
>+
>+If are doing trampoline patching we would need to:
>+
>+ * allocate a new memory location for the string.
>+ * all locations which use this string will have to be updated to use the
>+ offset to the string.
>+ * mark the region RO when we are done.
Except that we don't currently write-protect .rodata or .text (and
you also don't say the same for .text patching).
>+### Patching code which is in the stack.
>+
>+We should not patch the code which is on the stack. That can lead
>+to corruption.
I think we did away with all code placement on the stack. Or do you
mean code _referenced_ by return addresses on the stack?
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |