[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 4/6] gzip: refactor state tracking


  • To: "Daniel P. Smith" <dpsmith@xxxxxxxxxxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Wed, 17 Apr 2024 19:45:12 +0100
  • Autocrypt: addr=andrew.cooper3@xxxxxxxxxx; keydata= xsFNBFLhNn8BEADVhE+Hb8i0GV6mihnnr/uiQQdPF8kUoFzCOPXkf7jQ5sLYeJa0cQi6Penp VtiFYznTairnVsN5J+ujSTIb+OlMSJUWV4opS7WVNnxHbFTPYZVQ3erv7NKc2iVizCRZ2Kxn srM1oPXWRic8BIAdYOKOloF2300SL/bIpeD+x7h3w9B/qez7nOin5NzkxgFoaUeIal12pXSR Q354FKFoy6Vh96gc4VRqte3jw8mPuJQpfws+Pb+swvSf/i1q1+1I4jsRQQh2m6OTADHIqg2E ofTYAEh7R5HfPx0EXoEDMdRjOeKn8+vvkAwhviWXTHlG3R1QkbE5M/oywnZ83udJmi+lxjJ5 YhQ5IzomvJ16H0Bq+TLyVLO/VRksp1VR9HxCzItLNCS8PdpYYz5TC204ViycobYU65WMpzWe LFAGn8jSS25XIpqv0Y9k87dLbctKKA14Ifw2kq5OIVu2FuX+3i446JOa2vpCI9GcjCzi3oHV e00bzYiHMIl0FICrNJU0Kjho8pdo0m2uxkn6SYEpogAy9pnatUlO+erL4LqFUO7GXSdBRbw5 gNt25XTLdSFuZtMxkY3tq8MFss5QnjhehCVPEpE6y9ZjI4XB8ad1G4oBHVGK5LMsvg22PfMJ ISWFSHoF/B5+lHkCKWkFxZ0gZn33ju5n6/FOdEx4B8cMJt+cWwARAQABzSlBbmRyZXcgQ29v cGVyIDxhbmRyZXcuY29vcGVyM0BjaXRyaXguY29tPsLBegQTAQgAJAIbAwULCQgHAwUVCgkI CwUWAgMBAAIeAQIXgAUCWKD95wIZAQAKCRBlw/kGpdefoHbdD/9AIoR3k6fKl+RFiFpyAhvO 59ttDFI7nIAnlYngev2XUR3acFElJATHSDO0ju+hqWqAb8kVijXLops0gOfqt3VPZq9cuHlh IMDquatGLzAadfFx2eQYIYT+FYuMoPZy/aTUazmJIDVxP7L383grjIkn+7tAv+qeDfE+txL4 SAm1UHNvmdfgL2/lcmL3xRh7sub3nJilM93RWX1Pe5LBSDXO45uzCGEdst6uSlzYR/MEr+5Z JQQ32JV64zwvf/aKaagSQSQMYNX9JFgfZ3TKWC1KJQbX5ssoX/5hNLqxMcZV3TN7kU8I3kjK mPec9+1nECOjjJSO/h4P0sBZyIUGfguwzhEeGf4sMCuSEM4xjCnwiBwftR17sr0spYcOpqET ZGcAmyYcNjy6CYadNCnfR40vhhWuCfNCBzWnUW0lFoo12wb0YnzoOLjvfD6OL3JjIUJNOmJy RCsJ5IA/Iz33RhSVRmROu+TztwuThClw63g7+hoyewv7BemKyuU6FTVhjjW+XUWmS/FzknSi dAG+insr0746cTPpSkGl3KAXeWDGJzve7/SBBfyznWCMGaf8E2P1oOdIZRxHgWj0zNr1+ooF /PzgLPiCI4OMUttTlEKChgbUTQ+5o0P080JojqfXwbPAyumbaYcQNiH1/xYbJdOFSiBv9rpt TQTBLzDKXok86M7BTQRS4TZ/ARAAkgqudHsp+hd82UVkvgnlqZjzz2vyrYfz7bkPtXaGb9H4 Rfo7mQsEQavEBdWWjbga6eMnDqtu+FC+qeTGYebToxEyp2lKDSoAsvt8w82tIlP/EbmRbDVn 7bhjBlfRcFjVYw8uVDPptT0TV47vpoCVkTwcyb6OltJrvg/QzV9f07DJswuda1JH3/qvYu0p vjPnYvCq4NsqY2XSdAJ02HrdYPFtNyPEntu1n1KK+gJrstjtw7KsZ4ygXYrsm/oCBiVW/OgU g/XIlGErkrxe4vQvJyVwg6YH653YTX5hLLUEL1NS4TCo47RP+wi6y+TnuAL36UtK/uFyEuPy wwrDVcC4cIFhYSfsO0BumEI65yu7a8aHbGfq2lW251UcoU48Z27ZUUZd2Dr6O/n8poQHbaTd 6bJJSjzGGHZVbRP9UQ3lkmkmc0+XCHmj5WhwNNYjgbbmML7y0fsJT5RgvefAIFfHBg7fTY/i kBEimoUsTEQz+N4hbKwo1hULfVxDJStE4sbPhjbsPCrlXf6W9CxSyQ0qmZ2bXsLQYRj2xqd1 bpA+1o1j2N4/au1R/uSiUFjewJdT/LX1EklKDcQwpk06Af/N7VZtSfEJeRV04unbsKVXWZAk uAJyDDKN99ziC0Wz5kcPyVD1HNf8bgaqGDzrv3TfYjwqayRFcMf7xJaL9xXedMcAEQEAAcLB XwQYAQgACQUCUuE2fwIbDAAKCRBlw/kGpdefoG4XEACD1Qf/er8EA7g23HMxYWd3FXHThrVQ HgiGdk5Yh632vjOm9L4sd/GCEACVQKjsu98e8o3ysitFlznEns5EAAXEbITrgKWXDDUWGYxd pnjj2u+GkVdsOAGk0kxczX6s+VRBhpbBI2PWnOsRJgU2n10PZ3mZD4Xu9kU2IXYmuW+e5KCA vTArRUdCrAtIa1k01sPipPPw6dfxx2e5asy21YOytzxuWFfJTGnVxZZSCyLUO83sh6OZhJkk b9rxL9wPmpN/t2IPaEKoAc0FTQZS36wAMOXkBh24PQ9gaLJvfPKpNzGD8XWR5HHF0NLIJhgg 4ZlEXQ2fVp3XrtocHqhu4UZR4koCijgB8sB7Tb0GCpwK+C4UePdFLfhKyRdSXuvY3AHJd4CP 4JzW0Bzq/WXY3XMOzUTYApGQpnUpdOmuQSfpV9MQO+/jo7r6yPbxT7CwRS5dcQPzUiuHLK9i nvjREdh84qycnx0/6dDroYhp0DFv4udxuAvt1h4wGwTPRQZerSm4xaYegEFusyhbZrI0U9tJ B8WrhBLXDiYlyJT6zOV2yZFuW47VrLsjYnHwn27hmxTC/7tvG3euCklmkn9Sl9IAKFu29RSo d5bD8kMSCYsTqtTfT6W4A3qHGvIDta3ptLYpIAOD2sY3GYq2nf3Bbzx81wZK14JdDDHUX2Rs 6+ahAA==
  • Cc: Jason Andryuk <jason.andryuk@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • Delivery-date: Wed, 17 Apr 2024 18:45:33 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 17/04/2024 3:37 pm, Daniel P. Smith wrote:
> diff --git a/xen/common/gzip/gunzip.c b/xen/common/gzip/gunzip.c
> index 1b448d6e3655..8178a05a0190 100644
> --- a/xen/common/gzip/gunzip.c
> +++ b/xen/common/gzip/gunzip.c
> @@ -4,18 +4,25 @@
>  #include <xen/lib.h>
>  #include <xen/mm.h>
>  
> -static unsigned char *__initdata window;
> -
>  #define WSIZE           0x80000000U
>  
> -static unsigned char *__initdata inbuf;
> -static unsigned int __initdata insize;
> +struct gzip_state {
> +    unsigned char *window;
> +
> +    unsigned char *inbuf;
> +    unsigned int insize;
> +
> +    /* Index of next byte to be processed in inbuf: */
> +    unsigned int inptr;

perform_gunzip() passes in an unsigned long size, which means it's
truncated in this state.

Please at least make these size_t.  Life is too short to deal with the
fallout of this going wrong.

>  
> -/* Index of next byte to be processed in inbuf: */
> -static unsigned int __initdata inptr;
> +    /* Bytes in output buffer: */
> +    unsigned int outcnt;

See later, but I think the comment for outcnt is wrong.

>  
> -/* Bytes in output buffer: */
> -static unsigned int __initdata outcnt;
> +    long bytes_out;

See later, but I don't think this can be signed.  It's only used to
cross-check the header-reported length, which is done as an unsigned long.

> +
> +    unsigned long bb;      /* bit buffer */
> +    unsigned bk;           /* bits in bit buffer */

As this is effectively new code, "unsigned int" please.

> @@ -27,7 +34,7 @@ typedef unsigned char   uch;
>  typedef unsigned short  ush;
>  typedef unsigned long   ulg;
>  
> -#define get_byte()      (inptr < insize ? inbuf[inptr++] : fill_inbuf())
> +#define get_byte()     (s->inptr < s->insize ? s->inbuf[s->inptr++] : 
> fill_inbuf())

This is a bit weird:
> $ git grep -w get_byte common/gzip
> common/gzip/gunzip.c:40:#define get_byte()     (s->inptr < s->insize ?
> s->inbuf[s->inptr++] : fill_inbuf())
> common/gzip/inflate.c:224:#define NEXTBYTE(s)  ({ int v = get_byte();
> if (v < 0) goto underrun; (uch)v; })
> common/gzip/inflate.c:1131:    flags  = (uch)get_byte();

NEXTBYTE() gets an s parameter, but doesn't pass it to get_byte() and
instead relies on state always having the name 's' in scope.

I'd suggest turning get_byte() into a proper static function.  It will
be more readable that throwing extra ()'s around s in the existing macro.

Except...  fill_inbuf() is a fatal error anyway, so that can be folded
together to simplify the result.


> @@ -72,16 +78,16 @@ static __init void flush_window(void)
>      unsigned int n;
>      unsigned char *in, ch;
>  
> -    in = window;
> -    for ( n = 0; n < outcnt; n++ )
> +    in = s->window;
> +    for ( n = 0; n < s->outcnt; n++ )
>      {
>          ch = *in++;
>          c = crc_32_tab[((int)c ^ ch) & 0xff] ^ (c >> 8);
>      }
>      crc = c;
>  
> -    bytes_out += (unsigned long)outcnt;
> -    outcnt = 0;
> +    s->bytes_out += (unsigned long)s->outcnt;

I can't figure out why this was written this way, even originally.

AFAICT, outcnt doesn't even match it's comment.

/* Bytes in output buffer: */

It looks like it's the number of bytes in window.  This also matches the
"wp" define which I guess is "window position".

Anyway, irrespective of naming, the cast is useless so lets drop it
while modifying the line.

> diff --git a/xen/common/gzip/inflate.c b/xen/common/gzip/inflate.c
> index 512d9bf0ee2e..5735bbcf7eb4 100644
> --- a/xen/common/gzip/inflate.c
> +++ b/xen/common/gzip/inflate.c
> @@ -119,7 +119,7 @@ static char rcsid[] = "#Id: inflate.c,v 0.14 1993/06/10 
> 13:27:04 jloup Exp #";
>  
>  #endif /* !__XEN__ */
>  
> -#define slide window
> +#define slide s->window

Given only 8 uses, I'd expand this in place and drop the macro.

But, there's an entire comment block discussing "slide" which I think is
wrong for Xen's implementation.  That wants to go too, I think.

>  
>  /*
>   * Huffman code lookup table entry--this entry is four bytes for machines
> @@ -143,12 +143,12 @@ struct huft {
>  static int huft_build OF((unsigned *, unsigned, unsigned,
>                            const ush *, const ush *, struct huft **, int *));
>  static int huft_free OF((struct huft *));
> -static int inflate_codes OF((struct huft *, struct huft *, int, int));
> -static int inflate_stored OF((void));
> -static int inflate_fixed OF((void));
> -static int inflate_dynamic OF((void));
> -static int inflate_block OF((int *));
> -static int inflate OF((void));
> +static int inflate_codes OF((struct gzip_state *, struct huft *, struct huft 
> *, int, int));
> +static int inflate_stored OF((struct gzip_state *));
> +static int inflate_fixed OF((struct gzip_state *));
> +static int inflate_dynamic OF((struct gzip_state *));
> +static int inflate_block OF((struct gzip_state *, int *));
> +static int inflate OF((struct gzip_state *));

These are the only users of OF, and it turns out it's a no-op macro. 
This code would be far-less WTF-worthy if it was expanded as part of
patch 1.

>  
>  /*
>   * The inflate algorithm uses a sliding 32 K byte window on the uncompressed
> @@ -162,8 +162,8 @@ static int inflate OF((void));
>   * must be in unzip.h, included above.
>   */
>  /* unsigned wp;             current position in slide */
> -#define wp outcnt
> -#define flush_output(w) (wp=(w),flush_window())
> +#define wp s->outcnt
> +#define flush_output(s, w) (wp=(w),flush_window(s))

The combination of these two isn't safe, I don't think, especially as
one caller passes wp into flush_output().

There are very few users of wp, so expand it in line, and turn
flush_output() into a static function.  In particular, it will make ...

> @@ -560,8 +557,8 @@ static int __init inflate_codes(
>  
>  
>      /* make local copies of globals */
> -    b = bb;                       /* initialize bit buffer */
> -    k = bk;
> +    b = s->bb;                   /* initialize bit buffer */
> +    k = s->bk;
>      w = wp;                       /* initialize window position */

... this look less weird.  I'd suggest dropping the remark about "globals".


> @@ -1157,18 +1157,18 @@ static int __init gunzip(void)
>          error("Input has invalid flags");
>          return -1;
>      }
> -    NEXTBYTE(); /* Get timestamp */
> -    NEXTBYTE();
> -    NEXTBYTE();
> -    NEXTBYTE();
> +    NEXTBYTE(s); /* Get timestamp */
> +    NEXTBYTE(s);
> +    NEXTBYTE(s);
> +    NEXTBYTE(s);
>  
> -    (void)NEXTBYTE();  /* Ignore extra flags for the moment */
> -    (void)NEXTBYTE();  /* Ignore OS type for the moment */
> +    (void)NEXTBYTE(s);  /* Ignore extra flags for the moment */
> +    (void)NEXTBYTE(s);  /* Ignore OS type for the moment */

I'd drop these (void) casts. They're not different to the timestamp above.

> @@ -1224,13 +1224,13 @@ static int __init gunzip(void)
>          error("crc error");
>          return -1;
>      }
> -    if (orig_len != bytes_out) {
> +    if (orig_len != s->bytes_out) {
>          error("length error");
>          return -1;
>      }
>      return 0;
>  
> - underrun:   /* NEXTBYTE() goto's here if needed */
> + underrun:   /* NEXTBYTE(s) goto's here if needed */
>      error("out of input data");
>      return -1;

Just for completeness, this is a dead path because fill_inbuf() is
implemented as panic().

Fixing this wants putting on a todo list somewhere.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.