[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 09/14] fuzz/x86_emulate: Take multiple test files for inputs



On 09/15/2017 02:07 PM, Wei Liu wrote:
> On Fri, Aug 25, 2017 at 05:43:38PM +0100, George Dunlap wrote:
>> Finding aggregate coverage for a set of test files means running each
>> afl-generated test case through the harness.  At the moment, this is
>> done by re-executing afl-harness-cov with each input file.  When a
>> large number of test cases have been generated, this can take a
>> significant amonut of time; a recent test with 30k total files
>> generated by 4 parallel fuzzers took over 7 minutes.
>>
>> The vast majority of this time is taken up with 'exec', however.
>> Since the harness is already designed to loop over multiple inputs for
>> llvm "persistent mode", just allow it to take a large number of inputs
>> on the same when *not* running in llvm "persistent mode"..  Then the
>> command can be efficiently executed like this:
>>
>>   ls */queue/id* | xargs $path/afl-harness-cov
>>
>> For the above-mentioned test on 30k files, the time to generate
>> coverage data was reduced from 7 minutes to under 30 seconds.
>>
>> Signed-off-by: George Dunlap <george.dunlap@xxxxxxxxxx>
>> ---
>> CC: Ian Jackson <ian.jackson@xxxxxxxxxx>
>> CC: Wei Liu <wei.liu2@xxxxxxxxxx>
>> CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> CC: Jan Beulich <jbeulich@xxxxxxxx>
>> ---
>>  tools/fuzz/README.afl                             |  7 +++++++
>>  tools/fuzz/x86_instruction_emulator/afl-harness.c | 23 
>> ++++++++++++++++-------
>>  2 files changed, 23 insertions(+), 7 deletions(-)
>>
>> diff --git a/tools/fuzz/README.afl b/tools/fuzz/README.afl
>> index 0d955b2687..e8c23d734c 100644
>> --- a/tools/fuzz/README.afl
>> +++ b/tools/fuzz/README.afl
>> @@ -49,6 +49,13 @@ generate coverage data.  To do this, use the target 
>> `afl-cov`:
>>  
>>      $ make afl-cov #produces afl-harness-cov
>>  
>> +In order to speed up the process of checking total coverage,
>> +`afl-harness-cov` can take several test inputs on its command-line;
>> +the speed-up effect should be similar to that of using afl-clang-fast.
>> +You can use xargs to do this most efficiently, like so:
>> +
>> +    $ ls queue/id* | xargs $path/afl-harness-cov
>> +
>>  NOTE: Please also note that the coverage instrumentation hard-codes
>>  the absolute path for the instrumentation read and write files in the
>>  binary; so coverage data will always show up in the build directory no
>> diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c 
>> b/tools/fuzz/x86_instruction_emulator/afl-harness.c
>> index 51e0183356..79f8aec653 100644
>> --- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
>> +++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
>> @@ -16,6 +16,8 @@ int main(int argc, char **argv)
>>  {
>>      size_t size;
>>      FILE *fp = NULL;
>> +    int count = 0;
>> +    int max;
> 
> unsigned int.
> 
>>  
>>      setbuf(stdin, NULL);
>>      setbuf(stdout, NULL);
>> @@ -42,8 +44,7 @@ int main(int argc, char **argv)
>>              break;
>>  
>>          case '?':
>> -        usage:
>> -            printf("Usage: %s $FILE | [--min-input-size]\n", argv[0]);
>> +            printf("Usage: %s $FILE [$FILE...] | [--min-input-size]\n", 
>> argv[0]);
>>              exit(-1);
>>              break;
>>  
>> @@ -54,21 +55,27 @@ int main(int argc, char **argv)
>>          }
>>      }
>>  
>> -    if ( optind == argc ) /* No positional parameters.  Use stdin. */
>> +    max = argc - optind;
>> +
>> +    if ( !max ) /* No positional parameters.  Use stdin. */
>> +    {
>> +        max = 1;
>>          fp = stdin;
>> -    else if ( optind != (argc - 1) )
>> -        goto usage;
>> +    }
>>  
>>      if ( LLVMFuzzerInitialize(&argc, &argv) )
>>          exit(-1);
>>  
>>  #ifdef __AFL_HAVE_MANUAL_CONTROL
>>      while ( __AFL_LOOP(1000) )
>> +#else
>> +    for( count = 0; count < max; count++ )
>>  #endif
>>      {
>>          if ( fp != stdin ) /* If not using stdin, open the provided file. */
>>          {
>> -            fp = fopen(argv[optind], "rb");
>> +            printf("Opening file %s\n", argv[optind]);
> 
> optind + count
> 
>> +            fp = fopen(argv[optind + count], "rb");
>>              if ( fp == NULL )
>>              {
>>                  perror("fopen");
>> @@ -87,7 +94,9 @@ int main(int argc, char **argv)
>>          if ( !feof(fp) || size > INPUT_SIZE )
>>          {
>>              printf("Input too large\n");
>> -            exit(-1);
>> +            if ( optind + 1 ==  argc )
> 
> What is this for?

Only call exit() when the file is too large if the current testcase is
the last one.

One of the "interesting testcases" that AFL finds is always "The file
was too large".  Without this clause, "afl-harness-cov [blah] queue/id*"
would process testcases until it found that one, then stop processing,
rather than processing the full set of them.

We could make this something like "if ( max == 1 )" (only exit if not in
'batch mode') if you think that would be better.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.