Silkenweb Example: Hackernews Clone

How to break everything by fuzz testing

155 points by MD87 5 years ago | 37 comments

TwoBit 5 years ago
My favorite personal fuzzing story is from 1987 when a friend said his x86 graphics drawing program was solid, and I said OK and smashed both hands on the keyboard and it insta-crashed.
- MaxBarraclough 5 years ago
  Mark Twain put it best: The weakest of all weak things is a virtue which has not been tested in the fire.
thechao 5 years ago
> The fix for this was fairly straightforward - I just made the library keep a record of the previously visited IFDs and bail out if it found a loop.
If you just want to detect loops, keep a “+1” pointer that you use to increment through the data; also, keep a “+2” pointer that is advanced twice each time your “+1” pointer advances: either your “+2” pointer hits the end, or it becomes equal to your “+1” pointer — in which case you have a loop.
- colatkinson 5 years ago
  Also known as the "Tortoise and Hare Algorithm!"
  https://en.wikipedia.org/wiki/Cycle_detection#Floyd's_Tortoi...
- andrepd 5 years ago
  That's quite clever!
bhaak 5 years ago
If you haven't read the whole article, you should do that.
There's a funny plot twist at the end.
3 bug reports by discovering 1 bug. What a bargain! :-D
oweiler 5 years ago
In university we had to write a simple fuzzer which extracted options from man pages and ran the corresponding command with randomized but valid options. Didn't take long until we found the first bug in one of the tested commands.
- elktea 5 years ago
  that's a good assignment - extra marks for reporting the bugs?
Psyladine 5 years ago
"A QA engineer walks into a bar. Orders a beer. Orders 0 beers. Orders 99999999999 beers. Orders a lizard. Orders -1 beers. Orders a ueicbksjdhd.
First real customer walks in and asks where the bathroom is. The bar bursts into flames, killing everyone."
pfdietz 5 years ago
A very funny thing about fuzzing is how random input testing used to be so looked down upon by the software testing community. Read old (1970s) testing books and you'll see comments like "random testing is the worst kind of testing". I still saw this even as recently as a decade ago.
- praptak 5 years ago
  Fuzzing is not random testing though. It's directed random testing.
  - pfdietz 5 years ago
    The original fuzzing was as random + black box as it gets.
    - UncleMeat 5 years ago
      Yes, and things like coverage guided fuzzing have completely revolutionized things. Prior to directed fuzzing, it was okay but largely unimpressive. Now it blazes through code structures that were previously used as motivating examples for symbolic execution. It is a meaningfully different technique today.
- m000 5 years ago
  That piece of advice was probably valid in the 1970s: The computers were far too slow and far too expensive for any kind of random testing to make sense. Fuzzing became popular when multi-core CPUs became commonplace and RAM more affordable.
- dnautics 5 years ago
  You really shouldn't do random testing. Fuzzing is better, but property testing (fuzzing with shrinkage) is even better.
  - pfdietz 5 years ago
    The original prejudice was against any sort of randomness in testing. Manually constructed tests were seen as superior. That may have been true when computer time was dear, but the bias persisted into the latest edition of a well known book on software testing, published after (for example) Csmith had been released.
snazz 5 years ago
Fuzzing is fun! If you're doing it on your personal computer (as opposed to a cloud VM somewhere), I'd suggest putting the testcase output directory on a spinning-rust hard drive that you don't care about instead of your (presumably much more expensive) internal SSD. It creates an impressive number of disk writes.
I've been thinking about fuzzing JavaScript code (not attacking V8 or SpiderMonkey, but the JS code itself). While JavaScript might not be vulnerable to buffer overflows and format string vulnerabilities, it certainly can have logic issues, unhandled exceptions, and DoS vulnerabilities that are exposed by fuzzing.
I took a look at the most-depended-on NPM packages. I'll try writing test harnesses on functions that take user input. Does anyone have any ideas for packages that could use some fuzz testing?
- segfaultbuserr 5 years ago
  > I'd suggest putting the testcase output directory on a spinning-rust hard drive that you don't care about instead of your (presumably much more expensive) internal SSD.
  Even better, use the /dev/shm RAM disk if your memory is more than enough (although you should probably create an additional RAM disk with a size limit if you don't want a runaway program to accidentally drain your RAM). On a modern development machine, taking 2 GiB out for testcase issue is usually not a problem, and there's often a significant acceleration.
- vimslayer 5 years ago
  If you are interested in finding possible security holes, you could try finding prototype pollution bugs in basically any library that somehow handles user input. Utility libraries like lodash and underscore, argument parsers like yargs, minimist, others like moment, handlebars, DB/ODM tools like Mongoose, Knex, etc.
  You'd look for code where input would be able to modify Object.prototype (or I guess some other constructor's prototype) unintentionally (and it's basically always unintentional).
  Example of such vulnerability found in Minimist https://snyk.io/vuln/SNYK-JS-MINIMIST-559764
  These issues are a constant pain in the JS ecosystem and you wouldn't be the only one using fuzzing to try to find them.
- someguyorother 5 years ago
  If the files you are fuzzing are small, then you could just create a couple of gigs of tmpfs ramdisk with something like "mount -t tmpfs -o size=2G none /mnt/somewhere" and put your fuzz directory on there.
  Then the impressive number of writes are all to memory, which should pose no problem.
jansan 5 years ago
It can be difficult to evaluate the result of a test. We solved this by using an existing (of course inferior) library that uses a different algorithm for the same task (different algorithm so it fails at different tests). We would run the same test with both libraries and compare the results. If they were different, we had to find a way to decide which library failed or maybe evaluate those failed cases manually.
jansan 5 years ago
I have used fuzz testing to make my Bezier intersection library more robust against edge cases. The test would try to find all intersections between a random pair of curves that lie within certain bounds (you probably know that there can be up to nine intersections between two cubic Bezier curves). At the beginning it failed at approx. 1 in 100,000 randomly generated curve pairs, now I am at a point where there is not a single failure in a billion test.
My problem was how to decide if a test failed, because this would not be a crash, but failing to find an intersection between the curves. So I compared against an existing library which uses a completely different algorithm, which means the other library fails at other test cases than my own library. If the results in a test case were different, one must have failed and by testing against the found intersections I could easily decide which one.
luord 5 years ago
I love case studies like this. They are the best way to show why the subject matter at hand is important and worth investing time into.
ToFab123 5 years ago
Interesting. Is there any fuzzing libraries for c#?
- thewebcount 5 years ago
  Can you call c# from C? If so, then you can just use any C fuzzing library, and have it call your C# code. I do this with C++ and Objective-C using clang's libfuzzer. You write a single C function that takes a pointer to a buffer and a length and pass it to whomever you want. I just write a C wrapper that calls my Objective-C or C++ functions with the data.
  - snazz 5 years ago
    Doesn't libFuzzer only require `extern "C" int LLVMFuzzerTestOneInput(...` to fuzz C++ code? What else does your C wrapper do beyond that? Google puts their fuzz tests right alongside the rest of the Chromium source code, which is C++.
- gnud 5 years ago
  There's a "port" of AFL - https://github.com/Metalnem/sharpfuzz
mebr 5 years ago
My summary of this blog post: plenty of random input data can reveal code bugs. The kind of bugs that would take probably a lot of time of think of and write unit tests for, in advance.
- userbinator 5 years ago
  The kind of bugs that would take probably a lot of time of think of and write unit tests for, in advance.
  Would it? Maybe it's because I've had a "low-level upbringing", but whenever I'm writing parsing code for a file format, "assume any byte of data you read can have any value" is the norm. The rest of it follows from there.
  - mebr 5 years ago
    Let's go one step higher, keeping track of the state by a state machine. When designing/coding with the correctness on mind, I try to stay focused, and not think of edge cases. Or I will end up spending more time coming up with edge cases and what can go wrong. I'm not lazy, I'm almost certain of that. But I do feel time is a limited resource and want to add more value per hour spent working. Maybe, this is more the case of if it can be automated then automate it.
  - jra_samba_org 5 years ago
    Yeah, I've gotten to the point where I can't do any arithmetic on any values without immediately adding integer wrap tests afterwards.
    - mcswell 5 years ago
      Reminds me of using a slide rule. You normally push the inner part (the C scale) to the right, line up the 1 on the C scale with the first number you're multiplying on the D scale, then look on the C scale for the second number you're multiplying, and read the result off the D scale immediately below that.
      But when the result is more than 10, you've wrapped: your answer is off the D scale. So now you have to push the inner part back to the left, and line up the 10 (usually marked as 1, at the right-hand end) on the C scale with the first number on the D scale. And remember to add 1 to the exponent.
      I've seen slide rules where the D scale goes slightly beyond 10 (like 10.1), so if the result was just a tiny bit over 10, you wouldn't need to wrap.
    - MaxBarraclough 5 years ago
      Something the C# language gets right is that it can be configured to throw on overflow, or to wrap-around.
  - a1369209993 5 years ago
    Although the really nasty bugs (like this one) happen when the individual bytes are all sensible, but the meta-level structure of the file is toxic.
WrtCdEvrydy 5 years ago
It just gets worse and worse....