Friday, January 10, 2014

FFmpeg and a thousand fixes



At Google, security is a top priority - not only for our own products, but across the entire Internet. That’s why members of the Google Security Team and other Googlers frequently perform audits of software and report the resulting findings to the respective vendors or maintainers, as shown in the official “Vulnerabilities - Application Security” list. We also try to employ the extensive computing power of our data centers in order to solve some of the security challenges by performing large-scale automated testing, commonly known as fuzzing.

One internal fuzzing effort we have been running continuously for the past two years is the testing process of FFmpeg, a large cross-platform solution to record, convert and stream audio and video written in C. It is used in multiple applications and software libraries such as Google Chrome, MPlayer, VLC or xine. We started relatively small by making use of trivial mutation algorithms, some 500 cores and input media samples gathered from readily available sources such as the samples.mplayerhq.hu sample base and FFmpeg FATE regression testing suite. Later on, we grew to more complex and effective mutation methods, 2000 cores and an input corpus supported by sample files improving the overall code coverage.

Following more than two years of work, we are happy to announce that the FFmpeg project has incorporated more than a thousand fixes to bugs (including some security issues) that we have discovered in the project so far:

$ git log | grep Jurczyk | grep -c Coldwind
1120

This event clearly marks an important milestone in our ongoing fuzzing effort.

FFmpeg robustness and security has clearly improved over time. When we started the fuzzing process and had initial results, we contacted the project maintainer - Michael Niedermayer - who submitted the first fix on the 24th of January, 2012 (see commit c77be3a35a0160d6af88056b0899f120f2eef38e). Since then, we have carried out several dozen fuzzing iterations (each typically resulting in less crashes than the previous ones) over the last two years, identifying bugs of a number of different classes:
  • NULL pointer dereferences, 
  • Invalid pointer arithmetic leading to SIGSEGV due to unmapped memory access, 
  • Out-of-bounds reads and writes to stack, heap and static-based arrays, 
  • Invalid free() calls, 
  • Double free() calls over the same pointer, 
  • Division errors, 
  • Assertion failures, 
  • Use of uninitialized memory. 
We have simultaneously worked with the developers of Libav, an independent fork of FFmpeg, in order to have both projects represent an equal, high level of robustness and security posture. Today, Libav is at 413 fixes and the library is slowly but surely catching up with FFmpeg.

We are continuously improving our corpus and fuzzing methods and will continue to work with both FFmpeg and Libav to ensure the highest quality of the software as used by millions of users behind multiple media players. Until we can declare both projects "fuzz clean" we recommend that people refrain from using either of the two projects to process untrusted media files. You can also use privilege separation on your PC or production environment when absolutely required.

Of course, we would not be able to do this without the hard work of all the developers involved in the fixing process. If you are interested in the effort, please keep an eye on the master branches for commits marked as "Found by Mateusz "j00ru" Jurczyk and Gynvael Coldwind" and watch out for new stable versions of the software packages.

For more details, see the “FFmpeg and a thousand fixes” posts at the authors’ personal blogs here or here.