Monday, October 8, 2007

Auditing open source software



Google encourages its employees to contribute back to the open source community, and there is no exception in Google's Security Team. Let's look at some interesting open source vulnerabilities that were located and fixed by members of Google's Security team. It is interesting to classify and aggregate the code flaws leading to the vulnerabilities, to see if any particular type of flaw is more prevalent.
  1. JDK. In May 2007, I released details on an interesting bug in the ICC profile parser in Sun's JDK. The bug is particularly interesting because it could be exploited by an evil image. Most previous JDK bugs involve a user having to run a whole evil applet. The key parts of code which demonstrate the bug are as follows:

    TagOffset = SpGetUInt32 (&Ptr);
    if (ProfileSize < TagOffset)
      return SpStatBadProfileDir;
    ...
    TagSize = SpGetUInt32 (&Ptr);
    if (ProfileSize < TagOffset + TagSize)
      return SpStatBadProfileDir;
    ...
    Ptr = (KpInt32_t *) malloc ((unsigned int)numBytes+HEADER);

    Both TagSize and TagOffset are untrusted unsigned 32-bit values pulled out of images being parsed. They are added together, causing a classic integer overflow condition and the bypass of the size check. A subsequent additional integer overflow in the allocation of a buffer leads to a heap-based buffer overflow.

  2. gunzip. In September 2006, my colleague Tavis Ormandy reported some interesting vulnerabilities in the gunzip decompressor. They were triggered when an evil compressed archive is decompressed. A lot of programs will automatically pass compressed data through gunzip, making it an interesting attack. The key parts of the code which demonstrate one of the bugs are as follows:

    ush count[17], weight[17], start[18], *p;
    ...
    for (i = 0; i < (unsigned)nchar; i++) count[bitlen[i]]++;

    Here, the stack-based array "count" is indexed by values in the "bitlen" array. These values are under the control of data in the incoming untrusted compressed data, and were not checked for being within the bounds of the "count" array. This led to corruption of data on the stack.


  3. libtiff. In August 2006, Tavis reported a range of security vulnerabilities in the libtiff image parsing library. A lot of image manipulation programs and services will be using libtiff if they handle TIFF format files. So, an evil TIFF file could compromise a lot of desktops or even servers. The key parts of the code which demonstrate one of the bugs are as follows:

    if (sp->cinfo.d.image_width != segment_width ||
        sp->cinfo.d.image_height != segment_height) {
      TIFFWarningExt(tif->tif_clientdata, module,
        "Improper JPEG strip/tile size, expected %dx%d, got %dx%d",

    Here, a TIFF file containing a JPEG image is being processed. In this case, both the TIFF header and the embedded JPEG image contain their own copies of the width and height of the image in pixels. This check above notices when these values differ, issues a warning, and continues. The destination buffer for the pixels is allocated based on the TIFF header values, and it is filled based on the JPEG values. This leads to a buffer overflow if a malicious image file contains a JPEG with larger dimensions than those in the TIFF header. Presumably the intent here was to support broken files where the embedded JPEG had smaller dimensions than those in the TIFF header. However, the consequences of larger dimensions that those in the TIFF header had not been considered.

We can draw some interesting conclusions from these bugs. The specific vulnerabilities are integer overflows, out-of-bounds array accesses and buffer overflows. However, the general theme is using an integer from an untrusted source without adequately sanity checking it. Integer abuse issues are still very common in code, particular code which is decoding untrusted binary data or protocols. We recommend being careful using any such code until it has been vetted for security (by extensive code auditing, fuzz testing, or preferably both). It is also important to watch for security updates for any decoding software you use, and keep patching up to date.