2009-08-29:

PHP getimagesize internals (part 2): GIF

php:security:easy
Time has come to write the second part of the PHP getimagesize story (yes, that means that there was a first part *grin*). This time I'll focus more on what getimagesize is supposed to do - on acquiring the image sizes from different file formats. I'll also write about why you should NOT use getimagesize to validate if an uploaded file is really an image.

OK, let's start with a (not-so-)short list of formats supported by getimagesize (in random order):

- GIF
- JPEG
- PNG
- SWF
- SWC (from 4.3.0, requires statically linkecd zlib)
- PSD (from 4.0.6)
- BMP (from 4.0.6)
- TIFF (from 4.2.0)
- JPC (from 4.3.2)
- JP2 (from 4.3.2)
- JPX (from 4.3.2)
- JB2 (from 4.3.2)
- IFF (from 4.3.0)
- WBMP (from 4.3.2)
- XBM (from 4.3.2)
- ICO (from 5.3.0)

In this and few next posts I'll write about the implementations of each format handling function (these functions are called php_handle_FORMATNAME). Today I'll start with GIF, and in the following days I'll write about JPEG, ICO, BMP and others.

The main target will be to create a byte sequence that will pass through getimagesize WITHOUT ANY ERROR (I expect the function to return a proper array) but WILL NOT be a proper image. The second target is to create an image that different size than the getimagesize function states (for example, getimagesize says the image is 10x10, but in fact the browsers show it as 200x200; this will be very browser dependent).

1. GIF


The GIF format handling is placed in a function called php_handle_gif. To reach this function one first has to successfully pass through php_getimagetype function, which additionally must return IMAGE_FILETYPE_GIF. To do that, the following condition must be met:

if (!memcmp(filetype, php_sig_gif, 3)) {
 return IMAGE_FILETYPE_GIF;


Where php_sig_gif is just "GIF". So, with php_getimagetype out of the way, only the php_handle_gif is left:

static struct gfxinfo *php_handle_gif (php_stream * stream TSRMLS_DC)
{
 struct gfxinfo *result = NULL;
 unsigned char dim[5];

 if (php_stream_seek(stream, 3, SEEK_CUR)) (1)
   return NULL;

 if (php_stream_read(stream, dim, sizeof(dim)) != sizeof(dim)) (2)
   return NULL;

 result = (struct gfxinfo *) ecalloc(1, sizeof(struct gfxinfo));
 result->width    = (unsigned int)dim[0] | (((unsigned int)dim[1])<<8);
 result->height   = (unsigned int)dim[2] | (((unsigned int)dim[3])<<8);
 result->bits     = dim[4]&0x80 ? ((((unsigned int)dim[4])&0x07) + 1) : 0;
 result->channels = 3; /* allways */

 return result;
}


As one can see this function is rather short (comparing to a full GIF format decoder implementation), and it works this way:
1. Skip 3 bytes (if (1)) - this is the GIF version, and it should be (according to the GIF standards) 87a or 89a; as one can see this is not validated at all, so we can pass any 3 bytes there
2. Read 5 bytes (if (2)) - this reads a part of the Logical Screen Descriptor block
3. Set the fields width, height and bits according to the read part of LSD
And thats all. There are no additional checks if the format is correct nor even if the file is longer.

So, as one can clearly see, in case of the GIF format (and it will be shown that in case of other formats too) getimagesize goes straight to the width/height/etc ignoring everything else, hence it cannot be used to validate if a stream of bytes is actually a legitimate image. And yes, some php coders tend to use that function this way.

How to create a non-image byte sequence that will pass through getimagesize without an error, returning some width/height as well? It's simple:

$data = "GIFxxx" . pack("vvC", 1024, 768, 0xff);
$a = getimagesize("data://text/plain;base64," . base64_encode($data));
var_dump($a);


The result:
array(7) {
 [0]=>
 int(1024)
 [1]=>
 int(768)
 [2]=>
 int(1)
 [3]=>
 string(25) "width="1024" height="768""
 ["bits"]=>
 int(8)
 ["channels"]=>
 int(3)
 ["mime"]=>
 string(9) "image/gif"
}


What about an image that would have a different size than that showed by getimagesize?

This is the place where we should ask ourselves: how much can we bend the GIF standard so that the browser X would still show the image correctly (in our own definition of correctness)? The answer depends on the browsers X implementation of the GIF decoder of course.
There is also a second, more precise question: how to make the getimagesize return for example 10x10 but the browser would display an 256x192 image?

In case of the GIF format the answer to the second question is simple.

As one may know, the GIF image is composed of one global logical image (which is described in the LSD block which is read by the getimagesize) and a number of rectangular "physical" images ("physical" as in "in fact having some content data") which are "rendered" to the logical image space at decoding. Of course, what is important to us, each "physical" image has it's own size descriptor, which is normally equal to or smaller then the logical image size. In the simplest case the GIF file consists of one "physical" image which is the size of the logical image and it's top-left corner is placed in 0x0 (so that it fills the whole logical image).

Let's create a GIF image that has 10x10 written in LSD block, and 256x192 in a "physical" Image Descriptor block (the easiest way to do it is to take a 256x192 GIF file and modify the LSD header with a hex editor). With a little luck some browsers may resize the logical image to the size of the largest "physical" image (or do something similar).

An example modified GIF looks like this (dear reader: if the below image displays to you as 256x192 it means that your browser has acted in fact as we predicted, resizing the Logical Image):

LSD smaller them ID hack


(The image is of course taken from http://icanhascheezburger.com/ ;>)

As it appears (empirically) the above image is shown as 256x192 by Firefox, Opera and Google Chrome. On the other hand Konqueror crops the "physical" image to be 10x10, and IE denies it's display (however it reserves a 256x192 rectangle on the page).

The getimagesize returns the following info about the above image:

array(7) {
 [0]=>
 int(10)
 [1]=>
 int(10)
 [2]=>
 int(1)
 [3]=>
 string(22) "width="10" height="10""
 ["bits"]=>
 int(8)
 ["channels"]=>
 int(3)
 ["mime"]=>
 string(9) "image/gif"
}


And that would be it for today.

P.S. Answering the yet-unstated question: yeees, there are a lot of PHP scripts out there that use getimagesize to check if the uploaded image isn't too big. Guess they odn't work too well, now do they? ;>

UPDATE:
As said by TeMPOraL on the Polish side of the mirror (the translation is mine):
explorer.exe in Windows Vista does not show the size of the test GIF. IrfanView (4.23) treats it as 256x192 both in size info and display. Also the SE K800i phone behaves the same way.

UPDATE 2:
I've done a couple of more tests on some apps:
- the gqview has troubles to decide whether to display the whole image or just a part; when run from command line (gqview image.gif) it shows a mostly-black 256x192 rectangle that has a 10x10 image rendered in the top-left corner, however in the preview panel one can see the whole image, and if you select the image.gif from the preview panel it also renders as 256x192 (now thats interesting behavior ;>)
- the GIMP imho has behaved in the most professional way possible - it has created a 10x10 image with a 256x192 layer, reading the whole data to the layer (but displaying only 10x10 as one time); 1:0 for GIMP ;>
- the KDE Dolphin (something like explorer, but for KDE) shows 10x10
- so is the case with Gwenview
- OpenOffice.org shows 256x192, just like Firefox and Opera

Comments:

2009-08-29 10:07:13 = plusvic
{
Safari also shows the image as 256x192 ;)
}
2009-08-30 10:05:15 = j00ru
{
Hi,
I believe there's a minor typo in the supported formats' list ("do" has not been correctly translated).
Anyway, interesting post!
}
2009-08-31 21:58:18 = Gynvael Coldwind
{
@plusvic
Thx ;>

@j00ru
Shh, maybe no one will notice ;>
}

Add a comment:

Nick:
URL (optional):
Math captcha: 7 ∗ 1 + 6 =