Week 10

Updated on 28 Dec 2018

ZIP Files

A zip file is a special type of file that acts as a holder for other files and/or directories. It may contain a single file or it may contain thousands, and this last part is very important because it is much easier to handle one file instead of many.

Windows Explorer allows you to create zipped folders, and view / extract the contents of one; whether you created the zip file or not.

The PHP language has built in functions to create zip files, read their contents and extract the files.

Question:

  • Do you know what a zip file is, and if so, how you might use it in web-programming?

ZIP File Basics

Reading, Writing and Extracting all follow the same basic steps.

Create a Zip object
Open the Zip object (with the zip filename)
Do our stuff
Close the Zip object
$zip = new ZipArchive();

if($zip->open('test1.zip', ZIPARCHIVE::CREATE) !== true) 
    exit("cannot open file");

//--do our stuff

$zip->close();

Okay so let’s have a look at what we are doing. The first step is creating a ZipArchive object, and this is a fairly rudimentary task. You should recognize the code (using new) as being Object Orientated.

$zip = new ZipArchive();

The next step is to open a zip file. In this example we are actually creating a zip file, which you probably could deduce from the second parameter in the open method.

$zip->open('test1.zip', ZIPARCHIVE::CREATE);

There are 4 predefined Zip constants that can be used for the second parameter.

Constant Meaning
ZIPARCHIVE::OVERWRITE Always start a new archive, this mode will overwrite the file if it already exists.
ZIPARCHIVE::CREATE Create the archive if it does not exist.
ZIPARCHIVE::EXCL Error if archive already exists
ZIPARCHIVE::CHECKCONS Perform additional consistency checks on the archive, and error if they fail.

Using the OVERWRITE and CREATE constants is only necessary if you wish to write to a zip file that you have to create. If the zip file already exists, and you want to add more files to it (or you want to read the contents / extract the files) then you don’t need to pass anything to the second parameter.

Many things can go wrong when you are dealing with the file system. Throw in a compressed folder and you can add a few more things that have the potential to go wrong. So it is no surprise that the call to open returns true if successful. If not successful open returns any one of 9 different error codes.

The PHP documentation for ZipArchive::open will list the 9 error codes that can be returned. The predefined constants will give an explanation for what those error codes mean.

The last part of our example closes the zip file. This is necessary to ‘save’ any changes that we’ve done.

$zip->close();

Questions:

  • When we created test_1.zip, whereabouts is it being created?
  • What sort of error codes do you think open might return?
  • Why is the error check with open using !== operator instead of != ? What is the difference?

Writing to a ZIP File

The previous example is perfect to expand upon for writing to a zip file, and writing to a zip file is very easy as shown in the example where we replace the ‘do our stuff’ comment with the code below.

$zip->addFile('files/pic1.jpg', 'pic1_z.jpg');

The first parameter is the file that we are adding. The ‘files’ directory is relative to the current script directory.

www_root/
  zip_add.php
  ...
  files/
    pic1.jpg

The second parameter is the name that the file will take inside the zip archive. In most cases you’d probably keep the same name, but in my example I’ve chosen a different name. You can even specify a directory location inside the archive where the file should be placed! I.e.

$zip->addFile('files/pic1.jpg',
              'myDir/pic1_z.jpg');

If you have multiple files that you need to add, then you simply call addFile as many times as you need to add files.

The addFile method returns a Boolean. true on success or false on error.

Reading a ZIP File

Reading a zip file could mean one of two things. Most people would associate ‘reading’ a ZIP file as getting the filenames that are in the archive. But it could also mean reading the contents of the file(s) in the archive. In this course we will be doing the former as there wouldn’t be much of a need to read the contents of a file.

The following example replaces the ‘do our stuff’ comment to display a nicely formatted unordered list of the contents of a zip archive.

echo '<ul>';
for($i = 0; $i < $zip->numFiles; $i++)
  {
  echo '<li>' . $zip->getNameIndex($i) . '</li>';
  }
echo '</ul>';

A couple of new things have been introduced in this example. numFiles, is a property of the ZipArchive class that stores the number of entries in the zip file. As we can see in the example it has been very useful in providing the upper bounds for a loop.

getNameIndex is a method that returns the name of an entry using the index number. It will also include any subfolders as part of the entry name.

We should be able to deduce from the code that using getNameIndex is retrieving the entry based on a zero indexed array; hence the use of the numFiles property.

ZipArchive Properties

The ZipArchive object gives the programmer access to 5 properties. We’ve already seen one of them in the previous code example, numFiles. The others include status, statusSys, filename and comment. I leave it as an exercise for you to find out what these mean, and how you might use them.

Questions:

  • What do you think the filename property would hold?
  • There is a method called locateName which returns the index of a file in the archive. How might this be useful?
  • What would happen if we called getNameIndex with an index that wasn’t actually in the archive?

Extracting Files from a ZIP

Extracting files from a zip archive follows the same principles that have been employed in the previous 2 examples. I.e. we just need to replace our ‘do your stuff’ comment with our extraction. An example is shown below.

$dir = dirname($_SERVER['SCRIPT_FILENAME']);
$zip->extractTo($dir);

This is a 2 step process. First we need to select a directory for where the zip is going to be extracted, and then we pass that value as a parameter to the extractTo method. This could be a hard-coded value or a dynamic value like in the example.

Using a dynamic directory value the way I have means that anyone can copy my example code and it will work. Hard-coding the directory means that the user needs to have the same directory structure otherwise the extractTo method won’t work.

$_SERVER['SCRIPT_FILENAME'] only works if the PHP script is running via Apache (or IIS). Non web based scripts can use the __FILE__ constant which usually gives you the same result.

The extractTo method has an optional second parameter. When this parameter is not used (as in my previous example) the entire archive is extracted. If you use the second parameter you can specify which files are to be extracted as shown in the example below.

$zip->extractTo($dir, array('myDir/pic1_z.jpg', 'pic2_z.jpg'));

It is important to note that the second parameter can take either a single entry (no array) or an array of files to be extracted. Also you need to extract the files with the directory structure as it appears inside the archive.

Questions

  • When would $_SERVER['SCRIPT_FILENAME'] and __FILE__ give different results?
  • With the extractTo example above that contains a directory, would your computer need to have this directory for the code to work? What would happen?

ZIP miscellaneous

The ZipArchive class has several other methods / concepts that we’re not covering in this lecture, but are worth mentioning.

It is possible to insert and retrieve comments for the archive and/or individual entries. I’m not sure what purpose this serves, and I don’t know how or if a user can view a comment.

There are deleteIndex and ddeleteName methods available if you need to delete a file from the archive. The method name should provide you with a hint as to what sort of argument the method is expecting.

renameIndex and renameName allow you to rename an entry inside the archive, and statIndex and statName give you status information about a file (i.e. the original size, the compressed size, method used etc…)

Question:

  • What steps should I take when investigating how to use archive comments?