Friday, November 11, 2011

Create ZIP Files From An ASP.NET


Introduction


ZIP file is a popular, decades-old file format used for file compression and archiving. Commonly, such files have a .zip extension and are used to reduce the size of one or more files and/or to archive multiple files and folders into a single file. Additionally, the contents of a ZIP file can optionally be encrypted and viewable only by those who know the password. Both Microsoft's Windows and Apple's Mac OS provide built-in operating system support for opening, reading, and creating ZIP files.
In a recent project I needed to create ZIP files from an ASP.NET application on the fly. Specifically, there was a web page that listed of series of data files that were created by an external process. Users visiting this page select a file to download, which then displays a dialog box in their browser, allowing them to open the file or save it to their hard drive. This user interface worked for most of our users, as they were only interested in downloading one or two files at most. However, some of our users needed to download upwards of 20 files. For them, clicking a download link, saving it to their hard drive, and repeating, 20 times, was frustrating and time consuming. To improve this user interface we created the notion of "download profiles," which allow users to associate a name - like "Accounting Files" - with a collection of file types that are available for download. After creating a "download profile," a user could then choose to download all available files that belong to that profile. This would create a ZIP file with the appropriate files and display a dialog box in the user's browser, allowing them to open or save the ZIP. With this enhancement, our power users can now download their 20 files with one mouse click.
This article starts with a look at different ways to create ZIP files in an ASP.NET application, but then focuses on using the free and open-source DotNetZip library. Read on to learn more!


An Overview of Compression and Archiving


The ZIP file format - invented by Phil Katz in 1989 - is one of the most widely used compression and archiving file formats. As noted in the Introduction, operating systems like Microsoft Windows and Apple Mac OS support the ZIP file format natively. There are also countless third-party applications for creating, opening, and reading ZIP files, including WinRARWinZip, and 7-Zip, among many others.The .NET Framework has offered only spotty support for compression and archiving throughout its history. In the .NET Framework 2.0, Microsoft added support for theDEFLATE compression algorithm and the gzip file format. The DEFLATE compression algorithm provides a lossless way to compress content, whereas the gzip file format defines a format for a compressed file, typically compressed using DEFLATE. The two classes in the .NET Framework's System.IO.Compression namespace - DeflateStreamand GZipStream - enable developers to compress or decompress files using DEFLATE and to compress or decompress data using the gzip file format. gzip was designed to compress a single file and therefore is not suitable for creating an archive of compressed files. (In plain English, gzip and ZIP are two totally different things; you would not use the .NET Framework's GZipStream class to create a ZIP file.)
Support for creating ZIP files was added to the .NET Framework in version 3.0 with the introduction of the System.Data.Packing namespace and its ZipPackage class. With the ZipPackage class you can programmatically create ZIP files. However, the ZipPackage class does not support ZIP features like encryption and password protection, among others.
After a bit of research, I decided to pass on the .NET Framework's ZipPackage class and instead use the DotNetZip library. DotNetZip is a free, open-source library built atop the .NET Framework 3.5 that allows .NET developers to create, read, extract, and update ZIP files. The remainder of this article looks at two scenarios that show using the DotNetZip library to programmatically create ZIP files:
  • The first one is similar to the project I worked on - namely, the user is presented with a list of files to download. After selecting the files to download and clicking the "Download" button, the ASP.NET page creates a ZIP file containing the selected file and returns it to the user, upon which she can open it or save it to her hard drive.
  • The second one shows how to take a file uploaded by a user, place it in a ZIP file, and then save that ZIP file on the web server's file system (rather than saving the original upload).
The complete code for these two scenarios (in both C# and VB) is available for download at the end of this article. Please note that to run the demo - or to use DotNetZip in your own project - you'll need to be using ASP.NET 3.5 or beyond and you'll need to have the DotNetZip assembly (Ionic.Zip.dll) in the Bin folder of your web application. This assembly (Version 1.9) can be found in download available at the end of this article; alternatively, you can get the most recent version from the DotNetZip homepage.

Downloading Multiple Files as a Single ZIP


ZIP files are a convenient way to combine multiple files into a single archive. This capability proves especially useful in certain web-based scenarios, such as the scenario I laid out in the Introduction. Consider a website that allows its visitors to download files. Rather than requiring users to download files one at a time, we may want to let the user select a set of files to download, and then deliver those in a single ZIP file. This is quite easy to accomplish using the DotNetZip library.The demo application available for download at the end of this article includes a folder named ~/DownloadLibrary, in which are files users can download. To download these files, a user visits the ~/DownloadFiles.aspx web page, where they are shown a list of the files in the ~/DownloadLibrary folder with a check box next to each file. The user then selects one or more files, clicks the "Download Now!" button, and is delivered a single ZIP file that contains the files selected for download. What's more, if the user provides a password, the ZIP file's contents are encrypted and the person opening the ZIP file must provide the correct password in order to extract or view the contents of the ZIP.
The screen shot below shows the ~/DownloadFiles.aspx web page when viewed through a browser.





The list of files in the screen shot above is displayed using the CheckBoxList Web control. This list is populated in the Page_Load event handler by using the DirectoryInfoclass to retrieve the list of files in the ~/DownloadLibrary folder.
Creating a ZIP file with DotNetZip is intuitive. Just follow these simple steps:
  1. Create an instance of the ZipFile class,
  2. Specify any configuration information for the ZIP file. For example, if you want to encrypt the ZIP file's contents you'll typically specify the encryption algorithm to use (via the Encryption property) as well as the password to use to decrypt the ZIP file (using the Password property),
  3. Add content to the ZIP file using the ZipFile class's AddFile or AddEntry methods. Use the AddFile method to add the contents of a file on disk to the ZIP file. Use the AddEntry method to add an entry into the ZIP file based on string or binary data that you supply.
  4. Call the ZipFile class's Save method, saving the contents to disk or a stream.
Let's walk through implementing each of these steps for the download example. Before we create the ZIP file, though, we first need to tell the browser that we're going to be sending back a ZIP file. Remember, this example prompts the user to select a number of files. Selecting these files and clicking the "Download Now!" button causes a postback. Instead of returning the HTML for the page, the web server is going to respond with the binary contents of the ZIP file that was just created that contains the files the user selected. Therefore, we need to tell the browser that it's going to be getting back a ZIP file (rather than a web page) and that it should prompt the user whether to open or save the content. This is done by specifying values for the response's Content-Type and Content-Disposition headers:
// Tell the browser we're sending a ZIP file!
var downloadFileName = string.Format("YourDownload-{0}.zip", DateTime.Now.ToString("yyyy-MM-dd-HH_mm_ss"));
Response.ContentType = "application/zip";
Response.AddHeader("Content-Disposition", "filename=" + downloadFileName);

...

Note that the Content-Disposition header specifies a filename; this filename is the filename the browser suggests the user save the file as. Here, we use the filenameYourDownload-date.zip, where date contains the four-digit year, two-digit month, two-digit day, two-digit hour, two-digit minute, and two-digit second specifying the (web server's) date and time when the download was requested. For example, the filename might look like: YourDownload-2010-09-29-15_28_05.zip.
With the header stuff out of the way, we're ready to create the ZipFile object! Start by creating an instance of the ZipFile class in a using statement like so:

// Zip the contents of the selected files (STEP 1)
using (var zip = new ZipFile())
{
   ...
}

Next, we need to determine whether to encrypt the contents of the ZIP file. If the user supplied a password then we want to encrypt the ZIP and limit access to those who know the password. This is done by setting the ZipFile object's Encryption and Password properties:

using (var zip = new ZipFile())
{
   // Add the password protection, if specified (STEP 2)
   if (!string.IsNullOrEmpty(txtZIPPassword.Text))
   {
      zip.Password = txtZIPPassword.Text;
      zip.Encryption = EncryptionAlgorithm.PkzipWeak;
   }

   ...
}

Note that if a password is supplied, the ZIP file's contents are encrypted using the PkzipWeak encryption algorithm. This is the encryption algorithm that was specified in the original specification of the ZIP format and is the only encryption algorithm natively across all ZIP reading programs. Unfortunately, this algorithm is known to be exploitable and is not recommended for protecting sensitive information. You can alternatively use 128- or 256-bit AES encryption by setting the Encryption property toEncryptionAlgorithm.WinZipAes128 or EncryptionAlgorithm.WinZipAes256, respectively; however, not all vendors support this stronger form of encryption. For example, you cannot extract files using Windows built-in ZIP file program if the ZIP was encrypted using AES. For more information see DotNetZip's EncryptionAlgorithm technical documentation.
We're now ready to add content to the ZIP file! In this example I wanted to add both the selected files and a README.txt file that lists the files included in the ZIP. There is no such README.txt file to include in the ZIP; instead, I will build a string (readMeMessage) and then add it to the ZIP file. The following code accomplishes both of these aims: it starts by defining the readMeMessage string variable. Next, it enumerates the ListItems in the CheckBoxList and, for each checked one, adds the file to the ZIP and notes the file in the readMeMessage string. After adding all of the files a new entry named README.txt is added to the ZIP containing the contents of the readMeMessage variable.

using (var zip = new ZipFile())
{
   ...

   // Construct the contents of the README.txt file that will be included in this ZIP
   var readMeMessage = string.Format("Your ZIP file {0} contains the following files:{1}{1}", downloadFileName, Environment.NewLine);

   // Add the checked files to the ZIP (STEP 3)
   foreach (ListItem li in cblFiles.Items)
      if (li.Selected)
      {
         // Record the file that was included in readMeMessage
         readMeMessage += string.Concat("\t* ", li.Text, Environment.NewLine);

         // Now add the file to the ZIP
         zip.AddFile(li.Value, "Your Files");
      }

   // Add the README.txt file to the ZIP (STEP 3)
   zip.AddEntry("README.txt", readMeMessage, Encoding.ASCII);

   ...
}

Note that when calling the AddFile method I have specified two input parameters. The first one (li.Value) is the path to the file to download (the CheckBoxList has been configured such that the value of each ListItem contains the full file path of the file, such as C:\MySites\ZipDemo\DownloadLibrary\puppy.jpg). The second parameter specifies where in the ZIP file's folder hierarchy to place the file. A ZIP file's contents can be arranged into folders. If you do not supply this second parameter, DotNetZip creates a folder structure that corresponds to the folder structure of the file on the web server's file system. Consider adding the fileC:\MySites\ZipDemo\DownloadLibrary\puppy.jpg to the ZIP folder without specifying a path in the AddFile method call. In this case, the ZIP file would contain a folder named MySites, with a subfolder named ZipDemo, with a subfolder named DownloadLibrary, in which would be the file puppy.jpg.
To have the file placed in the root of the ZIP file, call the AddFile method and pass in an empty string as the second parameter, like so: AddFile(fileToAdd, string.Empty). In the code above, I instruct the AddFile method to create a folder in the ZIP file named Your Files, into which to place the user's selected files.
After adding the user-selected files, I add one more entry to the ZIP file, a dynamically-created file named README.txt that contains the contents of the readMeMessage string. The AddEntry method can be used to add new content to the ZIP, both string content (like in the example above) and binary content. This README.txt file will be placed in the root of the ZIP file. (To place the README.txt file in the Your Files folder you would specify the path name in the first input parameters, like so: zip.AddEntry(@"Your Files\README.txt", ...).
With the ZIP file's contents defined, the final step is to save the ZIP file. The ZIP file can be saved to the web server's file system or written to a stream. In this case we want to blast the contents of the ZIP file down to the requesting browser; to accomplish that, we write the ZIP out to the Response object's OutputStream:

using (var zip = new ZipFile())
{
   ...

   // Send the contents of the ZIP back to the output stream (STEP 4)
   zip.Save(Response.OutputStream);
}

With this code in place in the "Download Now!" Button's Click event handler, the visitor is returned a ZIP file that includes their selected files and, if specified, is encrypted and protected by a password.

Automatically ZIPing Uploaded Files


The previous example illustrated the archival qualities of a ZIP file - a means for packaging a collection of files (and folders) into a single file. ZIP files also are quite useful for compression, as they can dramatically reduce the size of certain file types. The ZIP compression is most pronounced on files that do not already employ some sort of compression, especially text files, web pages, and BMP images. Files like MP3s, other ZIP files, and JPG images already use compression and therefore do not enjoy much further compression from the ZIP file format.If you run a website where users frequently upload large, uncompressed files, you may consider storing not the actual uploaded file, but rather a ZIP file containing the uploaded content. Whether this makes sense for your application depends on a variety of parameters, but the good news is that implementing such functionality is quite easy with DotNetZip. Imagine a web page that has a FileUpload control through which users can upload content from their desktop. The following code takes the user's uploaded file and saves it to the ~/Uploads folder on the web server.

var saveToFilename = Server.MapPath("~/Uploads/" + fupUploadAndZip.FileName);
fupUploadAndZip.SaveAs(saveToFilename);

The first line of code creates a variable named saveToFilename that contains the full physical path to where the uploaded file should be saved on the web server's file system. Specifically, it says to save it in the ~/Uploads folder using the same name as the name of the file the user uploaded. (This naming strategy is not, generally, a good idea, since multiple users may upload different files with the same name.) Nice and simple. But what if we wanted to first ZIP the uploaded file and then save just that ZIP file to the web server's file system? That's actually quite simple, too, as the following code illustrates.

// Determine the name of file to save the upload to
var saveToFilename = Server.MapPath("~/Uploads/" + Path.GetFileNameWithoutExtension(fupUploadAndZip.FileName) + ".zip");

// ZIP the uploaded contents and save the ZIP to disk!
using (var zip = new ZipFile())
{
   zip.AddEntry(fupUploadAndZip.FileName, fupUploadAndZip.FileContent);

   zip.Save(saveToFilename);
}

This time, the saveToFilename is a little more involved. Instead of saving the file to the ~/Uploads folder using the same name as the uploaded file, we instead want to use the same name, but use the extension .zip. The Path.GetFileNameWithoutExtension method returns the file name (less the extension), to which we concatenate the desired extension.
Next, we create a ZipFile object and add the uploaded content using the AddEntry method. The first input parameter to AddEntry specifies the name of the file in the ZIP file. Here we use the uploaded file name. The second input parameter specifies the binary contents, into which we pass the FileUpload control's FileContent property returns the binary contents of the just-uploaded file. To conclude, we use the Save method to save the ZIP file to the ~/Uploads folder. Easy!
(An example: if the user uploaded a file named MyNovel.pdf we would create a ZIP file named MyNovel.zip that contained a single entry named MyNovel.pdf with the contents of the uploaded PDF. This ZIP file would then be saved in the ~/Uploads folder on the web server's file system.)