Watermarking PDF documents using HttpHandlers

Watermarking PDF documents using HttpHandlers

Published: 03 Oct 2007
By: Deepak Raghavan
Download Sample Code

This article shows how to watermark PDFs using the open source PDF generation library iTextSharp, C#, and ASP.NET.

Introduction

Often enough, Web Applications use PDF documents which may be printable versions of pages, custom forms, manuals etc. When the document consists of critical or sensitive information, the pages are watermarked with the company logo or some other custom image file. There are some off-the-shelf commercial products which have this feature. This article looks at such an implementation using the open source PDF generation library iTextSharp, C#, and ASP.NET.

Using Http Handlers

When a client sends an HTTP request to the browser, the IIS on the WebServer gets to see it first. If the requested resource has an .aspx, .asmx, or .ashx extension, the request is routed to the .NET engine as these files are registered with IIS to be handled by the ASP.NET runtime engine. At this point, the request passes through the ASP.NET HTTP pipeline. The first step is to process the request through an array of Http Modules such as Caching, Authentication, Authorization, SesssionState; before associating it with a HttpHandler endpoint. The .NET runtime processes the request through multiple modules and sends it to an HttpHandler factory. The factory chooses an appropriate handler depending on the file extension, which is used to create the output response. The response can be HTML output in the case of an .aspx request, SOAP response in the case of an .asmx request and custom handling of the request in the case of a .ashx request.

When a PDF file is requested through a browser, IIS is responsible to handle the request and, for example, invokes the Adobe Acrobat Reader to launch the requested file. In the current article, we want to intercept the PDF file requested, add a watermark to it and render it over the browser. If such a custom behavior is desired, we can perform the following steps:

  • Register .pdf file types within the web application with .NET runtime
  • Create a custom handler inheriting IHttpHandler to process the .pdf requests
  • Let the web.config file know that we have a custom handler in place for .pdf files.

iTextSharp

iTextSharp is an open source .NET port of the iText library used for Java Applications. The library can be used to automate PDF creation and/or manipulation. In the current article, the iTextSharp API is used within the custom Http Handler to intercept the PDF request, read the PDF stream, add an image to it and stream it back to the browser.

Pdf Handler

All custom handlers should implement the IHttpHandler interface shown below.

1. public interface IHttpHandler
2. {
3.     bool IsReusable { get; }
4.   
5.     void ProcessRequest(HttpContext context);
6. }

The IHttpHandler interface has two methods to be implemented.

  • IsResuable indicates whether the IHttpHandler instance is reusable
  • ProcessRequest has the main implementation for the associated request type

To create a custom handler, a class library is created within Visual Studio 2005 and a class (PdfHandler) is added to it which implements IHttpHandler. The source code for PdfHandler implementation is shown below

Listing 2: PdfHandler Implementing IHttpHandler

01. /// <summary>
02. /// Reads the requested PDF file, adds the watermark and streams the modified PDF file.
03. /// </summary>
04. /// <param name="context"></param>
05. public void ProcessRequest(HttpContext context)
06. {
07.       MemoryStream outputStream = new MemoryStream();
08.       PdfReader pdfReader = new PdfReader(context.Request.PhysicalPath);
09.       int numberOfPages = pdfReader.NumberOfPages;
10.       PdfStamper pdfStamper = new PdfStamper(pdfReader, outputStream);
11.       PdfContentByte waterMarkContent;
12.       Image image =Image.GetInstance(context.Server.MapPath("watermark.jpg"));
13.       image.SetAbsolutePosition(250, 300);
14.       for (int i =1; i <= numberOfPages; i++)
15.       {
16.             waterMarkContent = pdfStamper.GetUnderContent(i);
17.             waterMarkContent.AddImage(image);
18.       }
19.       pdfStamper.Close();
20.       byte[] content = outputStream.ToArray();
21.       outputStream.Close();
22.       context.Response.ContentType = "application/pdf";
23.       context.Response.BinaryWrite(content);
24.       context.Response.End();
25. }
26. /// <summary>
27. /// Marks the handler reusable across multiple requests
28. /// </summary>
29. public bool IsReusable
30. {
31.       get
32.       {
33.           return true;
34.       }
35. }

The classes used in the ProcessRequest method and their brief descriptions are listed in the table below:

Table 1: Classes within iTextSharp API and their descriptions

Class Name Usage Description
PdfReader Reads and parses a PDF document at the provided Url string
PdfStamper Used to add new content to multiple pages of a document. The content is an instance of PdfContentByte
PdfContentByte Object containing user positioned text and/or graphics on a PDF page.

In the previous listing, the IsReusable property always returns true. This means that a single PdfHandler instance can be used to process multiple concurrent requests (from different clients) served on multiple worker threads. This is possible because we don't have common shared resources or data across multiple requests.

Test Web Application

The following steps are followed to create the demo web application:

  • Create a new web site in your choice of directory and name it TestPdfRender.
  • Add the binary output of the custom handler as a reference
  • Add the image used for watermarking purposes. We have used the sample image provided on the iTextSharp website, but this can be replaced by any image of choice.
  • Modify the web.config file with the following code. The path is set to a .pdf file and verb="*" means that the handler is used across GET, POST, HEAD. TRACE, and DEBUG operations.
1. <httpHandlers>
2. <add verb="*" path="*.pdf" 
3.      type="Handlers.PdfHandler, Handlers" />
4. </httpHandlers>
  • Create a virtual directory within IIS called TestPdfRender and point it to the website.
  • In the project properties of TestPdfRenderer, within Start Options, set the Server section to "Use custom server" and set the Base Url to the location of this test web site within IIS (In our case, it is http://localhost/TestPdfRender).
  • To map the .pdf extension through IIS: within IIS, right click on TestPdfRender->properties->under Virtual Directory tab, click on configuration->add a new extension with extension .pdf and set Executable to the .NET runtime ("c:\windows\microsoft.net\framework\v2.0.50727\aspnet_isapi.dll"), keep the rest of the sections to their default values. Click on OK to close all the open configuration windows.

Demo

The application is ready to be tested. The sample code files included use a test PDF document and an image file which can be replaced. Open "Default.aspx" and add this test hyperlink control:

1. <asp:HyperLink ID="HyperLink1" runat="server" NavigateUrl="Test.pdf"  
2.                Target="_blank">Pdf Document</asp:HyperLink>

Now, set Default.aspx to be the start page. When the link is clicked the demo file shows the image watermarked in the center. The actual position of the image can be altered by changing the coordinates in listing 1, within the ProcessRequest method:

1. image.SetAbsolutePosition(250, 300);

Summary

In this article we have seen how to implement a custom HttpHandler to watermark PDF Documents. The meat of the implementation is within the ProcessRequest method with the PdfHandler. This handler can be used across multiple web applications if the handler is registered through IIS and referenced within the web.config file.

References

转自:http://dotnetslackers.com/articles/aspnet/WatermarkingPDFDocumentsUsingHttpHandlers.aspx

你可能感兴趣的:(document)