AspAlliance.com LogoASPAlliance: Articles, reviews, and samples for .NET Developers
URL:
http://aspalliance.com/articleViewer.aspx?aId=1447&pId=-1
Creating PDFs with C# using Ghostscript
page
by Bhuban Mohan Mishra
Feedback
Average Rating: 
Views (Total / Last 10 Days): 93772/ 221

Introduction

Portable Document Format (PDF) is a file format from Adobe that enables a document to be distributed on different systems while preserving the layout. It has become a standard for secured and reliable distribution and exchange of electronic documents around the world. It preserves the fonts, images, graphics, and layout of any source document, regardless of the application and platform used to create it, thus making it cross-platform and cross-browser compatible.

With the increased use of PDF documents as a universal format for sharing documents and managing the paperless office, it has become a part of the commercial applications to be able to convert documents of different format to PDF. In this article we will discuss how we can use Ghostscript to convert various documents into PDF.

Ghostscript

Ghostscript is a set of packages written in C language that interprets the PDF file format and is able to convert Postscript files to PDF and vice versa. Though it has limited options, it can be used to convert a lot of document formats into PDF.

Though many versions of Ghostscript are available, we will use the GNU Ghostscript that is available with GNU General Public License. The latest version of Ghost script (we will use v8.56) can be download from SourceForge. Remember to get the installer for Windows.

Installing Ghostscript

Download the installer "gs856w32.exe" and install it on Windows OS. The default installation location is "C:\Program Files\gs." The required files for development and command line tools can be found in "C:\Program Files\gs\gs8.56\bin" and the printer installation files can be found in "C:\Program Files\gs\gs8.56\lib."

Installing PDF Printer Driver

Installing the printer driver is easier if we use the "Add Printer Wizard" from Start Menu --> Printer and Faxes --> Add a printer. Alternatively, we can use launch the same wizard from command line. On the Windows Run Prompt, type the following:

rundll32.exe printui.dll, PrintUIEntry /il

This will launch the Add Printer Wizard.

Figure 1

Add a local Printer. Uncheck the "Automatically Detect and Install my Plug and Play Printer" option.

Figure 2

Now we have to create a new Port. Select "Local Port" from the drop down list.

Figure 3

Now enter the local port location. Keep it "C:\\GSOUTPUT.PS."

Figure 4

On the next screen, click on the "Have Disk" button. Now browse and locate the "ghostpdf.inf" file from the lib directory of the ghostscript. This file holds the driver details for the GhostScript Printer.

Figure 5

On the next screen provide the Printer Name. Let us have the default "Ghostscript PDF."

Figure 6

Skip any other pages of the wizard. Now we are ready to setup the Ghostscript printer.

Figure 7

During installation, the windows will popup the "Windows Logo Warning." Click "Continue Anyway" to install the driver.

Figure 8

Now we are ready to print PostScript files through the installed printer. Whenever we print any document on this printer, if we have any PostScript Viewer/Converter, then we will find the PostScript version of the printed document in our GSOUTPUT.PS file.

Converting documents to PDF

Converting documents to PDF is a two step process. First, we will convert the Document to a PostScript file, and then will convert this postscript file to PDF. The only limitation to this is that a document that has to be converted to PDF has to have an application associated to it. That means, if you want to convert a .doc or a .odt file to PDF, then you need to have appropriate applications installed on the system that can read these files. For example, we need to have Microsoft Word or simply Microsoft Word Viewer for .doc and OpenOffice Writer for an .odt file.

The reason behind this is that we need to have an application that can really read the document format and can produce a Printer friendly text. This Printer friendly text is then converted to a postscript file via the installed Ghostscript printer. We will then use the gswin32c.exe, the command line utility provided by GhostScript to convert the postscript file to a PDF document.

The Process

Let us create a small Windows application in C# to make this whole process a reality. First, put a Textbox and a Button to browse and point to a document that we wish to convert to PDF. Then we can have another button that initiates the PDF conversion process.

Another important part of the process is the gswin32c.exe that helps to convert the postscript file to PDF. So, we will need this file to be copied to the bin\Debug folder of the project.

Figure 9

As stated earlier, the conversion process is a two step process, one that converts the document to postscript and the other that converts the postscript file to PDF. So, we will have 2 functions,convertToPs() and convertToPDF(), that actually depicts these two process.

 

convertToPs()

If we look closely at the code in Listing 1, we will get to know that we are using the installed PDF printer to print the file. As the Ghost PDF printer is a postscript printer, it will create the printable format of the file as GSOUTPUT.PS.

Listing 1: convertToPs()

public void convertToPs(string file)
{
  try
  {
    Process printProcess = new Process();
 
    printProcess.StartInfo.FileName = file;
    printProcess.StartInfo.Verb = "printto";
    printProcess.StartInfo.Arguments = "\"Ghostscript PDF\"";
    printProcess.StartInfo.WindowStyle = ProcessWindowStyle.Hidden;
    printProcess.StartInfo.CreateNoWindow = true;
    printProcess.Start();
 
    // Wait until the PostScript file is created
    try
    {
      printProcess.WaitForExit();
    }
    catch (InvalidOperationException){}
 
    printProcess.Dispose();
  }
  catch (Exception ex)
  {
    throw ex;
  }
}

This method takes in one argument, the full path of the file which needs to be converted to PDF. Then we create a new Process, with the following StartInfo:

·         FileName: The full path of the file to be converter to PDF. This is converted to postscript in this method.

·         Verb: We are using "printto" as the Process Verb. The default verb for printing any document is "print." But, this will use the default printer to print the document. In order to use the installed "Ghostscript PDF" printer, if it has not been set as a default printer, we need to pass "printto" as the Verb and the printer name as the argument.

·         Argument: The installed Ghostscript printer name.

These are the required properties that are to be set, to convert our document to postscript. We can set other properties as per our requirement.

Now we are ready to start the process. The time taken to convert the document to PDF depends on the size of the document. So, we have used the WaitForExit() method to wait until the process completes. But we have also used a try block where we are catching the InvalidOperationException, but are not doing anything here.

Reason: Whenever we launch a document, the associated application is invoked by the Operating System as a new Process. Let us take an example. We are converting a .doc file to PDF. Here, the OS will invoke Microsoft Word to open the document for printing. Now say for instance, Microsoft Word is running on our system and we are editing a different document on it. In this case, no new process is invoked; rather the same process that was running is used to open the new word document. After the print job completes, the process does not get disposed as the Application was running previously. In this case we get an InvalidOperationException. Thus, we need to catch this exception and ignore it for the normal operation of our job.

convertToPdf()

This method is the final step of our application which converts the postscript file generated by the Ghostscript printer in the above function to a PDF document.

Here, we are creating a process and opening the Windows command window. Then we are using the gswin32c.exe the Command line utility provided by Ghostscript and are converting the postscript file created by the above function to PDF.

The function convertToPdf() in Listing 2 takes one argument: the full path of the output .pdf file that needs to be created. Thus, if we want to create a PDF file with the same name as the input file name, we can just replace the extension of the inputted file with .pdf.

Listing 2: convertToPdf()

private string CreatePdf(string outputPath)
{
  try
  {
    string command = "gswin32c -q -dNOPAUSE -sDEVICE=pdfwrite " +
      "-sOutputFile=\" outputPath\" -fc:\\gsoutput.ps";
 
    Process pdfProcess = new Process();
 
    StreamWriter writer;
    StreamReader reader;
 
    ProcessStartInfo info = new ProcessStartInfo("cmd");
    info.WorkingDirectory = System.AppDomain.CurrentDomain.BaseDirectory;
 
    info.CreateNoWindow = true;
    info.UseShellExecute = false;
    info.RedirectStandardInput = true;
    info.RedirectStandardOutput = true;
 
    pdfProcess.StartInfo = info;
    pdfProcess.Start();
 
    writer = pdfProcess.StandardInput;
    reader = pdfProcess.StandardOutput;
    writer.AutoFlush = true;
 
    writer.WriteLine(command);
 
    writer.Close();
 
    string ret = reader.ReadToEnd();
  }
  catch (Exception ex)
  {
    throw ex;
  }
 
  return ret;
}
Conclusion

This was a small example which you can use to convert your documents to PDF format. Though it can not replace the Adobe's Acrobat Professional, for basic operations it will help you a lot. And moreover, you can build your own PDF writer and it will not cost you anything.

There are other methods available in Ghostscript which can be used to manipulate postscript and PDF files. Refer the SDK document for the advanced processes. You can play with Ghostscript and PDF if you are comfortable with C and C++.

Thanks,

Bhuban M. Mishra

Mindfire solutions



©Copyright 1998-2021 ASPAlliance.com  |  Page Processed at 2021-10-27 4:36:33 PM  AspAlliance Recent Articles RSS Feed
About ASPAlliance | Newsgroups | Advertise | Authors | Email Lists | Feedback | Link To Us | Privacy | Search