Image and PDF Compression in C#

GdPicture.NET SDK enables you to dramatically reduce the file size of PDF documents, with a focus on font optimization, data compression, and image analysis.

PDF optimization involves serializing several compression algorithms to go beyond the limitations of some compression schemes. It also involves removing unwanted or unused objects in a PDF.

To compress a PDF document, follow these steps:

  1. Create a GdPicturePDFReducer object.

  2. Configure the metadata of the resulting PDF document with the following properties of the PDFReducerConfiguration object:

    Property Name Description
    Author Specifies the author of the resulting PDF document.
    Producer Specifies the producer of the resulting PDF document.
    ProducerName Specifies the name of the producer of the resulting PDF document.
    Title Specifies the title of the resulting PDF document.
  3. Configure the compression process with the following properties of the PDFReducerConfiguration object:

    Property Name Description
    DownscaleImages Specifies whether to downscale images. The default value is true.
    DownscaleResolution Specifies the resolution to downscale images. The default value is 150.
    DownscaleResolutionMRC Specifies the resolution for downscaling the background layer by the mixed raster content (MRC) engine. The default value is 100.
    EnableCharRepair Specifies whether to perform character repair during bitonal conversion. The default value is false.
    EnableColorDetection Specifies whether to perform color detection on images. The default value is true.
    EnableJBIG2 Specifies whether to use the JBIG2 compression scheme to compress bitonal images. The default value is true.
    EnableJPEG2000 Specifies whether to use the JPEG2000 compression scheme to compress the images. The default value is true.
    EnableMRC Specifies whether to use MRC for compressing the content of the source PDF. The default value is false.
    EnableParallelization Specifies whether to use multiple cores to speed up the process. Threads are dynamically allocated based on the real-time available CPU resources. The default value is true.
    FastWebView Specifies whether to optimize the PDF for online distribution (linearized PDF). The default value is false.
    ImageQuality Specifies the quality of the compressed images. The default value is PDFReducerImageQuality.ImageQualityMedium.
    JBIG2PMSThreshold Specifies the threshold value for the JBIG2 encoder pattern matching and substitution between 0 and 1. Any number lower than 1 may lead to lossy compression. The default value is 0.75.
    MaxBitmapPerPage Specifies the maximum number of bitmap images per page.
    OutputFormat A member of the PDFReducerPDFVersion enumeration that specifies the version and the conformance level of the output PDF document. The default value is PDFReducerPDFVersion.PdfVersion15.
    PackDocument Specifies whether to pack the PDF to reduce its size. The default value is true.
    PackFonts Specifies whether to pack the PDF fonts to reduce their size. The default value is true.
    PreserveSmoothing Specifies whether the MRC engine preserves smoothing between different layers. The default value is true.
    RecompressImages Specifies whether to recompress the images. The default value is true.
    RemoveAnnotations Specifies whether to remove annotations. The default value is false.
    RemoveBlankPages Specifies whether to remove blank pages. The default value is false.
    RemoveBookmarks Specifies whether to remove bookmarks. The default value is false.
    RemoveEmbeddedFiles Specifies whether to remove embedded files. The default value is false.
    RemoveFormFields Specifies whether to remove form fields. The default value is false.
    RemoveHyperlinks Specifies whether to remove hyperlinks. The default value is false.
    RemoveJavaScript Specifies whether to remove JavaScript. The default value is false.
    RemoveMetadata Specifies whether to remove metadata. The default value is false.
    RemovePagePieceInfo Specifies whether to remove the page PieceInfo dictionary used to hold private application data. The default value is true.
    RemovePageThumbnails Specifies whether to remove page thumbnails. The default value is false.
    UnembedFonts Specifies whether to remove embedded font data. The default value is false.
  4. Run the compression process with the ProcessDocument method. This method takes the path to the source and the output PDF files as its parameters.

General Optimization of PDF Documents

The example below focuses on general aspects of PDF optimization such as content removal and font optimization:

using GdPicturePDFReducer gdpicturePDFReducer = new GdPicturePDFReducer();
// Configure the metadata of the resulting PDF document.
gdpicturePDFReducer.PDFReducerConfiguration.Author = "GdPicture.NET PDF Reducer SDK";
gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14";
gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "PSPDFKit";
gdpicturePDFReducer.PDFReducerConfiguration.Title = "PDF Optimization";

// Specify the version and the conformance level of the output PDF document.
gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting;

// Configure the compression process by removing document elements.
gdpicturePDFReducer.PDFReducerConfiguration.RemoveAnnotations = true;
gdpicturePDFReducer.PDFReducerConfiguration.RemoveBlankPages = true;
gdpicturePDFReducer.PDFReducerConfiguration.RemoveBookmarks = true;
gdpicturePDFReducer.PDFReducerConfiguration.RemoveEmbeddedFiles = true;
gdpicturePDFReducer.PDFReducerConfiguration.RemoveFormFields = true;
gdpicturePDFReducer.PDFReducerConfiguration.RemoveHyperlinks = true;
gdpicturePDFReducer.PDFReducerConfiguration.RemoveJavaScript = true;
gdpicturePDFReducer.PDFReducerConfiguration.RemoveMetadata = true;
gdpicturePDFReducer.PDFReducerConfiguration.RemovePageThumbnails = true;

// Optimize the output file size by packing fonts.
gdpicturePDFReducer.PDFReducerConfiguration.PackFonts = true;

// Optimize the output file size by packing the document.
gdpicturePDFReducer.PDFReducerConfiguration.PackDocument = true;

// Run the compression process.
gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf");
Using gdpicturePDFReducer As GdPicturePDFReducer = New GdPicturePDFReducer()
    'Configure the metadata of the resulting PDF document.
    gdpicturePDFReducer.PDFReducerConfiguration.Author = "GdPicture.NET PDF Reducer SDK"
    gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14"
    gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "PSPDFKit"
    gdpicturePDFReducer.PDFReducerConfiguration.Title = "PDF Optimization"

    'Specify the version and the conformance level of the output PDF document.
    gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting

    'Configure the compression process by removing document elements.
    gdpicturePDFReducer.PDFReducerConfiguration.RemoveAnnotations = True
    gdpicturePDFReducer.PDFReducerConfiguration.RemoveBlankPages = True
    gdpicturePDFReducer.PDFReducerConfiguration.RemoveBookmarks = True
    gdpicturePDFReducer.PDFReducerConfiguration.RemoveEmbeddedFiles = True
    gdpicturePDFReducer.PDFReducerConfiguration.RemoveFormFields = True
    gdpicturePDFReducer.PDFReducerConfiguration.RemoveHyperlinks = True
    gdpicturePDFReducer.PDFReducerConfiguration.RemoveJavaScript = True
    gdpicturePDFReducer.PDFReducerConfiguration.RemoveMetadata = True
    gdpicturePDFReducer.PDFReducerConfiguration.RemovePageThumbnails = True

    'Optimize the output file size by packing fonts.
    gdpicturePDFReducer.PDFReducerConfiguration.PackFonts = True

    'Optimize the output file size by packing the document.
    gdpicturePDFReducer.PDFReducerConfiguration.PackDocument = True

    'Run the compression process.
    gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf")
End Using

Recompressing Images

Compress PDF documents by recompressing existing images in a file. For example, decreasing unnecessarily high resolutions can dramatically reduce the file size without affecting the viewing experience.

The example below shows how to recompress images:

using GdPicturePDFReducer gdpicturePDFReducer = new GdPicturePDFReducer();
// Configure the metadata of the resulting PDF document.
gdpicturePDFReducer.PDFReducerConfiguration.Author = "GdPicture.NET PDF Reducer SDK";
gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14";
gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "PSPDFKit";
gdpicturePDFReducer.PDFReducerConfiguration.Title = "Re-Compress Images";

// Specify the version and the conformance level of the output PDF document.
gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting;

// Recompress images to obtain a better compression ratio.
gdpicturePDFReducer.PDFReducerConfiguration.RecompressImages = true;
gdpicturePDFReducer.PDFReducerConfiguration.ImageQuality = PDFReducerImageQuality.ImageQualityHigh;

// Reduce the image size by decreasing the image resolution.
gdpicturePDFReducer.PDFReducerConfiguration.DownscaleImages = true;
gdpicturePDFReducer.PDFReducerConfiguration.DownscaleResolution = 200;

// Run the compression process.
gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf");
Using gdpicturePDFReducer As GdPicturePDFReducer = New GdPicturePDFReducer()
    'Configure the metadata of the resulting PDF document.
    gdpicturePDFReducer.PDFReducerConfiguration.Author = "GdPicture.NET PDF Reducer SDK"
    gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14"
    gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "PSPDFKit"
    gdpicturePDFReducer.PDFReducerConfiguration.Title = "Re-Compress Images"

    'Specify the version and the conformance level of the output PDF document.
    gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting

    'Recompress images to obtain a better compression ratio.
    gdpicturePDFReducer.PDFReducerConfiguration.RecompressImages = True
    gdpicturePDFReducer.PDFReducerConfiguration.ImageQuality = PDFReducerImageQuality.ImageQualityHigh

    'Reduce the image size by decreasing the image resolution.
    gdpicturePDFReducer.PDFReducerConfiguration.DownscaleImages = True
    gdpicturePDFReducer.PDFReducerConfiguration.DownscaleResolution = 200

    'Run the compression process.
    gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf")
End Using

Controlling Image Compression

The PDF specification allows for seven compression schemes, all of which can be used to compress images. For example, two popular compression schemes are the following:

  • JBIG2 for bitonal images (usually black and white).

  • JPEG2000 for 24-bit color and 8-bit grayscale images.

The example below uses both of these schemes to compress images in a PDF document:

using GdPicturePDFReducer gdpicturePDFReducer = new GdPicturePDFReducer();
// Configure the metadata of the resulting PDF document.
gdpicturePDFReducer.PDFReducerConfiguration.Author = "GdPicture.NET PDF Reducer SDK";
gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14";
gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "PSPDFKit";
gdpicturePDFReducer.PDFReducerConfiguration.Title = "Image Compression";

// Specify the version and the conformance level of the output PDF document.
gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting;

// Enable automatic color detection.
gdpicturePDFReducer.PDFReducerConfiguration.EnableColorDetection = true;

// Repair characters.
gdpicturePDFReducer.PDFReducerConfiguration.EnableCharRepair = true;

// Control image compression.
gdpicturePDFReducer.PDFReducerConfiguration.EnableJPEG2000 = true;
gdpicturePDFReducer.PDFReducerConfiguration.EnableJBIG2 = true;
gdpicturePDFReducer.PDFReducerConfiguration.JBIG2PMSThreshold = 0.65f;

// Run the compression process.
gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf");
Using gdpicturePDFReducer As GdPicturePDFReducer = New GdPicturePDFReducer()
    'Configure the metadata of the resulting PDF document.
    gdpicturePDFReducer.PDFReducerConfiguration.Author = "GdPicture.NET PDF Reducer SDK"
    gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14"
    gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "PSPDFKit"
    gdpicturePDFReducer.PDFReducerConfiguration.Title = "Image Compression"

    'Specify the version and the conformance level of the output PDF document.
    gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting

    'Enable automatic color detection.
    gdpicturePDFReducer.PDFReducerConfiguration.EnableColorDetection = True

    'Repair characters.
    gdpicturePDFReducer.PDFReducerConfiguration.EnableCharRepair = True

    'Control image compression.
    gdpicturePDFReducer.PDFReducerConfiguration.EnableJPEG2000 = True
    gdpicturePDFReducer.PDFReducerConfiguration.EnableJBIG2 = True
    gdpicturePDFReducer.PDFReducerConfiguration.JBIG2PMSThreshold = 0.65F

    'Run the compression process.
    gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf")
End Using