LAST REVISED: 06/10/19 22:34
CPC TOOL
USER'S GUIDE
VERSION 5.2

CPC Tool, the CPC compressor/decompressor application, allows for command-line conversion of documents between a variety of image file formats including TIFF, CALS, JEDMICS, and Cartesian's own CPC format. CPC Tool accepts a series of input image files, each holding one or more pages in one of the supported formats, and generates a single file that contains all of the pages in all of the input files. This document provides information about Version 5.2.x and later of CPC Tool.

In the documentation below, advanced topics are described in this smaller font size.


1. Basic Operation

CPC Tool is a command-line program. To run CPC Tool, you must use a command-line interpreter such as an MS-DOS console window under Microsoft Windows or one of the many shell programs available for Unix and MacOS X. The command-line arguments are used to specify the input and output files for the conversion, as well as other conversion parameters.


1.1. Installation

The CPC Tool executable for the appropriate architecture should be downloaded and stored in a file named "cpctool" (or "cpctool.exe" under Windows). Click here to download the latest release of CPC Tool.

On Unix and MacOS systems, you also need to mark the downloaded file as executable, using the chmod command. For example, if you saved the downloaded file as "/usr/local/bin/cpctool", the following command would mark the file as executable:

chmod +x /usr/local/bin/cpctool


1.2. Obtaining a command summary

If run without command-line arguments, CPC Tool will display a brief help message enumerating the available command-line switches:

Usage: CPCTool [<params>] ifile1 [ifile2...]
 where: ifiles are the pages to be encoded
        <params> is any combination of the following flags:
  -B[atch]                   Convert each input file to a separate output file
  -C[lobberMode] <val>       Set batch clobber mode (query, skip, overwrite)
  -D[irectoryMode] <val>     Set batch descend mode (ignore, descend, package)
  -E[xtractMetaXML]          Generate an XML description of the document meta-data
  -F[ormat] <val>            Specify explicit output format (cpc, pbm, tiff)
  -G[etPageCount]            Display the page counts of each input document
  -I[nputFilter] <val>       Input filter for batch mode (cpc, pbm, tiff, any)
  -M[etaXML] <val>           Set the document meta-data to that of the specified XML file
  -N[umPages] <val>          The number of pages to copy
  -O[utFile] <val>           Output file name
  -Q[uiet]                   Disable normal messages
  -R[esolution] <val>        Specify page resolution for output dictionary
  -R[eviewLicense]           Review the end-user license agreement for CPC Tool
  -S[kipPages] <val>         Number of pages to skip in the input stream
  -T[imings]                 Time execution
  -V[ersion]                 Display version and exit

All switches are case-insensitve and can be abbreviated by the shortest unambiguous prefix. This means that all switches other than -Resolution and -ReviewLicense can be abbreviated by their first letter.


1.3. Supported Formats

CPC Tool supports a number of different image file formats. Not all formats are supported with the same capabilities. Some formats (e.g., CPC) support only binary (black-and-white) images. Other formats (e.g., TIFF) can support color or grayscale images as well as binary images. Certain formats (e.g., CALS) are read-only -- CPC Tool can read images in these formats, but can not generate images in these formats. Other formats (e.g., PDF) are write-only -- CPC Tool can write images in these formats, but can not decode images in these formats.

The following table describes the file formats supported by CPC Tool. The Color column describes whether or not CPC Tool supports color and grayscale images in the format; the Read column describes whether or not CPC Tool can read images in the format; the Write column describes whether or not CPC Tool can write images in the format.

Format Extensions Color Read Write
CPC cpc, cpi -YesYes
TIFF tiff, tif YesYesYes
CALS cals, cal, ct1 -Yes-
JEDMICS C4 c4, ct4 -Yes-
Portable Bitmap pbm, pgm, ppm, pnm YesYesYes
PDF pdf --Yes


1.4. Specifying Input Files

The filenames of the input files are specified as individual command-line arguments. At least one input file must be specified. If multiple input files are specified, the pages from the various input files will be combined into a single output document. If an input file contains multiple pages, all of the pages will be copied to the output document. (To copy selected pages from an input document, use the NumPages or SkipPages switches.)

The format of the input files is determined on the basis of the file contents; the file extension is not used to determine input file format.

1.4.1. Wildcards

Input files may be specified via wildcards.


1.5. Specifying the Output File

The output file is specified using the -OutFile filename switch, where filename is the name of the output file. CPC Tool derives the output file format from the output filename extension. The recognized extensions are defined above in the file format table.

The output file format can be specified explicitly via the -Format command-line switch.

If the output filename extension is not a recognized format name, the output format defaults to CPC, unless the first input file is in CPC format, in which case the output format defaults to TIFF.

If the output file name is "@", the output data is discarded, as if piped to /dev/null on Unix.

If neither a format nor a file is specified, image data is sent to a null image sink. This may be useful for timing decompression.


1.6. Extracting Pages from a Multi-Page Document

The -SkipPages switch allows specification of a number of pages to skip in the input file. The -NumPages switch allows specification of the number of pages (following any skipped pages) to process. Using these switches, single pages or subsequences of pages can be extracted from multi-page files.

Note: These switches may only be used in conjunction with a single input file.


1.7. Specifying Image Resolution

Many image formats provide space for image resolution information. By default, CPC Tool will copy this resolution information to the output document. The -Resolution switch allows explicit specification of the image resolution for output documents. (If the output format does not support resolution information, this switch is ignored.)

The resolution value is specified in dots per inch. If the horizontal and vertical resolutions are the same, the resolution is specified as a single number. For example, to specify 300 dpi as both the horizontal and vertical resolution, use -Resolution 300.

If the horizontal and vertical resolutions differ, supply both values separated by an x. For example, to specify a horizontal resolution of 200 dpi and a vertical resolution of 100 dpi, use -Resolution 200x100.

By default, the specification is used as the default resolution for all pages in the output document. To specify the resolution of a specific page, add a /n suffix, where n is the 1-based page number. For example, to set the resolution specification for page 1 to 400 dpi, use -Resolution 400/1.

You can concatenate multiple resolution specifications together by separating them with semi-colons. For example, the argument -Resolution 300;200/1;400/3 specifies that page 1 is 200 dpi; page 3 is 400 dpi; and all other pages in the document are 300 dpi.

Note: This flag has no effect whatsoever on the image data. It merely modifies the standard header information that accompanies the image data in these formats.


1.8. Determining Document Page Counts

The -GetPageCount option causes CPC Tool to display the number of pages in each of the specified input files. The page counts are printed to standard output. Conversions are not performed.

The -OutFile parameter is not used. To save the page counts in a file, use output redirection (e.g., cpctool -get xx.cpc > pagecount.txt).

This option is available in CPC Tool version 5.1.9 and later.


1.9. Examples



2. Output Format Modifiers

The format of the output file may be specified explicitly using the -Format formatname switch, where formatname is one of the extensions listed in the file format table.

If an explicit format is specified without an output file, the output is directed to the standard output. Currently, only PBM format output may be directed to the standard output. (All diagnostic and progress messages are sent to standard error, except for timing results. If you send to standard output with timing enabled, the ascii timing message will be appended to the end of the image data.)

Several output formats accept additional format-specific modifiers. These modifiers are discussed below. Format modifiers are constructed by appending a colon, followed by the modifiers, to the format name. For example, the argument -Format foo:bar would apply the bar modifier to the mythical foo output format.


2.1. CPC Modifiers

ProgressiveCPC is a variant of the CPC format that allows page images to be rendered more quickly at a very small loss in compression (only 0.7% in our QA test suite). Except for extremely space-critical applications, ProgressiveCPC format is generally the preferred format. By default, CPC Tool will use the ProgressiveCPC variant for CPC output documents. For extremely space-critical applications, the original modifier may be used to select the OriginalCPC variant.

2.1.1. Examples

The following command will convert the input file (in.tiff) to the OriginalCPC variant and store the results in out.cpc. CPCTool in.tiff -o out.cpc -f cpc:original


2.2. PDF Modifiers

The following PDF format modifiers are accepted. Multiple modifiers should be separated by semi-colons.

2.2.1. Examples

The following command will convert the input file (in.tiff) to PDF format and store the results in out.pdf. The PDF file will contain thumbnails and will specify that the initial display of the document should be scaled to fit the window. CPCTool in.tiff -o out.pdf -f pdf:thumbnails;scale=Fit


2.3. TIFF Modifiers

By default, CPC Tool uses CCITT G4 compression when encoding binary (black-and-white) TIFF images and LZW compression when encoding grayscale or color TIFF images. The following modifiers can be used to select alternate compression methods: If you are creating multi-page TIFF files that contain both binary (black-and-white) and non-binary (color or grayscale) images, you might need to specify different compression method modifiers for the two categories of images. To do so, specify the binary compression method modifier, followed by the non-binary compression method modifier, separated by a forward slash (/). For example, the composite modifier Uncompressed/Lzw would cause CPC Tool to encode binary images without compression, while encoding color and grayscale images with LZW compression.

To specify the default compression method for an image category, use the Default modifier. For example, the composite modifier Default/Uncompressed would cause CPC Tool to encode binary images with CCITT G4 compression, while encoding color and grayscale images without compression.

TIFF modifiers are available in version 5.2 or later of CPC Tool. Prior to version 5.2, CPC Tool always used CCITT G4 compression for binary images and encoded color or grayscale images without compression.

2.3.1. Examples

The following command will convert the input file (in.cpc) to TIFF format and store the results in out.tiff. Binary images will be encoded with CCITT G4 compression; grayscale and color images will be encoded without compression. CPCTool in.cpc -o out.tiff -f tiff:default/uncompressed This was the default mode of operation prior to CPC Tool version 5.2.


3. Document Meta-data

Image files often contain arbitrary meta-data in addition to the images themselves. For example, an image file may contain an ASCII description of the file's contents. CPC Tool provides options to retrieve and set this meta-data.

CPC Tool models the meta-data as a set of named values (i.e., a dictionary). Each named value can be associated with either a specific page within the document or the document as a whole.


3.1. XML Description

The meta-data dictionary is described by XML. Currently, there is no DTD.

3.1.1. Tag: dictionary

The entire meta-data dictionary must be enclosed in a <dictionary> block:
<?xml version="1.0" ?>
<dictionary>
    list of meta-data dictionary entries
</dictionary>
The dictionary tag does not accept any attributes.

3.1.2. Tag: entry

Each name/value pair in the dictionary is described by an <entry> block:
<entry name="Description" page="3">
This is a description of page 3.
</entry>
The tag has one required attribute: the name portion of the name/value pair (name). The contents of the <entry> block specify the value portion of the name/value pair.

The tag accepts an optional page attribute, specifying the page to which the name/value pair applies. Pages are numbered from one. If the page attribute is omitted (or is zero), the name/value pair is not page-specific (i.e., it applies to the entire document).

The tag accepts an optional enc attribute, specifying the encoding of the block's contents. Currently, the only recognized value for the enc attribute is b64 indicating that the block's contents have been base64 encoded. This is useful if the value portion of the name/value pair contains significant amounts of binary data.

3.1.3. Example

Consider the following XML meta-data description:
<?xml version="1.0" ?>
<dictionary>
<entry name="Title">A sample document</entry>
<entry name="Workflow" enc="b64">ZGF0YQ==</entry>
<entry name="Source" page="1">Direct Scan</entry>
<entry name="Source" page="2">Email</entry>
</dictionary>

This example describes a meta-data dictionary containing four entries:

Name Value Page
Title A sample document Applies to the entire document
Workflow data (Base64 encoded) Applies to the entire document
Source Direct Scan Applies to page 1
Source Email Applies to page 2


3.2. Extracting Document Meta-data

The -ExtractMetaXML option causes CPC Tool to display an XML description of the meta-data for the specified input file. The XML description is printed to standard output. Only a single input file can be specified. Conversions are not performed.

The -OutFile parameter is not used. To save the XML in a file, use output redirection (e.g., cpctool -extract xx.cpc > meta.xml).


3.3. Setting Document Meta-Data

The -MetaXML filename option causes CPC Tool to load the meta-data from the XML description contained in the file filename. The meta-data is stored in the output-file of the requested conversion operation.

This option is only supported for output files in the CPC file format.

The specified meta-data overrides any meta-data that would normally be stored in the output file of the conversion operation. However, any named values normally produced by the conversion operation that are distinct from those specified by the XML file will remain in the output file. To produce an exact duplicate of the XML file in the output file, precede the name of the XML file with an = (e.g., -MetaXML =meta.xml).


3.4. Examples



4. Batch Processing

In addition to wildcards, CPC Tool has several command line switches which simplify batch conversion of large image repositories. Batch mode is enabled using the -Batch command line argument. The default behavior in batch mode is to convert each input file to a separate output file. If no explicit output format is specified (via the -Format switch), each CPC input file is converted to a TIFF output file and each non-CPC input file is converted to a CPC file. If an explicit output format is specified, each input file is converted to that format.

Under the default behavior, there is a one-to-one correspondence between input files and output files. The output file name is constructed from the input file name by stripping the input file extension and appending the file extension appropriate for the output format. For example, the input file abc/junk.tiff would be converted to the CPC output file abc/junk.cpc.


4.1. Output Directory

A batch output directory may be specified via the -OutFile dirname switch. The output directory is prepended to all output file names, and any missing sub-directories are created by CPC Tool. For example, if the output directory is specified as -OutFile outdir, the input file abc/junk.tiff is converted to the CPC output file outdir/abc/junk.cpc; if outdir or outdir/abc do not exist, they are created by CPC Tool.

Note: Note: If an output directory is specified, all of the command line file arguments must be specified as relative pathnames; absolute pathnames may not be used in conjunction with an output directory.


4.2. ClobberMode

If an output file already exists, CPC Tool will query the user as to how to proceed. The query allows the user to specify whether the output file should be skipped or overwritten. The user can also specify that the entire job should be aborted, or that all subsequent existing files should be skipped or that all subsequent existing files should be overwritten.

By specifying a clobber-mode, -ClobberMode mode, you can specify the overwrite behavior on the command line, avoiding the interactive queries. Mode can be one of the following:



4.3. DirectoryMode

If an input file is a directory, CPC Tool will convert each of the image files contained in that directory to a separate output file. For example, if the directory docs contains three files, p1.tiff, p2.tiff, and p3.tiff, the command CPCTool -o output -batch docs would create three output files output/docs/p1.cpc, output/docs/p2.cpc, and output/docs/p3.cpc.

By default, any sub-directories contained in the specified input directories are ignored. The default handling of directories can be modified using the -DirectoryMode mode switch. The mode specifier must be one of the following:


4.3.1. Example

For example, assume the following directory structure: By default, when processing Docs, CPC Tool would ignore the sub-directories Doc1 and Doc2. Hence, the command CPCTool -batch -f cpc Docs would produce: When the directory mode is set to descend, CPC Tool will process the sub-directories Doc1 and Doc2. Hence, the command CPCTool -batch -f cpc -dir descend Docs would produce the following: Whereas, the command CPCTool -batch -f cpc -dir package Docs would produce the following:


4.4. InputFilter

By default, CPC Tool will attempt to convert every image file contained in the specified directories and sub-directories. By specifying an input filter, -InputFilter type, you can restrict, by file format, the image files considered by CPC Tool as candidates for conversion. Type must be one of the following: cpc, tiff, cals, c4, pbm, or any. The input filter causes CPC Tool to ignore all image files that do not match the specified type. (any is used to specify no input filter and is the default.) The input filter is only used when searching directories; it does not apply to files specified on the command line.


4.5. Position-Sensitive Arguments

The batch modification arguments can, alternatively, be specified in a position-senstive manner, by replacing the leading '-' with a '+'. Unlike normal command line arguments, position-sensitive arguments only apply to those input files specified after the position-senstive argument on the command line. Furthermore, there can be multiple position-sensitive instances of a particular argument. A particular input file is always processed according to the most recently preceding position-sensitive argument(s) on the command line.

For example, the command CPCTool -batch Doc1 +clobber skip Doc2 +clobber overwrite Doc3 would process Doc1 with the default clobber-mode (query); Doc2 with the clobber-mode set to skip; and Doc3 with the clobber-mode set to overwrite.


5. Batteries

When you install CPC Tool, you will be able to use it in evaluation mode. Evaluation mode allows you to convert a total of 1000 pages of non-CPC input. (There is no limit to the number of CPC input pages which can be converted.)

Once you have used CPC Tool to convert 1000 pages of non-CPC input, you will need a conversion battery to perform additional conversions. Conversion batteries provide conversion capabilities for a fixed number of non-CPC input pages on a specific computer.

To order a battery, you must supply the CPC Tool machine ID of the computer on which you want to use CPC Tool. The CPC Tool machine ID is displayed when you run CPC Tool with no command-line arguments.


5.1. Installation

To install a virtual battery, run CPC Tool with the battery's file name as the only command-line argument. For example, if the battery is in a file named Cpc.Battery, you would run the command: CPCTool Cpc.Battery Once you have installed the battery, you can dispose of the battery file.


5.2. Discharge

You can also discharge your existing CPC Tool batteries and return the unused portion for a replacement or rebate. The command:

CPCTool -f DischargeBattery -o Cpc.DischargeBattery input-file

will discharge the remaining pages in your CPC Tool battery, leaving the discharge record in the file Cpc.DischargeBattery. Simply email the discharge record to Cartesian Products for a rebate or replacement battery.

A valid image file must be specified as the input-file. The file is not used in the discharge process.

Warning: Warning: The discharge operation can not be undone. Once you have discharged your battery, you will not be able to perform non-CPC input file conversions without installing a new battery.

© 1998-2005 Cartesian Products, Inc. Contact Cartesian