LAST REVISED: 06/10/19 22:34
CPC TOOL
USER'S GUIDE
VERSION 5.2
CPC Tool, the CPC compressor/decompressor application, allows
for command-line conversion of documents between a variety of image file
formats including TIFF, CALS, JEDMICS,
and Cartesian's own CPC format.
CPC Tool accepts a series of input image files,
each holding one or more pages in one of the supported formats, and generates
a single file that contains all of the pages in all of the input files.
This document provides information about
Version 5.2.x and
later of CPC Tool.
In the documentation below, advanced topics are
described in this smaller font size.
CPC Tool is a command-line program. To run CPC Tool,
you must use a command-line interpreter such as an MS-DOS console window
under Microsoft
Windows or one of the many shell programs available for Unix and MacOS X.
The command-line arguments are used to specify the input and output files for the
conversion, as well as other conversion parameters.
The CPC Tool executable for the appropriate architecture should be downloaded
and stored in a file named "cpctool" (or "cpctool.exe" under Windows).
Click here to download
the latest release of CPC Tool.
On Unix and MacOS systems, you also need to mark the downloaded file as executable, using
the chmod command. For example, if you saved the downloaded file as "/usr/local/bin/cpctool",
the following command would mark the file as executable:
chmod +x /usr/local/bin/cpctool
If run without command-line arguments, CPC Tool will display
a brief help message enumerating the available command-line switches:
Usage: CPCTool [<params>] ifile1 [ifile2...]
where: ifiles are the pages to be encoded
<params> is any combination of the following flags:
-B[atch] Convert each input file to a separate output file
-C[lobberMode] <val> Set batch clobber mode (query, skip, overwrite)
-D[irectoryMode] <val> Set batch descend mode (ignore, descend, package)
-E[xtractMetaXML] Generate an XML description of the document meta-data
-F[ormat] <val> Specify explicit output format (cpc, pbm, tiff)
-G[etPageCount] Display the page counts of each input document
-I[nputFilter] <val> Input filter for batch mode (cpc, pbm, tiff, any)
-M[etaXML] <val> Set the document meta-data to that of the specified XML file
-N[umPages] <val> The number of pages to copy
-O[utFile] <val> Output file name
-Q[uiet] Disable normal messages
-R[esolution] <val> Specify page resolution for output dictionary
-R[eviewLicense] Review the end-user license agreement for CPC Tool
-S[kipPages] <val> Number of pages to skip in the input stream
-T[imings] Time execution
-V[ersion] Display version and exit
|
All switches are case-insensitve and can be abbreviated by the shortest unambiguous prefix.
This means that all switches other than -Resolution and
-ReviewLicense can be abbreviated by their first letter.
CPC Tool supports a number of different image file formats. Not all formats
are supported with the same capabilities. Some formats (e.g., CPC)
support only binary (black-and-white) images. Other formats (e.g., TIFF)
can support color or grayscale images as well as binary images.
Certain formats (e.g., CALS) are read-only -- CPC Tool can read images
in these formats, but can not generate images in these formats. Other formats
(e.g., PDF) are
write-only -- CPC Tool can write images in these formats, but can not
decode images in these formats.
The following table describes the file formats supported by CPC Tool.
The Color column describes whether or not
CPC Tool supports color and grayscale images in the format; the Read
column describes whether or not
CPC Tool can read images in the format; the Write column describes
whether or not CPC Tool can write images in the format.
Format |
Extensions |
Color |
Read |
Write |
CPC |
cpc, cpi |
- | Yes | Yes |
TIFF |
tiff, tif |
Yes | Yes | Yes |
CALS |
cals, cal, ct1 |
- | Yes | - |
JEDMICS C4 |
c4, ct4 |
- | Yes | - |
Portable Bitmap |
pbm, pgm, ppm, pnm |
Yes | Yes | Yes |
PDF |
pdf |
- | - | Yes |
- Color and black-and-white images can be freely interspersed within a single
multi-page TIFF file.
-
CPC Tool will not convert color or grayscale images to black-and-white formats.
The filenames of the input files are specified as individual command-line arguments.
At least one input file must be specified. If multiple input files are specified,
the pages from the various input files will be combined into a single output document. If
an input file contains multiple pages, all of the pages will be copied to the output document.
(To copy selected pages from an input document, use the
NumPages
or SkipPages
switches.)
The format of the input files is determined on the basis of the file
contents; the file extension is not used to determine input file
format.
Input files may be specified via wildcards.
- Windows OS: Under Windows, all of the normal
DOS wildcard operators are supported. Within a filename, the question mark
(?) wildcard can be used to represent any single character, and the asterisk
(*) wildcard can be used to represent any string of zero or more characters.
Enclosing an argument in double quotation marks (" ") suppresses the wildcard
expansion. Within quoted arguments, you can represent quotation marks literally
by preceding the double quotation mark character with a backslash (\).
If no matches are found for the wildcard argument, the argument is passed
literally.
- Unix OS: Under Unix, the syntax and semantics of wildcards are defined
by the shell.
The output file is specified using the -OutFile filename switch, where
filename is the name of the output file. CPC Tool derives the
output file format from the output filename extension. The recognized extensions are
defined above in the file format table.
The output file format can be specified explicitly via the
-Format
command-line switch.
If the output filename extension is not a recognized format name, the output format
defaults to CPC, unless the first input file is in CPC format, in which case
the output format defaults to TIFF.
If the output file name is "@", the output data is discarded, as if
piped to /dev/null on Unix.
If neither a format nor a file is specified, image data is sent to
a null image sink. This may be useful for timing decompression.
The -SkipPages switch allows specification of a number of pages to
skip in the input file. The -NumPages switch allows specification
of the number of pages (following any skipped pages) to process.
Using these switches, single pages or subsequences of pages can be
extracted from multi-page files.
These switches may only be used in conjunction with a single input
file.
Many image formats provide space for image resolution information.
By default, CPC Tool will copy this resolution information
to the output document. The -Resolution switch
allows explicit specification of the image resolution for output documents. (If the output
format does not support resolution information, this switch is ignored.)
The resolution value is specified in dots per inch. If the horizontal and vertical
resolutions are the same, the resolution is specified as a single number. For example,
to specify 300 dpi as both the horizontal and
vertical resolution, use -Resolution 300.
If the horizontal and vertical
resolutions differ, supply both values separated by an x. For example,
to specify a horizontal resolution of 200 dpi and a vertical resolution of 100 dpi, use
-Resolution 200x100.
By default, the specification is used as the default resolution for all pages in the
output document. To specify the resolution of a specific page, add a
/n suffix,
where n is the 1-based page number. For example, to
set the resolution specification for page 1 to 400 dpi, use -Resolution 400/1.
You can concatenate multiple resolution specifications together by separating
them with semi-colons. For example, the argument
-Resolution 300;200/1;400/3
specifies that page 1 is 200 dpi; page 3 is 400 dpi; and all other pages in the
document are 300 dpi.
This flag has no effect whatsoever on the image data. It merely modifies the
standard header information that accompanies the image data in these formats.
The -GetPageCount option causes CPC Tool to display the
number of pages in each of the specified input files. The page counts are printed
to standard output. Conversions are not performed.
The -OutFile parameter is not used. To save the page counts in a file, use
output redirection (e.g., cpctool -get xx.cpc > pagecount.txt).
This option is available in CPC Tool version 5.1.9 and later.
- To compress a series of three pages in single-page
TIFF files (file1.tiff, file2.tiff, file3.tiff) into a single
multi-page CPC file (output.cpi), execute
CPCTool file1.tiff file2.tiff file3.tiff -o output.cpi
- To decompress the resulting file into a single file in multi-page TIFF
Group 4 format (reconstructed.tiff), execute
CPCTool output.cpi -o reconstructed.tiff
- To generate the TIFF reconstruction of page 2 from
output.cpi, execute:
CPCTool output.cpi -s 1 -n 1 -o page2only.tiff
- To
create a single CPC document (out.cpc)
from all of the files in the current directory whose names
begin with page and end with .tiff, execute
CPCTool page*.tiff -o out.cpc
The format of the output file may be specified explicitly
using the -Format formatname switch, where
formatname
is one of the extensions listed in the file format table.
If an explicit format is specified without an output file,
the output is directed to the standard output. Currently, only PBM format
output may be directed to the standard output.
(All diagnostic and progress messages are sent to standard
error, except for timing results. If you send to standard output
with timing enabled, the ascii timing message will be appended to the
end of the image data.)
Several output formats
accept additional format-specific modifiers. These modifiers are
discussed below. Format modifiers are constructed by appending a colon, followed
by the modifiers, to the format name. For example, the argument
-Format foo:bar would apply the bar modifier
to the mythical foo output format.
ProgressiveCPC is a variant of the CPC format that allows page images to
be rendered more quickly at a very small loss in compression (only 0.7% in our QA test
suite). Except for extremely space-critical applications,
ProgressiveCPC format is generally the preferred format.
By default, CPC Tool will use the ProgressiveCPC
variant for CPC output documents.
For extremely space-critical applications, the original modifier may be
used to select the OriginalCPC variant.
The following command will convert the input file (in.tiff) to the
OriginalCPC variant and store the results in out.cpc.
CPCTool in.tiff -o out.cpc -f cpc:original
The following PDF format modifiers are accepted. Multiple modifiers should be
separated by semi-colons.
-
thumbnails: Include thumbnails in the output. By default, CPC Tool
does not include thumbnails.
-
page=n: Set the initial view page to n. Pages are
numbered from 1.
-
scale=n: Set the initial display scale to n which
specifies a floating-point magnification factor. For example, the modifier
scale=0.25 would cause a compliant PDF Viewer to initially display
the document at 1/4 size. The scale may also be specified as one of the
following strings:
-
FitWidth: Cause a compliant PDF Viewer to initially display the
first page scaled to fit within the width of the viewing window.
-
FitHeight: Cause a compliant PDF Viewer to initially display the
first page scaled to fit within the height of the viewing window.
-
Fit: Cause a compliant PDF Viewer to initially display the first
page scaled to fit within the height and width of the viewing window.
The following command will convert the input file (in.tiff) to
PDF format and store the results in out.pdf. The PDF file will contain
thumbnails and will specify that the initial display of the document should
be scaled to fit the window.
CPCTool in.tiff -o out.pdf -f pdf:thumbnails;scale=Fit
By default, CPC Tool uses CCITT G4 compression when encoding
binary (black-and-white) TIFF images and LZW compression when
encoding grayscale or color TIFF images. The following modifiers can be used to select alternate
compression methods:
- LZW: Encode binary images with LZW compression rather than
CCITT G4. Typically, CCITT G4 compression
results in smaller TIFF files than LZW compression. However,
if the images contain predominantly half-tones or other graphics, LZW may
result in slighly smaller TIFF files than CCITT G4.
- Uncompressed: Do not compress the images. This will almost always
result in the largest TIFF files. However, if the images are predominantly random noise, it
is possible that turning off compression will result in slightly smaller TIFF files.
If you are creating multi-page TIFF files that contain both binary (black-and-white)
and non-binary (color or grayscale) images, you might need to specify different compression
method modifiers for the two categories of images. To do so, specify the binary compression
method modifier, followed by the non-binary compression method modifier, separated by a forward
slash (/). For example, the composite modifier Uncompressed/Lzw would cause
CPC Tool
to encode binary images without compression, while encoding color and grayscale images
with LZW compression.
To specify the default compression method for an image category, use the Default
modifier. For example, the composite modifier Default/Uncompressed would cause
CPC Tool
to encode binary images with CCITT G4 compression, while encoding color and grayscale images
without compression.
TIFF modifiers are available in version 5.2 or later of CPC Tool.
Prior to version 5.2, CPC Tool always used CCITT G4 compression
for binary images and encoded color or grayscale images without compression.
The following command will convert the input file (in.cpc) to
TIFF format and store the results in out.tiff. Binary images
will be encoded with CCITT G4 compression; grayscale and color images
will be encoded without compression.
CPCTool in.cpc -o out.tiff -f tiff:default/uncompressed
This was the default mode
of operation prior to CPC Tool version 5.2.
Image files often contain arbitrary meta-data in addition to the images themselves.
For example, an image file may contain an ASCII description of the file's contents.
CPC Tool provides options to retrieve and set this meta-data.
CPC Tool models the meta-data as a set
of named values (i.e., a dictionary). Each named value can be associated
with either a specific page within the document
or the document as a whole.
The meta-data dictionary is described by XML. Currently, there is no DTD.
The entire meta-data dictionary must
be enclosed in a <dictionary> block:
<?xml version="1.0" ?>
<dictionary>
list of meta-data dictionary entries
</dictionary>
The dictionary tag does not accept any attributes.
Each name/value pair in the dictionary is described by an
<entry> block:
<entry name="Description" page="3">
This is a description of page 3.
</entry>
The tag has one required attribute: the name portion
of the name/value pair (name). The contents of
the <entry> block specify the value portion
of the name/value pair.
The tag accepts an optional page attribute, specifying
the page to which the name/value pair applies. Pages are numbered
from one. If the page attribute is omitted (or is zero), the name/value
pair is not page-specific (i.e., it applies to the entire document).
The tag accepts an optional enc attribute, specifying
the encoding of the block's contents. Currently, the only recognized
value for the enc attribute is b64 indicating that
the block's contents have been
base64 encoded.
This is useful
if the value portion of the name/value pair contains significant
amounts of binary data.
Consider the following XML meta-data description:
<?xml version="1.0" ?>
<dictionary>
<entry name="Title">A sample document</entry>
<entry name="Workflow" enc="b64">ZGF0YQ==</entry>
<entry name="Source" page="1">Direct Scan</entry>
<entry name="Source" page="2">Email</entry>
</dictionary>
This example describes a meta-data dictionary containing
four entries:
Name |
Value |
Page |
Title |
A sample document |
Applies to the entire document |
Workflow |
data (Base64 encoded) |
Applies to the entire document |
Source |
Direct Scan |
Applies to page 1 |
Source |
Email |
Applies to page 2 |
The -ExtractMetaXML option causes CPC Tool to display an XML
description of the meta-data for the specified input file. The
XML description is printed to standard output.
Only a single
input file can be specified. Conversions are not performed.
The -OutFile parameter is not used. To save the XML in a file, use
output redirection (e.g., cpctool -extract xx.cpc > meta.xml).
The -MetaXML filename option causes CPC Tool to load the
meta-data from the XML description contained in the file filename.
The meta-data is stored in the output-file of the requested conversion operation.
This option is only supported for output files in the CPC file format.
The specified meta-data overrides any meta-data that would normally be stored
in the output file of the conversion operation. However, any named values
normally produced by the conversion operation that are
distinct from those specified by the XML file will remain in the output file.
To produce an exact duplicate of the XML file in the output file, precede the name of the
XML file with an = (e.g., -MetaXML =meta.xml).
- To extract the XML description of the meta-data contained in the
file named foo.cpc, execute
CPCTool -extract foo.cpc
- To copy the images in in.cpc to the file out.cpc, augmenting
the meta-data with the meta-data described in the file meta.xml, execute
CPCTool in.cpc -o out.cpc -meta meta.xml
- To copy the images in in.cpc to the file out.cpc, replacing
the meta-data with the meta-data described in the file meta.xml, execute
CPCTool in.cpc -o out.cpc -meta =meta.xml
In addition to wildcards, CPC Tool has several command line switches which
simplify batch conversion of large image repositories. Batch mode is enabled
using the -Batch command line argument. The default behavior in
batch mode is to convert each input file to a separate
output file. If no explicit output format is specified (via the
-Format
switch), each CPC input file is converted to a TIFF output
file and each non-CPC input
file is converted to a CPC file. If an explicit output format is
specified, each input file is converted to that format.
Under the default behavior, there is a one-to-one correspondence between
input files and output files. The output file name is constructed from
the input file name by stripping the input file extension and appending
the file extension appropriate for the output format. For example, the
input file abc/junk.tiff would be converted to the CPC output
file abc/junk.cpc.
A batch output directory may be specified via the -OutFile dirname
switch. The output directory is prepended to all output file names, and
any missing sub-directories are created by CPC Tool. For example, if
the output directory is specified as -OutFile outdir, the input file
abc/junk.tiff is converted to the CPC output file outdir/abc/junk.cpc;
if outdir or outdir/abc do not exist, they are created by CPC Tool.
Note: If an output directory is specified, all of the command line
file arguments must be specified as relative pathnames; absolute
pathnames may not be used in conjunction with an output directory.
If an output file already exists, CPC Tool will query the user as to how
to proceed. The query allows the user to specify whether the output file
should be skipped or overwritten. The user can also specify that the entire
job should be aborted, or that all subsequent existing files should be
skipped or that all subsequent existing files should be overwritten.
By specifying a clobber-mode, -ClobberMode mode, you can
specify the overwrite behavior on the command line, avoiding the interactive
queries. Mode can be one of the following:
-
skip: Causes CPC Tool to skip any conversion jobs for which the
output file already exists. This is useful when performing batch operations
on a directory structure which has already been partially converted.
-
overwrite: Causes CPC Tool to overwrite the pre-existing file.
-
query: Causes CPC Tool to interactively query the user. This is
the default behavior.
If an input file is a directory, CPC Tool will convert each of the image
files contained in that directory to a separate output file. For example,
if the directory docs contains three files, p1.tiff, p2.tiff,
and p3.tiff, the command
CPCTool -o output -batch docs
would
create three output files output/docs/p1.cpc, output/docs/p2.cpc,
and output/docs/p3.cpc.
By default, any sub-directories contained in the specified input directories
are ignored. The default handling of directories can be modified using
the -DirectoryMode mode switch. The mode
specifier must be one of the following:
-
descend: Searches for image files to convert in all
descendent directories of the specified directory. Each
image file is converted to a separate output file as described above.
-
package: Package all of the files found in a directory into a single
output file containing the concatenation of all of the pages contained
in all of the input files. The input file names are sorted alpha-numerically.
The output file name is identical to the directory name with the appropriate
format extension appended. While in package mode, CPC Tool will
also recursively descend into any sub-directories looking for additional
sub-directories to package.
-
ignore: Ignore all sub-directories. This is the default.
For example, assume the following directory structure:
-
Docs
-
Intro1.tiff
-
Intro2.tiff
-
Doc1
-
Doc2
-
Page1.tiff
-
Page2.tiff
-
Page3.tiff
By default, when processing Docs, CPC Tool would ignore the sub-directories
Doc1 and Doc2. Hence, the command CPCTool -batch -f cpc
Docs would produce:
-
Docs
-
Intro1.tiff
-
Intro1.cpc
-
Intro2.tiff
-
Intro2.cpc
-
Doc1
-
Doc2
-
Page1.tiff
-
Page2.tiff
-
Page3.tiff
When the directory mode is set to descend, CPC Tool will process
the sub-directories Doc1 and Doc2. Hence, the command CPCTool
-batch -f cpc -dir descend Docs would produce the following:
-
Docs
-
Intro1.tiff
-
Intro1.cpc
-
Intro2.tiff
-
Intro2.cpc
-
Doc1
-
Page1.tiff
-
Page1.cpc
-
Page2.tiff
-
Page2.cpc
-
Doc2
-
Page1.tiff
-
Page1.cpc
-
Page2.tiff
-
Page2.cpc
-
Page3.tiff
-
Page3.cpc
Whereas, the command CPCTool -batch -f cpc -dir package Docs would
produce the following:
-
Docs
-
Intro1.tiff
-
Intro2.tiff
-
Doc1
-
Doc1.cpc (contains
Doc1/Page1.tiff and Doc1/Page2.tiff)
-
Doc2
-
Page1.tiff
-
Page2.tiff
-
Page3.tiff
-
Doc2.cpc (contains
Doc2/Page1.tiff, Doc2/Page2.tiff, and Doc2/Page3.tiff)
-
Docs.cpc (contains
Intro1.tiff and Intro2.tiff)
By default, CPC Tool will attempt to convert every image file contained
in the specified directories and sub-directories. By specifying an input
filter, -InputFilter type, you can restrict, by file format,
the image files considered by CPC Tool as candidates for conversion. Type
must be one of the following: cpc, tiff, cals, c4,
pbm, or any. The input filter causes CPC Tool to ignore all
image files that do not match the specified type. (any is used to
specify no input filter and is the default.) The input filter is only used
when searching directories; it does not apply to files specified on the
command line.
The batch modification arguments can, alternatively, be specified in a
position-senstive manner, by replacing the leading '-' with a '+'. Unlike
normal command line arguments, position-sensitive arguments only apply
to those input files specified after the position-senstive argument on
the command line. Furthermore, there can be multiple position-sensitive
instances of a particular argument. A particular input file is always processed
according to the most recently preceding position-sensitive argument(s)
on the command line.
For example, the command CPCTool -batch Doc1 +clobber skip Doc2 +clobber
overwrite Doc3 would process Doc1 with the default clobber-mode
(query); Doc2 with the clobber-mode set to skip; and
Doc3 with the clobber-mode set to overwrite.
When you install CPC Tool, you will be able to use it
in evaluation mode. Evaluation mode allows you to convert a total
of 1000 pages of non-CPC input. (There is no limit to the
number of CPC input pages which can be converted.)
Once you have used CPC Tool
to convert 1000 pages of non-CPC input, you will need a
conversion battery to perform additional conversions.
Conversion batteries provide conversion capabilities for a fixed number of
non-CPC input pages on a specific computer.
To order a battery,
you must supply the CPC Tool machine ID of the computer on which you want
to use CPC Tool. The CPC Tool machine ID is displayed when you
run CPC Tool with no command-line arguments.
To install a virtual battery,
run CPC Tool with the battery's file name as the only command-line argument.
For example, if the battery is in a file named Cpc.Battery, you would
run the command:
CPCTool Cpc.Battery
Once you have installed the battery, you can dispose of the battery file.
You can also discharge your existing CPC Tool
batteries and return the unused portion for a replacement or rebate.
The command:
CPCTool -f DischargeBattery -o Cpc.DischargeBattery input-file
will discharge the remaining pages in your CPC Tool battery, leaving the discharge
record in the file Cpc.DischargeBattery. Simply email the discharge
record to Cartesian Products for a rebate or replacement battery.
A valid image file must be specified as the input-file. The file
is not used in the discharge process.
Warning: The discharge operation can not be undone. Once you
have discharged your battery, you will not be able to perform non-CPC
input file conversions without installing a new battery.