Rapid Deployment Data Streams
VERSION 4.1
[The following sections are generated directly from the source file: RapidCpc.h
Revision: 1.6 Date: 1996/12/26 04:01:26
©1996 Cartesian Products, Inc. All rights reserved.]
The CPC Rapid Deployment API defines a framework for building CPC compression
and decompression applications in which the raw CPC data is stored in files.
While files serve as a suitable storage framework for many environments, there
are certain situations in which the application requires greater control of the storage
and retrieval process. For example, in a network environment, there may not exist a
file-based abstraction of network I/O, instead requiring the application to use special network-defined
functions for the transfer of data. Similarly, a database environment is likely to require
that the application use special database-defined functions for storage and retrieval.
To provide this level of application control, CPC defines a polymorphic abstraction
of a file, known as a Data Stream, and performs all data storage and retrieval
by manipulating application-supplied instances of this abstraction. This document describes
the Rapid Deployment Data Streams API, which allows a developer to implement
application-specific Data Streams, and thereby take complete control of the storage and retrieval of CPC data.
This document assumes that the reader is already familiar with the Rapid Deployment API.
Cartesian recommends that developers use that API
for situations in which a file-based API is appropriate. It is simpler, and the CPC-specific portions of
the application will be smaller.
A Data Stream can be viewed abstractly as a contiguous array of bytes that can
be read or written at arbitrary offsets. The methods used to manipulate a Data Stream are
polymorphic in that each Data Stream supplies its own implementation. The Stream user
(e.g., the CPC library) is totally unaware of the actual type of storage that sits behind
each Data Stream. It could be a local disk
file, a memory buffer, a transactional database, an RPC circuit, or some other entity. The
Stream user merely
tells the Stream that it wants to read or write data; the actual procedures used to perform the
operation are totally up to the specific Stream's implementation.
The CPC library provides a set of functions for the generic manipulation of Data Streams. The following
sections describe these manipulation interfaces.
This document describes only those functions that may be needed for
interacting with the Rapid Deployment API.
Opaque Type: DataStr
|
The opaque data type used to represent a Data Stream to a Stream user.
|
typedef struct DataStr DataStr; |
|
A Data Stream maintains a latched error state, which can be used by the
application to provide descriptive errors to the end-user. The error state consists
of a pointer to a string that describes the error, and a categorization of the error
as a hard error or soft error. A soft error can be cleared and will be
overridden by a hard error. A hard error can not be cleared and is not overridden
by subsequent hard errors.
After an error is latched, it is up to the Stream's implementation as to whether or
not subsequent I/Os fail.
Function Definition: strGetError
|
Returns 0 if the Stream, str, is not in error. Otherwise, returns a
zero-terminated ASCII string describing the nature of the error. The returned string
will remain valid even after the Stream is closed.
|
The returned string is owned by the CPC library and should not be
modified or deallocated by the application.
|
char const * strGetError(DataStr *str); |
|
Function Definition: strSetError
|
Put the Data Stream, str, into the error state described by err.
If isHard is non-zero, the error is a permanent hard error. Otherwise,
the error is a transient soft error. This specific error will be latched if the Stream
is not currently in error, or if the Stream is in a soft error state and isHard
is non-zero.
|
err must be a valid pointer even after str is
closed. Typically, this restriction is met by using string constants.
|
void strSetError(DataStr *str, char const* err, unsigned isHard); |
|
When the application is done with a Stream, it must be closed.
This is typically
done automatically when the application invokes cpcEnc_destroy
or cpcDec_destroy.
Function Definition: strClose
|
Close the Data Stream, str, deallocating or detaching all of its resources.
If propagateClose is non-zero, any subsidiary resources owned
by the Stream are also closed and deallocated. Otherwise, the Stream merely
detaches itself from any subsidiary resources. (The definition of
subsidiary resources is determined by the implementation of the Stream.)
|
Returns 0 if the Stream was closed without error (and
the Stream was not in error at the time of the close).
Otherwise, returns a zero-terminated ASCII string describing the nature of the error.
|
On return from this routine, str is no longer valid and should
not be used on any subsequent operations.
|
The returned string is owned by the CPC library. It should not be
modified or deallocated by the application.
|
char const *strClose(DataStr *str, unsigned propagateClose); |
|
A Data Stream Agent is the entity that implements the behavior of a
Data Stream. Typically, there is a many-to-one relationship between
Data Streams and Data Stream Agents. For example, there can be several open
FooBar Data Streams, but there is only one FooBar Agent.
Borrowing from object-oriented terminology, we refer to a set of Data Streams
that share a common Agent as a Class of Data Stream.
A Data Stream Agent must implement six methods, of which two are optional.
This section describes the methods.
As described below, some of the methods are allowed to return errors.
In addition to returning an error, the Agent can also call strSetError
to provide an Agent-specific message describing the error. (If the Agent does
not call strSetError, a generic error description is used.)
The Agent must maintain an independent seek pointer for each Stream.
The seek pointer determines the offset within the Stream at which to perform the next read or write.
DataStrAgent Method: GetPos
|
Prototype: int (*GetPos)(DataStr *str)
|
Example: GetPos_File |
|
Returns the current position of the seek pointer for str,
or a negative value if str is in error and the current position
of the seek pointer can not be ascertained.
|
DataStrAgent Method: SetPos
|
Prototype: unsigned (*SetPos)(DataStr *str, unsigned long pos)
|
Example: SetPos_File |
|
Set the current position of the seek pointer for str to pos.
|
Returns non-zero if the operation was successful, zero otherwise.
|
The Agent is not required to return
an error on a failed seek; it can wait until the I/O is attempted
and return an error there.
|
The following two methods are used to perform I/O on the Stream.
After each I/O, the seek pointer should automatically advance by the
number of bytes transferred.
DataStrAgent Method: Read
|
Prototype: int (*Read)(DataStr *str, void *buf, Ulong cnt)
|
Example: Read_File |
|
Read cnt bytes of data from the Stream, str, into buf.
The data should be read from the current position of the Stream's seek pointer.
|
Returns the number of bytes of data transferred to buf, or a negative
value if an error occurred. On return, the Agent should advance the seek pointer
by the number of bytes transferred.
|
A CpcEncoder will never attempt to read from its output Data Stream. Hence, an
encode-only application need not implement this method.
|
If all of the requested data is not yet available, the Agent should block the
caller until it is available. The Agent should only return a value less than cnt
if an error occurs or the end of the Stream has been reached.
|
DataStrAgent Method: Write
|
Prototype: int (*Write)(DataStr*str, void *buf, Ulong cnt)
|
Example: Write_File |
|
Write cnt bytes of data from buf into the Stream,
str. The data should be written at the current position of the Stream's
seek pointer.
|
Returns the number of bytes of data
transferred to buf, or a negative value if an error occurred.
On return, the Agent should advance
the seek pointer by the number of bytes transferred.
|
A CpcDecoder will never attempt to write to its input Data Stream.
Hence, a decode-only application need not implement this method.
|
The Agent should only return a value less than cnt
if an error occurs.
|
The following method is invoked when a Stream is closed, allowing the Agent
to deallocate the resources which the Stream is consuming. This
includes deallocation of the DataStr structure itself (since the
structure is allocated by the Agent).
The method is passed a parameter indicating whether or not subsidiary resources
of the Stream should also be closed. The definition of subidiary resource is up to the Agent.
The general idea is that a subsidiary resource is one that could potentially continue to be used
after the Stream is closed. For example, an Agent that interacts with an open file might consider
the open file to be a subsidiary resource. This would allow the Stream user to close the Stream
without closing the open file.
There is no requirement that the Agent consider any of the Stream's
resources to be subsidiary.
DataStrAgent Method: Close
|
Prototype: char const *(*Close)(DataStr *str, Boolean propagateClose)
|
Example: CloseFile |
|
Close the Data Stream, str, deallocating
its resources. If propagateClose is non-zero, any subsidiary resources owned
by the Stream are also closed and deallocated. Otherwise, the Stream is merely
detached from its subsidiary resources.
|
Returns 0 if the Stream was closed without error.
Otherwise, returns a pointer to a zero-terminated ASCII string describing
the nature of the error.
|
On return from this routine, str is assumed to be deallocated and will
not be passed to any subsequent method invocations.
|
Since on return from this routine the Stream is closed, the returned
string must be valid after the Stream is deallocated. Typically, this
restriction is met by using string constants.
|
The remaining methods are optional.
The Agent can (optionally) provide a method that is invoked on any change
to the error state of a Stream. The Agent can use strGetError to
retrieve the specific error.
The Agent can (optionally) provide a method to query the total number of bytes
contained in the Stream.
This method is not used in the Rapid Deployment API. There is
no reason to implement it.
DataStrAgent Method: GetLen
|
Prototype: int (*GetLen)(DataStr *str)
|
|
Returns the total number of bytes of data contained in
str, or a negative value if the length of the
Stream is not known.
|
An Agent is defined by the DataStrAgent structure, which contains pointers
to the functions which implement the Agent's methods.
Type Definition: DataStrAgent
|
Defines the Agent implementations of the methods for a Data Stream Class.
The semantics of the methods were discussed in the preceding sections.
If an Agent does not implement an optional method, it should set the
corresponding pointer to zero.
|
|
Typically, the Agent will provide some sort of factory function for creating Data
Streams of its Class (for example, see fileStr_open). The Agent is responsible
for allocating the necessary instance memory for the Stream, including a DataStr
to represent the Stream generically. The following function must be invoked by the Agent
to initialize a newly allocated Stream.
Each of the Agent methods is passed a DataStr pointer
identifying the Stream instance on which to perform
the method. (This is the same pointer that was passed by the Agent to strInit.)
If the Agent maintains any per-Stream data structures,
it will need some mechanism for mapping the DataStr pointer
to its own per-Stream structure. One simple solution to this problem is to embed
the DataStr as the first field of a larger structure which contains the
Agent-specific Stream information. This allows the Agent to map the
DataStr pointer to its own structure by direct pointer coercion (since
both structures always start at the same address).
There is no requirement that an Agent be implemented in this manner.
Type Definition: DataStr
|
The generic structure of a Data Stream. The Agent should never have to directly manipulate the fields of this
structure. The structure is exposed to allow Agents to embed it in their
own internal structures.
|
The semantics and fields of this structure are not defined
by this API. They are subject to change without notice.
|
struct DataStr {
DataStrAgent const *agent;
unsigned char normalizeValues,
exceptionsEnabled, errorIsHard;
char const *error;
}; |
|
The procedures for using Data Streams to compress and decompress
CPC images are virtually identical to the procedures described in the
Rapid Deployment API. The only difference is the function used to create the
CpcEncoder or CpcDecoder.
Applications that compress CPC image data use the CpcEncoder object
of the Rapid Deployment API.
The only differences from the procedures described there are:
-
The application should use cpcEnc_createFromStream
rather than cpcEnc_createFromFile.
-
The closeOutput parameter of cpcEnc_destroy determines whether
or not the underlying output Data Stream is also closed (via strClose). If
closeOutput is non-zero, the underlying Data Stream is also closed (passing
a non-zero value for the propagateClose parameter of strClose).
Function Definition: cpcEnc_createFromStream
|
Create a CPC encoder that sends its compressed CPC data to the
Data Stream, sink, starting at the Stream's current seek position.
If progressive is non-zero, the document is encoded using the
CPC-Progressive format. Otherwise, it is encoded using the
CPC-Normal format.
|
Returns a pointer to the encoder, or 0 if the encoder could not be
created. The only reason for failure is that memory could not be allocated.
|
The CPC encoder never tries to read from sink.
|
|
Applications that decompress CPC image data use the CpcDecoder
object of the
Rapid Deployment API.
The only differences from the procedures described there are:
-
The application should use cpcDec_createFromStream rather than
cpcDec_createFromFile.
-
The closeInput parameter of cpcDec_destroy determines whether
or not the underlying input Data Stream is also closed (via strClose). If
closeInput is non-zero, the underlying Data Stream is also closed (passing
a non-zero value for the propagateClose parameter of strClose).
Function Definition: cpcDec_createFromStream
|
Create a CPC decoder that reads its compressed CPC data from the
Data Stream, data, starting at the Stream's current seek position.
If sequential is non-zero, the decoder is configured for sequential
access to the pages. Otherwise, the decoder is configured for random access.
(On large documents, random access uses an additional 750k of memory.)
|
Returns a pointer to the decoder, or 0 if the decoder could not be
created. The only reason for failure is that memory could not be
allocated.
|
The CPC decoder never tries to write to data.
|
|
In applications that deal with multiple image formats, it is often desirable to
determine the particular format of an incoming Data Stream by examining the
contents of the Stream. The following function can be used to determine if the
data contained in a Stream appears to be CPC-formatted image data.
[The following sections are generated directly from the source file: CpcStr.c
Revision: 1.4 Date: 1996/12/26 21:57:31
©1996 Cartesian Products, Inc. All rights reserved.]
In this section, we develop a full sample application, which
compresses and decompresses CPC data using the Rapid Deployment
Data Streams API.
The code in this example is taken directly from the CpcStr sample
application of the Rapid Deployment SDK.
In order to use the Data Stream API, we need to implement a Stream Agent.
In this section, we implement an Agent that uses files for the underlying
Stream storage and retrieval, referred to as the
File Agent.
The File Agent is provided for expository purposes only. It provides no
additional functionality over the file-based Rapid Deployment API. (In fact, the File Agent is
a simplification of the Agent used to implement the file-based API.)
We use the ANSI stdio API for manipulating the files. Hence,
in order to process the Stream methods, we will need to know the
FILE pointer corresponding to each open Stream. As recommended in
§ Instances, the File Agent augments the generic Data Stream
with additional information by embedding a DataStr as the
first field of its Agent-specific data structure.
Type Definition: FileStr
|
Describes an open File Stream. gen contains the generic description
of the stream. fp is a pointer to the open
file that stands behind the stream.
|
typedef struct { DataStr gen; FILE *fp; } FileStr; |
|
Since the first field of a FileStr is the DataStr, both structures
will start at the same address. Hence, the mapping of a DataStr pointer
to a FileStr pointer can be performed by simple pointer coercion.
When an error occurs in a stdio operation, we
set the error state of the Stream to be the stdio
description of the error.
Function Definition: SetError
|
Set the error state for str to be the error contained
in the underlying stdio file. dfltMsg is used
as the error if the underlying file does not contain a known error code
(or does not appear to be in error).
|
static void SetError(DataStr *str, char const *dfltMsg)
{
FILE *fp = SubClass(str)->fp;
if(ferror(fp) && errno < sys_nerr) {
dfltMsg = sys_errlist[errno];
}
strSetError(str, dfltMsg, 0);
} |
|
This section provides the implementation of the Agent methods.
A seek pointer (with the appropriate semantics) is maintained by the underlying
stdio file. Hence, we use the file's seek pointer to implement the
Stream's seek pointer.
Function Definition: SetPos_File
|
Set the current position of the seek pointer for str to pos.
|
Returns non-zero if the seek was successful, zero otherwise.
|
static unsigned
SetPos_File(DataStr *str, unsigned long pos)
{
FILE *fp = SubClass(str)->fp;
unsigned worked = !fseek(fp, pos, SEEK_SET);
if(!worked) { SetError(str, "Seek error"); }
return worked;
} |
|
The following two functions implement the File Agent's data transfer methods.
Function Definition: Read_File
|
Read cnt bytes of data from str into buf. The data is
read from the current position of the Stream's seek pointer.
|
Returns the number of bytes of data transferred to buf, or a negative
value if an error occurred. On return, the seek pointer is advanced
by the number of bytes transferred.
|
static int
Read_File(DataStr *str, void *buf, unsigned long cnt)
{ |  fread returns a short count on an error or end-of-file.
We use ferror to disambiguate the two cases. | FILE *fp = SubClass(str)->fp;
size_t numRead = fread(buf, 1, cnt, fp);
if(numRead==cnt || !ferror(fp)) {
return numRead;
} |  Set the error state of the stream. | SetError(str, "fread error");
return -1;
} |
|
Function Definition: Write_File
|
Write cnt bytes of data from buf into the stream,
str. The data is written at the current position of the Stream's seek pointer.
|
Returns the number of bytes of data transferred to buf, or
a negative value if an error occurred. On return, the seek pointer
is advanced by the number of bytes transferred.
|
static int
Write_File(DataStr *str, void *buf, unsigned long cnt)
{ |  We always consider a short write to be an error. | FILE *fp = SubClass(str)->fp;
size_t numWritten = fwrite(buf, 1, cnt, fp);
if(numWritten == cnt) {
return numWritten;
} |  Set the error state of the stream. | SetError(str, "fwrite error");
return -1;
} |
|
The File Agent defines the file to be a subsidiary resource,
and hence, will optionally leave the file open after the Stream is closed.
Function Definition: Close_File
|
Close the Data Stream, str, deallocating
its resources. If closeFp is non-zero, the underlying file owned
by the Stream is also closed. Otherwise, the Stream merely
detaches itself from the underlying file.
|
Returns 0 if the Stream was closed without error.
Otherwise, returns a pointer to a zero-terminated ASCII string
describing the nature of the error.
|
static char const *
Close_File(DataStr *str, unsigned closeFp)
{ |  Flush the file and latch any final error. | char const *err;
FILE *fp = SubClass(str)->fp;
if(fflush(fp)) { SetError(str, "fflush error"); } |  Close the file if requested. | if(closeFp) { fclose(fp); } |  Deallocate the str (which we allocated in fileStr_open). | err = strGetError(str);
free(str);
return err;
} |
|
The Agent is defined by initializing a DataStrAgent structure with
pointers to our internal methods.
The File Agent provides the following function to create a File
Stream and attach it to an open stdio file.
Function Definition: fileStr_open
|
Create a new instance of a File Stream and attach it to the
open file, fp.
|
Returns a pointer to the newly created Stream, or zero if an error occurred.
The Stream's seek pointer is positioned at the same offset as the seek
pointer for fp
|
DataStr *fileStr_open(FILE *fp)
{
FileStr *str; |  Fail if the file is invalid. | if(!fp) { return 0; } |  Allocate a new FileStr. | str = malloc(sizeof(*str));
if(!str) { return 0; } |  Initialize the generic portion. | strInit(&str->gen, &Agent_File); |  Attach it to the file and return the generic pointer. | str->fp = fp;
return &str->gen;
} |
|
Now that we have a Stream Agent, we are ready to implement the application.
The sample program will decode a CPC input file and re-encode each input page
to produce a CPC output file. Since CPC is a lossy compression algorithm, the
resultant output will (potentially) differ from the input.
(Regenerative CPC codings typically stabilize after 2-4 generations.
Once the coding has stabilized, subsequent generations are bitwise identical.)
First, we implement a function to create a CpcDecoder that uses a
File Stream as its input.
Function Definition: OpenDecoder
|
Returns a pointer to a CpcDecoder which reads its raw CPC
data from the file named, name, or zero if the CpcDecoder can
not be created.
|
static CpcDecoder *OpenDecoder(char const *name)
{ |  Open the source file and attach it to a File Stream. If
we are unable to open the Stream, give up. | CpcDecoder *cpc;
DataStr *str = fileStr_open(fopen(name, "rb"));
if(!str) {
fprintf(stderr, "Unable to open <%s>\n", name);
return 0;
} |  Check for a CPC signature. (This is not really necessary, since the
CpcDecoder would detect the error.) | if(!cpcDec_checkSignature(str)) {
fprintf(stderr, "<%s> does not contain CPC data\n",
name);
strClose(str, 1/*closeInput*/);
return 0;
} |  Create the CPC decoder, using the File Stream as
the data source. | cpc = cpcDec_createFromStream(str, 1 /*sequential*/);
if(!cpc) {
fprintf(stderr, "Unable to create CPC decoder\n");
strClose(str, 1/*closeInput*/);
return 0;
}
return cpc;
} |
|
Next, we implement a function to create a CpcEncoder that uses a
File Stream as its output.
Function Definition: OpenEncoder
|
Returns a pointer to a CpcEncoder that writes its raw CPC data
to the file named name, or zero if the CpcEncoder can not be created.
|
static CpcEncoder *OpenEncoder(char const *name)
{ |  Create the output file and attach it to a File Stream. If
we are unable to open the Stream, give up. | CpcEncoder *cpc;
DataStr *str = fileStr_open(fopen(name, "wb"));
if(!str) {
fprintf(stderr, "Unable to create <%s>\n", name);
return 0;
} |  Create the CPC encoder, using the File Stream as the data sink. | cpc = cpcEnc_createFromStream(str, 1/*CPC-Progressive*/);
if(!cpc) {
fprintf(stderr, "Unable to create CPC encoder\n");
strClose(str, 1/*closeOutput*/);
return 0;
}
return cpc;
} |
|
Finally, we implement the entry-point for the application, main.
The application accepts two command-line parameters. The first
specifies the name of the CPC input file. The second specifies the
name of the CPC output file. The application decompresses each page of
the input file, and writes it to the output file.
Function Definition: main
|
The entry point for the application.
|
int main(int argc, char **argv)
{
CpcEncoder *encoder; CpcDecoder *decoder;
char const *err; unsigned long i; |  There must be exactly two arguments. | if(argc != 3) {
fprintf(stderr,
"Usage: %s <inCpcFile> <outCpcFile>\n",
argv[0]);
return -1;
} |  Open the encoder and decoder. If either fails, give up. | decoder = OpenDecoder(argv[1]);
encoder = OpenEncoder(argv[2]);
if(!decoder || !encoder) { return -1; } |  Iterate over the pages in the input document. | for(i=0; i<cpcDec_getPageCount(decoder); i++) { |  Retrieve the page from the decoder. | ImBitMap *ibm = cpcDec_getPage(decoder, i);
if(!ibm) {
fprintf(stderr, "Get Page %ld failed (%s)\n",
i+1, cpcDec_getError(decoder));
return -1;
} |  Add the page to the encoder. | cpcEnc_addPage(encoder, ibm); ibm_destroy(ibm);
} |  Destroy the decoder, checking for errors. | err = cpcDec_destroy(decoder, 1/*close stream*/);
if(err) {
fprintf(stderr, "Cpc decoder error: <%s>\n", err);
return -1;
} |  Destroy the encoder, checking for errors. | err = cpcEnc_destroy(encoder, 1/*close stream*/);
if(err) {
fprintf(stderr, "Cpc encoder error: <%s>\n", err);
return -1;
}
return 0;
} |
|
Index
THE FINE PRINT (regarding copyrights and trademarks)
Cartesian Products, Inc.
cpi@cartesianinc.com