) as a backward-encoded Mobipocket variable-width integer.
+
+Only a few bits have been identified
+
+bit Data at end of records
+0x0001 Multi-byte character overlaps
+0x0002 Some data to help with indexing
+0x0004 Some data about uncrossable breaks
+
+
+Multibyte character overlap
+---------------------------
+
+When bit 1 of the Extra Data Flags field is set, each record is followed by a
+trailing entry containing any extra bytes necessary to complete a multibyte
+character which crosses the record boundary. The bytes do not participate in
+compression regardless which compression scheme is used for the file. However,
+unlike the trailing data bytes, the multibytes (including the count byte) do
+get included in any encryption. The overlapping bytes then re-appear as normal
+content at the beginning of the following record. The trailing entry ends with
+a byte containing a count of the overlapping bytes plus additional flags.
+
+offset bytes content comments
+0 0-3 N terminal bytes
+ of a multibyte
+ character
+N 1 Size & flags bits 1-2 encode N, use of bits 3-8 is unknown
+
+
+PalmDOC Compression
+-------------------
+
+PalmDOC uses LZ77 compression techniques. DOC files can contain only compressed
+text. The format does not allow for any text formatting. This keeps files small,
+in keeping with the Palm philosophy. However, extensions to the format can use
+tags, such as HTML or PML, to include formatting within text. These extensions
+to PalmDoc are not interchangeable and are the basis for most eBook Reader
+formats on Palm devices.
+
+LZ77 algorithms achieve compression by replacing portions of the data with
+references to matching data that has already passed through both encoder and
+decoder. A match is encoded by a pair of numbers called a length-distance pair,
+which is equivalent to the statement "each of the next length characters is
+equal to the character exactly distance characters behind it in the uncompressed
+stream." (The "distance" is sometimes called the "offset" instead.)
+
+In the PalmDoc format, a length-distance pair is always encoded by a two-byte
+sequence. Of the 16 bits that make up these two bytes, 11 bits go to encoding
+the distance, 3 go to encoding the length, and the remaining two are used to
+make sure the decoder can identify the first byte as the beginning of such a
+two-byte sequence. The exact alforithm needed to decode the compressed text can
+be found on the PalmDOC page.
+
+PalmDOC data is always divided into 4096 byte blocks and the blocks are acted
+upon independently.
+
+PalmDOC does have support for bookmarks. These pointers are named and refer to
+an offset location in a file. If the file is edited these locations may no
+longer refer to the correct locations. Some reading programs allow the user to
+enter or edit these bookmarks while others treat them as a TOC. Some reading
+programs may ignore them entirely. They are stored at the end of the file itself
+so the full file needs to be scanned when loaded to find them.
+
+
+MBP
+---
+
+This is the extension used on a side file (auxiliary) for MOBI formatted eBooks.
+It is used to store metadata used by the library software and also to store
+user entered data like bookmarks, annotations, last read position. This file is
+created automatically by the reader program when the eBook is first opened and
+has a .mbp extension. The Library management software in MobiPocket uses this
+file to get information displayed in the library window such as title and author
+so that it won't have to open the larger eBook file.
+
diff --git a/format_docs/pdb/palmdoc.txt b/format_docs/pdb/palmdoc.txt
new file mode 100644
index 0000000000..0df62ae2e2
--- /dev/null
+++ b/format_docs/pdb/palmdoc.txt
@@ -0,0 +1,25 @@
+PalmDoc Format
+--------------
+
+The format is that of a standard Palm Database Format file. The header of that
+format includes the name of the database (usually the book title and sometimes
+a portion of the authors name) which is up to 31 bytes of data. This string of
+characters is terminated with a 0 in the C style. The files are identified as
+Creator ID of REAd and a Type of TEXt.
+
+
+Record 0
+--------
+
+The first record in the Palm Database Format gives more information about the
+PalmDOC file, and contains 16 bytes.
+
+bytes content comments
+
+2 Compression 1 == no compression, 2 = PalmDOC compression (see below)
+2 Unused Always zero
+4 text length Uncompressed length of the entire text of the book
+2 record count Number of PDB records used for the text of the book.
+2 record size Maximum size of each record containing text, always 4096
+4 Current Position Current reading position, as an offset into the uncompressed text
+
diff --git a/format_docs/pdb/pdb_format.txt b/format_docs/pdb/pdb_format.txt
new file mode 100644
index 0000000000..e6837ac2ad
--- /dev/null
+++ b/format_docs/pdb/pdb_format.txt
@@ -0,0 +1,104 @@
+Format
+------
+
+A PDB file can be borken into multiple parts. The header, record 0 and data.
+values stored within the various parts are big-endian byte order. The data
+part is is broken down into multiple sections. The section count and offsets
+are referened in the PDB header. Sections can be no more than 65505 bytes in
+length.
+
+
+Layout
+------
+
+PDB files take the format: DB header followed by the record 0 which has
+contained format specific iformation followed by data.
+
+ DB Header
+0 Record 0
+.
+. Data (borken down into sections)
+.
+
+
+Palm Database Header Format
+
+bytes content comments
+
+32 name database name. This name is 0 terminated in the
+ field and will be used as the file name on a
+ computer. For eBooks this usually contains the
+ title and may have the author depending on the
+ length available.
+
+2 attributes bit field.
+ 0x0002 Read-Only
+ 0x0004 Dirty AppInfoArea
+ 0x0008 Backup this database (i.e. no conduit exists)
+ 0x0010 (16 decimal) Okay to install newer over
+ existing copy, if present on PalmPilot
+ 0x0020 (32 decimal) Force the PalmPilot to reset
+ after this database is installed
+ 0x0040 (64 decimal) Don't allow copy of file to be
+ beamed to other Pilot.
+
+2 version file version
+
+4 creation date No. of seconds since start of January 1, 1904.
+
+4 modification date No. of seconds since start of January 1, 1904.
+
+4 last backup date No. of seconds since start of January 1, 1904.
+
+4 modificationNumber
+
+4 appInfoID offset to start of Application Info (if present)
+ or null
+
+4 sortInfoID offset to start of Sort Info (if present) or null
+
+4 type See above table. (For Applications this data will
+ be 'appl')
+
+4 creator See above table. This program will be launched if
+ the file is tapped
+
+4 uniqueIDseed used internally to identify record
+
+4 nextRecordListID Only used when in-memory on Palm OS. Always set to
+ zero in stored files.
+
+2 number of Records number of records in the file - N
+
+8N record Info List
+
+ start of record
+ info entry Repeat N times to end of record info entry
+
+4 record Data Offset the offset from the start of the PDB of this record
+
+1 record Attributes bit field. The least significant four bits are used
+ to represent the category values. These are the
+ categories used to split the databases for viewing
+ on the screen. A few of the 16 categories are
+ pre-defined but the user can add their own. There
+ is an undefined category for use if the user or
+ programmer hasn't set this.
+ 0x10 (16 decimal) Secret record bit.
+ 0x20 (32 decimal) Record in use (busy bit).
+ 0x40 (64 decimal) Dirty record bit.
+ 0x80 (128, unsigned decimal) Delete record on
+ next HotSync.
+
+3 UniqueID The unique ID for this record. Often just a
+ sequential count from 0
+
+ end of record
+ info entry
+
+2? Gap to data traditionally 2 zero bytes to Info or raw data
+
+? Records The actual data in the file. AppInfoArea (if
+ present), SortInfoArea (if present) and then
+ records sequentially
+
diff --git a/format_docs/pdb/pdb_types.txt b/format_docs/pdb/pdb_types.txt
new file mode 100644
index 0000000000..5d6d39c897
--- /dev/null
+++ b/format_docs/pdb/pdb_types.txt
@@ -0,0 +1,34 @@
+Palm Database File Code
+-----------------------
+
+Reader Type Code
+
+Adobe Reader .pdfADBE
+PalmDOC TEXtREAd
+BDicty BVokBDIC
+DB (Database program) DB99DBOS
+eReader PNRdPPrs
+eReader DataPPrs
+FireViewer (ImageViewer) vIMGView
+HanDBase PmDBPmDB
+InfoView InfoINDB
+iSilo ToGoToGo
+iSilo 3 SDocSilX
+JFile JbDbJBas
+JFile Pro JfDbJFil
+LIST DATALSdb
+MobileDB Mdb1Mdb1
+MobiPocket BOOKMOBI
+Plucker DataPlkr
+QuickSheet DataSprd
+SuperMemo SM01SMem
+TealDoc TEXtTlDc
+TealInfo InfoTlIf
+TealMeal DataTlMl
+TealPaint DataTlPt
+ThinkDB dataTDBP
+Tides TdatTide
+TomeRaider ToRaTRPW
+Weasel zTXTGPlm
+WordSmith BDOCWrdS
+
diff --git a/format_docs/pdb/plucker.html b/format_docs/pdb/plucker.html
new file mode 100644
index 0000000000..07f7b926ca
--- /dev/null
+++ b/format_docs/pdb/plucker.html
@@ -0,0 +1,2122 @@
+
+
+
+
+The Plucker Document Format
+
+
+The Plucker Document Format
+
+Introduction
+
+
+This document is the official description of the
+Plucker format.
+
+
Overview
+
+The Plucker document format supports a multi-page (in the Web sense of 'page') hyperlinked information structure containing both 'rich' text and images. Links may be internal to the document or link to other documents. External links, in standard URL form, may be included and displayed, but not followed. Images may either be embedded in a text page, as with HTML, or be included as separate stand-alone pages.
+
+Plucker documents are structured so that they can be used both with a file-system-oriented operating system such as Unix or Windows, and with the PalmOS, a non-file-system-oriented OS. To this end, they always begin with a standard PalmOS record database prefix, which consists of four parts: the database header, a record-id list, an AppInfo block, and a SortInfo block. The Plucker format does not use the SortInfo block, which is therefore null, and consequently occupies no space in the document prefix.
+
+The record database prefix is then followed by a sequence of application-specific records. In a Plucker document, this sequence consists of one index record, followed by a series of data records. The index record contains information about the data records, along with some global information, such as the type of compression used. Each data record contains either a page, an image, or data about the document, such as bookmarks or URL data.
+
+The format is big-endian; any multi-byte numeric values specified in this document are big-endian. Images are stored in the Palm image format; for more information on this format please consult http://www.palmos.com/dev/tech/docs/.
+
The Database Prefix
+
+
+
+The database header is a fixed-size structure of 72 bytes. It contains the name of the database, the Plucker version number, various timestamps (creation, modification, last backup), and several flags. All timestamps are given using the PalmOS standard, seconds since 12:00 AM on January 1, 1904.
+
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| docName |
+32 |
+String |
+Must contain a NUL-terminated 7-bit ASCII string (only character codes 0x20-0x7E are valid) giving the name of the document. Because of the terminating NUL character at end, only 31 bytes can actually be used for the name of the document. The first 26 bytes of this string are used by Plucker as a unique ID for the document; names should be unique in the first 26 characters. |
+
+
+
+| flags |
+2 |
+Bitfield |
+Most bits in this field are unused. Unused bits should be set to zero on document creation, but reader software should not expect them to stay at this value.
+
+ Valid bits are as follows. All numeric values given are big-endian.
+
+
+| Name |
+Value |
+Meaning |
+
+
+
+| CopyPrevention |
+0x0040 |
+Indicates that system should not allow copying of this document. |
+
+
+
+| Launchable |
+0x0200 |
+Indicates that this document should be presented as a first-class object on desktop renderings. If this bit is set, an AppInfo block must be included. |
+
+
+
+| Backup |
+0x0008 |
+Indicates that this document should be backed up, if the system includes such a capability. |
+
+
+
+ |
+
+
+
+| version |
+2 |
+Numeric |
+Version of the Plucker format used in this document. Must have the value 1. |
+
+
+
+| creationDate |
+4 |
+Timestamp |
+Time of document creation |
+
+
+
+| modificationDate |
+4 |
+Timestamp |
+Time document last modified |
+
+
+
+| unused1 |
+8 |
+Numeric |
+Must be zero at document creation, but any specific value should not be relied upon. |
+
+
+
+| appInfoOffset |
+4 |
+Numeric |
+Either zero, if no appInfo is present, or the offset from the beginning of the document to the start of the appInfo block.
+ |
+
+
+| sortInfoId |
+4 |
+Numeric |
+Must be zero. |
+
+
+
+| magic |
+8 |
+String |
+Must be the 8 ISO Latin-1 characters "DataPlkr". No terminating NUL character. |
+
+
+
+| unused2 |
+4 |
+Numeric |
+Must be zero at document creation, but any specific value should not be relied upon. |
+
+
+
+
+
+
+This list consists of a six-byte list header, followed by one ID entry for each data record in the document. The list header has the structure:
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| nextRecordListID |
+4 |
+Numeric |
+Must be zero. |
+
+
+
+| numRecords |
+2 |
+Numeric |
+Number of records in the document, including the index record. |
+
+
+
+This is then followed by numRecords entries of the following structure:
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| recordOffset |
+4 |
+Numeric |
+Number of bytes from the start of the document to the beginning of the record |
+
+
+
+| attributes |
+1 |
+Bitfield |
+Record attributes -- should be zero. |
+
+
+
+| uniqueID |
+3 |
+Numeric |
+A local (document-specific) unique ID for the record. This is not used by Plucker (because it is not preserved by PalmOS through beaming of a document), but must still be different for each record. |
+
+
+
+
+Finally, there are two bytes of zero-padding to bring the structure alignment back to 4 bytes.
+
+
+
+Typically, this is only present when the launchable flag is set in the flags field of the database header. No Plucker data aside from icon display information and a versioning string is stored in this block. This block has the following structure:
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| signature |
+4 |
+Numeric |
+Must contain the value 0x6C6E6368. |
+
+
+
+| hdrVersion |
+2 |
+Numeric |
+Must have the value 3. |
+
+
+
+| hdrEncoding |
+2 |
+Numeric |
+Must have the value 0. |
+
+
+
+| verStrWords |
+2 |
+Numeric |
+The number of two-byte words following, containing the version string. |
+
+
+
+| verStr |
+2 * verStrWords |
+String |
+NUL-terminated ISO Latin-1 string, padded at end if necessary with a zero byte to an even-byte boundary, containing a version string to display to the user containing version information for the document. |
+
+
+
+| pqaTitleWords |
+2 |
+Numeric |
+The number of two-byte words in the following pqaTitleStr. |
+
+
+
+| pqaTitleStr |
+2 * pqaTitleWords |
+String |
+NUL-terminated ISO Latin-1 string, padded at end if necessary with a zero byte to an even-byte boundary, containing a title string for iconic display of the document. |
+
+
+
+| iconWords |
+2 |
+Numeric |
+Number of two-byte words in the following icon image. |
+
+
+
+| icon |
+2 * iconWords |
+Image |
+Image (32x32) in Palm image format to be used as an icon to represent the document on a desktop-style display. The image may not use a custom color map. |
+
+
+
+| smIconWords |
+2 |
+Numeric |
+Number of two-byte words in the following icon image. |
+
+
+
+| smIcon |
+2 * smIconWords |
+Image |
+Small image (15x9) in Palm image format to be used as an icon to represent the document on a desktop-style display. The image may not use a custom color map. |
+
+
+
+
+
+This record includes info about the compression type used
+for the Plucker document and also what IDs the reserved records use.
+The viewer will use this record to know where to look for the
+reserved records and whether it must have support for ZLib
+compression. This record should always be the first record in
+the Plucker document (i.e. at index 0).
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| uid |
+
+2 |
+
+Numeric |
+
+unique ID for record, always 0x0001 |
+
+
+
+| version |
+
+2 |
+
+Numeric |
+
+0x0002 if data is ZLib compressed, 0x0001 if DOC compressed |
+
+
+
+| records |
+
+2 |
+
+Numeric |
+
+number of reserved records |
+
+
+
+| reserved |
+
+4*records |
+
+Numeric |
+
+reserved ID array |
+
+
+
+
+The reserved ID array consists of a series of name/ID pairs,
+where the ID is the unique ID (2 bytes) for
+the record and the name is a value (2 bytes)
+from the following list.
+
+
+- home.html = 0
+- external bookmarks = 1
+- URL handling = 2
+- default category = 3
+- additional metadata = 4
+- page list metadata = 5
+- sorted URL name data = 6
+- external anchor name data = 7
+
+
+
+
+
+There are several different types of data records.
+
+
+
+
+
+Each data record starts with a header, having the following structure:
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| uid |
+2 |
+Numeric |
+Unique ID for record. IDs must be sorted in increasing order.
+Currently the ID is not
+allowed to be 0xFFFF. Moreover, some earlier versions of the viewer had a bug
+that crashed on records numbered 0x8000-0xFFFE. |
+
+
+
+| paragraphs |
+2 |
+Numeric |
+number of paragraphs |
+
+
+
+| size |
+2 |
+Numeric |
+total length of data before compression |
+
+
+
+| type |
+1 |
+Numeric |
+Data type. Must be one of the following:
+
+
+| Data type |
+Value |
+
+
+
+| DATATYPE_PHTML |
+
+0 |
+
+
+
+| DATATYPE_PHTML_COMPRESSED |
+
+1 |
+
+
+
+| DATATYPE_TBMP |
+
+2 |
+
+
+
+| DATATYPE_TBMP_COMPRESSED |
+
+3 |
+
+
+
+| DATATYPE_MAILTO |
+
+4 |
+
+
+
+| DATATYPE_LINK_INDEX |
+
+5 |
+
+
+
+| DATATYPE_LINKS |
+
+6 |
+
+
+
+| DATATYPE_LINKS_COMPRESSED |
+
+7 |
+
+
+
+| DATATYPE_BOOKMARKS |
+
+8 |
+
+
+
+| DATATYPE_CATEGORY |
+
+9 |
+
+
+
+| DATATYPE_METADATA |
+
+10 |
+
+
+
+| DATATYPE_STYLE_SHEET |
+
+11 |
+
+
+
+| DATATYPE_FONT_PAGE |
+
+12 |
+
+
+| DATATYPE_TABLE |
+
+13 |
+
+
+
+| DATATYPE_TABLE_COMPRESSED |
+
+14 |
+
+
+
+| DATATYPE_COMPOSITE_IMAGE |
+
+15 |
+
+
+
+| DATATYPE_PAGELIST_METADATA |
+
+16 |
+
+
+
+| DATATYPE_SORTED_URL_INDEX |
+
+17 |
+
+
+
+| DATATYPE_SORTED_URL |
+
+18 |
+
+
+
+| DATATYPE_SORTED_URL_COMPRESSED |
+
+19 |
+
+
+
+| DATATYPE_EXT_ANCHOR_INDEX |
+
+20 |
+
+
+
+| DATATYPE_EXT_ANCHOR |
+
+21 |
+
+
+
+| DATATYPE_EXT_ANCHOR_COMPRESSED |
+
+22 |
+
+
+
+ |
+
+
+
+| flags |
+1 |
+Bitfield |
+Bit-mapped record flags. Valid bits are as follows (all numeric values given are big-endian); unused bits should be set to zero.
+
+
+| Name |
+Value |
+Meaning |
+
+
+
+| Continued Record |
+ 0x01 |
+A value of one indicates that the record is
+continued by the fragment in the next sequential record of the same type.
+This value is applicable to the following data types:
+
+ - DATATYPE_PHTML
+ - DATATYPE_PHTML_COMPRESSED
+
+A value of zero indicates that the record is not to be continued (i.e. there are no fragments beyond this one, or this is the last one).
+ |
+
+
+
+| Navigation Metadata |
+0x02 |
+A value of one indicates that the text or image data in this record is followed by additional navigation metadata. |
+
+
+
+ |
+
+
+
+
+This data format supports two forms of compression, DOC and ZLIB. That part of a data record that occurs after the header is compressed as a single chunk. All compressed records in a single document must use the same compression format. Compressed records may be mixed with uncompressed records. In a compressed record, the length of the compressed data must be less than its uncompressed length.
+DOC compression is the the format invented for early Palm usage.
+ZLIB compression uses the ZLib format documented in Internet RFCs 1950 and 1951. See also http://www.gzip.org/zlib/manual.html for a description of the library used to perform the compression and decompression.
+Plucker documents may be keyed to a specific string of 40 or fewer ASCII characters, called the owner-id. When such a key is specified, zlib compression must be used in the document. When an owner-id is specified, the beginning of each zlib-compressed data segment is XOR'ed with a value derived from the key, after compression, and must be XOR'ed again with the derived value before being decompressed. If an owner-id is specified for a document, the metadata record must exist, and must contain an OwnerID subrecord giving the CRC-32 of the owner-id string.
+The derived value mentioned above is a 40-byte value constructed by forming 10 strings by concatenating the owner-id string with itself 2, 3, 4, 5, 6, 7, 8, 9, 10, and 11 times, then taking the CRC-32 values of each of these concatenations, then packing those 32-bit values in big-endian order into a 40-byte buffer.
+
+
+
+
+
+For text data the data record header is followed by a series of paragraph
+headers, each representing a paragraph block in the text data. This series of paragraph headers is then followed by the compressed or uncompressed text data. Each paragraph header has the form:
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| size |
+
+2 |
+
+Numeric |
+
+Total length of paragraph before compression. NOTE: No text data should be larger than
+32k. If the original document is larger than 32k, then the
+parser must split it into several records.
+ |
+
+
+
+| attributes |
+
+2 |
+
+Bitfield |
+
+Paragraph info. The high-order 13 bits are reserved for future use and should be set to zero; the 3 low-order bits contain a numeric value in the range [0..7] giving the
+amount of extra paragraph spacing (2*value pixels). |
+
+
+
+
+The (uncompressed) text data contains a character stream of ISO Latin-1 characters, interspersed with 'functions'.
+
+A function is introduced in the text stream by a NULL character (0x00), followed by a one-byte function code
+and up to 7 bytes of data. The 3 LSB of the function code represent the
+remaining function data length; the 5 MSB denote the actual function
+code. The following functions are valid:
+
+
+
+| Function Code |
+Description |
+Bytes |
+Arguments |
+
+
+
+| 0x0A |
+Page link begins |
+2 |
+record ID |
+
+
+
+| 0x0B |
+Targeted page link begins |
+3 |
+record ID, target |
+
+
+
+| 0x0C |
+Paragraph link begins |
+4 |
+record ID, paragraph number |
+
+
+
+| 0x0D |
+Targeted paragraph link begins |
+5 |
+record ID, paragraph number, target |
+
+
+
+| 0x08 |
+Link ends |
+0 |
+no data |
+
+
+
+| 0x11 |
+Set font |
+1 |
+font specifier |
+
+
+
+| 0x1A |
+Embedded image |
+2 |
+image record ID |
+
+
+
+| 0x22 |
+Set margin |
+2 |
+left margin, right margin |
+
+
+
+| 0x29 |
+Alignment of text |
+1 |
+alignment |
+
+
+
+| 0x33 |
+Horizontal rule |
+3 |
+8-bit height, 8-bit width (pixels), 8-bit width (%, 1-100) |
+
+
+
+| 0x38 |
+New line |
+0 |
+no data |
+
+
+
+| 0x40 |
+Italic text begins |
+0 |
+no data |
+
+
+
+| 0x48 |
+Italic text ends |
+0 |
+no data |
+
+
+
+| 0x53 |
+Set text color |
+3 |
+8-bit red, 8-bit green, 8-bit blue |
+
+
+
+| 0x5C |
+Multiple embedded image |
+4 |
+alternate image record ID, image record ID |
+
+
+
+| 0x60 |
+Underline text begins |
+0 |
+no data |
+
+
+
+| 0x68 |
+Underline text ends |
+0 |
+no data |
+
+
+
+| 0x70 |
+Strike-through text begins |
+0 |
+no data |
+
+
+
+| 0x78 |
+Strike-through text ends |
+0 |
+no data |
+
+
+
+| 0x83 |
+16-bit Unicode character |
+3 |
+alternate text length, 16-bit unicode character |
+
+
+
+| 0x85 |
+32-bit Unicode character |
+5 |
+alternate text length, 32-bit unicode character |
+
+
+
+| 0x8E |
+Begin custom font span |
+6 |
+font page record ID, X page position, Y page position |
+
+
+
+| 0x8C |
+Adjust custom font glyph position |
+4 |
+X page position, Y page position |
+
+
+
+| 0x8A |
+Change font page |
+2 |
+font record ID |
+
+
+
+| 0x88 |
+End custom font span |
+0 |
+no data |
+
+
+
+| 0x90 |
+Begin new table row |
+0 |
+no data |
+
+
+
+| 0x92 |
+Insert table (or table link) |
+2 |
+table record ID |
+
+
+
+| 0x97 |
+Table cell data |
+7 |
+8-bit alignment, 16-bit image record ID, 8-bit columns, 8-bit rows, 16-bit text length |
+
+
+
+| 0x9A |
+Exact link modifier |
+2 |
+Paragraph Offset (The Exact Link Modifier modifies a Paragraph Link or Targeted Paragraph Link function to specify an exact byte offset within the paragraph. This function must be followed immediately by the function it modifies). |
+
+
+
+
+The function arguments have the following definitions:
+
+
+
+| Argument |
+Bytes |
+Notes |
+
+
+
+| record ID |
+
+2 |
+
+This is either a reference to a record in Plucker document (that is, a real record ID), or an index into the list of URLs, for URLs which have not been included in the document. |
+
+
+
+| image record ID |
+
+2 |
+
+reference to image in Plucker document |
+
+
+
+| paragraph number |
+
+2 |
+
+paragraph number (starting from 0) to jump to or an index into the external anchor name data if the record ID is pseudo-Record ID for a URL which has not been included in the document. |
+
+
+
+| font specifier |
+
+1 |
+
+
+The font concept used in Plucker is that of a 'standard' font, along with bold and italic versions of that font. There is no font notion corresponding to HTML's <BIG> or <SMALL>. In this markup, boldness and size are specified with a font specifier; italic is specified with a separate function code. There are currently 11 font specification values, with the following meanings (the actual PalmOS fonts used by the Palm viewer are also given):
+
+
+| Value |
+Description |
+PalmOS 2.x |
+PalmOS 3.x |
+
+
+| 0 |
+Regular text. |
+stdFont |
+stdFont |
+
+
+| 1 |
+Suitable for <H1> HTML tags. |
+boldFont |
+largeBoldFont |
+
+
+| 2 |
+Suitable for <H2> HTML tags. |
+boldFont |
+largeBoldFont |
+
+
+| 3 |
+Suitable for <H3> HTML tags. |
+boldFont |
+largeFont |
+
+
+| 4 |
+Suitable for <H4> HTML tags. |
+boldFont |
+largeFont |
+
+
+| 5 |
+Suitable for <H5> HTML tags. |
+stdFont |
+boldFont |
+
+
+| 6 |
+Suitable for <H6> HTML tags. |
+stdFont |
+boldFont |
+
+
+| 7 |
+Regular text, but bold. |
+stdFont |
+boldFont |
+
+
+| 8 |
+Fixed-width text, suitable for <TT> HTML tags. |
+stdFont |
+fixedWidthFont |
+
+
+| 9 |
+Small normal text, suitable for <SMALL> HTML tags. |
+stdFont |
+stdFont |
+
+
+| 10 |
+Small subscript text, suitable for <SUB> HTML tags. |
+stdFont |
+stdFont |
+
+
+| 11 |
+Small superscript text, suitable for <SUP> HTML tags. |
+stdFont |
+stdFont |
+
+
+ |
+
+
+
+| left margin |
+
+1 |
+
+left margin in pixels |
+
+
+
+| right margin |
+
+1 |
+
+right margin in pixels |
+
+
+
+| alignment |
+
+1 |
+
+alignment code (left = 0, right = 1, center = 2, justify = 3) |
+
+
+
+| height |
+
+1 |
+
+height of horizontal rule in pixels, if not given a default value
+of 2 pixels will be used |
+
+
+
+| width (pixels) |
+
+1 |
+
+width in pixels, should be 0 if percentage value should be used |
+
+
+
+| width (%) |
+
+1 |
+
+width as the percentage between the current left and right margins.
+The default is 100% |
+
+
+
+| alternate text length |
+
+1 |
+
+When a Unicode character not representable in ISO-Latin-1 is encountered in an HTML document, a Unicode-character function code is inserted, with the 16-bit or 32-bit value of the character. This is followed by a "alternate representation" of the character in ISO-Latin-1 text. This parameter gives the length, in bytes, of the alternate text span. If the viewer can present the Unicode character directly, display of the alternate text should be suppressed. |
+
+
+| 16 or 32 bit Unicode character |
+
+2, 4 |
+
+When a Unicode character not representable in ISO-Latin-1 is encountered in an HTML document, a Unicode-character function code is inserted, with the 16-bit or 32-bit Unicode character code for the character, which this parameter supplies. This is followed by a "alternate representation" of the character in ISO-Latin-1 text. If the viewer can present the Unicode character directly, display of the alternate text should be suppressed. |
+
+
+
+| target |
+
+1 |
+
+The target parameter of a link function allows an alternate default target view for a link
+to be specified. By default, a link will always open in the same view as the current content
+location. Valid link targets are as follows:
+
+
+| Value |
+Description |
+
+
+| 0 |
+Default View. If specified, the link will be opened in the default view as
+determined by the reader. This value causes a Targeted Paragraph or Page
+Link to behave identical to a standard Paragraph or Page Link. |
+
+
+| 1 |
+Primary View. Specifies that the link will be opened in the primary window
+regardless of current content location. |
+
+
+| 2 |
+Secondary View/Popup View. Specifies that the link will be opened in the
+secondary or popup view regardless of current content location. |
+
+
+ |
+
+
+
+| Paragraph Offset |
+
+2 |
+
+specifies an exact byte offset within a paragraph relative to the beginning of the paragraph. |
+
+
+
+
+
+
+The image data consists of an image in Palm image format, compressed or uncompressed as specified in the document's index record. The image may in addition be internally compressed, via any of the compression techniques allowed in the Palm image format. The fundamental size of an image must be less than 480,000; this size is calculated by multiplying the width (in pixels) by the height (in pixels) by the depth (in bits).
+
+If the fundamental size is greater than 480,000, most parsers can be told to create a Multi-image group. This is a group of image records consisting of parts of the image which the viewer displays as one image. The parts are standard Image data records and the Multi-image record tells how many columns and rows the image has, and the record numbers of the parts.
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| columns |
+
+2 |
+
+Numeric |
+
+number of columns in this image |
+
+
+
+| rows |
+
+2 |
+
+Numeric |
+
+number of rows in this image |
+
+
+
+| image record IDs |
+
+2 * columns * rows |
+
+Numeric |
+
+References to images in Plucker document. There are (columns * rows) images listed here |
+
+
+
+
+
+
+
+
+This data is optionally appended to the end of a text or image data record based on the
+setting of the Navigation Metadata flag in the record header. If the Navigation Metadata flag
+is set to one, the image or text data is immediately followed by the following data structures.
+
+
+NOTE: If navigation data is appended to a record then the last two bytes
+in the record shall contain the byte offset from the beginning of the record to the start of the
+navigation data.
+
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| anchor name offset |
+
+2 |
+
+Numeric |
+
+Byte offset from the beginning of the metadata to the anchor name table for this record or 0xffff if there is no anchor name data. |
+
+
+
+| pagelist offset |
+
+2 |
+
+Numeric |
+
+Byte offset from the beginning of the metadata to the page list table for this record or 0xffff if there is no page list data. |
+
+
+
+| hierarchy offset |
+
+2 |
+
+Numeric |
+
+Byte offset from the beginning of the metadata to the hierarchy table for this record or 0xffff if there is no hierarchy data. |
+
+
+
+| topic offset |
+
+2 |
+
+Numeric |
+
+Byte offset from the beginning of the metadata to the list of topics associates with this record or 0xffff if there is no topic data . |
+
+
+
+| Title Strings |
+
+2+ |
+
+String sequence |
+
+A series of concatenated NUL-terminated strings in the following order:
+
+ - Long Record Title - The title of the record. This title should allow the record to be identified out of context.
+ - Short Record Title - A title string which allows the record to be identified in context.
+
+If a given string in the list is not defined, an empty string (NUL) must still be entered
+in the appropriate location in the string sequence. |
+
+
+
+
+The offsets block and title strings are followed by a series of tables.
+
+The anchor name table specifies the offset of the anchor names within the text record.
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| anchor names |
+
+2 |
+
+Numeric |
+
+number of anchor names |
+
+
+
+| anchor name data |
+
+2*anchor names |
+
+Numeric Array |
+
+This field is an array of 2 byte offsets, each representing the corresponding
+offset associated with an anchor name relative to the beginning of the text
+record. The order of each offset corresponds to the order of the string for that anchor name
+within the anchor name string sequence below. |
+
+
+
+| anchor name strings |
+
+variable size |
+
+String sequence |
+
+A concatenated sequence of NUL-terminated strings, each an anchor name name. The relative
+location of an anchor name string in the string sequence represents its index into the anchor
+name data. |
+
+
+
+
+
+The page list table contains
+the uid of the previous and next records relative to this record for one or more
+page lists. Each page list, combined with the page lists from other
+Incoming Navigation data records defines one of more unique a linear navigation
+schemes for the document. The default scheme is always associated with a
+list id of 0.
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| pagelists |
+
+2 |
+
+Numeric |
+
+number of page list entries for this record. |
+
+
+
+| pagelist data |
+
+6*pagelists |
+
+Page List Data |
+
+Block of data containing an array of Page List Data (described below). |
+
+
+
+
+
+The page list data consists of a series of structures containing a list id followed by the
+unique ID of the previous and next records associated with that list id.
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| list id |
+
+2 |
+
+Numeric |
+
+the list id for this list. A list id of 0 should be used for the default linear ordering. |
+
+
+
+| prev uid |
+
+2 |
+
+Numeric |
+
+the uid of the previous record in the series for this list id or 0xffff if this is the
+first record in the series. |
+
+
+
+| next uid |
+
+2 |
+
+Numeric |
+
+the uid of the next record in the series for the list id or 0xffff if this is the
+last record in the series. |
+
+
+
+
+The hierarchy table specifies unique ID of each text record above this record in
+the document hierarchy that serves as an index leading to the current record.
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| levels |
+
+2 |
+
+Numeric |
+
+number of levels above the current record in the hierarchy. |
+
+
+
+| hierarchy data |
+
+2*levels |
+
+Numeric Array |
+
+This field is an array of 2 byte uids, each corresponding to the text record
+that serves as the index at a given level in the document hierarchy relative to
+the current record. The order of each uid in the array corresponds to the
+order of its corresponding string description in the string sequence below. |
+
+
+
+| hierarchy strings |
+
+variable size |
+
+String sequence |
+
+An abbreviated string that identifies the level index. This string should be
+as short as possible, ideally only a few characters. |
+
+
+
+
+The topic table provides a list of topics associated with the record.
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| topics |
+
+2 |
+
+Numeric |
+
+number of topics in the topic string sequence. |
+
+
+
+| topic strings |
+
+0+ |
+
+String sequence |
+
+A concatenated sequence of one or more NUL-terminated
+ISO Latin-1 strings. Each string represents a topic associated with this
+text record. |
+
+
+
+
+
+
+The mailto data contains info about e-mail addresses that are
+referenced by the mailto anchors. All the offsets are counting
+from the end of the header.
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| to offset |
+
+2 |
+
+Numeric |
+
+offset to TO string |
+
+
+
+| cc offset |
+
+2 |
+
+Numeric |
+
+offset to CC string |
+
+
+
+| subject offset |
+
+2 |
+
+Numeric |
+
+offset to SUBJECT string |
+
+
+
+| body offset |
+
+2 |
+
+Numeric |
+
+offset to BODY string |
+
+
+
+| strings |
+
+0+ |
+
+String sequence |
+
+A concatenated sequence of one or more NUL-terminated US-ASCII strings. Each contains a header-value, which follows the contraints on header values laid down in IETF RFC 2822. Header folding is not allowed. Any of the four headers shown above may be absent; header values should be accessed via the above offsets. |
+
+
+
+
+
+
+Optionally, URL information for the records in the document may be stored. This information includes URL strings both for the pages actually included in the document, and for those pages excluded from the document. This information is conceptually stored as a sequence of strings, where the position of the URL in the sequence corresponds to the record ID of its page in the document. In the case of a page which is not actually included in the document, a pseudo-record-ID is assigned, greater than any actual record IDs in the document, and the URL of that page is associated with that pseudo-record-ID.
+
+In practice, there are two kinds of records used to store the URL strings, the URL handling data record, which serves as an index into the sequence of strings, and the URL data record, one or more of which contain the actual strings.
+
+
+For cross-document linking support, the URL strings must be of the format "doc://[external doc name]:[url]" where external doc name is the name of the external document and url is the URL string associated with a given record.
+
+The URL handling data is used to find the record ID of the record which contains the correct URL string. It
+contains a series of 2 byte number pairs.
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| last url |
+
+2 |
+
+Numeric |
+
+the ordinal number of the last URL in record |
+
+
+
+| id |
+
+2 |
+
+Numeric |
+
+record ID for record |
+
+
+
+
+
+
+The URL data contains a list of the URLs. Additional records
+are created if needed and contain up to 200 URLs.
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| URLs |
+
+1+ |
+
+String sequence |
+
+a concatenated sequence of NUL-terminated URL strings following the constraints of IETF RFC 1738. The list may contain up to 200 URLs (only text and image records are included,
+other records are represented only by the presence of a NUL; that is, by an empty string) |
+
+
+
+
+These records may or may not be compressed. This is indicated
+by the type in the header. These records are used by the Details
+form to display the URL of the current record and by the External
+Reference form to display the URL of not collected pages. From
+either form you can copy the URL to a Memo to remind you to pluck
+it at a later date. For inter-document links, a paragraph link
+function may be specified to contain a pseudo-Record ID in place
+of an actual-Record ID, and an index into the
+external anchor names record
+in place of the paragraph number.
+
+
+
+
+The external bookmarks data contains a list of bookmarks added by the
+parser. It will work similar to named anchors.
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| bookmarks |
+
+2 |
+
+Numeric |
+
+number of bookmarks |
+
+
+
+| offset |
+
+2 |
+
+Numeric |
+
+offset to the start of the bookmark data (counting from the beginning
+of the record) |
+
+
+
+| names |
+
+variable size |
+
+String sequence |
+
+A concatenated sequence of NUL-terminated strings, each a bookmark name |
+
+
+
+| bookmark data |
+
+4*bookmarks |
+
+Bookmark Data |
+
+block of data for the location of the external bookmarks (see below) |
+
+
+
+
+
+The bookmark data is a series of uid/offset pairs.
+
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| uid |
+
+2 |
+
+Numeric |
+
+unique ID for record |
+
+
+
+| offset |
+
+2 |
+
+Numeric |
+
+paragraph offset |
+
+
+
+
+
+
+Each Plucker document can be assigned to a number of named categories. This record stores the names of default categories for the document. The data consists of a concatenated series of NUL-terminated strings that
+should be used as the default category/categories for this document.
+
+
+
+There should only be one of these per document. This record begins with a two byte numeric value, giving the number of subrecords that follow, followed by that number of subrecords. The subrecords are a sequence of tagged variable length items. Each subrecord consists of three fields:
+
+
+
+
+| Field |
+Bytes |
+Type |
+Description |
+
+
+
+| type code |
+2 |
+Numeric |
+Specifies what piece of extra information is in this subrecord |
+
+
+
+| length |
+2 |
+Numeric |
+Number of 2-byte words in the argument |
+
+
+
+| argument |
+2 * length |
+(type code specific) |
+Data |
+
+
+
+
+The following table describes the valid subrecord type codes, and describes the structure of the associated data for each subrecord type. Subrecords with unknown type codes should be ignored.
+
+
+
+| Type code |
+Name |
+Description |
+Argument |
+
+
+
+| 1 |
+CharSet |
+This is the character set and encoding used by text records in this document, unless otherwise specified for particular records. |
+a two-byte numeric value, specifying the IETF IANA MIBenum value for the character set. See the IANA registry of character sets for valid values. |
+
+
+
+| 2 |
+ExceptionalCharSets |
+This is a list of text records which use a charset other than that specified by the default CharSet. Note that if no default CharSet is specified, the default charset should be thought of as "unknown". |
+a sequence of (length / 2) record-ID, IANA-MIBenum pairs, where MIBenum values are as specified for CharSet. The invalid MIBenum value of 0 (zero) is used for records which have an unknown charset, if necessary.
+
+
+| Field |
+Bytes |
+Type |
+Notes |
+
+
+
+| record ID |
+
+2 |
+
+Numeric |
+
+unique ID for record |
+
+
+
+| MIBenum |
+
+2 |
+
+Numeric |
+
+IANA MIBenum for the character set used in this record |
+
+
+
+ |
+
+
+
+| 3 |
+OwnerID |
+This is the CRC-32 of the specified owner-id for the document, if any. Note that associating an owner-id with a document also affects the calculation of zlib compression. |
+a four-byte numeric value giving the CRC-32 of the owner-id string. |
+
+
+
+| 4 |
+Author |
+The name of the author of the document. |
+A string value in the document's default character set, padded at the end with NUL characters to an even number of bytes. |
+
+
+
+| 5 |
+Title |
+The full title of the document. |
+A string value in the document's default character set, padded at the end with NUL characters to an even number of bytes. |
+
+
+
+| 6 |
+PublicationDate |
+The date and time this document was created. |
+A 4-byte unsigned integer giving the number of seconds from 12:00 AM on January 1, 1904, to the time when this document was created. |
+
+
+
+| 7 |
+Linked Documents |
+The list of external documents that this document links to. |
+A concatenated sequence of NUL-terminated strings representing the document names for each external document linked to within this document. The string sequence should be padded at the end with NUL characters to an even number of bytes. |
+
+
+
+
+
+
+TBD
+
+
+
+TBD
+
+
+
+The Table Record describes an HTML table. It begins with a structure with the following format.
+
+
+
+| Field |
+Bytes |
+Type |
+Description |
+
+
+
+| size |
+2 |
+Numeric |
+Size of the following data |
+
+
+
+| columns |
+2 |
+Numeric |
+Number of columns the table contains |
+
+
+
+| rows |
+2 |
+Numeric |
+Number of rows the table contains |
+
+
+
+| depth |
+1 |
+Numeric |
+Bits per pixel (BPP) needed to render the table |
+
+
+
+| border |
+1 |
+Numeric |
+Draw table borders (0 = no, any other value = yes, 1 pixel wide) |
+
+
+
+| border color |
+4 |
+Numeric |
+RGB value of border color |
+
+
+
+| link color |
+4 |
+Numeric |
+RGB value of link color |
+
+
+
+
+This is followed by table row and table cell functions (their ends are implied). Each table cell function is followed by 'text length' (from the function) text, containing text and/or formatting functions. (Such as links, style, underline, strike through, italic, etc.)
+
+
+
+
+There should only be one of these per document. This record is used to assign the name and initial record associated with each page list in the document. Page lists are used to define the default ordering of pages within the document. More than one page list can be specified, which can be useful for defining tours through a document.
+
+
+
+
+| Field |
+Bytes |
+Type |
+Description |
+
+
+
+| lists |
+
+2 |
+
+Numeric |
+
+The number of page lists or tours in the document. |
+
+
+
+| first record |
+
+2 * sequences |
+
+Numeric Array |
+
+An array of uids corresponding to the first record in each page sequence. The zero
+based index into this array represents the sequence id or tour id. The first entry should be
+considered the default page ordering for the document. |
+
+
+
+| list name |
+
+1+ |
+
+String sequence |
+
+a concatenated sequence of NUL-terminated strings, each representing the name of a
+page list. The first entry in the list corresponds to the default page ordering. For
+unnamed page lists, a NUL charater should still be specified. |
+
+
+
+
+Page lists can be thought of as linked lists of records. The first record field in the
+Page List Metadata record is equivalent to the head pointer of the list.
+Each text record contains a previous/next record pointer within it's navigation
+metadata.
+
+
+
+
+The Sorted URL handling record is used to find the record ID of the Sorted URL data record
+containing data for a given URL string. It contains a series of 2 byte number pairs.
+
+
+
+
+| Field |
+Bytes |
+Type |
+Description |
+
+
+
+| last URL |
+
+2 |
+
+Numeric |
+
+the ordinal number of the last URL in record |
+
+
+
+| id |
+
+2 |
+
+Numeric |
+
+record ID for record |
+
+
+
+
+
+
+
+The sorted URL data record contains a list of URL/UID pointers sorted according to the
+lexicographical order of the URL strings pointed to by the url uid and
+url offset fields. This data is used in cross-document linking to
+facilitate a binary search of the URL strings in order to lookup the record ID
+for an incoming URL string. Only URLs for records actually contained in the document
+should be included in the Sorted URL data records. URLs for external
+records should be omitted.
+
+
+
+| Field |
+Bytes |
+Type |
+Description |
+
+
+
+| url uid |
+
+2 |
+
+Numeric |
+
+unique ID for URL data record that contains the sorted URL string. |
+
+
+
+| url offset |
+
+2 |
+
+Numeric |
+
+byte offset of the first character of the sorted URL string in the URL data record. |
+
+
+
+| record uid |
+
+2 |
+
+Numeric |
+
+unique ID of the text or image record that pertains to the sorted URL string. |
+
+
+
+
+
+
+
+The External Anchor handling record is used to find the record ID of the External
+Anchor data record containing a given external anchor string. It contains a series of
+2 byte number pairs.
+
+
+
+
+| Field |
+Bytes |
+Type |
+Description |
+
+
+
+| last anchor |
+
+2 |
+
+Numeric |
+
+the ordinal number of the last anchor in record |
+
+
+
+| id |
+
+2 |
+
+Numeric |
+
+record ID for record |
+
+
+
+
+
+
+
+The External Anchor data record is a string table containing the unique names for all
+external anchor name strings referenced in this document. These strings
+are used to query the record specific anchor name tables in a target book to
+determine a paragraph offset for cross-document linking. This information is
+conceptually stored as a sequence of strings, where the position of the anchor
+name in the sequence corresponds to it's index.
+
+
+
+| Field |
+Bytes |
+Type |
+Description |
+
+
+
+| anchor name list |
+
+1+ |
+
+String sequence |
+
+a concatenated sequence of unique NUL-terminated strings, each representing an anchor
+name from an external link found within this document. |
+
+
+
+
+
+These records may or may not be compressed. This is indicated by the type in the header. These
+records are used in conjunction with the Sorted URL Data records and record specific anchor
+name tables to facilitate cross-document linking.
+
+
+
+
+| © Copyright 2000 Michael
+Nordström
+<micke@sslug.dk> · Copyright 2001 Bill Janssen <bill@janssen.org |
+
+$Id: DBFormat.html,v 1.27 2005/10/29 14:14:21 nordstrom Exp $ |
+
+
+
+
diff --git a/format_docs/pdb/pml.txt b/format_docs/pdb/pml.txt
new file mode 100644
index 0000000000..b5b357f381
--- /dev/null
+++ b/format_docs/pdb/pml.txt
@@ -0,0 +1,936 @@
+Palm Markup Language
+--------------------
+
+This page explains how to use the Palm Markup Language (PML) to specify
+formatting and other information in a text file for later reading using the
+eReader.
+
+PML commands start with a backslash, "\", and usually consist of a single
+character after that. Some PML commands are paired, such as those that specify
+italicized text. Other commands are directives, such as the "\p", which
+specifies a page break. PML is not meant to be an industrial-strength markup
+language, but it is easy to understand, easy to parse, and creates high-quality
+electronic books.
+
+Since PML and Palm DropBook are not without flaws, there is a page of Tips and
+Pitfalls.
+
+
+Let's Dive Right In
+-------------------
+
+palmsample.txt contains examples of formatting text, specifying chapters, etc.
+Use it to start from, or just as an example when making your own books.
+
+The following table specifies the Palm Markup Language commands, and what
+they do.
+
+\p New page
+\x New chapter; also causes a new page break.
+ Enclose chapter title (and any style codes)
+ with \x and \x
+\Xn New chapter, indented n levels (n between 0 and
+ 4 inclusive) in the Chapter dialog; doesn't
+ cause a page break. Enclose chapter title (and
+ any style codes) with \Xn and \Xn
+\Cn="Chapter title" Insert "Chapter title" into the chapter
+ listing, with level n (like \Xn). The text is
+ not shown on the page and does not force a page
+ break. This can sometimes be useful to insert a
+ chapter mark at the beginning of an
+ introduction to the chapter, for example.
+\c Center this block of text; close with \c on
+ beginning of line
+\r Right justify text block; close with \r on
+ beginning of line
+\i Italicize block; close with \i
+\u Underline block; close with \u
+\o Overstrike block; close with \o
+\v Invisible text; close with \v (can be used for
+ comments)
+\t Indent block. Start at beginning of a line,
+ close with \t at end of a line
+\T="50%" Indents the specified percentage of the screen
+ width, 50% in this case. If the current drawing
+ position is already past the specified screen
+ location, this tag is ignored.
+\w="50%" Embed a horizontal rule of a given percentage
+ width of the screen, in this case 50%. This tag
+ causes a line break before and after it. The
+ rule is centered. The percent sign is mandatory.
+\n Switch to the "normal" font, which is specified
+ by the user
+\s Switch to stdFont; close with \s to revert to
+ normal font
+\b Switch to boldFont; close with \b to revert to
+ normal font (deprecated; use \B instead)
+\l Switch to largeFont; close with \l to revert to
+ normal font
+\B Mark text as bold. Unlike the \b tag, \B
+ doesn't change the font, so you can have large
+ bold text. You cannot mix \b and \B in the same
+ PML file.
+\Sp Mark text as superscript. Should not be mixed
+ with other styles such as bold, italic, etc.
+ Enclose superscripted text with \Sp.
+\Sb Mark text as subscript. Should not be mixed
+ with other styles such as bold, italic, etc.
+ Enclose subscripted text with \Sb.
+\k Make enclosed text into small-caps; close with
+ \k. Any characters enclosed in \k tags
+ (including those with accents) are made
+ uppercase and are rendered at a smaller point
+ size than a regular uppercase character.
+\\ Represents a single backslash
+\aXXX Insert non-ASCII character whose Windows 1252
+ code is decimal XXX. See the PML character
+ table for details.
+\UXXXX Insert non-ASCII character whose Unicode code
+ is hexidecimal XXXX. See the Extended PML
+ character table for details.
+\m="imagename.png" Insert the named image. See the section on
+ Images below.
+\q="#linkanchor"Some text\q Reference a link anchor which is at another
+ spot in the document. The string after the
+ anchor specification and before the trailing\q
+ is underlined or otherwise shown to be a link
+ when viewing the document.
+\Q="linkanchor" Specify a link anchor in the document.
+\- Insert a soft hyphen. A soft hyphen shows up
+ only if it is necessary to break a word across
+ a line.
+\Fn="footnote1"1\Fn Link the "1" to a footnote whose name is
+ footnote1, tagged at the end of the PML
+ document. See the section on Footnotes and
+ Sidebars below.
+\Sd="sidebar1"Sidebar\Sd Link the "Sidebar" text to a sidebar whose name
+ is sidebar1, tagged at the end of the PML
+ document. See the section on Footnotes and
+ Sidebars below.
+\I Mark as a reference index item. Enclose index
+ item (and any style codes) with \I and \I. See
+ Creating Dictionaries for more information.
+
+
+Examples
+--------
+
+\pThis is a new page
+
+\xChapter III\x
+
+\X1Chapter III, part A\X1
+
+\p\C="Introduction"The following story is one of my favorites...
+
+\cProperty of
+Gateway Senior High School
+\c
+
+\rJustify my love
+\r
+
+This stuff is \ireally\i cool.
+
+I just read \uMoby Dick.\u
+
+This is a \obig\o mistake.
+
+Copyright 1917\v Date of magazine serialization \v
+
+\tOnce upon a time
+there was a wicked queen
+called Esmerelda.\t
+
+Mammals:\T="40%"Lions
+\T="40%"Tigers
+\T="40%"Bears
+
+He walked away.
+\w="80%"
+Later that day, he ran into an old friend.
+
+\nIn the normal ways...
+
+The \stitle page\s should be formatted...
+
+I just \bcan't\b believe that you...
+
+This \lREALLY\l is a large tiger...
+
+This \Bbold\B text can be either \l\Blarge bold\B\l or \s\Bsmall bold\B\s.
+
+e\Spx + 2\Sp = 9
+
+C\Sb2\SbH\Sb3\SbO\Sb2\Sb should be used in moderation.
+
+See also \kanteater\k.
+
+The DOS prompt said "C:\\windows\\"
+
+The man said \a147Yeah.\a148
+
+Arrows can point \U2190 left or right \U2192.
+
+A Yield sign looks like this: \m="yieldsign.png".
+
+See the \q="#detailedinstructions"Detailed Instructions\q for how to install your eBook.
+
+\Q="detailedinstructions"\bDetailed Instructions\b - This section
+describes how to install an eBook to your handheld device.
+
+Very long words like anti\-dis\-establish\-ment\-arian\-ism may benefit from
+the use of soft hyphens.
+
+The Emerson case\Fn="emerson"[1]\Fn will be very important...
+
+For more information, see the \Sd="moreinfo"sidebar\Sd.
+
+\I\Baardvark\B\I \in.\i a large burrowing nocturnal mammal that feeds especially on termites and ants
+
+
+Footnotes and Sidebars
+----------------------
+
+Footnotes and Sidebars are specified with an XML-like syntax at the end of the
+PML document. For example,
+
+
+
+would specify the sidebar to be displayed when the user taps on a sidebar link
+in the text that was specified using the \Sd tag.
+
+Any text or PML placed after the first footnote or sidebar is ignored as part
+of the book text.
+
+Sidebars and footnotes can include most PML features, but there are some PML
+tags that cannot be used inside of a sidebar or footnote.
+
+These include
+Chapters \x, \X, \C
+Links \q, \Q
+Footnotes \Fn
+Sidebars \Sd
+
+See the palmsample.txt file for examples of how to use many of the PML tags.
+
+
+Images
+------
+
+The following rules are intended to guarantee that images in your eBook will be
+viewable on all platforms that eReader runs on.
+
+On low-resolution Palm OS handhelds, an image wider than 158 pixels or taller
+than 148 pixels will be represented in the text by a thumbnail that the user
+can tap to view the entire image. Images smaller than 158 x 148 will be
+presented in-line with the text.
+
+On high-resolution Palm OS handhelds (those having screens of 320x320 pixels or
+more), images smaller than 158 by 148 pixels will be pixel-doubled. Images
+larger than 158x148 may be shown in-line with the text, if they will fit on
+the screen.
+
+On non-Palm OS platforms, small images will be scaled up appropriately. Large
+images will be scaled down to fit on the page; in this case the user can tap on
+the image to view the entire image and zoom in or out.
+
+For DropBook to find the image, it must be present in a directory whose name
+matches that of the PML text file. For example, if "pmlsample.txt" contains a
+reference to an image called "intro.png", then there must be a directory called
+"pmlsample_img" that contains intro.png. The directory's name is the name of
+the PML file (without the .txt extension) with "_img" appended.
+
+Images must be in PNG format and cannot be filtered or interlaced. Image depth
+must be 8 bits or less. Any color table may be used for color images.
+
+Image files must be less than or equal to 65505 bytes in size, since they are
+embedded into the .pdb format of the book; Palm database records are limited to
+65505 bytes in length. Since images are compressed, the actual image displayed
+by the reader may be much larger than 64K.
+
+Any or all of these restrictions may eventually be removed.
+
+
+Adding a Title, Cover Art, and Other Meta-information to Your eBook
+-------------------------------------------------------------------
+
+DropBook normally presents a dialog in which the title and other information
+for the eBook may be specified. This information may be embedded in the PML
+file instead.
+
+To specify the eBook title as it will appear in the Open dialog on the
+handheld, place a block of invisible comment text at the beginning of the file
+using \v tags. Inside this comment block, put the string TITLE="My eBook",
+where "My eBook" is replaced with the name of your eBook. It should look
+something like this:
+
+\vTITLE="Palm Sample Document"\v
+
+You can also specify the author using the AUTHOR meta-tag, the publisher with
+PUBLISHER, copyright information with COPYRIGHT, and the eBook ISBN with EISBN.
+A fully-specified set of meta-information might appear in PML as:
+
+\vTITLE="Palm Sample Document" AUTHOR="Sam Morgenstern" PUBLISHER="eReader.com"
+EISBN="X-XXXX-XXXX" COPYRIGHT="Copyright \a169 2004 by Sam Morgenstern"\v
+
+Cover art: If an image named "cover.png" is present in the eBook, it is assumed
+to be the cover art for the eBook. See the rules for images for sizing and
+other information.
+
+Some or all of this information may appear in the book information dialog in
+eReader, and may be used for other purposes in future products.
+
+
+Creating Dictionaries
+---------------------
+
+The \I PML tag is used to delimit an index item. Example: \Iaardvark\I
+
+Each entry must start in the normal font. If DropBook shows an error beginning
+with "No styles permitted before...", there is probably a missing end style tag
+before the text shown in the error message.
+
+Links, chapters and other PML structures are not permitted in dictionaries.
+Images, however, are.
+
+A special dictionary entry, "(Front matter)" is shown before other entries in
+the list of entries, and should be used to include pronunciation symbols and
+other front matter.
+
+Note that use of dictionaries requires eReader Pro.
+
+
+Tips and Pitfalls
+-----------------
+
+This page explains some common mistakes, some bugs in DropBook and/or the
+eReader, and some techniques that will allow you to create quality electronic
+books for the eReader.
+
+ * Check out the Converting to Palm eBooks page for some pointers on
+ converting text from various formats into the Palm Markup Language.
+ * Use a return at the end of each paragraph, not each line.
+ * Using an extra return between paragraphs reads easier than paragraph
+ indentation.
+ * The eReader doesn't display empty lines at the top of a page. If you need
+ to have some "empty" lines at the top of a page, put a space on each line.
+ * Don't use tables if you can possibly avoid it.
+
+ None of the fonts that the eReader supports are monospaced, so tables can
+ be difficult to represent. Break out the information in another way, or
+ use the \T tag, but beware of tables that look great on a Palm OS
+ handheld but not on a Pocket PC or vice versa.
+
+ * The Reader breaks lines on spaces, dashes or underscores. This has
+ several implications.
+
+ 1. Don't fill more than a line with spaces, dashes or underscores.
+ There's a bug (which will be fixed in a future release) which
+ causes MakeBook to hang on such a line. Note that in the large
+ font, the number of spaces, dashes or underscores will be much
+ smaller than in the small font.
+ 2. A string such as He shouted "Wait!--" may place the last quote on
+ the beginning of a line, since the line would break after the
+ second dash. Prevent this by using the PML string: He shouted
+ "Wait!\a150\a150". The non-breaking dash, code 150, will not break
+ a line. Use \a160 for a non-breaking space. Even better: use \a151,
+ a long dash, instead of two short dashes.
+
+ * The justification codes \c and \r (center and right justification) must
+ have closing codes on the beginning of the line following the justified
+ text.
+ * The indentation tag \t must have a closing tag at the end of a line of
+ the indented text.
+ * Use \s (small font) in the title page(s) of books to force the page(s) to
+ format nicely. Other than that, \n, \s and \l should rarely be necessary;
+ the font size used for most text display should be chosen by the user.
+
+
+Converting Uncommon Characters to PML
+-------------------------------------
+
+Use this chart to convert uncommon characters to their Palm Markup Language
+(PML) equivalent. Most characters are simply represented as themselves in PML
+and don't require this chart. But some uncommon characters can only be
+represented in PML by their "\aXXX" syntax. Use this chart to look up that
+"\aXXX" syntax.
+
+For Example, if you wanted to write the following phrase in PML:
+
+ Copyright © 1999 by Samuel Morgenstern
+
+In PML, you would write it as:
+
+ Copyright \a169 1999 by Samuel Morgenstern
+
+Char HTML # Code HTML Char Code PML Char Code Description
+
+ - Normal space
+! ! - ! Exclamation
+" " " " Double quote
+# # - # Hash
+$ $ - $ Dollar
+% % - % Percent
+& & & & Ampersand
+' ' - ' Apostrophe
+( ( - ( Open bracket
+) ) - ) Close bracket
+* * - * Asterisk
++ + - + Plus sign
+, , - , Comma
+- - - - Minus sign
+. . - . Period
+/ / - / Forward slash
+0 0 - 0 Digit 0
+1 1 - 1 Digit 1
+2 2 - 2 Digit 2
+3 3 - 3 Digit 3
+4 4 - 4 Digit 4
+5 5 - 5 Digit 5
+6 6 - 6 Digit 6
+7 7 - 7 Digit 7
+8 8 - 8 Digit 8
+9 9 - 9 Digit 9
+: : - : Colon
+; ; - ; Semicolon
+ < < < Less than
+= = - = Equals
+ > > > Greater than
+? ? - ? Question mark
+@ @ - @ At sign
+A A - A A
+B B - B B
+C C - C C
+D D - D D
+E E - E E
+F F - F F
+G G - G G
+H H - H H
+I I - I I
+J J - J J
+K K - K K
+L L - L L
+M M - M M
+N N - N N
+O O - O O
+P P - P P
+Q Q - Q Q
+R R - R R
+S S - S S
+T T - T T
+U U - U U
+V V - V V
+W W - W W
+X X - X X
+Y Y - Y Y
+Z Z - Z Z
+[ [ - [ Open square bracket
+\ \ - \\ Backslash
+] ] - ] Close square bracket
+^ ^ - ^ Caret
+_ _ - _ Underscore
+` ` - ` Grave accent
+a a - a a
+b b - b b
+c c - c c
+d d - d d
+e e - e e
+f f - f f
+g g - g g
+h h - h h
+i i - i i
+j j - j j
+k k - k k
+l l - l l
+m m - m m
+n n - n n
+o o - o o
+p p - p p
+q q - q q
+r r - r r
+s s - s s
+t t - t t
+u u - u u
+v v - v v
+w w - w w
+x x - x x
+y y - y y
+z z - z z
+{ { - { Left brace
+| | - | Vertical bar
+} } - } Right brace
+~ ~ - ~ Tilde
+
+ \a160 Non-breaking space
+ ¡ ¡ \a161 Inverted exclamation
+ ¢ ¢ \a162 Cent sign
+ £ £ \a163 Pound sign
+ ¤ ¤ \a164 Currency sign
+ ¥ ¥ \a165 Yen sign
+ ¦ ¦ \a166 Broken bar
+ § § \a167 Section sign
+ ¨ ¨ \a168 Umlaut or diaeresis
+ © © \a169 Copyright sign
+ ª ª \a170 Feminine ordinal
+ « « \a171 Left angle quotes
+ ¬ ¬ \a172 Logical not sign
+ \a173 Soft hyphen
+ ® ® \a174 Registered trademark
+ ¯ ¯ \a175 Spacing macron
+ ° ° \a176 Degree sign
+ ± ± \a177 Plus-minus sign
+ ² ² \a178 Superscript 2
+ ³ ³ \a179 Superscript 3
+ ´ ´ \a180 Spacing acute
+ µ µ \a181 Micro sign
+ ¶ ¶ \a182 Paragraph sign
+ · · \a183 Middle dot
+ ¸ ¸ \a184 Spacing cedilla
+ ¹ ¹ \a185 Superscript 1
+ º º \a186 Masculine ordinal
+ » » \a187 Right angle quotes
+ ¼ ¼ \a188 One quarter
+ ½ ½ \a189 One half
+ ¾ ¾ \a190 Three quarters
+ ¿ ¿ \a191 Inverted question mark
+ À À \a192 A grave
+ Á Á \a193 A acute
+ Â Â \a194 A circumflex
+ Ã Ã \a195 A tilde
+ Ä Ä \a196 A diaeresis
+ Å Å \a197 A ring
+ Æ &Aelig; \a198 AE ligature
+ Ç Ç \a199 C cedilla
+ È È \a200 E grave
+ É É \a201 E acute
+ Ê Ê \a202 E circumflex
+ Ë Ë \a203 E diaeresis
+ Ì Ì \a204 I grave
+ Í Í \a205 I acute
+ Î Î \a206 I circumflex
+ Ï Ï \a207 I diaeresis
+ Ð Ð \a208 Eth
+ Ñ Ñ \a209 N tilde
+ Ò Ò \a210 O grave
+ Ó Ó \a211 O acute
+ Ô Ô \a212 O circumflex
+ Õ Õ \a213 O tilde
+ Ö Ö \a214 O diaeresis
+ × × \a215 Multiplication sign
+ Ø Ø \a216 O slash
+ Ù Ù \a217 U grave
+ Ú Ú \a218 U acute
+ Û Û \a219 U circumflex
+ Ü Ü \a220 U diaeresis
+ Ý Ý \a221 Y acute
+ Þ Þ \a222 THORN
+ ß ß \a223 sharp s
+ à à \a224 a grave
+ á á \a225 a acute
+ â â \a226 a circumflex
+ ã ã \a227 a tilde
+ ä ä \a228 a diaeresis
+ å å \a229 a ring
+ æ æ \a230 ae ligature
+ ç ç \a231 c cedilla
+ è è \a232 e grave
+ é é \a233 e acute
+ ê ê \a234 e circumflex
+ ë ë \a235 e diaeresis
+ ì ì \a236 i grave
+ í í \a237 i acute
+ î î \a238 i circumflex
+ ï ï \a239 i diaeresis
+ ð ð \a240 eth
+ ñ ñ \a241 n tilde
+ ò ò \a242 o grave
+ ó ó \a243 o acute
+ ô ô \a244 o circumflex
+ õ õ \a245 o tilde
+ ö ö \a246 o diaeresis
+ ÷ ÷ \a247 division sign
+ ø ø \a248 o slash
+ ù ù \a249 u grave
+ ú ú \a250 u acute
+ û û \a251 u circumflex
+ ü ü \a252 u diaeresis
+ ý ý \a253 y acute
+ þ þ \a254 thorn
+ ÿ ÿ \a255 y diaeresis
+, ‚ ‚ \a130 single low quote
+ ƒ ƒ \a131 Scripted f
+ „ „ \a132 low quote
+ … … \a133 Ellipsis
+ † † \a134 Dagger
+ ‡ &Dagger \a135 Double dagger
+ Š Š \a138 Large S w/inverted caret
+< ‹ ‹ \a139 single left angle quote
+ Œ Œ \a140 Large combined oe
+ ‘ ‘ \a145 Open single smart quote
+ ’ ’ \a146 Close single smart quote
+ “ “ \a147 Open double smart quote
+ ” ” \a148 Close double smart quote
+ • • \a149 Bullet
+ – – \a150 Small dash (en dash)
+ — — \a151 Large dash (em dash)
+ ™ ™ \a153 Trademark
+ š š \a154 Small S w/inverted caret
+> › › \a155 single right angle quote
+ œ œ \a156 Small combined oe
+ Ÿ Ÿ \a159 Large Y with diaeresis
+
+
+Extended Character Set
+----------------------
+
+In addition to the special characters supported by earlier versions of eReader
+(which can be accessed using the \a### tag), all versions of eReader Pro and
+eReader version 2.4 and later include support for additional special characters
+and symbols. These symbols can be accessed using the \U#### tag, where #### are
+four hexidecimal digits giving the Unicode encoding of the special character.
+
+Only the limited subset of Unicode characters given in the table below are
+supported. In addition, some of the characters that are included in the table
+are not present in eReader Pro versions prior to 2.4. To ensure that the
+characters are displayed correctly, books using these tags should be read using
+eReader or eReader Pro version 2.4 or later.
+
+On Palm OS handhelds these special symbols are only available in one size,
+matching the "Small" font. For best results on Palm OS handhelds the \U tag
+should only be used inside blocks set to the "Small" font by way of \s tags.
+On Palm OS handhelds these special characters are not affected by the font tags
+(\s, \l, \b and \n), the bold style tag (\B), or the small caps style tag (\k).
+
+If the \U characters are not showing up correctly using eReader on your Windows
+desktop or laptop this problem is a result of the fonts for eReader not being
+installed properly. The solution is to go to the directory C:\Windows\Fonts\
+and "double click" on each font that starts with "Maynard". This will open each
+font and allow the system to register it. Close the windows that were opened a
+result of the mouse clicks and the problem should be resolved.
+
+Char HTML Code PML Code Description
+
+Latin Extended-A
+Ā Ā \U0100 LATIN CAPITAL LETTER A WITH MACRON
+ā ā \U0101 LATIN SMALL LETTER A WITH MACRON
+Ă Ă \U0102 LATIN CAPITAL LETTER A WITH BREVE
+ă ă \U0103 LATIN SMALL LETTER A WITH BREVE
+ą ą \U0105 LATIN SMALL LETTER A WITH OGONEK
+ć ć \U0107 LATIN SMALL LETTER C WITH ACUTE
+Č Č \U010C LATIN CAPITAL LETTER C WITH CARON
+č č \U010D LATIN SMALL LETTER C WITH CARON
+Ē Ē \U0112 LATIN CAPITAL LETTER E WITH MACRON
+ē ē \U0113 LATIN SMALL LETTER E WITH MACRON
+ĕ ĕ \U0115 LATIN SMALL LETTER E WITH BREVE
+ė ė \U0117 LATIN SMALL LETTER E WITH DOT ABOVE
+ę ę \U0119 LATIN SMALL LETTER E WITH OGONEK
+ě ě \U011B LATIN SMALL LETTER E WITH CARON
+ĝ ĝ \U011D LATIN SMALL LETTER G WITH CIRCUMFLEX
+ğ ğ \U011F LATIN SMALL LETTER G WITH BREVE
+Ī Ī \U012A LATIN CAPITAL LETTER I WITH MACRON
+ī ī \U012B LATIN SMALL LETTER I WITH MACRON
+ĭ ĭ \U012D LATIN SMALL LETTER I WITH BREVE
+į į \U012F LATIN SMALL LETTER I WITH OGONEK
+ı ı \U0131 LATIN SMALL LETTER DOTLESS I
+Ł Ł \U0141 LATIN CAPITAL LETTER L WITH STROKE
+ł ł \U0142 LATIN SMALL LETTER L WITH STROKE
+ń ń \U0144 LATIN SMALL LETTER N WITH ACUTE
+ň ň \U0148 LATIN SMALL LETTER N WITH CARON
+ŋ ŋ \U014B LATIN SMALL LETTER ENG
+Ō Ō \U014C LATIN CAPITAL LETTER O WITH MACRON
+ō ō \U014D LATIN SMALL LETTER O WITH MACRON
+ŏ ŏ \U014F LATIN SMALL LETTER O WITH BREVE
+ő ő \U0151 LATIN SMALL LETTER O WITH DOUBLE ACUTE
+ŕ ŕ \U0155 LATIN SMALL LETTER R WITH ACUTE
+ř ř \U0159 LATIN SMALL LETTER R WITH CARON
+Ś Ś \U015A LATIN CAPITAL LETTER S WITH ACUTE
+ś ś \U015B LATIN SMALL LETTER S WITH ACUTE
+ş ş \U015F LATIN SMALL LETTER S WITH CEDILLA
+ţ ţ \U0163 LATIN SMALL LETTER T WITH CEDILLA
+ũ ũ \U0169 LATIN SMALL LETTER U WITH TILDE
+ū ū \U016B LATIN SMALL LETTER U WITH MACRON
+ŭ ŭ \U016D LATIN SMALL LETTER U WITH BREVE
+ŷ ŷ \U0177 LATIN SMALL LETTER Y WITH CIRCUMFLEX
+ź ź \U017A LATIN SMALL LETTER Z WITH ACUTE
+Ž Ž \U017D LATIN CAPITAL LETTER Z WITH CARON
+ž ž \U017E LATIN SMALL LETTER Z WITH CARON
+Latin Extended-B
+ ƿ \U01BF LATIN LETTER WYNN
+ ǎ \U01CE LATIN SMALL LETTER A WITH CARON
+ ǐ \U01D0 LATIN SMALL LETTER I WITH CARON
+ ǒ \U01D2 LATIN SMALL LETTER O WITH CARON
+ ǔ \U01D4 LATIN SMALL LETTER U WITH CARON
+ ǡ \U01E1 LATIN SMALL LETTER A WITH DOT ABOVE AND MACRON
+ ǣ \U01E3 LATIN SMALL LETTER AE WITH MACRON
+ ǧ \U01E7 LATIN SMALL LETTER G WITH CARON
+ ǫ \U01EB LATIN SMALL LETTER O WITH OGONEK
+ ǰ \U01F0 LATIN SMALL LETTER J WITH CARON
+ ȇ \U0207 LATIN SMALL LETTER E WITH INVERTED BREVE
+ ȝ \U021D LATIN SMALL LETTER YOGH
+ ȧ \U0227 LATIN SMALL LETTER A WITH DOT ABOVE
+ ȯ \U022F LATIN SMALL LETTER O WITH DOT ABOVE
+ ȳ \U0233 LATIN SMALL LETTER Y WITH MACRON
+IPA Extensions
+ ɑ \U0251 LATIN SMALL LETTER SCRIPT A
+ ɒ \U0252 LATIN SMALL LETTER TURNED SCRIPT A
+ ɔ \U0254 LATIN SMALL LETTER OPEN O
+ ə \U0259 LATIN SMALL LETTER SCHWA
+ ɜ \U025C LATIN SMALL LETTER REVERSED OPEN E
+ ɥ \U0265 LATIN LETTER SMALL LETTER TURNED H
+ ɪ \U026A LATIN LETTER SMALL CAPITAL I
+ ɲ \U0272 LATIN SMALL LETTER N WITH LEFT HOOK
+ ʃ \U0283 LATIN SMALL LETTER ESH
+ ʉ \U0289 LATIN SMALL LETTER U BAR
+ ʊ \U028A LATIN SMALL LETTER UPSILON
+ ʌ \U028C LATIN SMALL LETTER TURNED V
+ ʏ \U028F LATIN LETTER SMALL CAPITAL Y
+ ʒ \U0292 LATIN SMALL LETTER EZH
+ ʔ \U0294 LATIN LETTER GLOTTAL STOP
+ ʜ \U029C LATIN LETTER SMALL CAPITAL H
+Spacing Modifier Letters
+ ʾ \U02BE MODIFIER LETTER RIGHT HALF RING
+ ʿ \U02BF MODIFIER LETTER LEFT HALF RING
+ˇ ˇ \U02C7 CARON
+ ˈ \U02C8 MODIFIER LETTER VERTICAL LINE
+ ˌ \U02CC MODIFIER LETTER LOW VERTICAL LINE
+ ː \U02D0 MODIFIER LETTER TRIANGULAR COLON
+˘ ˘ \U02D8 BREVE
+˙ ˙ \U02D9 DOT ABOVE
+Greek and Coptic
+Α Α \U0391 GREEK CAPTIAL LETTER ALPHA
+Β Β \U0392 GREEK CAPTIAL LETTER BETA
+Γ Γ \U0393 GREEK CAPTIAL LETTER GAMMA
+Δ Ε \U0394 GREEK CAPTIAL LETTER DELTA
+Ε Ε \U0395 GREEK CAPTIAL LETTER EPSILON
+Ζ Ζ \U0396 GREEK CAPTIAL LETTER ZETA
+Η Η \U0397 GREEK CAPTIAL LETTER ETA
+Θ Θ \U0398 GREEK CAPTIAL LETTER THETA
+Ι Ι \U0399 GREEK CAPTIAL LETTER IOTA
+Κ Κ \U039A GREEK CAPTIAL LETTER KAPPA
+Λ Λ \U039B GREEK CAPTIAL LETTER LAMBDA
+Μ Μ \U039C GREEK CAPTIAL LETTER MU
+Ν Ν \U039D GREEK CAPTIAL LETTER NU
+Ξ Ξ \U039E GREEK CAPTIAL LETTER XI
+Ο Ο \U039F GREEK CAPTIAL LETTER OMICRON
+Π Π \U03A0 GREEK CAPTIAL LETTER PI
+Ρ Ρ \U03A1 GREEK CAPTIAL LETTER RHO
+Σ Σ \U03A3 GREEK CAPTIAL LETTER SIGMA
+Τ Τ \U03A4 GREEK CAPTIAL LETTER TAU
+Υ Υ \U03A5 GREEK CAPTIAL LETTER UPSILON
+Φ Φ \U03A6 GREEK CAPTIAL LETTER PHI
+Χ Χ \U03A7 GREEK CAPTIAL LETTER CHI
+Ψ Ψ \U03A8 GREEK CAPTIAL LETTER PSI
+Ω Ω \U03A9 GREEK CAPTIAL LETTER OMEGA
+α α \U03B1 GREEK SMALL LETTER ALPHA
+β β \U03B2 GREEK SMALL LETTER BETA
+γ γ \U03B3 GREEK SMALL LETTER GAMMA
+δ δ \U03B4 GREEK SMALL LETTER DELTA
+ε ε \U03B5 GREEK SMALL LETTER EPSILON
+ζ ζ \U03B6 GREEK SMALL LETTER ZETA
+η η \U03B7 GREEK SMALL LETTER ETA
+θ θ \U03B8 GREEK SMALL LETTER THETA
+ι ι \U03B9 GREEK SMALL LETTER IOTA
+κ κ \U03BA GREEK SMALL LETTER KAPPA
+λ λ \U03BB GREEK SMALL LETTER LAMBDA
+μ μ \U03BC GREEK SMALL LETTER MU
+ν ν \U03BD GREEK SMALL LETTER NU
+ξ ξ \U03BE GREEK SMALL LETTER XI
+ο ο \U03BF GREEK SMALL LETTER OMICRON
+π π \U03C0 GREEK SMALL LETTER PI
+ρ ρ \U03C1 GREEK SMALL LETTER RHO
+ς ς \U03C2 GREEK SMALL LETTER FINAL SIGMA
+σ σ \U03C3 GREEK SMALL LETTER SIGMA
+τ τ \U03C4 GREEK SMALL LETTER TAU
+υ υ \U03C5 GREEK SMALL LETTER UPSILON
+φ φ \U03C6 GREEK SMALL LETTER PHI
+χ χ \U03C7 GREEK SMALL LETTER CHI
+ψ ψ \U03C8 GREEK SMALL LETTER PSI
+ω ω \U03C9 GREEK SMALL LETTER OMEGA
+ ϑ \U03D1 GREEK THETA SYMBOL
+ ϝ \U03DD GREEK SMALL LETTER DIGAMMA
+Hebrew
+א א \U05D0 HEBREW LETTER ALEPH
+ב ב \U05D1 HEBREW LETTER BET
+ג ג \U05D2 HEBREW LETTER GIMEL
+ד ד \U05D3 HEBREW LETTER DALET
+ה ה \U05D4 HEBREW LETTER HE
+ו ו \U05D5 HEBREW LETTER VAV
+ז ז \U05D6 HEBREW LETTER ZAYIN
+ח ח \U05D7 HEBREW LETTER HET
+ט ט \U05D8 HEBREW LETTER TET
+י י \U05D9 HEBREW LETTER YOD
+ך ך \U05DA HEBREW LETTER FINAL KAF
+כ כ \U05DB HEBREW LETTER KAF
+ל ל \U05DC HEBREW LETTER LAMED
+ם ם \U05DD HEBREW LETTER FINAL MEM
+מ מ \U05DE HEBREW LETTER MEM
+ן ן \U05DF HEBREW LETTER FINAL NUN
+נ נ \U05E0 HEBREW LETTER NUN
+ס ס \U05E1 HEBREW LETTER SAMEKH
+ע ע \U05E2 HEBREW LETTER AYIN
+ף ף \U05E3 HEBREW LETTER FINAL PE
+פ פ \U05E4 HEBREW LETTER PE
+ץ ץ \U05E5 HEBREW LETTER FINAL TSADI
+צ צ \U05E6 HEBREW LETTER TSADI
+ק ק \U05E7 HEBREW LETTER QOF
+ר ר \U05E8 HEBREW LETTER RESH
+ת ת \U05EA HEBREW LETTER TAV
+Latin Extended Additional
+ ḋ \U1E0B LATIN SMALL LETTER D WITH DOT ABOVE
+ ḍ \U1E0D LATIN SMALL LETTER D WITH DOT BELOW
+ ḗ \U1E17 LATIN SMALL LETTER E WITH MACRON AND ACUTE
+ Ḣ \U1E22 LATIN CAPITAL LETTER H WITH DOT ABOVE
+ Ḥ \U1E24 LATIN CAPITAL LETTER H WITH DOT BELOW
+ ḥ \U1E25 LATIN SMALL LETTER H WITH DOT BELOW
+ ḫ \U1E2B LATIN SMALL LETTER H WITH BREVE BELOW
+ ḳ \U1E33 LATIN SMALL LETTER K WITH DOT BELOW
+ ḷ \U1E37 LATIN SMALL LETTER L WITH DOT BELOW
+ ṁ \U1E41 LATIN SMALL LETTER M WITH DOT ABOVE
+ ṃ \U1E43 LATIN SMALL LETTER M WITH DOT BELOW
+ ṅ \U1E45 LATIN SMALL LETTER N WITH DOT ABOVE
+ ṇ \U1E47 LATIN SMALL LETTER N WITH DOT BELOW
+ ṓ \U1E53 LATIN SMALL LETTER O WITH MACRON AND ACUTE
+ ṙ \U1E59 LATIN SMALL LETTER R WITH DOT ABOVE
+ Ṛ \U1E5A LATIN CAPITAL LETTER R WITH DOT BELOW
+ ṛ \U1E5B LATIN SMALL LETTER R WITH DOT BELOW
+ ṡ \U1E61 LATIN SMALL LETTER S WITH DOT ABOVE
+ ṣ \U1E63 LATIN SMALL LETTER S WITH DOT BELOW
+ ṫ \U1E6B LATIN SMALL LETTER T WITH DOT ABOVE
+ ṭ \U1E6D LATIN SMALL LETTER T WITH DOT BELOW
+ ṯ \U1E6F LATIN SMALL LETTER T WITH LINE BELOW
+ ẑ \U1E91 LATIN SMALL LETTER Z WITH CIRCUMFLEX
+ ẓ \U1E93 LATIN SMALL LETTER Z WITH DOT BELOW
+ ẖ \U1E96 LATIN SMALL LETTER H WITH LINE BELOW
+ ạ \U1EA1 LATIN SMALL LETTER A WITH DOT BELOW
+ ọ \U1ECD LATIN SMALL LETTER O WITH DOT BELOW
+ ỹ \U1EF9 LATIN SMALL LETTER Y WITH TILDE
+General Punctuation
+- ‑ \U2011 NON-BREAKING HYPHEN
+ ‸ \U2038 CARET
+ ‽ \U203D INTERROBANG
+ ⁂ \U2042 ASTERISM
+Arrows
+← ← \U2190 LEFTWARDS ARROW
+→ → \U2192 RIGHTWARDS ARROW
+Mathematical Operators
+∂ ∂ \U2202 PARTIAL DIFFERENTIAL
+√ √ \U221A SQUARE ROOT
+∞ ∞ \U221E INFINITY
+∥ ∥ \U2225 PARALLEL TO
+∫ ∫ \U222B INTEGRAL
+≠ ≠ \U2260 NOT EQUAL TO
+ ⊔ \U2294 SQUARE CUP
+ ⊕ \U2295 CIRCLED PLUS
+ ⋮ \U22EE VERTICAL ELLIPSIS
+Enclosed Alphanumerics
+ Ⓤ \U24CA CIRCLED LATIN CAPITAL LETTER U
+Miscellaneous Symbols
+☜ ☜ \U261C WHITE LEFT POINTING INDEX
+☞ ☞ \U261E WHITE RIGHT POINTING INDEX
+ ☿ \U263F MERCURY
+ ♀ \U2640 FEMALE SIGN
+ ♂ \U2642 MALE SIGN
+ ♃ \U2643 JUPITER
+ ♄ \U2644 SATURN
+ ♅ \U2645 URANUS
+ ♆ \U2646 NEPTUNE
+ ♇ \U2647 PLUTO
+ ♠ \U2660 BLACK SPADE SUIT
+ ♡ \U2661 WHITE HEART SUIT
+ ♢ \U2662 WHITE DIAMOND SUIT
+ ♣ \U2663 BLACK CLUB SUIT
+ ♭ \U266D MUSIC FLAT SIGN
+ ♮ \U266E MUSIC NATURAL SIGN
+ ♯ \U266F MUSIC SHARP SIGN
+Dingbats
+ ✓ \U2713 CHECK MARK
+ ✠ \U2720 MALTESE CROSS
+Private Use Area
+ - \UE000 LATIN SMALL LETTER A WITH MACRON AND ACUTE
+ - \UE001 LATIN SMALL LETTER A WITH MACRON AND TILDE
+ - \UE002 LATIN SMALL LETTER A WITH VERTICAL LINE ABOVE
+ - \UE003 LATIN CAPITAL LETTER C WITH MACRON
+ - \UE004 LATIN SMALL LETTER C WITH MACRON
+ - \UE005 LATIN SMALL LETTER C WITH BREVE
+ - \UE006 LATIN SMALL LETTER C WITH DOT BELOW
+ - \UE007 LATIN SMALL LIGATURE CH
+ - \UE008 LATIN CAPITAL LETTER D WITH MACRON
+ - \UE009 LATIN SMALL LETTER E WITH BAR BELOW
+ - \UE00A LATIN SMALL LETTER E WITH TILDE
+ - \UE00B LATIN SMALL LETTER E WITH MACRON AND BREVE
+ - \UE00C LATIN SMALL LETTER E WITH TILDE AND DOT ABOVE
+ - \UE00D LATIN SMALL LETTER E WITH HOOK RIGHT BELOW
+ - \UE00E LATIN SMALL LETTER G WITH INVERTED BREVE
+ - \UE00F LATIN SMALL LETTER I WITH INVERTED BREVE BELOW
+ - \UE010 LATIN SMALL LETTER I WITH MACRON AND ACUTE
+ - \UE011 LATIN SMALL LETTER K WITH CIRCUMFLEX
+ - \UE012 LATIN SMALL LETTER K WITH BREVE
+ - \UE013 LATIN SMALL LETTER K WITH INVERTED BREVE
+ - \UE014 LATIN SMALL LIGATURE KH
+ - \UE015 LATIN CAPITAL LETTER L WITH MACRON
+ - \UE016 LATIN SMALL LETTER L WITH TILDE
+ - \UE017 LATIN SMALL LETTER L WITH INVERTED BREVE
+ - \UE018 LATIN CAPITAL LETTER M WITH MACRON
+ - \UE019 LATIN SMALL LETTER M WITH MACRON
+ - \UE01A LATIN SMALL LETTER M WITH TILDE
+ - \UE01B LATIN SMALL LETTER O WITH CEDILLA
+ - \UE01C LATIN SMALL LETTER O WITH MACRON AND CIRUMFLEX
+ - \UE01E LATIN SMALL LIGATURE OI
+ - \UE01F LATIN SMALL LIGATURE OO
+ - \UE020 LATIN SMALL LIGATURE OO WITH MACRON
+ - \UE021 LATIN SMALL LIGATURE OU
+ - \UE022 LATIN SMALL LETTER OPEN O WITH ACUTE
+ - \UE023 LATIN SMALL LETTER R WITH DIARESIS
+ - \UE024 LATIN SMALL LETTER R WITH CIRCUMFLEX
+ - \UE025 LATIN SMALL LETTER R WITH RING BELOW
+ - \UE026 LATIN SMALL LETTER S WITH VERTICAL LINE ABOVE
+ - \UE027 LATIN SMALL LETTER S WITH OGONEK
+ - \UE028 LATIN SMALL LETTER S WITH COMMA
+ - \UE02A LATIN SMALL LETTER S WITH BREVE
+ - \UE02B LATIN SMALL LIGATURE SH
+ - \UE02C LATIN SMALL LIGATURE TH
+ - \UE02D LATIN SMALL LETTER U WITH MACRON AND ACUTE
+ - \UE02E LATIN CAPITAL LETTER V WITH MACRON
+ - \UE02F LATIN CAPITAL LETTER X WITH MACRON
+ - \UE030 LATIN SMALL LETTER X WITH CIRCUMFLEX
+ - \UE031 LATIN SMALL LETTER Y WITH BREVE
+ - \UE032 LATIN SMALL LIGATURE ZH
+ - \UE033 LATIN SMALL LETTER TURNED E WITH ACUTE
+ - \UE034 LATIN SMALL LETTER TURNED E WITH CIRCUMFLEX
+ - \UE035 GREEK SMALL LETTER ALPHA WITH GRAVE
+ - \UE036 MUSICAL SYMBOL SEGNO
+ - \UE037 MUSICAL SYMBOL FERMATA
+ - \UE038 MUSICAL SYMBOL CRESCENDO
+ - \UE039 MUSICAL SYMBOL DECRESCENDO
+ - \UE03A MUSICAL SYMBOL DOUBLE SHARP
+ - \UE03B MUSICAL SYMBOL BREVE
+ - \UE03C MUSICAL SYMBOL DOWN BOW
+ - \UE03D MUSICAL SYMBOL UP BOW
+ - \UE03E MUSICAL SYMBOL BREVE ALTERNATE
+ - \UE03F PRINTING SYMBOL DELE
+ - \UE040 PRINTING SYMBOL FRACTIONAL EM
+ - \UE041 INVERTED ASTERISM
+ - \UE042 LATIN SMALL LETTER SCHWA SUPERSCRIPT
+ - \UE043 LATIN SMALL LETTER TURNED Y
+ - \UE044 LATIN SMALL LIGATURE OE WITH MACRON
+ - \UE045 SQUARE ROOT WITH BAR
+ - \UE046 LATIN SMALL LETTER U WITH DOT ABOVE
+ - \UE047 LATIN SMALL LIGATURE UE
+ - \UE048 LATIN SMALL LIGATURE UE WITH MACRON
+ - \UE049 LATIN SMALL LETTER OPEN O WITH TILDE
+ - \UE04A LATIN SMALL LETTER T WITH CARON BELOW
+ - \UE04B LATIN SMALL LETTER SCRIPT A WITH TILDE
+ - \UE04C GREEK SMALL LETTER EPSILON WITH TILDE
+ - \UE04D LATIN SMALL LIGATURE OE WITH TILDE
+ - \UE04E MODIFIER LETTER DOUBLE VERTICAL LINE
+ - \UE04F DOUBLE HYPHEN
+ - \UE050 LATIN SMALL LETTER SCHWA WITH DOT ABOVE
+ - \UE051 LATIN SMALL LETTER SCHWA WITH MACRON
+Alphabetic Presentation Forms
+fl fl \UFB02 LATIN SMALL LIGATURE FL
+שׁ שׁ \UFB2A HEBREW LETTER SINH WITH SHIN DOT
+שׂ שׂ \UFB2B HEBREW LETTER SINH WITH SIN DOT
+
diff --git a/format_docs/pdb/ztxt.txt b/format_docs/pdb/ztxt.txt
new file mode 100644
index 0000000000..98fb6bae3e
--- /dev/null
+++ b/format_docs/pdb/ztxt.txt
@@ -0,0 +1,226 @@
+The zTXT Format
+---------------
+
+The zTXT format is relatively straightforward. The simplest zTXT contains a
+Palm database header, followed by zTXT record #0, followed by the compressed
+data. The compressed data can be in one of two formats: one long data stream,
+or split into chunks for random access. If there are any bookmarks, they occupy
+the record immediately after the compressed data. If there are any annotations,
+the annotation index occupies the record immediately after the bookmarks with
+each annotation in the index having a record immediately after the annotation
+index. Here are diagrams of a simple zTXT and a full featured zTXT:
+
+ DB Header
+0 Record 0
+1
+2
+3
+... Compressed Data
+36
+37
+38
+
+ DB Header
+0 Record 0
+1
+2
+3
+... Compressed Data
+36
+37
+38
+39 Bookmarks
+40 Annotation Index
+41 Annotation 1
+42 Annotation 2
+43 Annotation 3
+
+
+Compression Modes
+-----------------
+
+zTXT version 1.40 and later supports two modes of compression. Mode 1 is a
+random access mode, and mode 2 consists of one long data stream. Both modes
+work on 8K (the default record size) blocks of text.
+
+Please note, however, that as of Weasel Reader version 1.60 the old style
+(mode 2) zTXT format is no longer supported. makeztxt and libztxt still support
+creating these documents for backwards compatibility, but you should not use
+mode 2 if possible.
+
+
+Mode 1
+------
+
+In mode one, 8K blocks of text are compressed into an equal number of blocks of
+compressed data. Using the Z_FULL_FLUSH flush mode with zLib allows for random
+access among the blocks of data. In order for this to function, the first block
+must be decompressed first, and after that any block in the file may be
+decompressed in any order. In mode 1, the blocks of compressed data will likely
+not all have the same size.
+
+
+Mode 2
+------
+
+In zTXT versions before 1.40, this was the only method of compression. This
+mode involves compressing the entire input buffer into a single output buffer
+and then splitting the resulting buffer into 8K segments. This mode requires
+that all of the compressed data be decompressed in one pass. Since there are no
+real 'blocks' of data, the resulting output can be of any blocksize, though
+typically the default of 8K should be fine. The advantage to mode 2 is that it
+will give about 10% - 15% more compression.
+
+
+zTXT Record #0 Definition (version 1.44)
+----------------------------------------
+
+Record 0 provides all of the information about the zTXT contents. Be sure it is
+correct, lest firey death rain down upon your program.
+
+typedef struct zTXT_record0Type {
+ UInt16 version;
+ UInt16 numRecords;
+ UInt32 size;
+ UInt16 recordSize;
+ UInt16 numBookmarks;
+ UInt16 bookmarkRecord;
+ UInt16 numAnnotations;
+ UInt16 annotationRecord;
+ UInt8 flags;
+ UInt8 reserved;
+ UInt32 crc32;
+ UInt8 padding[0x20 - 24];
+} zTXT_record0;
+
+
+Structure Elements
+------------------
+
+UInt16 version;
+
+This is mostly just informational. Your program can figure out what features
+might be available from the version. However, the remaining parts of the
+structure are designed such that their value will be 0 if that particular
+feature is not present, so that is the correct way to test. The version is
+stored as two 8 bit integers. For example, version 1.42 is 0x012A.
+
+UInt16 numRecords;
+
+This is the number of DATA records only and does not include record 0,
+bookmarks, or annotations. With compression mode 1, this is also the number of
+uncompressed text records. With mode 2, you must decompress the file to figure
+out how many text records there will be.
+
+UInt32 size;
+
+The size in bytes of the uncompressed data in the zTXT. Check this value with
+the amount of free storage memory on the Palm to make sure there's enough room
+to decompress the data in full or in part.
+
+UInt16 recordSize;
+
+recordSize is the size in bytes of a text record. This field is important, as
+the size of text and decompression buffers is based on this value. It is used
+by Weasel to navigate though the text so it can map absolute offsets to record
+numberss. 8192 is the default. With compression mode 1, this is the amount of
+data inside each compressed record (except maybe the last one), but the actual
+compressed records will likely have varying sizes. In mode 2, both compressed
+records and the resulting text records are all of this size (except, again, the
+last record).
+
+UInt16 numBookmarks;
+
+The definitive count of how many bookmarks are stored in the bookmark index
+record. See the section on bookmarks below.
+
+UInt16 bookmarkRecord;
+
+If there are any bookmarks, this is set to the record index number that
+contains the bookmark listing, otherwise it is 0.
+
+UInt16 numAnnotations;
+
+Like the bookmark count, this is the definitive count of how many annotations
+are in the annotation index and how many annotation records follow it. See the
+section on annotation below.
+
+UInt16 annotationRecord;
+
+If there are any annotations, this is set to the record index number that
+contains the annotation index, otherwise it is 0.
+
+UInt8 flags;
+
+These flags indicate various features of the zTXT database. flags is a bitmask
+and at present the only two defined bits are:
+
+ZTXT_RANDOMACCESS (0x01)
+ If the zTXT was compressed according to the method in mode 1, then it
+ supports random access and this should be set.
+ZTXT_NONUNIFORM (0x02)
+ Setting this bit indicates that the text records within the zTXT database
+ are not of uniform length. That is, when the blocks of text are
+ decompressed they will not have identical block sizes. If this is not set,
+ the compressed blocks are assumed to all have the same size when
+ decompressed (typically 8K) except for the last block which can be smaller.
+
+UInt32 crc32;
+
+A CRC32 value for checking data integrity. This value is computer over all text
+data record only and does not include record 0 nor any bookmark/annotation
+records. The current implementation in makeztxt/Weasel computes this value
+using the crc32 function in zLib which should be the standard CRC32 definition.
+
+UInt8 padding[0x20 - 24];
+
+zTXT record zero is 32 bytes in length, so the unused portion is padded.
+
+
+zTXT Bookmarks
+--------------
+
+zTXT bookmarks are stored in a simple array in a record at the end of a zTXT.
+The format is as follows:
+
+#define MAX_BMRK_LENGTH 20
+
+typedef struct GPlmMarkType {
+ UInt32 offset;
+ Char title[MAX_BMRK_LENGTH];
+} GPlmMark;
+
+In the structure, offset is counted as an absolute offset into the text. The
+bookmarks must be sorted in ascending order.
+
+If there are no bookmarks, then the bookmark index does not exist. When the
+user creates the first bookmark, the record containing the index will then be
+created. If there are annotations, when the bookmark record is created it must
+go before the annotation index. This will require incrementing annotationRecord
+in record 0 to point to the new record index.
+
+Similarly, when all bookmarks are deleted the bookmark index record is also
+deleted. If there are annotations, annotationRecord in record 0 must be
+decremented to point to the new index.
+
+
+zTXT Annotations
+----------------
+
+zTXT annotations have a format almost identical to that of the bookmark index:
+
+typedef struct GPlmAnnotationType {
+ UInt32 offset;
+ Char title[MAX_BMRK_LENGTH];
+} GPlmAnnotation;
+
+Like the bookmarks, offset is an absolute offset into the text. The annotation
+index is organized just as the bookmarks are, as a single array in a record.
+Note that this structure does NOT store the actual annotation text.
+
+The text of each annotation is stored in its own record immediately following
+the index. So, the first annotation in the index will occupy the first record
+following the index, and the second annotation will be in the second record
+following the index, and so on. The text of each annotation is limited to
+4096 bytes.
+
diff --git a/format_docs/rb.txt b/format_docs/rb.txt
new file mode 100644
index 0000000000..2eb1992afb
--- /dev/null
+++ b/format_docs/rb.txt
@@ -0,0 +1,303 @@
+Rocket eBook File Format
+------------------------
+
+from http://rbmake.sourceforge.net/rb_format.html
+
+
+Overview
+--------
+
+This document attempts to describe the format of a .rb file -- the book
+format that is downloaded into NuvoMedia's
+hand-held wonder, the Rocket eBook
+.
+
+*Note:* All multi-byte integers are stored in Vax/Intel order (the
+opposite of network byte order). Most integers are 4 bytes (an int32),
+but there are some minor exceptions (as detailed below).
+
+Also, the following document refers to the .rb file sections as "pages".
+
+
+Details
+-------
+
+The first 4 bytes of the file seem to be a magic number (in hex): B0 0C
+B0 0C. I like to think of this as a hexidecimal pun on the word "book"
+(repeated). [Matt Greenwood has reported seeing a magic number of "B0 0C
+F0 0D" in another type of ReB-related file -- i.e. "book food".]
+
+The next two bytes appear to be a version number, currently "02 00". I
+assume this means major version 2, minor version 0.
+
+The next 4 bytes are the string "NUVO", followed by 4 bytes of 00h. (I
+have also seen an old title that had 0s in place of the "NUVO".)
+
+This brings us up to offset 0Eh, at which point we have a 4-byte
+representation of the date the book was created (Matt Greenwood pointed
+this out to me -- thanks!). The year is encoded as an int16. On older
+version of the RocketLibrary was encoding the year's full value (e.g.
+1999 was "CF 07" and 2000 was "D0 07"), but a more recent version is now
+using the tm_year value verbatim -- i.e. it's storing 100 for the year
+2000 ("64 00"). The year is followed by an int8 for the 1-relative month
+number, and an int8 for the day of the month.
+
+After that is 6 bytes of 00h. These may be reserved for setting the time
+of creation (at a guess).
+
+Then, at offset 18h, we have an int32 that contains the absolute offset
+of the "Table of Contents" (the directory of the pages contained within
+this .rb file). In all of the .rb file's I've seen, this remains
+constant with a value of 128h. However, I have tested an atypical .rb
+file where I placed the ToC at the end of the file (after all the file
+contents), and it worked fine. (I've chosen not to build any books in
+such a non-standard format, however.)
+
+Immediately following this is an int32 with the length of the .rb file
+(so we can check if the file is complete or not).
+
+All the bytes from here (offset 20h) up to offset 128h appear to only be
+used by an encrypted title. In a non-encrypted title, they are always 0.
+
+The table of contents typically comes next (at offset 128h). It starts
+with an int32 count of the number of "page" entries (.rb-file sections)
+in the ToC. Each entry consists of a name (zero-padded to 32 bytes),
+followed by 3 int32s: the length of this entry's data segment, the
+absolute offset of the data in the .rb file, and a flag. The known flag
+values are: 1 (encrypted), 2 (info page), and 8 (deflated). The names
+are tweaked as needed to ensure that they are all unique. The current
+RocketWriter software uses a unique 6-digit number, a dash, up to 8
+characters from the filename, and then the re-mapped suffix for the data
+(.html, .hidx, .png, .info, etc.). My rbmake library simply ensures that
+the names are no longer than 15 characters (not counting the suffix) and
+are all unique.
+
+Often the first item in the ToC is the info page, but it doesn't have to
+be. This page of information contains NAME=VALUE pairs that note the
+author, title, what the root-page's name is, etc. (See appendix A). This
+data is never encrypted nor compressed, so this entry's flag value is
+always "2".
+
+An image page is always stored as a B&W image in PNG format. Since it
+has its own compression, it is stored without any additional attempt at
+deflation. I have also never seen an encrypted image, so its flag value
+is always 0.
+
+An HTML page contains the tags and text that were re-written into a
+consistent syntax (this presumably makes the HTML renderer in the ReB
+itself simpler). HTML pages are typically compressed (See appendix B).
+Every HTML page appears to use the suffix .html no matter what the file
+name was on import (but I have seen older files with .htm used as the
+suffix, so the rocket appears to support both).
+
+For every HTML page there is a corresponding .hidx page that contains a
+summary of the paragraph formatting and the position of the anchor names
+in the associated .html page (See appendix C). This page is sometimes
+compressed, depending on length (See appendix B).
+
+There are also reference titles that have a .hkey page that contains a
+list of words that can be looked up in the associated .html page (See
+appendix D).
+
+Immediately following the ToC is the data for each piece mentioned in
+the ToC, in the same order as it appeared in the ToC.
+
+Finally, the end of the file appears to be padded with 20 bytes of 01h.
+
+
+Appendix A: Info Page Format
+----------------------------
+
+The info page consists of a series of lines that contain "NAME=VALUE"
+strings. Each line is terminated by a single newline. Here are the
+values that the RocketWriter generates:
+
+ COMMENT=Info file for
+ TYPE=2
+ TITLE=
+ AUTHOR=
+ URL=ebook:
+ GENERATOR=
+ PARSE=1
+ OUTPUT=1
+ BODY=
+ MENUMARK=menumark.html
+ SuggestedRetailPrice=
+
+Encrypted titles have a few more entries (including those listed above):
+
+ ISBN=
+ REVISION=
+ TITLE_LANGUAGE=
+ PUB_NAME=
+ PUBSERVER_ID=
+ GENERATOR=
+ VERSION=
+ USERNAME=
+ COPY_ID=
+ COPYRIGHT=
+ COPYTITLE=
+
+A reference title also has an indication that there is a .hkey page
+present, and may also have a GENRE of "Reference":
+
+ HKEY=1
+ GENRE=Reference
+
+
+Appendix B: The format of compressed data
+-----------------------------------------
+
+Compressed pages have a data section in the .rb file with the following
+format:
+
+The first int32 is a count of the number of 4096-byte chunks of data we
+broke the uncompressed page into (the last chunk can be shorter than
+4096 bytes, of course).
+
+This is immediately followed by an int32 with the length of the entire
+uncompressed data.
+
+After this there are int32s that indicate the size of each
+chunk's compressed data.
+
+Following these length int32s is the output from a deflation (the
+algorithm used in gzip) for each 4096-byte chunk of the original data.
+It appears that you must use a window-bit size of 13 and a compression
+level of "best" to be compatible with the Rocket eBook's system software.
+
+
+Appendix C: HTML-index Page Format
+----------------------------------
+
+The .hidx page's purpose is to allow the renderer to quickly look up the
+format of each paragraph (useful for random access to the data), and the
+position of the anchor names.
+
+The first section lists the various paragraph-producing tags. It is
+headed by a line of "[tags ]", where is the number of
+tags that follow this header. The tags are listed one per line, and have
+an implied enumeration from 0 to N-1 (which the other tags and the
+upcoming paragraph sections reference).
+
+The first tag is typically (always?) " -1". The number trailing
+the tag indicates what other tag (or sequence of tags, one per line) in
+which we are nested. So, if we have a
nested inside a , it would be listed separately from a
that was
+nested inside a normal paragraph, and each one would have a different
+trailing index number.
+
+Following the tag section is the paragraph section. The heading is
+"[paragraphs ]", and is followed by a line for each paragraph.
+These lines consist of a character offset into the .html page for the
+start of the paragraph followed by a 0-relative offset into the tag
+section (indicating what kind of formatting to use for the indicated
+paragraph).
+
+The paragraph-section character offsets point to the first bit of text
+after the associated tag.
+
+The last section details the anchor names. The heading is
+"[names ]", and each item that follows is a quoted string of the
+anchor name, followed by a character offset into the .html page where
+we'll find that name. If there are no names in the associated HTML
+section, the heading is included with a 0 count (i.e. "[names 0]").
+
+The name-section character offsets point to the start of the anchor tag
+(not after the tag, like the offsets in the "paragraphs" section).
+
+The lines are terminated by newlines (in standard unix fashion).
+
+For example:
+
+ [tags 10]
+ -1
+ 0
+ 1
+
1
+
1
+
1
+
1
+
6
+
1
+
1
+
+ [paragraphs 42]
+ 160 9
+ 164 9
+ 184 8
+ 220 8
+ 261 6
+ 316 5
+ 359 1
+ 379 6
+ 410 6
+ 460 7
+ 511 7
+ 564 7
+ 616 7
+ 668 7
+ 720 7
+ 773 7
+ 827 7
+ 880 7
+ 933 7
+ 988 7
+ 1043 7
+ 1100 7
+ 1157 7
+ 1214 7
+ 1270 7
+ 1328 7
+ 1385 7
+ 1442 7
+ 1497 7
+ 1556 7
+ 1561 7
+ 1635 1
+ 1656 5
+ 1690 6
+ 1737 7
+ 1773 5
+ 1798 4
+ 1826 3
+ 2663 1
+ 2668 4
+ 2689 2
+ 2730 8
+
+ [names 1]
+ "ch1" 2689
+
+
+Appendix D: HTML-key Page Format
+--------------------------------
+
+The .hkey page contains a list of words, one per line, sorted in a
+strict ASCII sequence, each one followed by a tab and the offset in the
+.html page of the word's data. I presume that the .hkey page must share
+the same name prefix as its related .html page.
+
+If the names contain high-bit characters, they are translated into
+regular ASCII in the .hkey file, since this allows the user to search
+for the words using unaccented characters.
+
+The lines are terminated with a newline (in standard unix fashion).
+
+An example:
+
+ a 5
+ apple 38
+ b 84
+ book 104
+
+Each of these offsets points to a paragraph tag in the associated .html
+page. I have only seen this sequence of tags used so far:
+
+
word other stuff
+
+I have seen multiple ... tags in the middle of the single set of
+... tags, but this is the basic tag format.
+
+The offset in the .hkey page points to the start of the tag.
+
diff --git a/format_docs/tcr.txt b/format_docs/tcr.txt
new file mode 100644
index 0000000000..dbbbbaa869
--- /dev/null
+++ b/format_docs/tcr.txt
@@ -0,0 +1,56 @@
+About
+-----
+
+Text compression format that can be decompressed starting at any point.
+Little-endian byte ordering is used.
+
+
+Header
+------
+
+TCR files always start with:
+
+!!8-Bit!!
+
+
+Layout
+------
+
+Header
+256 key dictionary
+compressed text
+
+
+Dictionary
+----------
+
+A dictionary of key and replacement string. There are a total of 256 keys,
+0 - 255. Each string is preceded with one byte that represents the length of
+the string.
+
+
+Compressed text
+---------------
+
+The compressed text is a series of values 0-255 which correspond to a key and
+thus a string. Reassembling is replacing each key in the compressed text with
+its corresponding string.
+
+
+Compressor
+-----------------
+
+From Andrew Giddings TCR.c (http://www.cix.co.uk/~gidds/Software/TCR.html):
+
+The TCR compression format is easy to describe: after the fixed header is a
+dictionary of 256 strings, each preceded by a length byte. The rest of the
+file is a list of codes from this dictionary.
+
+The compressor works by starting with each code defined as itself. While
+there's an unused code, it finds the most common two-code combination, and
+creates a new code for it, replacing all occurrences in the text with the
+new code.
+
+It also searches for codes that are always followed by another, which it can
+merge, possibly freeing up some.
+
From cbb668b6ddfb362e58b505a475bc41893c8f01ac Mon Sep 17 00:00:00 2001
From: Hiroshi Miura
Date: Thu, 3 Feb 2011 13:21:44 +0900
Subject: [PATCH 3/7] recipe: msn sankei news changes its charcode.
---
resources/recipes/msnsankei.recipe | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/resources/recipes/msnsankei.recipe b/resources/recipes/msnsankei.recipe
index ae195559d5..59664d055f 100644
--- a/resources/recipes/msnsankei.recipe
+++ b/resources/recipes/msnsankei.recipe
@@ -13,15 +13,12 @@ class MSNSankeiNewsProduct(BasicNewsRecipe):
description = 'Products release from Japan'
oldest_article = 7
max_articles_per_feed = 100
- encoding = 'Shift_JIS'
+ encoding = 'utf-8'
language = 'ja'
cover_url = 'http://sankei.jp.msn.com/images/common/sankeShinbunLogo.jpg'
masthead_url = 'http://sankei.jp.msn.com/images/common/sankeiNewsLogo.gif'
feeds = [(u'\u65b0\u5546\u54c1', u'http://sankei.jp.msn.com/rss/news/release.xml')]
- remove_tags_before = dict(id="__r_article_title__")
- remove_tags_after = dict(id="ajax_release_news")
- remove_tags = [{'class':"parent chromeCustom6G"},
- dict(id="RelatedImg")
- ]
+ remove_tags_before = dict(id="NewsTitle")
+ remove_tags_after = dict(id="RelatedTitle")
From c7da3c8c1f35b7fa07ab4ff21e15106330ce62a3 Mon Sep 17 00:00:00 2001
From: GRiker
Date: Thu, 3 Feb 2011 06:41:40 -0700
Subject: [PATCH 4/7] GwR updates to catalog css
---
resources/catalog/stylesheet.css | 37 ++++++++++++++++++----------
src/calibre/library/catalog.py | 41 +++++++++++++++-----------------
2 files changed, 43 insertions(+), 35 deletions(-)
diff --git a/resources/catalog/stylesheet.css b/resources/catalog/stylesheet.css
index 336d015e44..4b32056400 100644
--- a/resources/catalog/stylesheet.css
+++ b/resources/catalog/stylesheet.css
@@ -52,6 +52,17 @@ p.formats {
text-indent: 0.0in;
}
+/*
+* Minimize widows and orphans by logically grouping chunks
+* Some reports of problems with Sony (ADE) ereaders
+* ADE: page-break-inside:avoid;
+* iBooks: display:inline-block;
+* width:100%;
+*/
+div.author_logical_group {
+ page-break-inside:avoid;
+ }
+
div.description > p:first-child {
margin: 0 0 0 0;
text-indent: 0em;
@@ -62,27 +73,19 @@ div.description {
text-indent: 1em;
}
-/*
-* Attempt to minimize widows and orphans by logically grouping chunks
-* Recommend enabling for iPad
-* Some reports of problems with Sony ereaders, presumably ADE engines
-*/
-/*
-div.logical_group {
- display:inline-block;
- width:100%;
+div.initial_letter {
+ page-break-before:always;
}
-*/
-p.date_index {
+p.author_title_letter_index {
font-size:x-large;
text-align:center;
font-weight:bold;
- margin-top:1em;
+ margin-top:0px;
margin-bottom:0px;
}
-p.letter_index {
+p.date_index {
font-size:x-large;
text-align:center;
font-weight:bold;
@@ -99,6 +102,14 @@ p.series {
text-indent:-2em;
}
+p.series_letter_index {
+ font-size:x-large;
+ text-align:center;
+ font-weight:bold;
+ margin-top:1em;
+ margin-bottom:0px;
+ }
+
p.read_book {
text-align:left;
margin-top:0px;
diff --git a/src/calibre/library/catalog.py b/src/calibre/library/catalog.py
index 8ad64c8cdd..092cc66ff9 100644
--- a/src/calibre/library/catalog.py
+++ b/src/calibre/library/catalog.py
@@ -1832,8 +1832,6 @@ def generateHTMLByTitle(self):
body.insert(btc,pTag)
btc += 1
- #
- #
divTag = Tag(soup, "div")
dtc = 0
current_letter = ""
@@ -1861,11 +1859,12 @@ def generateHTMLByTitle(self):
divTag.insert(dtc, divRunningTag)
dtc += 1
divRunningTag = Tag(soup, 'div')
- divRunningTag['class'] = "logical_group"
+ if dtc > 0:
+ divRunningTag['class'] = "initial_letter"
drtc = 0
current_letter = self.letter_or_symbol(book['title_sort'][0])
pIndexTag = Tag(soup, "p")
- pIndexTag['class'] = "letter_index"
+ pIndexTag['class'] = "author_title_letter_index"
aTag = Tag(soup, "a")
aTag['name'] = "%s" % self.letter_or_symbol(book['title_sort'][0])
pIndexTag.insert(0,aTag)
@@ -1973,8 +1972,6 @@ def generateHTMLByAuthor(self):
body.insert(btc, aTag)
btc += 1
- #
- #
divTag = Tag(soup, "div")
dtc = 0
divOpeningTag = None
@@ -2008,10 +2005,11 @@ def generateHTMLByAuthor(self):
current_letter = self.letter_or_symbol(book['author_sort'][0].upper())
author_count = 0
divOpeningTag = Tag(soup, 'div')
- divOpeningTag['class'] = "logical_group"
+ if dtc > 0:
+ divOpeningTag['class'] = "initial_letter"
dotc = 0
pIndexTag = Tag(soup, "p")
- pIndexTag['class'] = "letter_index"
+ pIndexTag['class'] = "author_title_letter_index"
aTag = Tag(soup, "a")
aTag['name'] = "%sauthors" % self.letter_or_symbol(current_letter)
pIndexTag.insert(0,aTag)
@@ -2023,16 +2021,21 @@ def generateHTMLByAuthor(self):
# Start a new author
current_author = book['author']
author_count += 1
- if author_count == 2:
+ if author_count >= 2:
# Add divOpeningTag to divTag, kill divOpeningTag
- divTag.insert(dtc, divOpeningTag)
- dtc += 1
- divOpeningTag = None
- dotc = 0
+ if divOpeningTag:
+ divTag.insert(dtc, divOpeningTag)
+ dtc += 1
+ divOpeningTag = None
+ dotc = 0
+
+ # Create a divRunningTag for the next author
+ if author_count > 2:
+ divTag.insert(dtc, divRunningTag)
+ dtc += 1
- # Create a divRunningTag for the rest of the authors in this letter
divRunningTag = Tag(soup, 'div')
- divRunningTag['class'] = "logical_group"
+ divRunningTag['class'] = "author_logical_group"
drtc = 0
non_series_books = 0
@@ -2364,8 +2367,6 @@ def add_books_to_HTML_by_date_range(date_range_list, date_range, dtc):
body.insert(btc,pTag)
btc += 1
- #
- #
divTag = Tag(soup, "div")
dtc = 0
@@ -2549,8 +2550,6 @@ def add_books_to_HTML_by_date_range(date_range_list, date_range, dtc):
body.insert(btc, aTag)
btc += 1
- #
- #
divTag = Tag(soup, "div")
dtc = 0
@@ -2652,8 +2651,6 @@ def generateHTMLBySeries(self):
body.insert(btc, aTag)
btc += 1
- #
- #
divTag = Tag(soup, "div")
dtc = 0
current_letter = ""
@@ -2668,7 +2665,7 @@ def generateHTMLBySeries(self):
# Start a new letter with Index letter
current_letter = self.letter_or_symbol(sort_title[0].upper())
pIndexTag = Tag(soup, "p")
- pIndexTag['class'] = "letter_index"
+ pIndexTag['class'] = "series_letter_index"
aTag = Tag(soup, "a")
aTag['name'] = "%s_series" % self.letter_or_symbol(current_letter)
pIndexTag.insert(0,aTag)
From be76d5b41e2fb298be7e2f63cf11274d2345e66d Mon Sep 17 00:00:00 2001
From: Kovid Goyal
Date: Thu, 3 Feb 2011 07:57:53 -0700
Subject: [PATCH 5/7] Fix #8735 (Updated recipe for The Onion)
---
resources/recipes/theonion.recipe | 78 ++++++++++++++++++++++---------
1 file changed, 57 insertions(+), 21 deletions(-)
diff --git a/resources/recipes/theonion.recipe b/resources/recipes/theonion.recipe
index 3be4ae4e04..b0eacbb5e0 100644
--- a/resources/recipes/theonion.recipe
+++ b/resources/recipes/theonion.recipe
@@ -1,7 +1,5 @@
-#!/usr/bin/env python
-
__license__ = 'GPL v3'
-__copyright__ = '2009, Darko Miletic '
+__copyright__ = '2009-2011, Darko Miletic '
'''
theonion.com
@@ -12,35 +10,73 @@
class TheOnion(BasicNewsRecipe):
title = 'The Onion'
__author__ = 'Darko Miletic'
- description = "America's finest news source"
- oldest_article = 2
+ description = "America's finest news source"
+ oldest_article = 2
max_articles_per_feed = 100
- publisher = u'Onion, Inc.'
- category = u'humor, news, USA'
- language = 'en'
-
+ publisher = 'Onion, Inc.'
+ category = 'humor, news, USA'
+ language = 'en'
no_stylesheets = True
use_embedded_content = False
encoding = 'utf-8'
- remove_javascript = True
- html2epub_options = 'publisher="' + publisher + '"\ncomments="' + description + '"\ntags="' + category + '"'
-
- html2lrf_options = [
- '--comment' , description
- , '--category' , category
- , '--publisher' , publisher
- ]
+ publication_type = 'newsportal'
+ masthead_url = 'http://o.onionstatic.com/img/headers/onion_190.png'
+ extra_css = """
+ body{font-family: Helvetica,Arial,sans-serif}
+ .section_title{color: gray; text-transform: uppercase}
+ .title{font-family: Georgia,serif}
+ .meta{color: gray; display: inline}
+ .has_caption{display: block}
+ .caption{font-size: x-small; color: gray; margin-bottom: 0.8em}
+ """
- keep_only_tags = [dict(name='div', attrs={'id':'main'})]
-
+ conversion_options = {
+ 'comment' : description
+ , 'tags' : category
+ , 'publisher': publisher
+ , 'language' : language
+ }
+
+ keep_only_tags = [
+ dict(name='h2', attrs={'class':['section_title','title']})
+ ,dict(attrs={'class':['main_image','meta','article_photo_lead','article_body']})
+ ,dict(attrs={'id':['entries']})
+ ]
+ remove_attributes=['lang','rel']
+ remove_tags_after = dict(attrs={'class':['article_body','feature_content']})
remove_tags = [
- dict(name=['object','link','iframe','base'])
+ dict(name=['object','link','iframe','base','meta'])
,dict(name='div', attrs={'class':['toolbar_side','graphical_feature','toolbar_bottom']})
,dict(name='div', attrs={'id':['recent_slider','sidebar','pagination','related_media']})
]
-
+
feeds = [
(u'Daily' , u'http://feeds.theonion.com/theonion/daily' )
,(u'Sports' , u'http://feeds.theonion.com/theonion/sports' )
]
+
+ def get_article_url(self, article):
+ artl = BasicNewsRecipe.get_article_url(self, article)
+ if artl.startswith('http://www.theonion.com/audio/'):
+ artl = None
+ return artl
+
+ def preprocess_html(self, soup):
+ for item in soup.findAll(style=True):
+ del item['style']
+ for item in soup.findAll('a'):
+ limg = item.find('img')
+ if item.string is not None:
+ str = item.string
+ item.replaceWith(str)
+ else:
+ if limg:
+ item.name = 'div'
+ item.attrs = []
+ if not limg.has_key('alt'):
+ limg['alt'] = 'image'
+ else:
+ str = self.tag_to_string(item)
+ item.replaceWith(str)
+ return soup
From 57efe4fb061c92a00db48cadc169fd05f6d8d833 Mon Sep 17 00:00:00 2001
From: Kovid Goyal
Date: Thu, 3 Feb 2011 10:18:00 -0700
Subject: [PATCH 6/7] Fix #8739 (get_matches() got multiple values for keyword
argument 'allow_recursion')
---
src/calibre/library/caches.py | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/src/calibre/library/caches.py b/src/calibre/library/caches.py
index dd4509acea..e818e6a3c0 100644
--- a/src/calibre/library/caches.py
+++ b/src/calibre/library/caches.py
@@ -420,7 +420,8 @@ def get_user_category_matches(self, location, query, candidates):
return candidates - res
return res
- def get_matches(self, location, query, allow_recursion=True, candidates=None):
+ def get_matches(self, location, query, candidates=None,
+ allow_recursion=True):
matches = set([])
if candidates is None:
candidates = self.universal_set()
@@ -434,8 +435,8 @@ def get_matches(self, location, query, allow_recursion=True, candidates=None):
if isinstance(location, list):
if allow_recursion:
for loc in location:
- matches |= self.get_matches(loc, query, candidates,
- allow_recursion=False)
+ matches |= self.get_matches(loc, query,
+ candidates=candidates, allow_recursion=False)
return matches
raise ParseException(query, len(query), 'Recursive query group detected', self)
From 8749611440861d79f53a3a43d19d1b276fbf13f6 Mon Sep 17 00:00:00 2001
From: Kovid Goyal
Date: Thu, 3 Feb 2011 11:33:19 -0700
Subject: [PATCH 7/7] Nook Color driver: Send downloaded news to the My
Files/Magazines folder on the Nook Color. Also when getting the list of books
on the device look at all folders in My Files, not just My Files/Books.
---
src/calibre/devices/nook/driver.py | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/src/calibre/devices/nook/driver.py b/src/calibre/devices/nook/driver.py
index ca05885645..39d0763735 100644
--- a/src/calibre/devices/nook/driver.py
+++ b/src/calibre/devices/nook/driver.py
@@ -89,21 +89,21 @@ class NOOK_COLOR(NOOK):
BCD = [0x216]
WINDOWS_MAIN_MEM = WINDOWS_CARD_A_MEM = 'EBOOK_DISK'
- EBOOK_DIR_MAIN = 'My Files/Books'
+ EBOOK_DIR_MAIN = 'My Files'
- '''
def create_upload_path(self, path, mdata, fname, create_dirs=True):
filepath = NOOK.create_upload_path(self, path, mdata, fname,
- create_dirs=create_dirs)
- edm = self.EBOOK_DIR_MAIN.replace('/', os.sep)
- npath = os.path.join(edm, _('News')) + os.sep
- if npath in filepath:
- filepath = filepath.replace(npath, os.sep.join('My Files',
- 'Magazines')+os.sep)
- filedir = os.path.dirname(filepath)
- if create_dirs and not os.path.exists(filedir):
- os.makedirs(filedir)
+ create_dirs=False)
+ edm = self.EBOOK_DIR_MAIN
+ subdir = 'Books'
+ if mdata.tags:
+ if _('News') in mdata.tags:
+ subdir = 'Magazines'
+ filepath = filepath.replace(os.sep+edm+os.sep,
+ os.sep+edm+os.sep+subdir+os.sep)
+ filedir = os.path.dirname(filepath)
+ if create_dirs and not os.path.exists(filedir):
+ os.makedirs(filedir)
return filepath
- '''