DIFF Format

Note: This document uses little-endian(Intel) byte and bit order throughout.


Each DIFF-format file consists of one chunk, the DIFF chunk, which usually contains a "pencil-compatible" list of chunks inside.

A chunk has the following general structure:

4B4B[P. Size]B
[Tag][P. Size][Payload]


The first bit of the tag indicates whether the chunk has a pencil-compatible header; the second bit indicates whether the chunk has a pencil-compatible chunk list. The remaining bits make up five 6-bit characters, using DIFF Tag Encoding(described later).


The payload consists of an optional pencil-compatible header, an optional pencil-compatible chunk list, and zero or more bytes of tag-specific data.

Pencil-Compatible Header

A pencil-compatible header is conceptually similar to an .INI file section, containing name=data entries in no particular order. Sections within it can be achieved by some application-defined interpretation of the names, like demarcation("Engine.Actor.Lifespan") or counting/lengthing([04][07]"Program"[04]"Dare"[08]"TextTree"[0A]"TextFormat"). It has the following structure:

4B4B[H. Size - 4]B
[H. Size][E. Count][[E. Count] Header Entries]

Each header entry is as follows:

4B[N. Size]B4B[D. Size]B
[N. Size][Name][D. Size][Data]

Pencil-Compatible Chunk List

A pencil-compatible chunk list is as follows:

4B4B[L. Size - 4]B
[L. Size][E. Count][[E. Count] Chunks]

The chunks within a pencil-compatible chunk list are full and normal chunks.

Tag-Specific Data

Tag-specific data is usually the ultimate payload.


Generally, an application reading a DIFF file should be able to ignore information specific to other applications, and an application should be able to add information specific to itself with relative freedom. Specifically:

Unexpected Chunks

Unexpected Pencil-Compatible Headers Or Chunk Lists

Unexpected Header Entries

Propagation Of Unexpected Information

DIFF Tag Encoding

Note that while this table resembles the tables used for OnTwelve encoding, it is *not* OnTwelve-compatible. Also, the and - characters are often treated as a single character, hyphen/minus, rather than separate hyphen and minus characters.

30WXYZ+-=<undefined; use "~" or "#8" through "#F" if needed>