The Acorn Computer User WWW Server

Font File Formats

In all the formats described below, 2-byte quantities are little-endian:
that is, the least significant byte comes first, followed by the
most-significant.  The values are unsigned unless otherwise stated.


IntMetrics / IntMet:
        
        40      Name of font, padded with 
        4       16
        4       16
        1       nlo = low-byte of number of defined characters
        1       version number of file format:
                        0       flags and nhi must be 0
                        1       **** not supported ****
                        2       you can set the flags as described
                                and nchars can be > 255
        1       flags:
                bit 0 set => there is no bbox data (use Outlines)
                bit 1 set => there is no x-offset data
                bit 2 set => there is no y-offset data
                bit 3 set => there is more data after the metrics
                bit 4 reserved (must be 0)
                bit 5 set => character map size precedes map
                bit 6 set => kern characters are 16-bit, else 8-bit
                bit 7 reserved (must be 0)
        1       nhi = high byte of number of defined characters

                n = nlo + 256*nhi

                If flags bit 5 set:
        2       m = character map size (lo,hi) (0 => no map)
                m = 256 if flags bit 5 clear

        m       character mapping (ie. indices into following arrays)
                if not present, index = character code

                Note that since the mapping table is 8-bit, there cannot be
                one if n > 256.

                Tables giving bounding boxes (1/1000th em) of characters:
                If flags bit 0 clear:
        2n      signed x0 2-byte values
        2n      signed y0 2-byte values
        2n      signed x1 2-byte values
        2n      signed y1 2-byte values

                Tables of character widths:
                If flags bit 1 clear:
        2n      signed 2-byte x-offsets after printing each character
                If flags bit 2 clear:
        2n      signed 2-byte y-offsets after printing each character


        To calculate offset to here:

                  nlo = byte at offset 48 in file
                  nhi = byte at offset 51 in file
                flags = byte at offset 49 in file
                    n = nlo + 256*nhi  ^^->Acorn, you mean 50 me thinks....

                offset = 52
                if (flags bit 5 clear) then offset += 256
                                       else offset += 2 + byte(52) + 256*byte(53)
                if (flags bit 0 clear) then offset += 8n
                if (flags bit 1 clear) then offset += 2n
                if (flags bit 2 clear) then offset += 2n

                If flags bit 3 set:

        8       table of 4 2-byte unsigned offsets from start of table:
                    entry 0 = offset to 'miscellaneous' data
                    entry 1 = offset to kern pair data
                    entry 2 = offset to reserved area #1
                    entry 3 = offset to reserved area #2

                The entries must be consecutive in the file, so the end of
                one area coincides with the beginning of the next.  The
                areas are not necessarily word-aligned, and the space at the
                end of each area is reserved (ie. there must not be any
                'dead' space at the end of an area).

        Area 0: Miscellaneous data:

        8       font bounding box in 1/1000th em (signed x0,y0,x1,y1)
        2       default x-offset per char (if flags bit 1 set)
        2       default y-offset per char (if flags bit 2 set)
        2       italic h-offset per em (-1000 * TAN(italic angle))
        1       underline position (signed, in 1/256th of an em)
        1       underline thickness (unsigned, in 1/256th of an em)
        2       CapHeight (1/1000th em, 2-byte signed)
        2       XHeight (1/1000th em, 2-byte signed)
        2       Descender (1/1000th em, 2-byte signed)
        2       Ascender (1/1000th em, 2-byte signed)
        4       reserved field (must be 0)

        Area 1: Kern pair data:

                flags bit 6 clr => character codes are 8-bit
                flags bit 6 set => character codes are 16-bit (lo,hi)

        1 or 2  left-hand letter code                          ]
        1 or 2  right-hand letter code                ]        ]
        2       x-kern amount (if flags bit 1 clear)  ] repeat ] repeat
        2       y-kern amount (if flags bit 2 clear)  ]        ]
        1 or 2  0 => end of list for this letter               ]
        1 or 2  0 => end of kerns

                Or, in pseudo-Backus-Naur notation:

                   ::= * 
         ::=  * 
                ::=   
                  ::= <8-bit character code> if flags bit 6 clear
                           or <16-bit character code> if flags bit 6 set
                ::= 
                           or  if flags bit 1 set
                ::= 
                           or  if flags bit 2 set
                ::= < with code 0>

        Area 2: Reserved (must be null)

        Area 3: Reserved (must be null)


x90y45:

        Index:
                1       point size (not multiplied by 16)
                1       bits per pixel (4)
                1       pixels per inch (x-direction)
                1       pixels per inch (y-direction)
                4       reserved (was checksum) - must be 0
                4       offset of pixel data in file
                4       size of pixel data
                ...     more of the same
                1       0

        Pixel data:
                4       x-size in 1/16ths point * pixels per inch (x)
                4       y-size in 1/16ths point * pixels per inch (y)
                4       pixels per inch (x-direction)
                4       pixels per inch (y-direction)
                1       x0    - maximum bounding box for any character
                1       y0    - bottom-left is inclusive
                1       x1    - top-right is exclusive
                1       y1    - all coordinates are in pixels
                512     2-byte offsets from table start of character data
                        0 => character is not defined
                        (pixel data is limited to 64K per block)

        Character data:

                1       x0      - bounding box
                1       y0
                1       x1-x0
                1       y1-y0
                4*n     4-bpp, consecutive rows bottom->top
                              not aligned until the end


Outlines / Outlines / f9999x9999 / b9999x9999:

        b9999x9999      1-bpp definition
        a9999x9999      4-bpp (anti-aliased) definition
        Outlines        the outline file

              '9999' = pixel size (ie. point size * 16 * dpi / 72)
                       zero-suppressed decimal number

        File header:
                4       "FONT"  - identification word
                1       Bits per pixel: 0 = outlines
                                        1 = 1 bpp
                                        4 = 4 bpp
                1       Version number of file format
                               4: no "don't draw skeleton lines unless smaller than this" byte present
                               5: byte at [table+512] = max pixel size for skeleton lines (0 => always do it)
                               6: byte at [chunk+indexsize] = dependency mask (see below)
                               7: flag word precedes index in chunk (offsets are rel. to index, not chunk)
                               8: file offset array is in a different place
                2       if bpp = 0: design size of font
                        if bpp > 0: flags:
                              bit 0 set => horizontal subpixel placement
                              bit 1 set => vertical subpixel placement
                              bit 6 set => flag word precedes index in chunk (version 7 onwards)
                              bits 2..5, 7 must be 0
                              NB: for outline files, bit 6 is derived from the version number
                                  but if present, its value must suit the version number
                2       x0      - font bounding box (16-bit signed)
                2       y0      - units are pixels or design units
                2       x1-x0   - x0,y0 inclusive, x1,y1 exclusive
                2       y1-y0

                If Version < 8: nchunks = 8

                4       file offset of 0..31 chunk (word-aligned)
                4       file offset of 32..63 chunk
                ...
                4       file offset of 224..255 chunk
                4       file offset of end (ie. size of file)
                        if offset(n+1)=offset(n), then chunk n is null.

                Else (Version >= 8): nchunks = as defined below

                4       file offset of area containing file offsets of chunks
                4       number of defined chunks
                4       ns = number of scaffold index entries (including entry[0] = size)
                4       scaffold flags:
                                bit 0 set => ALL scaffold base chars are 16-bit
                                bit 1 set => these outlines should not be anti-aliased (eg. System.Fixed)
                                bit 2 set => fill using non-zero winding rule, else odd-even.
                                bits 3..31 reserved (must be 0)
                4*5     all reserved (must be 0)

     Table start:
                2       n = size of table/scaffold data

         Bitmaps: (n=10 normally - other values are reserved)
                2       x-size (1/16th point)
                2       x-res (dpi)
                2       y-size (1/16th point)
                2       y-res (dpi)

         Outlines:
           ns*2-2       2-byte offsets to scaffold data:

                        If scaffold flags bit 0 clear:
                          bits 0..14 = offset of scaffold data from table start
                          bit 15 set => base char code is 2 bytes, else 1

                        If scaffold flags bit 0 set:
                          bits 0..15 = offset of scaffold data from table start
                          base char code is always 2 bytes

                        0 => no scaffold data for char

                1       Skeleton threshold pixel size
                        (if file format version >= 5)
                        When rastering the outlines, skeleton lines will
                        only be drawn if either the x-or y- pixel size is
                        less than this value (except if value=0, which means
                        'always draw skeleton lines').
                ?       ... sets of scaffold data (see below)

     Table end: ?       description of contents of file:
                        , 0, "Outlines", 0
                                        "999x999 point at 999x999 dpi", 0
                        padded with zeros until word-aligned

                If Version >= 8:

                4       file offset of chunk 0
                4       file offset of chunk 1
                ...
                4       file offset of chunk (nchunks-1)
                4       file offset of end of file

                All versions:

                        ... word-aligned chunks follow

     Scaffold data:
              1 or 2    char code of 'base' scaffold entry (0 ==> none)
                1       bit n set ==> x-scaffold line n defined in base char
                1       bit n set ==> y-scaffold line n defined in base char
                1       bit n set ==> x-scaffold line n defined locally
                1       bit n set ==> y-scaffold line n defined locally
                        ... local scaffold lines follow
     Scaffold lines:
                2       bits 0..11 = coordinate (signed)
                        bits 12..14 = scaffold link index (0 => none)
                        bit 15 set => 'linear' scaffold link
                1       width (254 ==> L-tangent, 255 ==> R-tangent)

     Chunk data:
                4       flag word (not present before file format version 7):
                              bit 0 set => horizontal subpixel placement
                              bit 1 set => vertical subpixel placement
                              bit 7 set => dependency byte(s) present (see below)
                              bits 2..6 and 8..30 must be 0
                              bit 31 must be 1

                4 * 32  offset from index start to character
                        0 => character is not defined
                           * 4 for vertical placement
                           * 4 for horizontal placement
                        Character index is more tightly bound than vertical
                        placement which is more tightly bound than
                        horizontal placement.

                n      Dependency byte(s) (if outline file, version >= 6).
                        One bit required for each chunk in file.
                        Bit n set => chunk n must be loaded in order to
                        rasterise this chunk.  This is required for
                        composite characters which include characters from
                        other chunks (see below).

                        ... character data follows (word-aligned at end of chunk)

                        Note: All character definitions must follow the
                              index in the order in which they are specified
                              in the index.  This is to allow the font
                              editor to easily determine the size of each
                              character.

     Char data:
                1       flags:
                           bit 0 set => coords are 12-bit, else 8-bit
                           bit 1 set => data is 1-bpp, else 4-bpp
                           bit 2 set => initial pixel is black, else white
                           bit 3 set => data is outline, else bitmap
                           if bit 3 clr: bits 4..7 = 'f' value for char (0 ==> not encoded)
                           if bit 3 set: bit 4 set => composite character
                                         bit 5 set => with an accent as well
                                         bit 6 set => ascii codes within this char are 16-bit, else 8-bit
                                         bit 7 reserved (must be 0)

         if flag bits 3 and 4 set:
                1 or 2     character code of base character
           if flag bit 5 set:
                1 or 2     character code of accent
                2 or 3     x,y offset of accent character

         if flag bits 3 or 4 clear:
                2 or 3     x0, y0  sign-extended 8- or 12- bit coordinates
                2 or 3     xs, ys  width, height (bbox = x0,y0,x0+xs,y0+ys)
                n          data: (depends on type of file)
                             1-bpp uncrunched: rows from bottom->top
                             4-bpp uncrunched: rows from bottom->top
                             1-bpp crunched: list of (packed) run-lengths
                             outlines: list of move/line/curve segments
                           word-aligned at the end of the character data

Outline char format
-------------------

Here the 'pixel bounding box' is actually the bounding box of the outline in
terms of the design size of the font (in the file header).  The data
following the bounding box consists of a series of move/line/curve segments
followed by a terminator and an optional extra set of line segments followed
by another terminator.  When constructing the bitmap from the outlines, the
font manager will fill the first set of line segments to half-way through
the boundary using an even-odd fill, and will thin-stroke the second set of
line segments (if present).  See the documentation on the Draw module for
further details.

If the font is version 8 or greater then it can specifiy that the winding
rule be either non-zero or odd-even, this is to provide compatibility with
other type face formats (see above in scaffold flags).

Each line segment consists of:

        1       bits 0..1 = segment type:
                            0 => terminator (see below)
                            1 => move to x,y
                            2 => line to x,y
                            3 => curve to x1,y1,x2,y2,x3,y3
                bits 2..4 = x-scaffold link
                bits 5..7 = y-scaffold link
                ... coordinates (design units) follow

Terminator:
                bit 2 set => stroke paths follow
                             (same format, but paths are not closed)
                bit 3 set => composite character inclusions follow:

Composite character inclusions:

        1 or 2  character code of character to include (0 => finished)
        2 or 3  x,y offset of this inclusion (design units)

The coordinates are either 8- or 12-bit sign-extended, depending on bit 0 of
the character flags (see above), including the composite character
inclusions.

The character codes in the composite character sections are either 8- or
16-bit (low-byte first), depending on bit 6 of the character flags.

The scaffold links associated with each line segment relate to the last
point specified in the definition of that move/line/curve, and the control
points of a bezier curve have the same links as their nearest endpoint.

Note that if a character includes another, the appropriate bit in the parent
character's chunk dependency flags must be set.  This byte tells the Font
Manager which extra chunk(s) must be loaded in order to rasterise the parent
character's chunk.


1-bpp uncompacted format
------------------------

1 bit per pixel, bit set => paint in foreground colour, in rows from
bottom-left to top-right, not aligned until word-aligned at the end of the
character.


1-bpp compacted data format
---------------------------

The whole character is initially treated as a stream of bits, as for the
uncompacted form.  The bit stream is then scanned row by row, with
consecutive duplicate rows being replaced by a 'repeat count', and alternate
runs of black and white pixels are noted.  The repeat counts and run counts
are then themselves encoded in a set of 4-bit entries.

Bit 2 of the character flags determine whether the initial pixel is black or
white (black = foreground), and bits 4..7 are the value of 'f' (see below).
The character is then represented as a series of packed numbers, which
represent the length of the next run of pixels.  These runs can span more
than one row, and after each run the pixel colour is changed over.  Special
values are used to denote row repeats.

 ==
          0         followed by n-1 zeroes, followed by n+1 nibbles
                    = resulting number + (13-f)*16 + f+1 - 16
   i =    1 .. f    i
   i =  f+1 .. 13   (i-f-1)*16 + next nibble + f + 1
         14         followed by n= = repeat count of n
         15         repeat count of 1 (ie. 1 extra copy of this row)

The optimal value of f lies between 1 and 12, and must be computed
individually for each character, by scanning the data and calculating the
length of the output for each possible value.  The value yielding the
shortest result is then used, unless that is larger than the bitmap itself,
in which case the bitmap is used.

Repeat counts operate on the current row, as understood by the unpacking
algorithm, ie. at the end of the row the repeat count is used to duplicate
the row as many times as necessary.  This effectively means that the repeat
count applies to the row containing the first pixel of the next run to start
up.

Note that rows consisting of entirely white or entirely black pixels cannot
always be represented by using repeat counts, since the run may span more
than one row, so a long run count is used instead.


Encoding files
--------------

An encoding file is a text file which contains a set of identifiers which
indicate which characters appear in which positions in a font.

Each identifier is preceded by a "/", and the characters are numbered from
0, increasing by 1 with each identifier found.

Comments are introduced by "%", and continue until the next control
character.

The following special comment lines are understood by the font manager:

        %%RISCOS_BasedOn 
        %%RISCOS_Alphabet 

where  and  denote positive decimal integers.

Both lines are optional, and they indicate respectively the number of the
base encoding and the alphabet number of this encoding.

If the %%RISCOS_BasedOn line is not present, then the Font Manager assumes
that the target encoding describes the actual positions of the glyphs in an
existing file, the data for which is in:

        .IntMetrics
        .Outlines

where  is null if the %%RISCOS_Alphabet line is omitted.

(In fact the leafnames are shortened to fit in 10 characters, by removing
characters from just before the  suffix).

In this case the IntMetrics and Outlines files are used directly, since
there is a one-to-one correspondence between the positions of the characters
in the datafiles and the positions required in the font.

If the %%RISCOS_BasedOn line IS present, then the Font Manager accesses the
following datafiles:

        .IntMetrics
        .Outlines

and assumes that the positions of the glyphs in the datafiles are as given
by the contents of the base encoding file.

The base encoding is called "/Base", and lives in the Encodings directory
under Font$Path, along with all the other encodings.  Because it is preceded
by a "/", the Font Manager does not return it in the list of encodings
returned by Font_ListFonts.

Note that only one encoding file with a given name can apply to all the
fonts known to the system.  Because of this, a given encoding can only be
either a direct encoding, where the alphabet number is used to reference the
datafiles, or an indirect encoding, where the base encoding number specifies
the datafile names.

Here is the start of a sample base encoding ("/Base0"):

        % /Base0 encoding

        %%RISCOS_Alphabet 0

        /.notdef /.NotDef /.NotDef /.NotDef
        /zero /one /two /three /four /five /six /seven /eight

Here is the start of a sample encoding file ("Latin1"):

        % Latin 1 encoding

        %%RISCOS_BasedOn 0
        %%RISCOS_Alphabet 101

        /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
        /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
        /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
        /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
        /space /exclam /quotedbl /numbersign
        /dollar /percent /ampersand /quotesingle

(Note that the sample /Base0 file is not the same as the released one).

These illustrate several points:

    1.  The %% lines must appear before the first identifier.

    2.  Character 0 in any encoding must be called ".notdef", and represent
        a null character.

    3.  Other null characters in the base encoding must be called ".NotDef",
        to distinguish them from character 0.

    4.  Non-base encoding files wanting to refer to the null character
        should use ".notdef" in all cases.

    5.  The other character names should follow the Adobe PostScript names
        wherever possible.  This is to enable the encoding to refer to Adobe
        character names when included as part of a print job by the
        PostScript printer driver.

    6.  The number of characters desribed by the base encoding can be
        anything from 0 to 768, and should refer to distinct characters
        (apart from the ".NotDef"s).  Other encodings, however, must
        contain exactly 256 characters, which need not be distinct.
poppy@poppyfields.net