Glyph Naming
For various reasons (historical, technical, convenience) most font development tools and formats provide mechanisms to name glyphs with human-readable strings (as opposed to just index numbers, for example). This article documents conventions, restrictions, and best practices for glyph naming.
We recommend that all glyphs in released fonts have names that are AGL-compliant. Because of their original association with Adobe PostScript, glyph names are often referred to as PostScript names or sometimes, more simply, psnames.
Glyph names used in development can differ from those used in released fonts and do not necessarily need to follow Adobe standards completely. See Working names vs production names.
Why use glyph names?
Given the ubiquitous support for TrueType/OpenType fonts, one might ask why do we even have glyph names? Nothing inside an OpenType font requires glyphs to have names—complex layout tables GSUB
and GPOS
, for example, do all their work in terms of glyph indexes. Even for the post
table itself (where glyph names are stored), one of the allowed forms, called Version 3, is one that has no names. So one might reasonably ask: if functional fonts can be produced without glyph names, why have them at all?
The answer starts in the era before TrueType came along, in which Adobe’s sophisticated page layout engine accessed glyphs by their names—glyphs had no other way to be located other than by their name. In fact, glyph names were simply another identifier in the PostScript programming language—and that is why even today glyph names have a very strict allowed form.
Going even further, some software packages are able to infer certain properties of a glyph from just its name. The most commonly cited case of this is that some PDF files do not include any representation of the character stream that the document represents, but yet you can still copy characters from the document to the clipboard. The reason this works at all—and it doesn’t always work perfectly—is that Acrobat is able to deduce the character stream (or something close to it) by looking only at the names of the glyphs in the file.
All of this can work, however, only if the glyphs have been correctly named—meaning named according to Adobe standards.
Glyph name limitations
According to the Adobe Glyph List Specification (AGL), glyph names, whether working names or production names, should be:
- no longer than 63 characters,
- entirely composed of characters from the following set: A–Z, a–z, 0–9, . (period, U+002E FULL STOP) and _ (underscore, U+005F LOW LINE),
- and, with the exception of
.notdef
, must not start with a period or digit.
AGL also specifies how those names should be chosen and constructed. For more details see the article on the Adobe Glyph List.
The Adobe Font Development Kit (AFDKO) Feature File Specification reiterates these same constraints though actual implementations accept more relaxed naming.
The ISO/IEC 14496-22 Open Font Format (Fourth edition 2019-01) allows an additional exception of .null
but does not permit glyph names starting with underscore. This is counter to current common practice wherein glyphs that are used only as components of composite glyphs typically have names that start with an underscore, as in _dot
.
Working names vs production names
The glyph names that should be in a font when it is shipped (i.e. released) are not necessarily the most friendly and memorable names. Therefore many developers use a different set of glyph names for development purposes and, as a last step of production, change the glyph names to the “official” names. We will call these two sets of glyph names the working names and the production names, respectively.
As an example, the acceptable production name for a glyph representing U+0628 ARABIC LETTER BEH would be one of:
uni0628
u0628
afii57416
However, as none of these is particularly memorable, designers might, for example, choose a working name of beh
.
The use of working and production glyph names is common enough that at least one modern font format (the UFO format) provides a mechanism to manage both sets of glyph names, and font editing applications can automatically convert names from working to production during export. Glyphs even provides an internal working names system that they call nice names.
Because working names are not present in shipping fonts, some of the Adobe AGL constraints can be relaxed. In particular:
- The basenames of glyphs need not be restricted to the Adobe Glyph List for New Fonts (AGLFN).
- The list of characters that are legal in a glyph name can be expanded. Glyphs, for example, utilizes a hyphen (
-
, U+002D) to append a script code to its nice names, for example the Arabicbeh-ar
.
Not all glyphs need production names
Note that even in a shipping font, there may be glyphs that do not need production names based on the AGLFN. Any glyphs used only as components of some other composite glyph need not have separate production names because the component glyphs can never end up in the output glyph stream of rendered text.
For example, many letters in the Arabic script are distinguished from each other only by the pattern of dots drawn above or below the main shape. A font might use component glyphs for each pattern of dots (one dot below, one dot above, two dots below, etc.) and also for each main shape, and then construct the desired glyph repertoire using composites that reference the appropriate base and dot glyphs. The glyphs for various patterns of dots can never appear in the output stream, so they do not need production names.