Last time, we got our hands dirty and built a simple cmap table parser. Today, let’s keep the momentum going and talk about another essential table—don’t worry, the journey is half the fun!
Unified Table Format
Remember how we wrote the cmap table with a structure like this?
pub const Table = struct {
pub fn parse()
pub fn deinit()
// ... etc
}As we add more and more tables, it’s clear we want all of them to share a stable API—at least parse and deinit (for those that need an allocator). Wouldn’t it be nice if we could just define an interface for all tables? Well, Zig isn’t Rust or Go, so we don’t get interfaces or traits out of the box. But hey, that’s not going to stop us!
/// Zig doesn't have interfaces or traits like Rust or Go, so let's mock up a vtable.
const std = @import("std");
const reader = @import("./byte_read.zig");
const Allocator = std.mem.Allocator;
const Table = @This();
ptr: *anyopaque,
vtable: *const VTable,
pub const VTable = struct {
/// Parse table data
parse: *const fn (*anyopaque) anyerror!void,
/// Free the table and its resources.
deinit: *const fn (*anyopaque) void,
};
pub fn parse(self: Table) anyerror!void {
return self.vtable.parse(self.ptr);
}
pub fn deinit(self: Table) void {
self.vtable.deinit(self.ptr);
}
/// Type-safe cast to specific table type
pub fn cast(self: Table, comptime T: type) *T {
return @ptrCast(@alignCast(self.ptr));
}And that’s it! Now every table can plug into this “interface” and we can treat them all the same way—no matter how weird or wonderful their internals are.
It’s time to refactor cmap now
/// No need define CmapTable anymore. Just like std Allocator we define thme in top level.
const std = @import("std");
const reader = @import("../byte_read.zig");
const Table = @import("../table.zig");
const Error = @import("./errors.zig").Error;
const Allocator = std.mem.Allocator;
const Self = @This();
allocator: Allocator,
byte_reader: *reader.ByteReader,
version: u16,
num_tables: u16,
encoding_records: []EncodingRecord,
subtables: []CmapFormat,
pub const CmapFormat = union(enum) {
format4: Format4,
format12: Format12
};
fn deinit(ptr: *anyopaque) void {
const self: *Self = @ptrCast(@alignCast(ptr));
if (self.encoding_records.len > 0) {
self.allocator.free(self.encoding_records);
}
for (0..self.subtables.len) |i| {
self.subtables[i].deinit(self.allocator);
}
if (self.subtables.len > 0) {
self.allocator.free(self.subtables);
}
self.allocator.destroy(self);
}
pub fn init(allocator: Allocator, byte_reader: *reader.ByteReader) !Table {
const self = try allocator.create(Self);
errdefer allocator.destroy(self);
self.allocator = allocator;
self.byte_reader = byte_reader;
self.version = 0;
self.num_tables = 0;
self.encoding_records = &.{};
self.subtables = &.{};
return Table{
.ptr = self,
.vtable = &.{ .parse = parse, .deinit = deinit },
};
}Glyf Table: Where Glyphs Come Alive
Alright, now for the fun part—let’s dive into the glyf table! If the cmap table tells us “which glyph to use for a character,” then the glyf table is where the actual shape of each glyph lives. This is where the magic happens and letters get their curves.
What’s Inside the Glyf Table?
Every glyph in a TrueType font is described here, and there are two main types:
- Simple Glyphs: These are your basic glyphs, made up of contours (think: a list of points that form the outline).
- Composite Glyphs: These are built by combining other glyphs, with optional transformations like scaling or shifting—super handy for accented characters or ligatures.
Anatomy of a Glyph
Every glyph starts with a header:
pub const GlyphHeader = struct {
number_of_contours: i16,
x_min: i16,
y_min: i16,
x_max: i16,
y_max: i16,
pub fn is_simple(self: GlyphHeader) bool {
return self.number_of_contours >= 0;
}
pub fn is_composite(self: GlyphHeader) bool {
return self.number_of_contours < 0;
}
};This tells us if we’re dealing with a simple or composite glyph, and gives us the bounding box.
Parsing a Simple Glyph
Simple glyphs are all about contours and points. Here’s the general flow:
- Read the contour endpoints: These tell us where each contour ends in the point list.
- Read instructions: These are for hinting (making the glyph look good at small sizes).
- Read flags: Each point has a flag describing its properties (on-curve, how its coordinates are stored, etc).
- Read X and Y coordinates: Decoded according to the flags.
The trickiest part? Flags can be repeated, and coordinates are often stored as deltas, not absolute values. Here’s a taste of the code:
fn parse_simple_glyph(self: *Self, glyph_header: GlyphHeader) void {
var end_pts_of_contours = try self.allocator.alloc(u16, glyph_header.number_of_contours);
errdefer self.allocator.free(end_pts_of_contours);
for (0..glyph_header.number_of_contours) |i| {
end_pts_of_contours[i] = try self.byte_reader.read_u16_be();
}
const instruction_length = try self.byte_reader.read_u16_be();
var instructions = try self.allocator.alloc(u8, instruction_length);
errdefer self.allocator.free(instructions);
for (0..instruction_length) |i| {
instructions[i] = try self.byte_reader.read_u8();
}
const variable = if (glyph_header.number_of_contours > 0) end_pts_of_contours[glyph_header.number_of_contours - 1] + 1 else 0;
var flags = try self.allocator.alloc(u8, variable);
errdefer self.allocator.free(flags);
var flag_index: usize = 0;
while (flag_index < variable) {
const flag = try self.byte_reader.read_u8();
flags[flag_index] = flag;
flag_index += 1;
if ((flag & 0x08) != 0 and flag_index < variable) {
const repeat_count = try self.byte_reader.read_u8();
for (0..repeat_count) |_| {
if (flag_index >= variable) break;
flags[flag_index] = flag;
flag_index += 1;
}
}
}
var x_coordinates = try self.allocator.alloc(i16, variable);
errdefer self.allocator.free(x_coordinates);
var current_x: i16 = 0;
for (0..variable) |i| {
const flag = flags[i];
if ((flag & 0x02) != 0) {
const delta = try self.byte_reader.read_u8();
if ((flag & 0x10) != 0) {
current_x += @intCast(delta);
} else {
current_x -= @intCast(delta);
}
} else if ((flag & 0x10) == 0) {
const delta = try self.byte_reader.read_i16_be();
current_x += delta;
}
x_coordinates[i] = current_x;
}
var y_coordinates = try self.allocator.alloc(i16, variable);
errdefer self.allocator.free(y_coordinates);
var current_y: i16 = 0;
for (0..variable) |i| {
const flag = flags[i];
if ((flag & 0x04) != 0) {
const delta = try self.byte_reader.read_u8();
if ((flag & 0x20) != 0) {
current_y += @intCast(delta);
} else {
current_y -= @intCast(delta);
}
} else if ((flag & 0x20) == 0) {
const delta = try self.byte_reader.read_i16_be();
current_y += delta;
}
y_coordinates[i] = current_y;
}
return SimpleGlyph{
.header = glyph_header,
.end_pts_of_contours = end_pts_of_contours,
.instructions = instructions,
.flags = flags,
.x_coordinates = x_coordinates,
.y_coordinates = y_coordinates,
};
}Parsing a Composite Glyph
Composite glyphs are like LEGO sets: they’re built from other glyphs, with optional transformations (scale, matrix, etc). The parser loops through each component, reading its flags, glyph index, arguments, and any transformation.
If the last component says there are instructions, we read those too.
fn parse_composite_glyph(self: *Self, glyph_header: GlyphHeader) !CompositeGlyph {
var components = std.ArrayList(CompositeComponent).init(self.allocator);
errdefer components.deinit();
var has_more_components = true;
while (has_more_components) {
const flags = try self.byte_reader.read_u16_be();
const glyph_index = try self.byte_reader.read_u16_be();
var arg1: i32 = 0;
var arg2: i32 = 0;
if ((flags & 0x0001) != 0) {
arg1 = try self.byte_reader.read_i16_be();
arg2 = try self.byte_reader.read_i16_be();
} else {
arg1 = @as(i8, @bitCast(try self.byte_reader.read_u8()));
arg2 = @as(i8, @bitCast(try self.byte_reader.read_u8()));
}
var transform = ComponentTransform.none;
// https://learn.microsoft.com/en-us/typography/opentype/spec/glyf#composite-glyph-description
if ((flags & 0x0008) != 0) {
const scale_raw = try self.byte_reader.read_i16_be();
const scale = @as(f32, @floatFromInt(scale_raw)) / 16384.0;
transform = ComponentTransform{ .scale = .{ .scale = scale } };
} else if ((flags & 0x0040) != 0) {
const x_scale_raw = try self.byte_reader.read_i16_be();
const y_scale_raw = try self.byte_reader.read_i16_be();
const x_scale = @as(f32, @floatFromInt(x_scale_raw)) / 16384.0;
const y_scale = @as(f32, @floatFromInt(y_scale_raw)) / 16384.0;
transform = ComponentTransform{ .xy_scale = .{ .x_scale = x_scale, .y_scale = y_scale } };
} else if ((flags & 0x0080) != 0) {
const xx_raw = try self.byte_reader.read_i16_be();
const xy_raw = try self.byte_reader.read_i16_be();
const yx_raw = try self.byte_reader.read_i16_be();
const yy_raw = try self.byte_reader.read_i16_be();
const xx = @as(f32, @floatFromInt(xx_raw)) / 16384.0;
const xy = @as(f32, @floatFromInt(xy_raw)) / 16384.0;
const yx = @as(f32, @floatFromInt(yx_raw)) / 16384.0;
const yy = @as(f32, @floatFromInt(yy_raw)) / 16384.0;
transform = ComponentTransform{ .matrix = .{ .xx = xx, .xy = xy, .yx = yx, .yy = yy } };
}
const component = CompositeComponent{
.flags = flags,
.glyph_index = glyph_index,
.arg1 = arg1,
.arg2 = arg2,
.transform = transform,
};
try components.append(component);
has_more_components = (flags & 0x0020) != 0;
}
var instructions: []u8 = &.{};
if (components.items.len > 0) {
const last_component = components.items[components.items.len - 1];
if ((last_component.flags & 0x0100) != 0) {
const instruction_length = try self.byte_reader.read_u16_be();
instructions = try self.allocator.alloc(u8, instruction_length);
errdefer self.allocator.free(instructions);
for (0..instruction_length) |i| {
instructions[i] = try self.byte_reader.read_u8();
}
}
}
return CompositeGlyph{
.header = glyph_header,
.components = try components.toOwnedSlice(),
.instructions = instructions,
};
}One Entry Point to Rule Them All
Whether it’s simple or composite, we use the same entry point:
pub fn parse_glyph(self: *Self, glyph_offset: u32) !void {
try self.byte_reader.seek_to(self.table_offset + glyph_offset);
const glyph_header = GlyphHeader{
.number_of_contours = try self.byte_reader.read_i16_be(),
.x_min = try self.byte_reader.read_i16_be(),
.y_min = try self.byte_reader.read_i16_be(),
.x_max = try self.byte_reader.read_i16_be(),
.y_max = try self.byte_reader.read_i16_be(),
};
if (glyph_header.is_simple()) {
const simple = try parse_simple_glyph(self, glyph_header);
return ParsedGlyph{ .simple = simple };
} else if (glyph_header.is_composite()) {
const composite = try self.parse_composite_glyph(glyph_header);
return ParsedGlyph{ .composite = composite };
} else {
return Error.InvalidGlyfTable;
}
}Wrapping Up
The glyf table is the heart of any TrueType font—it’s where the outlines live, and where all the clever tricks for building complex glyphs happen. Parsing it is a bit of a puzzle, but once you get the hang of the flags and the delta encoding, it’s actually pretty fun.
Next up: let’s see if we can render some glyphs and actually draw something on the screen. Stay tuned—this is where fonts really come to life!
If you made it this far, congrats! You now know how to crack open the most important part of a font file. Got questions or want to see more? Drop a comment or open an issue—let’s geek out about fonts together!