
Before APEX can tell us anything about the shapes of letters, we have to tell it what those letters are—where they came from, when, how confidently we know that, and why we think so. Each traced form in the database is embedded in metadata, and each metadata field is a choice. Or, more accurately, a battleground between competing demands: standardization vs. specificity, precision vs. honesty, completeness vs. clarity.
This post is about how those choices get made.
Metadata is Never Neutral
In paleographic projects, metadata is often treated as scaffolding: a tidy column of site names, dates, scripts, and references that supports the real work of analysis. But metadata is also where the most important theoretical decisions live. Is this letter from “Attica” or from “Athens”? Is it dated to “c. 750 BCE,” to “the mid-8th century,” or to “LG I”? Is the script “Euboean” or “Eubo-Cypriot” or just “Greek”? And what happens when those terms carry more ambiguity than certainty?
Each of these choices folds a story into the dataset. A story about geography, chronology, authority, and classification. And once the data is encoded, those stories begin to function as fixed facts—however uncertain or contingent they might actually be.
The Problem of Uncertainty
Uncertainty is endemic to early alphabetic evidence. Sites are excavated unevenly. Inscriptions are fragmentary. Provenance can be speculative. Paleographic dating relies on a mix of typological comparison and stratigraphy, both of which are prone to revision.
But digital systems, including APEX, are uncomfortable with uncertainty. The data model wants a number, not a range; a site name, not a narrative; a script label, not a conditional. This means encoding decisions often involve collapsing complexity into legibility. The trick is to do this transparently, and to leave enough of a trail that the uncertainties can still be seen.
To that end, I’m building out APEX metadata with fields for:
- Date (range + confidence + reason for): e.g., “750–725 BCE (medium confidence) (archaeological strata)”
- Provenance (text + category): allowing freeform notes alongside structured site-region-place data
- Script classification (source + variant): to distinguish between modern typologies and how scripts were likely conceived by ancient users
- Notes: a catch-all for uncertainty flags, alternate readings, or disputed attributions
This structure still imposes a frame, but it tries to make the cracks visible.
Metadata as Narrative
Every field in APEX is a decision, and every decision reflects a point of view—whether archaeological, linguistic, technological, or historiographic. That makes the metadata not just a scaffold but a narrative: a set of assumptions about what kind of thing the alphabet is, how it moves through space and time, and how we can know what we know. As the project scales, that narrative will shape what kinds of conclusions are possible. So I want to make it visible now, while it’s still under construction, as part of the work.
In the next update, I’ll cover the baseline nature of the pipeline and what makes it so difficult—but also so precise.
Leave a comment