Tag: paleography

  • APEX Updates, 10: Glyph to System



    Complexity trends for three letters over 700 years on Euboea,
    from my forthcoming diachronic study.

    When I began APEX seven months ago, I wrote that before theory comes tracing—the act of turning old strokes into structured data. Half a year later, that small act has grown into something larger: a functioning, extensible research environment capable of analyzing thousands of letterforms across hundreds of inscriptions.

    If the last few months have been about proving the analytical potential of APEX, this one has been about deepening its usability—turning it from a powerful engine into a genuine workspace. The latest version, 1.9.1, focuses on the graphical user interface, which I only dreamed of last April. The idea was to give form to the human side of paleographemics: how scholars see, record, and reason through inscriptions.

    The platform now balances two goals that usually pull in opposite directions. It is rigorous enough to handle multilingual, multi-directional corpora across millennia, yet flexible enough to capture interpretive uncertainty and scholarly disagreement.

    1. The Corpus Grows

    APEX now contains its first completed regional corpus: 209 lead curse tablets from 5th-century BCE Styra (Euboea), encompassing 1,857 individual glyphs, each manually traced, annotated, and analyzed through the full APEX pipeline. Alongside this is a parallel dataset of another 99 Euboean inscriptions, spanning roughly eight centuries—from the Archaic through the Hellenistic period—processed through an exploratory workflow still in development.

    Together, these two datasets represent 4,990 glyphs from the island of Euboea alone, making this one of the largest and most detailed regional paleographic corpora currently in existence. This body of material allows APEX not only to test technical scalability but to examine a single region’s graphical traditions across a complete chronological arc.

    Selected gallery of “Most Typical Glyphs by Letter” from Styra lead tablet report.

    Within Styra alone, clear structural tendencies emerge. Across the 1,857 analyzed glyphs, symmetry and complexity show a strong inverse relationship, as expected, but now quantified. Others—especially theta and, unexpectedly, many iotas—deviate, showing that simplicity and circularity were not universal ideals but locally negotiated habits.

    Classical intuitions—decreasing complexity through the Archaic and Early Classical periods, a plateau, then a late stylistic uptick—are confirmed here, but more importantly, they’re now quantified

    Across the broader eight-century span, early tendencies toward angularity give way to smoother, more balanced forms. Though not universally—delta stands out, evolving from a 2-stroke rounded D-shape to the familiar 3-stroke angular Δ, a shift that mirrors the broader transition from ductus-driven to design-driven writing. Nonetheless, this broadly confirmed long-standing epigraphic intuitions, but for the first time, making them concrete and measurable.

    Taken together, the data suggest that what epigraphers once described qualitatively as a “balanced hand” or “tidy style” can now be measured as a structural principle—evidence that writers (whoever they may be, trained scribes or so-called ordinary people) in 5th-century Styra pursued an underlying visual economy that blurred the boundary between mechanical habit and aesthetic intention.

    2. The Interface Takes Shape

    v1.9.1 particularly hinges on a comprehensive Inscription Metadata panel—a modular framework for recording everything an inscription can tell us: provenance, language, writing direction, translation, confidence, and context.

    The (very granular) metadata panel, designed for maximum precision. This will later
    allow highly dimensional, unsupervised machine learning (ML) to be performed.

    Furthermore, there’s an extensive rights and permissions panel just below that. This enables future rights-safe integration with public databases, preserving sensitive and restricted information from accidental reproduction—critical in heritage preservation and in preventing looting/destruction, especially in conflict zones. Now that I’m pivoting to working with data outside of the public domain, this is a non-negotiable feature, and I hope this is a practice that others replicate when fusing rights-diverse corpora. Below is the model of that.

    Each record can now be broken into sublines, allowing users to specify separate languages and writing directions within a single inscription. This makes it possible to manually encode boustrophedon layouts, alternating left-to-right and right-to-left lines without losing reading order. The same applies for multilingual inscriptions: the user isolates each portion of a different language subline to analyze individually. However, true schlangenschrift—the serpentine style of continuous directional change—remains a technical frontier still ahead, but the architecture for handling it is now in place.

    Bounding boxes are direction-aware, indexed according to reading orientation, ensuring that extracted visual features align correctly with the direction of writing. Metadata imports from museum APIs are now supported, and flexible fields allow users to enter additional descriptors such as inscription purpose, formula, or archaeological context.

    3. Encoding the Human Element

    Each glyph now carries its own metadata through a compact per-glyph panel. Users can record completeness, stroke count, and intersection data, and—critically—can flag alternative readings where forms are contested. The new Scholarship Mode attributes alternate identifications to specific scholars or corpora, creating a visible interpretive genealogy and turning disagreement into structured data.

    5-tier completeness flags now present.

    What results is a layered model of knowledge. APEX no longer treats the epigrapher’s uncertainty as noise; it considers it data. Each recorded disagreement becomes part of the historical record of how these inscriptions have been read.

    User can now cite alternative readings and the reasoning for them.

    4. Intelligent Defaults

    Five editable, language-aware dictionaries now exist for the GCELL script cluster, i.e., Greek, Coptic, Etruscan, Latin, and Lydian. There are another five dictionaries on the way for the PASHA branch: Phoenician, Aramaic, Semitic, Hebrew, and Arabian. This capability autofills letter names with their expected stroke and intersection counts, cutting per-glyph processing time by ~70%. These default expectations provide baselines for feature extraction and make visible the subtle divergences that define local or experimental hands.

    The dynamic Greek dictionary.

    5. From Interface to Insight

    The combination of robust metadata, per-glyph fields, and structured dictionaries has turned APEX into a living research environment. A researcher can now import an object from a museum API, record multilingual metadata, define directionality, tag individual glyphs, and export a ready-to-analyze JSON file—all within a single interface.

    Early exploratory notebooks using the full eight-century dataset are already visualizing regional drift and stylistic convergence over time. Though not yet publishable, these models provide a first view of how letterforms move within and between centuries, forming clusters of continuity and outliers of innovation.

    Critically, they also provide examples contrary to certain received wisdom. See the following chart, and note that the p-value of this correlation is p = 0.37, well above the <0.05 threshold for statistical significance in the social sciences.

    In the eight-century dataset, letter frequency shows only a weak and statistically insignificant relationship to graphical stability. The trend line slopes slightly downward—more common letters like alpha, sigma, and omicron are somewhat more stable—but the effect is far from reliable. This suggests that the conventional linguistic expectation—that frequently used units remain more conservative—does not translate cleanly to letterforms. Here, stability may follow style and medium more than frequency.

    6. Reflection

    The major achievement of this phase is not simply scale—it’s integration. APEX has reached a point where drawing, data entry, and interpretation form a continuous loop. Each inscription is both a record of ancient writing and a record of modern reading.

    With nearly five thousand glyphs from one major region already processed, APEX is beginning to reveal what paleographemics promises: the ability to study writing as a cultural system that can be seen, measured, and compared without losing its human texture.

    Download a PDF of the abridged report (13 pages): A Synchronic Analysis of 5th-Century BCE Lead Tablet Inscriptions from Styra on Euboea

  • APEX Updates, 2: What is the Transmission Problem? A Brief History of My Research Question

    If the first APEX post was about tracing letters, this one is about why those traces matter. Underneath every variant alpha or eccentric epsilon is a deeper question: when, how, and under what conditions did the Greek alphabet emerge from its West Semitic predecessor? This question, which is known in the scholarship as the transmission problem, lies at the core of alphabetic studies, and despite over a century of scholarship, it remains fiercely contested. To map alphabetic transmission is not just to track graphical similarity, but to reckon with how cultures borrow, adapt, forget, and reimagine the systems by which they make language visible.

    At its simplest, the transmission problem asks: When did the Greeks adopt the Phoenician script? But the real terrain is messier. Did the transfer happen once or multiple times? Was it sudden or gradual? Coordinated or ad hoc? Which region of the Greek-speaking world was first? Exactly which Semitic script was the donor—or was there a confluence of models? And what kind of evidence—linguistic, paleographic, archaeological—should we privilege when our sources conflict?

    Historically, the debate has followed disciplinary lines. Scholars trained in Semitic philology and Near Eastern studies tend to favor a high date for the transmission: sometime in the 11th or 10th century BCE, before the traditional Greek Geometric period (in older scholarship, referred to as the “Greek Dark Age”). This camp emphasizes the strong formal similarities between early Greek and Phoenician letterforms, arguing that Greek epichoric scripts most closely resemble Phoenician forms from around 1050 BCE, not the later shapes one would expect if transmission occurred in the 8th century. Joseph Naveh, for instance, in his landmark Early History of the Alphabet (1982), argued that the Greek system must have branched off before major innovations appear in the Phoenician script, such as the angular mem or evolved forms of shin. Naveh saw the Greek alphabet as a snapshot of an earlier Semitic system—evidence, in his view, of early contact and early borrowing.

    On the other side of the debate, Classicists and archaeologists tend to argue for a low date, favoring the 8th century BCE. Their reasoning draws primarily from stratified archaeological contexts: the earliest securely datable Greek inscriptions—such as the Dipylon oinochoe and the Nestor’s Cup from Pithekoussai—belong to the mid-to-late 8th century. Rhys Carpenter was among the earliest and most forceful voices in this camp. In a 1933 article, he wrote that “the argumentum a silentio grows every year more formidable and more conclusive,” referring to the continued absence of any Greek alphabetic inscriptions predating the eighth century (“The Antiquity of the Greek Alphabet,” AJA 37 [1933]: 8–29, at p. 27). For Carpenter, the lack of material evidence was not a gap to be explained away, but itself a powerful datum: if earlier use had existed, we would likely have found traces by now.

    This school is generally skeptical of typological comparison, pointing out that letterforms evolve unevenly and can be conservative in certain contexts. Archaeological absence, while never conclusive, is taken seriously—especially when paired with the sudden, near-simultaneous appearance of inscriptions across disparate sites in the 8th century, suggesting a relatively rapid uptake of a recently acquired script. Later scholars, such as Barry B. Powell, built on this foundation. In Homer and the Origin of the Greek Alphabet (1991), Powell controversially argued that the Greek alphabet was deliberately invented for the purpose of recording Homeric verse, dating the invention to around 750 BCE. Though widely criticized for its teleology and lack of evidence for such a top-down design, Powell’s theory exemplifies the kind of interdisciplinary crossfire that defines this problem: where linguistic function, archaeological data, and cultural ideology all collide.

    Roger D. Woodard, in Greek Writing from Knossos to Homer (1997), pushed back against Powell while still supporting a relatively late date. Woodard views the alphabet’s adaptation as a process shaped not only by contact with Phoenician traders but also by internal Greek developments—especially the memory of Linear B and broader shifts in literacy practices. He emphasizes the complex interplay between tradition and innovation, seeing the Greek vowel system as a structural solution that could only emerge in a linguistic environment receptive to phonological precision.

    The question remains open, but APEX offers a different kind of approach. Rather than anchoring the debate to a single origin point, I focus on regional trajectories and graphical evidence: how letterforms vary, travel, and settle. If the Semitic party line reads the Greek alphabet as a photograph of Phoenician forms from 1000 BCE, and the archaeological model sees it as an emergent public tool of the 8th century, then I want to understand how specific graphemes move through space and time. Which forms remain stable across centuries? Which mutate rapidly? And what can that tell us about the process of transmission, rather than the moment of origin?

    In fact, the most immediate goal of the APEX project is to evaluate whether the Greek letterforms do, in fact, most closely resemble the Semitic models from around 1000 BCE—as the high-date camp maintains—or if their nearest parallels lie elsewhere in the Phoenician typology. The intention is to move beyond qualitative comparisons and scholarly intuition, toward a quantitative, statistically grounded assessment of letterform similarity. By measuring and modeling these visual relationships systematically, APEX aims to provide a more objective foundation for dating the moment of greatest resemblance between the Greek and Phoenician scripts.

    Rather than jumping straight into letterform similarity metrics, though, the next update will take a detour—one that’s no less crucial. Before the vectors can speak, they must be named, contextualized, and organized. APEX Updates, 3: Encoding Decisions will explore how I’m structuring the metadata that surrounds each traced letter: what counts as “context,” how information is tagged, and why every dataset is also a narrative. As it turns out, deciding how to describe a letter may be just as revealing as deciding how to compare it.

  • Tools of the Trade, 1: Epigraphy: The Local Scripts of Archaic Greece by L.H. Jeffery

    Jeffery’s summary table of all epichoric scripts at the end of LSAG.
    It is foundational for any work on early regional Greek scripts.

    There are very few books I consider truly irreplaceable in my research. Lilian H. Jeffery’s The Local Scripts of Archaic Greece is one of them. First published in 1961 and revised in 1990 with A.W. Johnston, this book remains the reference for regional variations in the Greek alphabet during the archaic period. It’s where I first learned to read epichoric inscriptions with the eye of a paleographer rather than a Classicist alone.

    The book is very hard to find, and I only got my copy at an even remotely affordable price after months of scouring secondhand sellers. While copies still circulate among libraries and the used book market, I wanted to make it more accessible to others working in this area. So I hunted diligently before finding it on the Internet Archive. You can read or download it here:
    The Local Scripts of Archaic Greece (1990 ed.) – Internet Archive

    Jeffery’s study remains foundational for any work on early Greek writing—not just in Athens or Ionia, but across the full spectrum of regional scripts: Corinthian, Euboian, Attic-Boeotian, Cretan, Cycladic, and others. It includes extensive commentary, maps, and an invaluable inscriptional catalogue organized by region, with drawings and typographic transcriptions. The 1990 revision added important corrections, expanded references, and additional illustrative material. For those of us studying alphabetic transmission, especially the Phoenician-Greek interface or the evolution of letterforms over time, this book is indispensable.

    What makes Local Scripts especially useful is that it bridges the gap between paleography, archaeology, and linguistics. Jeffery doesn’t just chart when and where a particular variant of alpha or epsilon shows up—she explains what those variations might imply for chronology, influence, and contact. And although her typology has been revised and challenged in places (especially with the discovery of new inscriptions), her system remains a critical baseline for almost every study that’s come after.

    Whether you’re interested in early Greek literacy, the transmission of the alphabet, the sociopolitical meaning of epigraphy, or just want to be able to tell the difference between Laconian and Euboian chi, this is the book to start with. I hope having it freely available will be helpful to others navigating this fragmentary and fascinating material.

    Do you have other resources you pair with Jeffery? I’d love to hear what we can supplement LSAG with.