
It contains approximately 260 characters, which works out to a rate of 1 character per 20 seconds.
After weeks of tracing letterforms by hand—squinting at jagged facsimiles and smoothing them into curves—I’ve hit a bottleneck. Manual vectorization has given me precision and intimacy with the material, but it’s not sustainable. Each traced letter takes time, care, and a degree of interpretive judgment that can’t be scaled easily. I even have a bit of a hand tremor that’s sometimes made me rely on line straightening and curve smoothing, which obviously is going to distort the measurement of features such as symmetry and curvature score. As I move toward building a larger corpus, I’ve had to ask: What’s keeping me from working at the scale this project demands?
The answer, in short, is the tracing pipeline. My current workflow looks like this:
- Scan the inscription facsimile (mostly from IG, plus some drawings from my 2022 semester in Athens)
- Import into Adobe Fresco on my iPad
- Trace each letter manually, often with correction enabled due to an unsteady hand
- Export as SVG
- Import to a Python program for analysis with OpenCV
- Export measured features to JSON
This essentially works, but it’s fragile. It depends on my eyesight, my steadiness, and my judgment. More importantly, it doesn’t scale. To move beyond 50 or so well-documented instances, I need to automate at least part of this process.
I’ve brainstormed a few approaches:
- Edge detection + curve fitting using OpenCV and Potrace
- Image preprocessing to isolate ink
- Eventually: Interactive labeling that lets a human confirm or correct bounding boxes and centerlines before full vectorization
So far, nothing replaces the hand trace. However, I’m refining the steps—normalizing resolution, simplifying contours (overcounting has been a major problem, even with hand-vectorization, surprisingly), and reducing noise—so that a machine can at least propose a first draft. Once I trust the pipeline, I can begin comparing letters in bulk—but not until then.
For now, I’m working under this model both as a proof of concept and because I have a hard deadline: an MVP (minimum viable product) is due on April 25th for my final project in my Data Science for Archaeology class. (That’s in NYU’s Anthro department, for anyone curious). That constraint is shaping my whole approach—what gets prioritized, what gets cut, and how I balance the methodological ideals with the practical demands of execution.
In the next post, I’ll zoom out from metadata and back into morphology—not through computation just yet, but through design. This detour will help us begin to operationalize high-level concepts like complexity and similarity—ideas that seem intuitive at first glance, but quickly reveal their computational thorns. APEX Updates, 5 will explore what I’m calling the “Geometric Mindset”: the tendency toward symmetry, regularity, and visual balance that emerges in early Greek inscriptions. What kinds of shapes did Greek scribes favor? What does it mean to “correct” a letter? And how might a cultural aesthetic of order and legibility leave its mark on the alphabet itself?
Leave a comment