APEX Updates, 4: The Pipeline Problem: Computer Vision and the Limits of Manual Tracing

Written by

Theodore Fitzgerald Avedisian Van de Walle

This inscription, the Poteidaia epigram (IG I³ 1179, CEG 1, no. 10), took me 1.5 hours to trace.
It contains approximately 260 characters, which works out to a rate of 1 character per 20 seconds.

After weeks of tracing letterforms by hand—squinting at jagged facsimiles and smoothing them into curves—I’ve hit a bottleneck. Manual vectorization has given me precision and intimacy with the material, but it’s not sustainable. Each traced letter takes time, care, and a degree of interpretive judgment that can’t be scaled easily. I even have a bit of a hand tremor that’s sometimes made me rely on line straightening and curve smoothing, which obviously is going to distort the measurement of features such as symmetry and curvature score. As I move toward building a larger corpus, I’ve had to ask: What’s keeping me from working at the scale this project demands?

The answer, in short, is the tracing pipeline. My current workflow looks like this:

Scan the inscription facsimile (mostly from IG, plus some drawings from my 2022 semester in Athens)
Import into Adobe Fresco on my iPad
Trace each letter manually, often with correction enabled due to an unsteady hand
Export as SVG
Import to a Python program for analysis with OpenCV
Export measured features to JSON

This essentially works, but it’s fragile. It depends on my eyesight, my steadiness, and my judgment. More importantly, it doesn’t scale. To move beyond 50 or so well-documented instances, I need to automate at least part of this process.

I’ve brainstormed a few approaches:

Edge detection + curve fitting using OpenCV and Potrace
Image preprocessing to isolate ink
Eventually: Interactive labeling that lets a human confirm or correct bounding boxes and centerlines before full vectorization

So far, nothing replaces the hand trace. However, I’m refining the steps—normalizing resolution, simplifying contours (overcounting has been a major problem, even with hand-vectorization, surprisingly), and reducing noise—so that a machine can at least propose a first draft. Once I trust the pipeline, I can begin comparing letters in bulk—but not until then.

For now, I’m working under this model both as a proof of concept and because I have a hard deadline: an MVP (minimum viable product) is due on April 25th for my final project in my Data Science for Archaeology class. (That’s in NYU’s Anthro department, for anyone curious). That constraint is shaping my whole approach—what gets prioritized, what gets cut, and how I balance the methodological ideals with the practical demands of execution.

In the next post, I’ll zoom out from metadata and back into morphology—not through computation just yet, but through design. This detour will help us begin to operationalize high-level concepts like complexity and similarity—ideas that seem intuitive at first glance, but quickly reveal their computational thorns. APEX Updates, 5 will explore what I’m calling the “Geometric Mindset”: the tendency toward symmetry, regularity, and visual balance that emerges in early Greek inscriptions. What kinds of shapes did Greek scribes favor? What does it mean to “correct” a letter? And how might a cultural aesthetic of order and legibility leave its mark on the alphabet itself?

APEX Updates

Comments

3 responses to “APEX Updates, 4: The Pipeline Problem: Computer Vision and the Limits of Manual Tracing”

October 21, 2025

APEX Updates, 3: Encoding Decisions – To Wake the Dead

[…] the next update, I’ll cover the baseline nature of the pipeline and what makes it so difficult—but also so […]

LikeLike

Reply
November 2, 2025

APEX Updates, 12: Teaching the Machine to Read – To Wake the Dead

[…] beneath that ritual lies a bottleneck—the invisible labor of segmentation. Before any computer can analyze a letter, someone has to […]

LikeLike

Reply
November 2, 2025

APEX Updates, 13: Designing in the Shadows – To Wake the Dead

[…] thread has been there since the beginning. Encoding Decisions asked how metadata carries ideology. The Pipeline Problem wrestled with the impossibility of full automation. Teaching the Machine to Read turned that […]

LikeLike

Reply

To Wake the Dead

APEX Updates, 4: The Pipeline Problem: Computer Vision and the Limits of Manual Tracing

Comments

3 responses to “APEX Updates, 4: The Pipeline Problem: Computer Vision and the Limits of Manual Tracing”

Leave a comment Cancel reply

More posts

Introduction (Pinned)

APEX Updates, 14: Data, from FAIR to FRAIL

APEX Updates, 13: Designing in the Shadows

APEX Updates, 12: Teaching the Machine to Read

Marginalia, 7: The Archive’s Great Secret

APEX Updates, 4: The Pipeline Problem: Computer Vision and the Limits of Manual Tracing

Share this:

Comments

3 responses to “APEX Updates, 4: The Pipeline Problem: Computer Vision and the Limits of Manual Tracing”

Leave a comment Cancel reply

More posts

Introduction (Pinned)

APEX Updates, 14: Data, from FAIR to FRAIL

APEX Updates, 13: Designing in the Shadows

APEX Updates, 12: Teaching the Machine to Read

Marginalia, 7: The Archive’s Great Secret