Typography and AI

8 min read

Why AI types badly by default, and the vocabulary that fixes most of it.

The gibberish problem

Type “modern logo with bold text” into any current image generator. You get something that looks competent at thumbnail size and falls apart on inspection. Glyphs that aren’t quite letters. X-heights that drift across a single word. Weights that swap mid-sentence. Fake serifs grafted onto sans skeletons. AI typography is bad by default.

It’s not the model. The model can render type when given specific instructions about what kind of type. The problem is that “modern bold typography” is not specific. The model resolves the ambiguity toward the median of what its training data labelled “modern bold typography.” When you ask for that, you get generic sans-with-attitude in fake Helvetica.

The fix is knowing the vocabulary. Designers who can name what they want (slab versus grotesque, contrast ratio, optical sizing, tracking) get usable type out of AI. Everyone else gets gibberish set in fake Helvetica.

Each fundamental works the same way: name the vocabulary, demonstrate the uplift, apply.

The vocabulary bottleneck

Paul Bakaus’s Impeccable skill (an open-source project giving AI assistants design vocabulary) has earned significant open-source traction; the point is direct: good design prompts require design vocabulary. The model synthesises; vocabulary is the input that shapes the output.

Typography shows this pattern most clearly because it has the highest resolution. There are dozens of named terms (anatomy, classification, optical adjustments) that map directly to controllable visual outcomes. A designer who knows the vocabulary can specify “humanist sans, optical-sized for display, generous tracking, slight contrast” and get something specific. A designer who doesn’t gets “modern bold typography” and gets the average.

Vocabulary takes years to build but minutes to deploy. Once you have it, every prompt becomes more precise. This article gives you the entry-level vocabulary and shows what becomes available with it.

The methodology that uses this vocabulary in production is The Taste-First Method, the Workflow pillar’s anchor article. Vocabulary in this article corresponds to constraint construction in Phase 2 of that methodology: phrases you put in the prompt to produce specific output instead of average output.

Demo 1: logo / wordmark vocabulary uplift

Brief: “Wordmark for a small specialty coffee roaster, founded 2026.”

Fig-01: three-panel comparison. Same brief, different prompts and tools.

Panel A: without vocabulary. Midjourney v7. Prompt: “Modern logo with bold text.” Output: predictable centred wordmark in geometric sans (sans-serif on geometric construction; Futura, Avenir). Tracking too tight, all caps. Looks fine in passing. Falls apart in actual use because kerning (the spacing between specific letter pairs) is uneven, the K and R have inconsistent x-heights, weight shifts mid-word.

Panel B: with vocabulary. Midjourney v7. Same brief. Prompt: “Minimalist wordmark in geometric grotesque, optical-sized for display (type designed to render correctly at a specific scale; display sizes have tighter spacing), generous tracking (uniform letterspacing across a run, expressed in 1/1000 em) at 50/1000, baseline-locked, single weight throughout, lowercase set.” Output: cohesive wordmark, even spacing, consistent weight, ready to refine in Illustrator.

Panel C: the Firefly comparison. Adobe Firefly. Same vocabulary-rich prompt. Output: cleaner glyph rendering than Midjourney; fewer fake-letter artefacts. Tool choice and vocabulary work together. The right model with the right vocabulary beats either alone.

The variable across the three panels is vocabulary, plus (in Panel C) tool choice. Same designer, same minute, specific output instead of average output.

Demo 2: editorial cover vocabulary uplift

Brief: “Magazine cover for an indie design quarterly.”

Panel A: without vocabulary. Midjourney v7. Prompt: “Magazine cover, indie design quarterly.” Output: stock-photo grid layout, generic serif headline, no compelling hierarchy. Could be any magazine.

Panel B: with hierarchy and tracking vocabulary. Same brief. Prompt: “Editorial cover, contrasted serif (typefaces with high contrast between thick and thin strokes; Bodoni, Didone classification) headline at 96pt with -2 tracking, sub-deck in humanist sans (sans-serifs with stroke modulation; Gill Sans, Frutiger) at one-quarter size, set 1.4 leading (line spacing as a multiplier of type size), dateline in monospace at base, off-axis composition with negative space upper-left, two-colour print feel.” Output: specific, voice-bearing, defensible to a client.

This demo is about more than type. It shows how typographic decisions sit inside composition (off-axis), production (two-colour print feel), and hierarchy (3:1:0.5 by visual weight). Type vocabulary becomes a gateway to specifying every other fundamental more precisely.

The companion piece on why this matters as a moat is The Design Advantage. This article covers the what to specify; that one covers the why specifying it works.

Anatomy as prompt-decoder

Fig-02: typographic anatomy diagram. Standard labels (cap height, x-height, ascender, descender, baseline, mean line, counter, aperture, terminal, stem, bowl, ear, spur) on a single character. Side panel: how each term works as prompt vocabulary.

Anatomy is not jargon. Every label in fig-02 is a lever you can pull to get the model to produce something specific.

A few worked examples:

Specify “elevated x-height” when you need type that holds at small sizes or reads informal. High x-height is why Inter and Lato feel the way they do; low x-height is why Garamond reads literary.
Reach for “open aperture” (Frutiger family) when the brief calls for friendly or wayfinding-readable; closed aperture (Bodoni) when it calls for formal or editorial. The model recognises both descriptors directly.
“Round counters” versus “squared counters” shifts the type’s mood from neutral (Helvetica) to architectural (DIN). Most prompts skip this control; naming it removes a whole category of generic outputs.
Stroke contrast tells the model where in the type-size spectrum to land. High contrast reads display; low contrast reads body. Specify “high stroke contrast” for editorial titles, “low stroke contrast” for body running text.
Optical correction wraps several adjustments at once. Request “optical-sized for [display / body / caption]” and the model recognises it directly.

Designers who know the anatomy can write shorter, more accurate prompts than designers who don’t. Every term in fig-02 is a constraint you can add to a prompt, and every constraint shrinks the output space toward what you meant, not the average.

The gallery: six prompts ranked

Fig-03: same brief, six prompts arranged as a vocabulary gradient. Outputs presented in a ranked grid; each cell labelled with its vocabulary content.

Same brief as Demo 1 (coffee-roaster wordmark) for narrative continuity. Six prompts, gradient from zero vocabulary to ceiling:

“Modern coffee logo.” (Zero vocabulary, baseline failure.)
“Coffee roaster wordmark.” (One unit, minimal uplift.)
“Wordmark, sans-serif, 2026 indie coffee brand.” (Two units, partial uplift.)
“Wordmark, geometric grotesque, all lowercase.” (Three units, meaningful uplift.)
“Wordmark, geometric grotesque, all lowercase, generous tracking, optical-sized for display.” (Five units, high uplift.)
“Wordmark, geometric grotesque, all lowercase, tracking 50/1000, optical-sized for display, baseline-locked, single weight throughout, slight terminal flares on a/g.” (Full vocabulary, ceiling.)

Vocabulary is monotonic. More terms produce more specific outputs, up to a ceiling. Where over-constraint kicks in (roughly seven to ten specific terms for current generators) the output starts losing flexibility and the model produces near-identical results across batches.

The takeaway is the gradient itself. Vocabulary is the only variable. The output ceiling moves with it.

Which models handle type best

No single best generator. Match the tool to the deliverable.

Recraft V4 (launched February 2026) is the only major image generator that outputs native editable SVG vector files, in four versions (standard 1024px, pro 2048px, SVG, Pro SVG). For typography specifically, Recraft V4 treats type as integrated design rather than as overlay, useful for posters, packaging, and editorial layouts where type and image must feel like one composition. All output cleared for commercial use.

Adobe Firefly is strong for type with the bonus of being trained on licensed data (relevant for commercial use; see Copyright and Licensing for the IP picture). Use Firefly when type is the deliverable and the work is commercial.

The gpt-image family (gpt-image-1 / 1.5 / 2) is the successor to DALL-E 2 and 3 (which retired May 2026). Currently mid-pack on type but improving rapidly. Reference the family, not a specific version: gpt-image-1 itself is deprecated 2026-10-23 with successor 1.5 and reasoning-augmented gpt-image-2 already shipped.

Midjourney v7 is strongest overall image quality and weakest on type. Use for compositional or illustrative work; switch to Recraft, Firefly, or the gpt-image family when type is the deliverable. The demos in this article use Midjourney v7 for failure-mode comparisons (because Midjourney’s failures are the most legible) and Firefly for “what good looks like” panels (because the contrast is sharper).

There is no best generator in the abstract. There is “best for this brief, given my vocabulary.”

TGDS Verdict

Typography is the entry-point fundamental. It has the most named vocabulary, the clearest visual feedback (you can see when type is wrong), and the lowest barrier to demonstration. Every other fundamental (colour, hierarchy, composition) works the same way. Vocabulary produces specificity. Once typography clicks, the others click faster.

AI doesn’t do typography. Designers who know typography do typography with AI.

Cert IV in Design starts with typography fundamentals; the vocabulary in this article is the entry-level set students learn in the first weeks. The Cert IV pathway is structured so that vocabulary builds before AI integration, which means students can name what they’re looking at long before they ask AI to produce it for them.

Next vocabulary set: colour. → Colour Theory and AI

Other useful next steps: The Design Advantage (the why behind this article’s what); Build Taste, Generate, Refine (how this vocabulary plugs into a working AI design workflow); and the future cluster articles Visual Hierarchy and AI and Composition and AI.

Share this article

Ready to start your design career?

Study graphic design online, at your own pace, with 1:1 support from our Support Angels. Accredited RTO since 2008.

Explore our courses

fundamentals

Colour Theory and AI

Why AI defaults to a biased palette, and the colour vocabulary that corrects it.

Read article

fundamentals

Composition and AI

Why AI lays out the median by default, and the layout vocabulary that fixes it.

Read article

fundamentals

Visual Hierarchy and AI

Why AI defaults to centred, even layouts, and the hierarchy vocabulary that ranks them.

Read article

Typography and AI

The gibberish problem

The vocabulary bottleneck

Demo 1: logo / wordmark vocabulary uplift

Demo 2: editorial cover vocabulary uplift

Anatomy as prompt-decoder

The gallery: six prompts ranked

Which models handle type best

TGDS Verdict

Ready to start your design career?

Related articles

Colour Theory and AI

Composition and AI

Visual Hierarchy and AI

Find us on the Gram

Get Started.

Brochures, Phone Calls & Questions