Content in Translation

Accessing new markets the right way

February 11, 2026

AGI v1.0 Will Be Harness + Skills

Consider the trajectory of LLMs so far. Starting in Google’s translation research departments, following on from their groundbreaking neural network models, transformers were first and foremost a “language” model. They have grown to be able to convince people of their utility, much like the market stall owner would do pre-2000s, holding groups of up to 50 people in awe because of their fast talking, confidence and skill demoing whatever they were selling. The same works on shopping channels and more recently Youtube. Speak fast and confidently, cut all the dead air, and you got yourself the makings of a popular channel.

January 15, 2026

Why Your Translation Files Keep Breaking Mid-Project

The email lands mid-project: “The translated files won’t open properly.” Or worse: “All the formatting is gone.”

File handling failures are among the most frustrating problems in localization. The translation might be perfect, but if the final files don’t work—if they won’t open in the original application, or they’ve lost their formatting, or the content appears corrupted—the project has failed.

These failures aren’t random. They follow predictable patterns, and understanding those patterns reveals how to prevent them.

January 14, 2026

XLIFF: The Universal Translation Standard Your Team Should Know

Every translation tool has its own internal format. Phrase uses one structure, Lokalise another, MemoQ another still. When you need to move content between tools—or work with external translators who use different tools—these incompatibilities create friction.

XLIFF exists to solve this problem.

What XLIFF actually is

XLIFF—XML Localization Interchange File Format—is an OASIS standard designed specifically for exchanging localization data between tools. Think of it as the PDF of translation: a format that preserves content and structure regardless of what tool created it or what tool will use it.

January 13, 2026

Stop Losing Formatting in Translation: The Placeholder Revolution

The translated document arrives. You open it. Immediately you see the problems: bold text that’s no longer bold, links that point nowhere, bullet lists that became plain paragraphs.

The translator didn’t make these mistakes. The translation itself is fine. What failed was format handling—the tags and codes that carry formatting got stripped or corrupted somewhere in the process.

This happens constantly in localization workflows, and it’s almost always preventable.

Why formatting breaks

Modern documents store formatting as codes embedded in the text. A bolded word isn’t just “stored as bold”—it’s wrapped in tags:

January 12, 2026

From Scanned PDFs to Translated Docs in Minutes

The document arrives as a scanned PDF. The text you need to translate is locked in images of pages, not extractable text. In a traditional workflow, someone would need to retype the entire document before translation could even begin.

This scenario plays out constantly. Legacy documents, signed contracts, historical records, printed materials that were never digital—all exist only as scans. The need to translate them doesn’t go away because they’re inconveniently formatted.

January 11, 2026

The Post-MT Enhancement Pipeline Your Competitors Don't Have

Machine translation output in 2026 is good. It’s not good enough.

Raw MT from DeepL, Google, or any major engine produces text that’s usually comprehensible but rarely polished. The translations are technically correct but feel off: wrong register, inconsistent terminology, awkward phrasing that no native speaker would use.

Most organizations address this through human post-editing. Translators review MT output, fix problems, and produce final text. This works, but it’s essentially using humans to compensate for systematic MT limitations—the same limitations, segment after segment, project after project.

January 10, 2026

Context-Aware Translation: Why Surrounding Segments Matter

Translation tools divide content into segments—usually sentences. Each segment gets translated independently. Segment 47 is processed without awareness of segments 46 or 48.

This architecture is efficient. It parallelizes well. It maps cleanly to translation memory matching. It’s been the industry standard for decades.

It also produces translations that feel fragmented.

The coherence problem

Read a professionally translated document closely. Often you’ll notice:

Pronoun inconsistency. “It” refers to different things in adjacent sentences, but the translations don’t maintain the reference chain.