Migrating 90 posts from WordPress to Sanity
We had 90 posts in WordPress and needed them in Sanity as Portable Text. Here's the migration script, the gotchas, and what we'd do differently.

We had years of blog content sitting in WordPress. Some written in Gutenberg, some in the classic editor, some touched by Elementor. All of it needed to land in Sanity as clean Portable Text.
Here is how we did it and what went wrong.
The approach
The migration happened in two phases. Phase one pulled existing posts from the WordPress REST API. Phase two used a separate script to import freshly written articles from JSON files.
Both scripts used `createOrReplace`. This is the key decision. It means you can run the migration repeatedly without creating duplicates. Every document has a deterministic `_id`. Run it once, run it ten times, same result.
const doc = {
_id: docId,
_type: 'post',
title: article.title,
slug: { _type: 'slug', current: slug },
publishedAt: article.date ? `${article.date}T09:00:00Z` : undefined,
category: article.category || 'development',
excerpt,
body,
author: { _type: 'reference', _ref: 'author-chris-ryan' },
}
await client.createOrReplace(doc)Idempotent migrations save you when you inevitably find a bug in your transform logic and need to re-run everything.
HTML to Portable Text
The hardest part. WordPress returns rendered HTML. We strip the tags first to get clean text, then convert that into Portable Text, which is a structured array of blocks.
After stripping HTML, our converter splits the clean text on double newlines, then pattern-matches each chunk:
function textToPortableText(body) {
const blocks = []
const lines = body.split('\n\n')
for (const line of lines) {
const trimmed = line.trim()
if (!trimmed) continue
if (trimmed.startsWith('## ')) {
blocks.push({
_type: 'block',
_key: randomKey(),
style: 'h2',
children: [{ _type: 'span', _key: randomKey(), text: trimmed.slice(3).trim(), marks: [] }],
markDefs: [],
})
continue
}
if (trimmed.startsWith('- ') || trimmed.startsWith('* ')) {
const items = trimmed.split('\n').filter(l => l.trim().startsWith('- ') || l.trim().startsWith('* '))
for (const item of items) {
blocks.push({
_type: 'block',
_key: randomKey(),
style: 'normal',
listItem: 'bullet',
level: 1,
children: [{ _type: 'span', _key: randomKey(), text: item.replace(/^[\-\*]\s+/, '').trim(), marks: [] }],
markDefs: [],
})
}
continue
}
blocks.push({
_type: 'block',
_key: randomKey(),
style: 'normal',
children: [{ _type: 'span', _key: randomKey(), text: trimmed, marks: [] }],
markDefs: [],
})
}
return blocks
}Every block needs a unique `_key`. Without it, Sanity rejects the document. A simple `Math.random().toString(36).slice(2, 10)` works fine.
Image handling
Featured images needed downloading from WordPress, uploading to the Sanity CDN, then referencing by asset ID.
async function uploadImageFromUrl(url, filename) {
const res = await fetch(url)
if (!res.ok) return null
const buffer = Buffer.from(await res.arrayBuffer())
const asset = await client.assets.upload('image', buffer, {
filename,
contentType: res.headers.get('content-type') || 'image/jpeg',
})
return asset._id
}The reference format is specific. You cannot just paste a URL. It must be a proper Sanity image reference:
function imageRef(assetId) {
if (!assetId) return undefined
return { _type: 'image', asset: { _type: 'reference', _ref: assetId } }
}The gotchas
Elementor posts came back from the REST API wrapped in inline styles, layout divs, and data attributes. We had to strip all of that before converting. The WordPress REST API returns rendered HTML, not the raw block content.
HTML entities were another problem. `–` instead of en dashes. `’` instead of apostrophes. We added a post-processing step to decode these, but missed a few on the first pass.
The biggest lesson: always inspect what the API actually returns before writing your parser. I assumed clean HTML. I got Elementor soup.
What we would do differently
I would use a proper HTML-to-Portable-Text library like `@sanity/block-tools` instead of rolling our own parser. Our approach works for simple content, headings, paragraphs, lists. But it does not handle inline formatting like bold or links within paragraphs. For our content that was acceptable. For a content-heavy site with complex formatting, it would not be.
The migration moved 90 posts into Sanity cleanly. Re-runnable, deterministic, no duplicates. Good enough.
Next in the series: [Our Tailwind v4 design system and how we handle brand tokens](/digital-insights/our-tailwind-v4-design-system-and-how-we-handle-brand-tokens)

Chris Ryan
Managing Director
17+ years in full-stack web development, most of it leading teams agency-side across e-commerce, CMS platforms, and bespoke applications. Specialises in infrastructure, system integration, and data privacy, with hands-on experience as a Data Protection Officer. Founded Innatus Digital in 2020 to offer the kind of honest, technically-led partnership that he felt was missing from the agency world.