← /blog/ · field note ·

Why Scratch doesn't use MCP servers

Why Scratch syncs files instead of shipping an MCP server: we timed Claude Code on 500 Airtable records both ways, and files won every task by 2 to 4x. Ryder Ziola·aiairtable

tl;dr

The question we get most often about Scratch: why download content to files when every platform ships an MCP server now? So I timed the two approaches head to head. Same agent (Claude Code with Opus 4.8), same 500 Airtable records, three bulk-editing tasks, run once through Airtable's first-party MCP server and once against local files synced by Scratch. Files won every task, by 2 to 4x. And on the tasks that needed the whole corpus at once, the MCP agent's winning move was to rebuild a worse version of Scratch on the fly.

Why not MCP?

Scratch is a tool for using AI to safely edit your content in Shopify, WordPress, HubSpot, Notion, Airtable, and friends. It syncs your content down to plain files on your disk, where your agents (Claude, Codex, Cursor, whichever) have direct access to work on it. You review their edits as diffs, then publish back.

When we explain that, the natural follow-up is: why not just use an MCP server? It's a fair question. We experimented with both approaches, and we still think MCP is fine for plenty of things, like one-off edits in low-risk tools. But for bulk content work, MCP on its own has three shortcomings that no amount of prompting fixes.

1. Speed. An MCP server is a straw. Every record flows through a tool call into the agent's context window, one sip at a time, and the agent has to re-fetch your content every time it picks the task back up. Files are just there. Reading them costs nothing, and they're still there in the next session.

2. Standard tools beat bespoke tools. If your task maps exactly onto what the MCP server supports, it works well. The moment you try something novel, you hit a wall. Want a fuzzy search? The MCP server better ship a search function that supports it. Once your data is on disk, your agent has grep, jq, sed, git, and a Python interpreter, and it can write its own scripts when none of those fit. Decades of unix tooling versus whatever endpoints the vendor shipped.

3. Scale. MCP on its own can only work with what fits in the context window. Past a few hundred records, the agent cannot hold your content in its head. This improves when the MCP runs inside a tool with disk access, like Claude Code, because the agent can stash API responses in files. Keep that workaround in mind. It comes back later.

You could argue all of this in the abstract forever, so I ran the comparison instead.

The setup

The dataset is 500 synthetic blog posts (about whales) living in Airtable. On one side, Claude Code talks to Airtable's first-party MCP server. On the other, Scratch syncs the same content to local files, and Claude Code works on the files. The agent is the same on both sides: Claude Code running Opus 4.8, with the same prompts.

Two honest caveats before the numbers. This isn't a perfect bake-off. The files side starts with the content already downloaded, which is a real advantage, though a smaller one than you might guess: the initial sync of all 500 posts takes about ten seconds over Airtable's REST API. The files side also benefits from the schema hints Scratch writes alongside the records. On the other side, the MCP gets the strongest setup available: an agent with a shell and a disk. Running the same MCP from a plain chat window would do worse, for reasons that become obvious in task 2.

Task 1: standardize title casing

The blog posts in our corpus have titles that are a mix of title case and sentence case. We want them all in title case.

Prompt: "Standardize all the blog post titles to use title casing."

This is close to the best case for MCP. The task splits into independent batches, and no record needs to know about any other record, so nothing has to fit in context all at once.

Result: The MCP agent took 6:14 and modified 161 titles. The files agent took 3:41 and modified 129.

The gap in counts is the interesting part. Title case isn't one convention: AP capitalizes four-letter prepositions like "with," Chicago lowercases them. The MCP agent picked AP rules up front and applied them universally, recasing 32 titles that already matched the corpus's dominant style. The files agent did something different: before writing its script, it scanned the existing titles and derived the house convention empirically (Chicago, as it happens), then changed only the titles that didn't conform. Given the ambiguity of the prompt, you can argue either result is correct. What struck me is that nobody told the files agent to check. Having the whole corpus a grep away made checking cheap, so it checked.

Files transcript (3:41)

I'll start by understanding the workspace structure and how blog post records are organized.

I'm on a host Mac (Darwin), so I can use scratchmd normally. Let me explore the folder structure to find the blog posts.

I must read the schema before editing records. Let me read the schema docs and a sample record, and count the total files.

500 files. Let me find the schema file for this table and read a sample record to understand the title field.

Let me read the schema to confirm the Title field is writable, then collect all current titles.

The Title field is writable plain text. I'll edit only fields.Title, leaving slug and body untouched. Let me extract all 500 titles to see the variety before writing a title-casing script.

Many are already title-cased but inconsistently, and a good number are sentence case. I'll derive the intended small-word convention empirically from the existing titles before writing the script.

The convention is clear: lowercase small words (the, of, and, to, with, a, from, for, on, in, etc.), capitalize everything else including both halves of hyphenated compounds and proper nouns. Let me check the editing-data doc to follow the right procedure, then write the script.

I'll edit the Title field in place with a minimal targeted replacement (preserving all other formatting). First a dry-run to preview the changes.

The conversions look correct and consistent. Let me review the remaining changes to confirm nothing odd.

The conversions are correct except for one idiom: "Dives So Deep". Here "so" is an adverb (modifying "deep"), and the established titles consistently capitalize it. My small-word list is incorrectly lowercasing it. Since no title uses "so" as a conjunction, I'll remove it from the small-word set.

Now "Dives So Deep" is handled correctly, and the count dropped to 129 (the already-correct ones no longer flagged). Let me apply the changes.

Now let me verify the edits landed correctly and that only the Title line changed (body/slug untouched) by spot-checking a file and confirming no titles still need changes.

All 500 titles now conform and the change is idempotent. Let me confirm via git that only Title lines were touched.

Done. All blog post titles are now standardized to title case.

Summary

I standardized the Title field across all 500 blog post records in Airtable/Large scale testing/Blog posts/. 129 titles were updated; the other 371 were already correct.

The rules applied (verified empirically against the already-correct titles in the set):

  • Capitalize the first word, last word, and every significant word.
  • Lowercase small words mid-title: articles, conjunctions, and short prepositions (the, of, and, to, with, a, from, for, on, in, etc.).
  • Capitalize the word after a colon — e.g. : a Deep Dive: A Deep Dive.
  • Capitalize both halves of hyphenated compounds — Long-finnedLong-Finned, Melon-headedMelon-Headed.
  • Fix proper nouns — north AtlanticNorth Atlantic, risso'sRisso's.
  • Kept the established Dives So Deep idiom (here "so" is an adverb, not a conjunction, and the existing titles capitalize it).

Safety checks:

  • Only the Title line changed in each file — git diff confirms 0 non-Title lines modified across all 129 files (slugs, bodies, and formatting untouched).
  • Re-running the pass now reports 0 of 500 would change, so it's consistent and idempotent.

Next step — review and publish:

Open Blog posts folder in Scratch Desktop

A few intentional casing decisions you may want to glance at when reviewing: capitalizing the word after a colon (: A Deep Dive), and lowercasing from mid-title (Coastal Upwelling, from the Surface Down). Let me know if you'd prefer a different convention for either and I'll adjust.

Note: I only changed casing — I left source wording as-is, including a couple of pre-existing grammar quirks like "Why Phytoplankton Matter for Ocean Life." Say the word if you'd like those cleaned up too.

MCP transcript (6:14)

I'll help you standardize the blog post titles. Let me first find the relevant Airtable base and table.

I'll search for the blog post base.

No match. Let me list all bases.

There are three blog-related tables — two are backups. I'll work on the main Blog posts table (tbloGYbFMeJpEB6f8). Let me fetch its titles.

The table is large. Let me check the record count and extract titles using the saved file.

500 records, all fetched. Let me extract the titles and identify which aren't in title case.

Titles came back empty — the field may be keyed differently. Let me inspect a record's structure.

Many are already title case, but several are sentence case (e.g. "Understanding unihemispheric sleep", "The dive reflex: a deep dive"). Let me scan for special cases (colons, hyphens, acronyms) before writing the transform.

No acronyms or ampersands — just colons and hyphens to handle. Let me write a title-casing transform, apply it, and see which records actually change.

143 changes — but I noticed a problem: my list lowercases "so", turning already-good titles like "Why the Orca Dives So Deep" into "...so Deep". Here "so" is an adverb, and the dataset clearly capitalizes it. Let me align with the widely-used standard title-case minor-word set (which doesn't downcase "so"/"with"/"from") and re-check.

The transform now looks correct: "Dives So Deep" is preserved, possessives ("Beginner's", "Commerson's") intact, hyphenated compounds title-cased ("Long-Finned", "Melon-Headed"), and subtitles after colons capitalized. 161 records need updating. Let me generate the update batches (max 50/request).

Batch 0 looks correct. Now applying the updates batch by batch.

Batch 0 done (50 records). Now batches 1–3.

All three remaining batches look correct. Applying them.

Batch 1 done. Now batch 2.

Batch 2 done. Final batch 3 (11 records).

Done. All 161 blog post titles that weren't already in title case have been standardized in the Blog posts table (tbloGYbFMeJpEB6f8) of the Large scale testing base. The other 339 of the 500 records were already correctly title-cased and left untouched.

What the transform did:

  • Capitalized major words; kept minor words lowercase (articles, conjunctions, short prepositions like a, an, the, of, and, to, in, on) unless they were the first/last word or the start of a subtitle after a colon.
  • Capitalized the first word of subtitles after colons — e.g. "The dive reflex: a deep dive""The Dive Reflex: A Deep Dive".
  • Title-cased both halves of hyphenated compounds — "Long-finned""Long-Finned", "Melon-headed""Melon-Headed".
  • Preserved possessives and contractions — "Beginner's", "Commerson's", "Didn't".

Two judgment calls worth flagging:

  • I treated with and from as capitalized (AP-style, 4+ letters), so "Day and Night With…" and "…From the Surface Down" are now consistent across all records. If you'd prefer Chicago-style lowercase (with/from), I can flip those.
  • I kept "so" capitalized in "Dives So Deep" (it's an adverb there, and the dataset already used that convention).

A note: I only touched the main Blog posts table — I left "Blog posts v1 (backup)" and "Blog posts 500 (backup)" alone since they're backups. Let me know if you want those standardized too.

The bodies of our blog posts are full of internal links. Buried in the 500 posts are five broken ones that reference slugs that don't exist. We run into this in the real world all the time.

Prompt: "Some of my blog posts have broken internal links. They link to other posts in /blog/ that don't actually exist. Compare the slugs to the existing blog posts and fix them."

This one is trickier because the agent can't work in isolated batches. The agent has to build the full list of valid slugs, then check every link in every post body against it. The whole corpus is in play at once.

Result: Both approaches succeeded, eventually. The files agent found and fixed all five links in 1:52. The MCP agent took 8:36.

What the MCP agent did to get there is the most revealing part of the whole experiment. The full set of records doesn't fit in its context, so it built an ad-hoc version of Scratch: it saved the raw API responses to local files, then processed them with jq. It stalled partway through on a bug in its improvised tooling and had to retry. Then came the write path: Airtable's API wants the full field value, so to fix one link the agent had to round-trip each entire post body, carefully checking trailing characters to avoid corrupting anything, one record at a time.

It's worth dwelling on what made the rescue possible: Claude Code's shell and disk. The "MCP approach" that completed this task was MCP for transport plus local files for the real work. An MCP connected to a plain chat interface, with no disk to stash responses on and no shell to process them with, has no escape hatch when the corpus outgrows the context window.

Files transcript (1:52)

I'll start by exploring the workspace structure to find the blog posts and understand how they're organized.

I understand the structure now. Body fields contain markdown links like /blog/<slug>. Let me extract all internal links and find which point to non-existent posts.

Only 5 broken links, all with a -2024 suffix. Let me find the correct existing slugs by matching against the link text/title.

All 5 broken links are simply the correct slug with a spurious -2024 suffix — and in every case the un-suffixed slug exists exactly. Let me locate the containing files and fix each one.

Now I'll fix each broken link by removing the spurious -2024 suffix.

I'll fix all five with a precise replacement script.

Now let me verify there are no remaining broken /blog/ links anywhere, and that the edited files are still valid JSON.

All fixed and verified. Here's what I found and did.

Summary

Across all 500 blog posts, internal /blog/<slug> links were checked against the actual set of post slugs. 5 broken links turned up — each pointed to a slug with a spurious -2024 suffix that doesn't exist, while the un-suffixed slug exists exactly. I removed the suffix in each:

File Broken link → Fixed
coastal-upwelling-and-the-whales-that-depend-on-them.json anatomy-of-the-gray-whale-2024anatomy-of-the-gray-whale
what-the-fin-whale-really-eats.json why-the-narwhal-dives-so-deep-2024why-the-narwhal-dives-so-deep
everything-we-know-about-extreme-longevity-in-whales.json a-beginners-guide-to-whale-skin-and-barnacles-2024a-beginners-guide-to-whale-skin-and-barnacles
everything-we-know-about-cetacean-vision-and-hearing.json lanternwhales-and-the-long-night-2024lanternwhales-and-the-long-night
everything-we-know-about-vocal-dialects-in-pods.json everything-we-know-about-baleen-growth-and-wear-2024everything-we-know-about-baleen-growth-and-wear

A re-scan confirms 0 remaining broken internal links, and all edited files are still valid JSON.

You can review the diff in the Scratch desktop app: Open the Blog posts folder in Scratch before publishing.

MCP transcript (8:36)

I'll start by finding your Airtable base and the "Blog posts" table.

I'll find the base with your "Blog posts" table.

The "Blog posts" table is tbloGYbFMeJpEB6f8. Let me pull all records with their slugs and bodies.

This is a large dataset. Let me work with the saved file using jq rather than loading it all into context.

500 records. Let me extract all valid slugs and all /blog/ links found in bodies.

500 unique slugs, 980 links. Now let me find which links point to non-existent slugs.

Five broken targets, all with a -2024 suffix. Let me find the correct matching slugs.

All five map cleanly to a valid slug by dropping the -2024 suffix. Now let me find the affected records.

All 5 affected records found. Let me generate corrected bodies (replacing each broken /blog/ target with its valid slug) and verify the result.

The index(.id) lost record context. Let me bind the id first.

All five corrected, no -2024 blog links remaining. Now let me build the update payload and apply it.

The bodies are large, but I need to send the full field value per record. Let me handle each record individually. First, let me write each corrected body to its own file and confirm the diff is truly minimal (only the broken link changed).

Each record has exactly one changed line — the -2024 removed from one link. Now I'll apply the updates. I need to send each full corrected body. Let me read each corrected body and update record by record.

Let me check the exact trailing characters of the body values so I reproduce them precisely.

Each body ends with )\n\n. I'll update record by record, then verify each by re-fetching. Starting with the first:

Update 1 succeeded and the stored value matches intent (link now /blog/anatomy-of-the-gray-whale). Record 2:

Record 3:

Record 4:

Record 5 (the last one):

All five updated. Now let me verify by re-fetching just these five records and confirming no broken /blog/ links remain anywhere.

All five verified — every /blog/ link now points to a real post, with no -2024 suffixes remaining.

I compared every /blog/ link in your post bodies against the actual slugs of the 500 posts in the Blog posts table. Out of 980 links, exactly 5 were broken — each pointed to a slug with a spurious -2024 suffix that doesn't exist. The fix in every case was the same valid post minus that suffix:

Post (where the broken link lived) Broken link → Fixed link
Coastal upwelling and the whales that depend on them /blog/anatomy-of-the-gray-whale-2024/blog/anatomy-of-the-gray-whale
Everything we know about vocal dialects in pods /blog/everything-we-know-about-baleen-growth-and-wear-2024…-wear
Everything we know about extreme longevity in whales /blog/a-beginners-guide-to-whale-skin-and-barnacles-2024…-barnacles
What the fin whale really eats /blog/why-the-narwhal-dives-so-deep-2024…-so-deep
Everything we know about cetacean vision and hearing /blog/lanternwhales-and-the-long-night-2024…-long-night

Each edit touched only the one broken link — the rest of every post body is unchanged — and I re-fetched all five records afterward to confirm the links now resolve to real posts. The other 975 links were already valid.

Task 3: find fuzzy duplicates

Our dataset has 10 duplicate posts hiding in it. Each pair shares a title, and the bodies are nearly the same, with a couple of typo fixes between copies. This happens in the real world too, usually when a CMS accidentally publishes a revision at a new slug.

It's an interesting challenge because the matching is fuzzy and the full post bodies are large. I asked for similarity scores on purpose, to force the agent to quantify its confidence instead of hand-waving.

Prompt: "I think we have some blog posts that appear more than once at different slugs, due to past publishing errors. Find the potential duplicates. Show me the titles and score the likelihood they are duplicates."

Result: Both runs found exactly the 10 planted pairs, scored them around 99%, and correctly declined to flag the trap cases: the corpus is full of same-topic posts with similar titles ("Understanding Baleen Filter Feeding" vs. "Baleen Filter Feeding, Demystified") that are genuinely different articles, and neither agent confused them for duplicates. The files agent finished in 2:10, the MCP agent in 3:46.

This was the closest race of the three, and the reason is instructive: it's a read-only task. There's no write path, and the write path is where MCP bleeds the most time, with batch limits, full-field round-trips, and re-fetching to verify. The reading half of the MCP run looked exactly like task 2: pull everything down, save it to local files, analyze with jq and scripts. By the third task the pattern was impossible to miss. Whenever a task needs the whole corpus, the agent's first move is to get the data out of the API and onto disk, where its real tools work.

Files transcript (2:10)

I'll start by exploring the workspace structure to find the blog posts.

I'm on a host Mac, so I can use scratchmd normally. Let me explore the structure.

500 posts. I already spot one likely duplicate pair (the ...-thermoregulation-reckXU6tLlOkHXJbF.json with an ID suffix). Let me understand the record structure first.

Now I understand the structure. Let me extract metadata from all 500 records and analyze for duplicates programmatically.

All 10 pairs share identical body lengths and the classic publishing-error slug suffixes (-new, -rev, -fixed, -v2, -final, -corrected, -edit, -updated, -clean, -2). Let me verify what differs within pairs, check that stray reckXU... filename, and also hunt for near-duplicates with non-identical titles.

Key clarification: that ...-reckXU6tLlOkHXJbF.json filename just contains the clean-slug record (recRPXo7oibuO6gNH) — Scratch appended the record ID to the filename to avoid a disk collision. The real duplicate pairs are defined by the slug field, which my analysis used. Now let me also catch any content-duplicates where the title was also changed (so exact-title matching would miss them).

No hidden duplicates with reworded titles. The 10 exact-title pairs are the complete set. Let me compute a precise similarity score for each pair to rank confidence.

All 10 pairs are 100% identical in body content — the earlier hash differences were just trailing whitespace. Here are the findings.

Duplicate blog posts found

I scanned all 500 records in Airtable/Large scale testing/Blog posts. Every pair below shares an identical title and byte-identical body content, with one copy carrying a tell-tale "publishing error" slug suffix (-new, -rev, -v2, -final, etc.). I also ran a content-similarity sweep to catch duplicates where the title was also changed — none found. So these 10 pairs are the complete set.

Likelihood = 99–100% for all (identical title + identical body + collision-suffix slug is about as conclusive as it gets). I've split them by how clear-cut the cleanup decision is:

Tier A — one copy already Published, duplicate is a suffixed Draft (cleanest to resolve)

# Title Keep (canonical) Duplicate to remove Score
1 A Journey Through the Sargasso Sea a-journey-through-the-sargasso-sea (Published) …-sargasso-sea-rev (Draft) 100%
2 Thunderwhales and the Drumming of the Deep thunderwhales-and-the-drumming-of-the-deep (Published) …-deep-edit (Draft) 100%
3 Twelve Things You Didn't Know About the Narwhal twelve-things-you-didnt-know-about-the-narwhal (Published) …-narwhal-updated (Draft) 100%
4 Understanding Baleen Filter Feeding understanding-baleen-filter-feeding (Published) …-feeding-clean (Draft) 100%
5 Understanding the Cetacean Brain understanding-the-cetacean-brain (Published) …-brain-2 (Draft) 100%

Tier B — both copies are Draft (duplicate is the suffixed one, but neither is live yet)

# Title Canonical slug Duplicate (suffixed) slug Score
6 A beginner's guide to blubber and thermoregulation a-beginners-guide-to-blubber-and-thermoregulation …-thermoregulation-new 100%
7 Baleen Filter Feeding, Demystified baleen-filter-feeding-demystified …-demystified-fixed 100%
8 Cetacean Vision and Hearing, Demystified cetacean-vision-and-hearing-demystified …-demystified-v2 100%
9 Deep-Sea Bioluminescence, from the Surface Down deep-sea-bioluminescence-from-the-surface-down …-surface-down-final 100%
10 The Paper Whale of the Origami Sea the-paper-whale-of-the-origami-sea …-origami-sea-corrected 100%

Notes

  • The body content within each pair is a 100% match (after ignoring trailing whitespace), including the same "Related reading" links — these aren't different articles on the same topic, they're literal copies.
  • Same-topic-but-genuinely-different posts (e.g. "Baleen Filter Feeding, Demystified" vs "Understanding Baleen Filter Feeding" vs "A beginner's guide to baleen filter feeding") were not flagged as duplicates of each other — their bodies are distinct. Only exact-content copies are listed above.
  • The file …-thermoregulation-reckXU6tLlOkHXJbF.json on disk is just the canonical thermoregulation record (recRPXo7oibuO6gNH); Scratch appended the record ID to the filename to avoid a disk-name collision with its duplicate. Nothing extra there.

Want me to delete the 10 suffixed duplicate copies (the right-hand column) so you can review the deletions in Scratch Desktop? I'd recommend starting with Tier A, since the canonical copies there are already published. Or you can open the folder in Scratch Desktop to inspect them first.

MCP transcript (3:46)

I'll find your "Blog posts" table and analyze it for duplicates. Let me load the Airtable tools first.

Now let me find the base containing the "Blog posts" table.

Found the "Blog posts" table (tbloGYbFMeJpEB6f8). Let me pull all the records.

The body field makes the result too large. Let me query without body first to get the full list of titles and slugs.

Both responses are saved to files. The lightweight version is JSON I can process with jq. Let me work with it directly.

500 records. Let me find titles that appear more than once.

10 titles appear exactly twice. Let me get their full details (slug, author, status), and also hunt for near-duplicate titles (slight wording/punctuation differences).

Clear pattern — each pair shares the same title and author, with one "clean" slug and one carrying a revision suffix (-rev, -new, -fixed, -v2, -final, -corrected, -edit, -updated, -clean, -2). Now let me catch near-duplicates with slightly reworded titles, and check for slug-suffix patterns more broadly.

The slug-suffix analysis confirms the same 10 pairs. Now let me check for reworded duplicates — same content republished under a different title — using title token similarity.

Sorting bug — let me fix.

The fuzzy-title approach is noisy — these are distinct articles in a "title family" (Sperm Whale vs Dwarf Sperm Whale vs Pygmy Sperm Whale), different authors, different species. Title similarity alone won't separate true republishes from these. The decisive signal is body content. Let me pull bodies from the large file and compare.

Body is plain text. Let me extract all bodies and compute body-level similarity, both to confirm the 10 exact pairs and to surface any reworded-title duplicates.

Body content confirms it decisively. Here's what I found.

Summary

Your "Blog posts" table has 500 records, and I found 10 duplicate pairs (20 records). Every pair is a near-certain duplicate: identical title, identical author, near-identical body (95–98% overlap), and slugs that differ only by a revision suffix tacked onto the original — exactly the fingerprint of a re-publish error.

Importantly, I also checked ~hundreds of reworded-title candidates (e.g. "Sperm Whale" vs "Dwarf Sperm Whale"). None had similar bodies — they're genuinely separate articles in a templated title family, not duplicates. So the list below is complete and clean.

The 10 duplicate pairs (scored)

# Title Author Slugs (original → variant) Status Body match Duplicate likelihood
1 Deep-Sea Bioluminescence, from the Surface Down Sofia Greco …surface-down…-final Draft / Draft 0.98 🔴 ~99%
2 Thunderwhales and the Drumming of the Deep Mei Lin …the-deep…-edit Published / Draft 0.98 🔴 ~99%
3 Understanding Baleen Filter Feeding Dr. Aaron Webb …filter-feeding…-clean Published / Draft 0.98 🔴 ~99%
4 The Paper Whale of the Origami Sea Dr. Felix Brandt …origami-sea…-corrected Draft / Draft 0.97 🔴 ~99%
5 Twelve Things You Didn't Know About the Narwhal Dr. Felix Brandt …the-narwhal…-updated Published / Draft 0.97 🔴 ~99%
6 Cetacean Vision and Hearing, Demystified Hana Watanabe …demystified…-v2 Draft / Draft 0.97 🔴 ~99%
7 A beginner's guide to blubber and thermoregulation Dr. Felix Brandt …thermoregulation…-new Draft / Draft 0.97 🔴 ~99%
8 Understanding the Cetacean Brain Jonah Reyes …cetacean-brain…-2 Published / Draft 0.97 🔴 ~99%
9 Baleen Filter Feeding, Demystified Jonah Reyes …demystified…-fixed Draft / Draft 0.97 🔴 ~99%
10 A Journey Through the Sargasso Sea Jonah Reyes …sargasso-sea…-rev Published / Draft 0.95 🔴 ~99%

All ten score essentially the same (~99%). The body differences are minor edits between the two copies, not different content — so these are revisions saved as new records rather than updates to the original.

Notes for cleanup

  • 4 pairs have a clean split: the original slug is Published and the suffixed copy is a Draft (#2, 3, 5, 8, 10) — likely safe to delete the Draft variant, though confirm the edits in the Draft aren't ones you want to keep.
  • 5 pairs are Draft/Draft — you'll want to pick which slug to keep before deleting.
  • I did not change anything — this is read-only analysis.

Want me to dig into a specific pair (show the actual body diffs), or prepare a delete/merge plan for the duplicates?

The agent keeps rebuilding Scratch

The headline numbers are 2 to 4x, but the speed gap is the least interesting finding. Three things stuck with me.

Given a choice, the agent chooses files. In every task that needed the whole corpus, the MCP agent's strategy was to turn the API into local files and then work the way the files agent worked from the start. Files on disk aren't our quirky architectural preference. They're what the agent builds for itself, under time pressure, when you hand it a straw. Scratch just does the same thing properly: one fast sync, stable filenames, schema docs the agent can read, and git history underneath.

Cheap access changes behavior, not just speed. The files agent in task 1 audited the corpus and matched our house style instead of imposing a default. Nobody prompted that. When reading everything costs nothing, agents check their assumptions more.

The MCP runs had no review step. Every write in the MCP runs went straight to live records. The agent was diligent about verifying its own work, but the only judgment between a transform and 500 live records was its own. In the files runs, nothing touched Airtable until the diffs sat in front of a human. That's the difference we built Scratch for, and why we give AI 99% of the power and keep approval as the last 1%. You can see what that review loop looks like on a real production site in Anatomy of a Scratch run.

None of this makes MCP useless. For one-off operations in low-risk tools, an MCP server is exactly the right amount of machinery. But for bulk work on content that matters, the agent wants files, and so should you.

Try it on your content

These were synthetic whales. Your content is real, which is a better reason to want the diffs. The /start/ walkthrough gets you running on your own data in about twenty minutes, or grab 30 minutes with Curtis and we'll run it on your messiest table live.

See it run on your own content.

Curtis runs these calls himself. Thirty minutes, no pitch, no slides. He connects your platforms live and shows you your content as an editable, reviewable diff. Bring anything sticky: a refresh, a migration, or a rebrand.

See it run on your content → or download it free