You are not searching for a way to make a thousand changes. You already have several: an AI agent, a CSV import, a script against the API, a long afternoon of admin tabs. What you are searching for is the step none of them comes with, a way to see those thousand changes before your customers do. Every platform you check hands you some version of a preview button, and every preview button was built for one page at a time.
You can't unit test content. The diff is the test suite. No assertion catches a meta description that lost the point or a rewrite that quietly dropped a paragraph; a person reading the edit next to the original, word by word, does. The options below are how teams improvise that reading today. Most of them answer a different question than the one you asked.
The review steps teams improvise
A staging environment
A staging site shows you the real thing: the actual templates, the actual content, rendered the way visitors will see it. Some platforms can publish to a staging domain alone, and page-diff tools can compare staging against production, URL by URL. What none of it surfaces is which of your 2,000 fields moved. The diffs are rendered text, page by page, and on Webflow the Editor's publish button pushes staging and production at the same time anyway; publishing to staging alone means stepping out to the Designer's publish dialog and unchecking the production domain. Staging answers "does it look right?". You asked "what changed?".
Draft modes and revisions
Drafts are the right instinct, applied one item at a time. WordPress revisions give a genuine side-by-side diff, per post. One headless CMS highlights field differences between two versions of a document, string fields only. Webflow keeps draft changes to published items unpublished until you push them, with a preview link each. All of it is per item: a 2,000-item batch means 2,000 open-review-close cycles, and none of the mainstream CMSs above offers a batch diff or an approve-all-but-these view. Webflow goes one further. It keeps no CMS version history at all, so the moment an item publishes, the old value is gone.
A git-based CMS
The one category that takes diffs seriously. A git-backed CMS stores content as files in a repository, so a batch of edits can be a branch, reviewed line by line as a pull request before it merges. Two catches. The review lives in GitHub, so your reviewers are developers; one popular git-backed CMS opens a separate branch and pull request per entry, and the pinned feature request asking to bundle them has been open for years. The larger catch: your CMS is probably not one. Mainstream marketing CMSs are database-backed, and moving platforms to get a review step is a migration, not a workflow.
A spreadsheet side-by-side
Export the collection, put the old value in one column and the new value in the next, and read across. It is free, it is genuinely batch, and for short plain-text fields it almost works. Then rich text arrives as a wall of raw HTML, and the reading stops. There is no approve button on a row; the re-import that follows is all or nothing. And the import path is where formatting dies: Webflow's CSV import strips line breaks from rich text fields, a widely used WordPress import plugin warns in its own documentation that updating all data risks erasing existing data, and search-and-replace dry runs report how many changes would happen, not what they are.
Publish and watch
Ship the batch, then read the live site. It costs nothing to set up, it is what most teams quietly do, and the guides for AI-written meta tags at scale tend to admit as much: publish, then audit a sample after the fact. The trouble is that review after publish is not review, it is incident response. The hallucinated claim is found by a customer or a crawler, whichever gets there first. And on Webflow there is no version history to fall back on; the documented undo is a full-site backup restore, which takes your pages, design, and every other collection back with it.
Scratch
Scratch makes the working copy files on your laptop, which is what turns review into a step instead of a wish. Your AI edits the files; every changed field comes back as a word-level diff next to the original, unchanged fields grayed out. Approval is per item. Publishing writes back only the fields that changed, so a field the agent never touched cannot be overwritten, by a blank, by a stale value, by anything. Validators, optional Python rules, fail the rule-breakers before you read. The honest cost: the queue is real human reading, call it an hour for a batch this size. Scratch is what makes it an hour instead of sixteen. It will not make it zero.
| Option | Shows what changed | At 2,000 items | Approve per item | Undo after publish |
|---|---|---|---|---|
| Staging environment | The rendered page, not the field | Page-by-page browsing | No | No |
| Draft modes | Per item, on some platforms | 2,000 open-and-close cycles | By hand, one at a time | Varies, often none |
| Git-based CMS | Real diffs, line level | One pull request per entry | In GitHub | Git revert, developer-run |
| Spreadsheet side-by-side | Two columns of raw HTML | Rows scale, reading does not | No, the import is all or nothing | A manual export, if you saved one |
| Publish and watch | After your customers see it | Sampling | No | Full-site restore, where one exists |
| Scratch | Word-level diff, per field | One review queue | Yes | Per item, even after publish |
How the loop works on a 2,000-item update
- Scratch pulls your CMS into files. Every item lands as its own file in a folder on your laptop, every field visible: body, meta title, description, slug, references. The live site has not changed, and nothing in the next step can change it.
- Your AI makes the changes, on the files. Open the folder in Claude, Codex, Cursor, or Copilot, whichever agent you already use, and hand it the brief. Rewrite the meta description on every landing page to front-load the offer, 150 characters max, touch nothing else. On files, the agent greps all 2,000 records in about a second and scripts its own checks, a character count across every meta field, a list of pages still carrying the old offer. An agent working over MCP typically fetches one record per tool call; at a typical 60 requests a minute, 2,000 per-record fetches is over half an hour before the agent has read what it is about to change. The agent edits the one field the brief names, and it holds no key to your CMS.
- You review the diff and publish. Every changed field sits next to its original, word by word, with unchanged fields grayed out. If you set validators, they run before you read: a row that picked up a banned phrase or moved its slug arrives already failed. Approve per item, and Scratch publishes only what you approved, writing back only the fields that changed. The agent did 99% of the work. The 1% it cannot do, deciding what your customers see, is the step this page exists for. Approval is not the last word, either: any published item can be reverted individually, and the original value comes back.
What reviewing 2,000 changes actually looks like
The objection to review-before-publish is always the same: nobody reads 2,000 of anything. That is true of the per-item version. Opening a draft, previewing it, and closing it at 30 seconds an item is over 16 hours of clicking before you have judged a single sentence.
A diff queue is a different kind of reading. Every touched item is a row in one table. Unchanged fields are grayed out, so a 40-field record where one meta description moved reads as one highlighted line. The validator already failed the 170-character description before you looked. Most rows are a one-glance call; the handful of full rewrites get flagged for real attention. Approve the clean rows in a pass, reject the three that drifted, leave the rest for a second look. The effort scales with the number of judgment calls, not the number of rows.
It is still real reading, and that is the point. The hour in the queue is the hour that used to happen on the live site, after the fact, with customers as the test environment.
Questions people ask
Can I see every change in a batch before it goes live?
Yes. Every edited item lands in one review queue, and every changed field shows as a word-level diff against the original. Nothing publishes until you approve it, item by item. The agent works on files; the live site does not move until the end.
Can a field the AI did not touch get overwritten?
No. Publishing writes back only the fields that changed. A field the agent never edited is not part of the write at all, so it cannot be replaced by a blank or a stale value. This is the failure CSV re-imports are famous for, and it is structurally absent here.
Will a bulk update break the formatting in my rich text?
No. Outside the field you asked for, nothing moves. The writeback carries only the edited field, so everything else in the record survives byte for byte; it is never written at all. The edited field itself can still go wrong, an agent can flatten a list or drop a heading, and that is what the diff is for: the damage is highlighted on your screen before it is on your site.
Who actually reads 2,000 diffs?
You do. It is less reading than it sounds. Unchanged fields are grayed out, validators have already rejected the rows that broke a rule, and most rows are one highlighted phrase you judge in a glance. The effort tracks the judgment calls, not the row count. The reading that remains is the product, not the overhead.
What happens if a bad change publishes anyway?
It reverts. Revert any published item and the original goes back, no full-site restore, no backup archaeology, and the other 1,999 items stay exactly where you approved them.
Does the AI ever get write access to my live CMS?
No. The agent edits files on your laptop and holds no credential to your CMS. Scratch does the pulling and the publishing, and it publishes only what you approved. Nothing the agent does, including being wrong, can reach the live site on its own.
Do I need a git-based CMS to get real diffs?
No. Files as the working copy is the part git-based systems get right, and Scratch builds it for the CMS you already run, without the migration. The diffs are field level rather than line level, the review is per item rather than per pull request, and the reviewer does not need a GitHub account.
Does this work with the CMS I already run?
Probably. Webflow, WordPress, HubSpot, Notion, Shopify, and more; see /for/ for what each connector reads and writes. If yours is missing, tell Curtis what you run and he will tell you honestly whether it is close.
See it on your own content
Nobody trusts a review step they have not run on their own content. See it run on your content →, or download Scratch free, connect one CMS, and pull a single collection. Your first diff is about twenty minutes away.