← /how-to/

How to enrich HubSpot company records with AI

4,000 companies that are a name, a domain, and nothing else. The fix: pull them down as files, your agent researches each one, you approve every field. Try it now free → or book a demo with Curtis

Open a company record in your portal and count what is actually in it: a name, a domain, and a column of blank properties. Now multiply by 4,000. Your routing rules key on industry, your segmentation keys on a custom property, your reporting groups by fields that are empty on most of the database, so every list you build starts with "where industry is known" and quietly drops the companies you worked hardest to get. The strange part is that the missing information is not hidden. Most of it sits on each company's own website, one tab away. What you do not have is a way to read 4,000 websites.

A web-capable AI agent can. Give it the records and it reads each company's site, drafts the description, the industry, and the custom properties your segmentation keys on, and notes a source for every field it fills, with nothing reaching your CRM until you approve the record. That approval step is where the options below split: coverage and cost vary, but the sharpest difference is whether anyone shows you the filled-in value before it becomes the thing your routing believes.

Your options

An enrichment vendor

The vendor category exists because firmographics at scale are hard, and a good database is genuinely good at them. Headcount, revenue, funding: for those, a maintained database beats anything an agent reads off the open web, and the better vendors offer per-field rules, fill-if-blank or overwrite, plus scheduled re-enrichment. The trades are structural. You pay in credits whether or not a match was useful, coverage thins exactly where your blanks live, among the small and long-tail companies, the data is a snapshot of the vendor's database rather than the company's live website, and the write happens automatically on a domain match. Nobody puts the new value in front of you first, and the vendor's industry taxonomy is not your Industry dropdown.

HubSpot's built-in enrichment

Worth turning on before you spend anything. On paid tiers, HubSpot fills a fixed list of standard company fields, industry, employee range, revenue, description among them, and automatic mode fills blanks only, never overwriting what a person typed. The limits are the fixed list and the matching. Companies are matched by domain, so a record without one stays empty. Bulk enrichment from an index page runs 100 records at a time, which makes 4,000 records 40 manual batches. HubSpot does not show you a source for a standard field. And custom properties move you to the credit-metered AI feature, smart properties, which can research the web and does cite sources, but consumes credits per run, even when it fills nothing, and the credits expire monthly.

Research by hand

On quality, this is the bar. A person reads the website, notices that the company calling itself a platform is actually an agency, and writes a description that means something. The arithmetic is the wall. At even 5 minutes per company, 4,000 records is more than 330 hours, unique values go into the HubSpot UI one record at a time, and nobody who starts this in January is still doing it in March. So the project gets scoped down to the open pipeline, and the database stays blank exactly where the next campaign needed it.

A script on the CRM API

The CRM API will take everything you can generate: batch updates accept 100 companies per call, so writing 4,000 records is about 40 calls, comfortably inside the rate limits, and a script can wire a model in for the research. Two things bite. The write is live the moment the script runs, with no preview and no per-record undo. And Industry is a dropdown: a value that does not exactly match an option, down to whitespace and case, fails validation, and in a batch call one bad value can fail the whole 100-record request, so the script needs partition-and-retry logic before it needs anything else. You can build validation, review, retries, and an audit trail around all of this. That is a project, not an afternoon.

Scratch plus a researching agent

Scratch puts the same 4,000 companies on your laptop as files and lets the agent you already use, one with web access, do the research. The agent greps for the blanks, builds its own coverage index, reads each company's website, and writes the description, the industry as one of your exact dropdown values, and the custom properties your segmentation needs, noting the source for each. Nothing it writes goes anywhere on its own: every filled field comes back as a diff against the blank, source alongside, and only approved records publish through the CRM API. Two honest costs. The review is real reading, on purpose, because web research can be wrong or stale. And for headcount and revenue, a vendor database is still the better source.

Option Custom properties Reads the live website Review before it writes Undo after
Enrichment vendor Mapping work, often extra cost No, database snapshot No, writes on match Varies
Built-in enrichment Smart properties, credits per run Smart properties only No, fills automatically No
Research by hand Yes Yes You are the review No
Script on the API Yes If you build it No No
Scratch Yes, your agent Yes Every field, as a diff Per record

How the loop works on your database

  1. Scratch pulls your companies into files. Each of the 4,000 companies lands as its own file in a folder on your laptop, name, domain, and every blank property visible as blank. Custom objects come along if your segmentation lives there. Workflows, lists, and reporting stay where they are, and nothing in the CRM has changed.
  2. Your AI researches and fills. Point Claude, Codex, Cursor, or Copilot, any agent with web access, at the folder and give it the brief. For every company missing an industry or description, read its website, write a 2-sentence description in our format, set industry to an exact value from this list, fill company_type, and note the source URL for each field you fill. The bookkeeping is the part the agent scripts: a grep across the folder finds every blank industry in about a second, where asking the API the same question is 40 paginated calls every time. Over an MCP server, the agent pages through records call by call, and 4,000 full records do not fit in a context window. With files, the agent works in batches, tracks its own coverage, and resumes where it stopped. It holds no API token, so it cannot touch the CRM even if it tries.
  3. You review the diffs and publish. Every filled field shows as a diff, the blank next to the new value, the source the agent noted alongside. Validators run first if you set them: industry must match your dropdown options exactly, descriptions under a length cap. Approve per record, and Scratch writes only approved records back through the CRM API, at the API's own pace. The agent did 99% of the work; the last 1%, deciding what becomes CRM truth, stays with you. A record that publishes wrong reverts per record.

Start with the 200 companies your next campaign actually targets. When those diffs read right, send the agent through the rest.

Where the open web wins, and where it loses

An agent reading a company's own website beats a database at exactly what databases are bad at: what the company does, in words a human chose this quarter, plus the judgment fields no vendor sells, your company_type, your ICP tier, agency versus product company. It also reaches records the matchers cannot. A company with no domain on file cannot be domain-matched at all, but an agent can search the name, find the website, and fill the domain first.

It loses at numbers companies do not publish. Headcount and revenue on the open web are guesses, and a maintained vendor database is the better source for both. The two are not exclusive: turn on HubSpot's built-in enrichment for the standard fields it covers, and point the agent at the descriptions, the industries it missed, and the custom properties nothing else fills.

Questions people ask

Can AI fill missing company data without buying enrichment credits?

Yes. Descriptions, industries, and your judgment-based custom properties are mostly readable off each company's own website, and a web-capable agent reads them there. Scratch is free to try, the research runs on the agent you already pay for, and there are no per-record credits. The honest exception is firmographics: for headcount and revenue, a vendor database remains the better source.

Does HubSpot's built-in enrichment already cover this?

Partly. On paid tiers it fills a fixed list of standard fields for companies it can match by domain, blanks only, and that is worth turning on first. It does not show a source for standard fields, it cannot fill a record with no domain, and custom properties require the credit-consuming smart properties feature, which spends credits per run, even when it fills nothing.

Can the AI fill custom properties, not just the standard fields?

Yes. The brief is yours: company_type, ICP tier, tech notes, whatever your segmentation keys on. Companies and custom objects are both editable in Scratch. The one rule that matters: dropdown properties need exact option values, so put the allowed list in the brief and let a validator check it.

Can I see the source for every filled field?

Yes. The brief tells the agent to note the source URL for everything it fills, so the value and where it came from arrive in the same diff. That is the visibility contrast with the matching route, where a value lands in the CRM and the best you can do is wonder.

Will the AI overwrite values my team entered by hand?

No. Write the brief to fill blanks only, and any record where the agent touched an existing value shows the old value next to the new one in the diff. Reject it and nothing changes.

What happens when the research is wrong?

It will be. The web is wrong in predictable ways: an acquired company's old site, a pivoted company's old positioning, two companies sharing a name. That is what the review step is for. The wrong industry sits next to its source on your screen before it becomes a routing decision, and a record that publishes wrong reverts per record.

Will the values match my Industry dropdown?

Yes. Put the exact option list in the brief, the agent writes only those values, and a validator can enforce the match before you read a single diff. This matters more than it sounds: HubSpot rejects a near-miss dropdown value with a validation error, so without a validator you find out at publish time, error by error.

Do I have to review 4,000 records one by one?

Yes. That is the trade, stated plainly. Reviewing a filled field with its source is seconds, not the minutes the research took, and validators clear the mechanical checks first so your reading goes to judgment. Work in batches: approve the first 200, tighten the brief where the diffs read wrong, and the later batches go faster.

Does the AI get write access to my CRM?

No. The agent edits files on your laptop and holds no API token. Scratch holds the connection and writes only the records you approved, through the CRM API, while workflows, lists, and emails stay untouched.

See it on your own database

The fastest way to judge it is to watch the agent research companies you actually know. See it run on your HubSpot companies →, or download Scratch free, pull your companies this afternoon, and ask your agent how many have a blank industry. The answer takes one grep.

See it run on your own content.

Curtis runs these calls himself. Thirty minutes, no pitch, no slides. He connects your platforms live and shows you your content as an editable, reviewable diff. Bring anything sticky: a refresh, a migration, or a rebrand.

See it run on your content → or download it free