Website crawl

The website crawl reads your business's website and imports its text content into the knowledge base. It's the fastest way to give Costello a broad base of knowledge about your business without manually entering everything.

What the crawl imports

The crawler reads the visible text on each page of your website — services, treatment descriptions, pricing, FAQs, about pages, and so on. It ignores:

Images (captions are included if present)
Navigation menus
Footer boilerplate (address, phone number, copyright notice)
Forms
Cookie banners and popups

If your key content lives in images (for example, a treatment menu designed in Canva and embedded as an image), the crawl won't capture it — upload the file as a PDF instead.

How to run a crawl

In Costello, open Knowledge Base → Website crawl.
Enter your business's website URL (for example, https://www.yoursalonlondon.com).
Click Start crawl.

Costello will crawl your entire site — every page it can reach from your homepage. For most business sites (5–30 pages), this takes under two minutes. You'll see a "Crawling…" status with a page count as it progresses.

Once complete, the crawled content is added to the knowledge base and a retrain is queued automatically.

Re-crawling after a website update

The crawl is not automatic — Costello doesn't monitor your website for changes. If you update your website (new prices, new services, changed hours), you need to manually re-crawl:

Open Knowledge Base → Website crawl.
Click Re-crawl now.

The new crawl replaces the previous one. Retraining completes within a few minutes.

Set a reminder to re-crawl whenever you update your site's content. Many businesses run a re-crawl monthly alongside their pricing reviews.

If the crawl misses content

The crawler follows links from your homepage. If a page is not linked from anywhere on your site, it won't be found. Common scenarios:

A hidden menu page not linked from the navigation
A PDF embedded on a page (PDFs are not crawled — upload them separately)
A separate booking platform you link out to (Fresha, Square, Calendly — these are external and won't be crawled)

If there are specific pages you want included, make sure they're reachable from your homepage via at least one link.

Using crawl alongside other sources

The website crawl is a starting point, not the whole picture. Most businesses combine it with:

FAQ pairs → for common questions the website doesn't answer well
Uploaded files → for detailed treatment menus with prices
Opening hours & services → for structured availability data

When the same information appears in multiple sources, Costello weighs them all. For precise facts (prices, hours), structured data and FAQ pairs are more reliable than crawled website text.

Was this helpful?

Still stuck? Submit a case →