Website crawl
The website crawl reads your clinic's website and imports its text content into the knowledge base. It's the fastest way to give Costello a broad base of knowledge about your clinic without manually entering everything.
What the crawl imports
The crawler reads the visible text on each page of your website — services, treatment descriptions, pricing, FAQs, about pages, and so on. It ignores:
- Images (captions are included if present)
- Navigation menus
- Footer boilerplate (address, phone number, copyright notice)
- Forms
- Cookie banners and popups
If your key content lives in images (e.g. a treatment menu designed in Canva and embedded as an image), the crawl won't capture it — upload the file as a PDF instead.
How to run a crawl
- In Costello, open Knowledge Base → Website crawl.
- Enter your clinic's website URL (e.g.
https://www.yourcliniclondon.com). - Click Start crawl.
Costello will crawl your entire site — every page it can reach from your homepage. For most clinic sites (5–30 pages), this takes under two minutes. You'll see a "Crawling…" status with a page count as it progresses.
Once complete, the crawled content is added to the knowledge base and a retrain is queued automatically.
Re-crawling after a website update
The crawl is not automatic — Costello doesn't monitor your website for changes. If you update your website (new prices, new services, changed hours), you need to manually re-crawl:
- Open Knowledge Base → Website crawl.
- Click Re-crawl now.
The new crawl replaces the previous one. Retraining completes within a few minutes.
Set a reminder to re-crawl whenever you update your site's content. Many clinics run a re-crawl monthly alongside their pricing reviews.
If the crawl misses content
The crawler follows links from your homepage. If a page is not linked from anywhere on your site, it won't be found. Common scenarios:
- A hidden menu page not linked from the navigation
- A PDF embedded on a page (PDFs are not crawled — upload them separately)
- A separate booking platform you link out to (Fresha, Square, Calendly — these are external and won't be crawled)
If there are specific pages you want included, make sure they're reachable from your homepage via at least one link.
Using crawl alongside other sources
The website crawl is a starting point, not the whole picture. Most clinics combine it with:
- FAQ pairs → for common questions the website doesn't answer well
- Uploaded files → for detailed treatment menus with prices
- Opening hours & services → for structured availability data
When the same information appears in multiple sources, Costello weighs them all. For precise facts (prices, hours), structured data and FAQ pairs are more reliable than crawled website text.
Still stuck? Submit a case →