Key takeaways
- AI assistants quote facts they can verify in structured text, not taglines buried in images.
- Technical crawlability (sitemap, robots, schema) still gates everything else.
- Pair sitemap hygiene with an AI Website Profile (llms.txt) so agents know which facts are canonical.
If you only optimize for human visitors, you are leaving a fast-growing channel on the table. ChatGPT-style assistants, AI Overviews, and vertical agents do not scroll your homepage like a person. They look for machine-verifiable facts: who you are, what you sell, where you operate, and what proof exists. This checklist helps you audit your site the way an impatient agent would, in under an hour.
Why this audit exists
Traditional SEO teaches keywords, internal links, and authority. AI discovery adds another layer: can a model extract trustworthy entities from your pages without guessing? When extraction fails, assistants skip you, cite competitors, or hallucinate details. This guide is intentionally practical. For each item, mark pass or fail, capture evidence, and schedule fixes starting with anything that blocks crawling or creates conflicting facts.
Section 1: Crawl and index foundations
1. sitemap.xml is live and honest
Open /sitemap.xml in your browser. You should see canonical URLs with lastmod values that reflect real updates. If you see a 404, enable sitemap generation in your CMS or SEO plugin. Large sites should use a sitemap index that points to section sitemaps. Stale lastmod dates signal neglect to both Google and any crawler that trusts freshness signals.
2. robots.txt matches your intent
Fetch /robots.txt. Confirm you are not blocking important paths for general crawlers unless you have a documented reason. If you block common AI user agents, understand the tradeoff: you may reduce training or retrieval access. Misconfigured wildcards are a frequent silent failure mode. When in doubt, ask your developer to diff robots.txt against your staging environment.
3. Structured data passes a real entity test
Run your homepage through Google Rich Results. You want entity-level types such as LocalBusiness, Restaurant, or ProfessionalService with complete name, address, phone, and hours where applicable. A bare WebSite node is better than nothing, but it does not answer local or service questions. If schema is missing, fix it before you invest in more blog volume.
4. Performance and mobile are baseline hygiene
Agents inherit practical constraints from search infrastructure. Slow LCP, intrusive interstitials, and broken mobile layouts correlate with incomplete crawling. Compress hero media, defer non-critical scripts, and keep critical business text in HTML above the fold. You are not chasing a perfect PageSpeed score; you are removing friction from fact extraction.
Section 2: Content machines can actually read
5. Copy and paste test
On your contact page, try to highlight your phone number and address. If you cannot select the text, it is probably rasterized inside an image. That is a fail for accessibility and for AI. Move those facts into HTML. Repeat the test on pricing, hours, and policy pages.
6. Menus, catalogs, and price lists in HTML
PDFs preserve print layout. They do not expose reliable dish-level or SKU-level structure to assistants. Publish HTML versions of your menu, catalog, or service list with headings and lists. PDF can remain as a download, but it should not be the only source of truth.
7. Google Business Profile matches your site
Search your brand and open your GBP panel. Categories, hours, phone, and service area should align with your website. Conflicts force models to reconcile sources, which increases error rates. Fill optional attributes that matter: accessibility, delivery, booking links, and photos with real captions.
8. One URL per core offer
If every service is crammed into one long page, assistants struggle to extract boundaries. Give each primary offer a dedicated URL with clear H1, proof, and a CTA. Interlink related pages so crawlers understand context.
Section 3: Advanced readiness
9. FAQ grounded in real questions
Build FAQs from support tickets, sales calls, and chat logs. Generic FAQs do not help. When you publish, add FAQPage schema where appropriate so question and answer pairs map cleanly into assistant responses.
10. llms.txt as your AI Website Profile
Check /llms.txt. This file is your concise, machine-oriented business brief: what you offer, who it is for, policies, and pointers to canonical pages. It is not a replacement for a sitemap, but it answers the questions assistants ask after URLs are discovered. If you do not have one yet, Platinum.ai can generate a production-ready profile aligned to your industry after scanning your site.
Turn results into a roadmap
Tackle failures in order: unblock crawling, align facts across surfaces, remove PDF-only traps, then publish structured truth in llms.txt. If you want an independent verification pass, run our Site Scan on the homepage. It surfaces the same signals agents use when deciding whether to trust your business, from token bloat to missing profiles.
How this ties to broader SEO strategy
AI readiness does not replace technical SEO, content strategy, or local link building. It intersects them. Fast pages help crawlers and humans. Clear service pages help intent matching in search and in assistants. Reviews help trust everywhere. Use this audit alongside your existing roadmap rather than as a one-time box check.
If you are planning a site migration, rerun the audit before and after launch. Migrations are a top cause of schema regressions, broken internal links, and accidental noindex tags. A second pass prevents silent AI visibility drops while rankings look temporarily stable.
