When someone asks an AI to recommend a business, the AI needs information. It needs to know who you are, what you do, where you operate, what you specialise in, and what makes you different. Most businesses give it nothing. They have a website, maybe a Google Business Profile, and that is it. The AI is left to guess, or worse, to recommend someone else entirely.
Some businesses give AI systems everything. They provide their identity, their brand guidelines, their verified facts, their frequently asked questions, their usage permissions. Those businesses get recommended. Not because they gamed a system, but because they made it easy for the AI to represent them accurately.
That is what AI discovery files do. And if you have not heard of them yet, this is the guide that explains exactly what they are, who is using them, and what your business gains by implementing them.
What AI Discovery Files Actually Are
AI discovery files are a collection of structured files placed on your website that help AI systems understand your business. Think of them as your business's formal introduction to every AI assistant, chatbot, and search engine that might recommend you.
They are not a new type of sitemap. They are not a list of your pages or products. They answer different questions entirely:
- Who is your business?
- What do you stand for?
- How should AI systems represent you?
- What are verified facts about your services?
- What permissions do AI systems have to use your content?
Your website tells customers what you do. AI discovery files tell AI systems who you are. If you're on WordPress, you can control exactly what AI says about your business with a free plugin that generates all 10 files from your dashboard.
The 365i AI Visibility Definition specification defines 10 file types, each serving a specific purpose. We will cover all of them later in this article, but the core idea is simple: give AI systems the structured information they need to recommend you accurately, and they will.
What They Are Not
There is a growing amount of confusion about what these files should contain, and it is leading businesses in the wrong direction.
AI discovery files are NOT another sitemap. They are not a list of your pages, products, services, or blog posts. If your llms.txt file is just a list of URLs, you have created a sitemap in markdown format, and that is a waste of a file.
They are not a replacement for good SEO or a quality website. A business with poor content and no online authority will not suddenly appear in AI recommendations because it created an llms.txt file. AI discovery files give context to AI systems, but that context needs to be backed by real expertise and a website that delivers on its promises.
They are not keyword stuffing for AI systems. Stuffing your llms.txt with sales copy or SEO keywords defeats the purpose. These files work best when they contain honest, verified, structured information about your business.
The distinction between "another sitemap" and "an AI identity" is critical, and it is the reason why some people wrongly believe these files do not work. More on that in a moment.
Who Is Already Using Them
The companies building AI infrastructure are the same companies implementing AI discovery files on their own websites. That should tell you something.
- Anthropic (the company behind Claude) maintains llms.txt for its own documentation
- Cloudflare has 3.7 million tokens in its llms-full.txt file
- Stripe structures its AI-readable content by product category
- Vercel has roughly 400,000 words of AI-consumable documentation
- Shopify, Supabase, Hugging Face, Zapier, and Perplexity all maintain their own implementations
Community directories now track over 13,000 websites with llms.txt files. The adoption curve is steepening, not flattening. Our Q1 2026 research crawling 1,460 top websites found 6.5% already have at least one AI Discovery File, outpacing humans.txt and closing on security.txt adoption rates.
These are not fringe early adopters hoping something catches on. These are the companies building the AI systems that recommend businesses to users. They are telling AI systems who they are. The question is whether your business is doing the same.
The Data So Far
Let us address the elephant in the room. An SE Ranking study of 300,000 domains found no measurable correlation between having an llms.txt file and being cited in AI responses.
That sounds damning. It is not, and here is why.
The study treated llms.txt as binary: either a site had one, or it did not. It did not assess quality. It did not check whether the file contained actual business information or was just an auto-generated list of URLs.
The vast majority of llms.txt files in the wild are auto-generated by SEO plugins. Yoast SEO generates a 5-URL markdown list. Rank Math generates up to 100 URLs. Both are literally sitemaps in markdown format. They tell AI systems nothing about the business, its expertise, its brand, or its verified facts.
There are roughly 800 manually crafted llms.txt implementations across the web. There are millions of auto-generated URL lists. When you measure all of them together and find "no correlation," you have not proven that AI discovery files do not work. You have proven that auto-generated URL lists do not work. Which is exactly what you would expect.
Measuring auto-generated URL lists and concluding that AI discovery files are useless is like measuring spam emails and concluding that email marketing does not work.
Meanwhile, CDN data from Profound and Mintlify confirms that ChatGPT's crawlers are actively fetching llms-full.txt files. AI systems are looking for this information. The question is whether what they find is worth consuming.
The 10 Files and What Each Does
The 365i AI Visibility Definition specifies 10 file types. Each serves a distinct purpose in helping AI systems understand and represent your business.
| File | What It Does | Why It Matters |
|---|---|---|
| llms.txt | Business identity in markdown format | Core identity document that AI systems can consume directly |
| llms.html | Human-readable HTML version with Schema.org data | Bridges AI consumption and human accessibility in one file |
| ai.txt | AI usage permissions and restrictions | Controls how AI systems can use your content and data |
| ai.json | Machine-parseable interaction guidance | Structured data AI systems can process programmatically |
| identity.json | Schema.org-aligned canonical identity data | Consistent identity representation across all AI platforms |
| brand.txt | Brand naming conventions and representation rules | Prevents AI from misrepresenting your brand name or positioning |
| faq-ai.txt | Verified Q&A pairs for AI consumption | Gives AI systems accurate, pre-verified answers to common questions |
| developer-ai.txt | Technical context for developers | Helps AI coding tools assist developers working with your platform |
| robots-ai.txt | AI crawler-specific access directives | Controls which AI crawlers can access which parts of your site |
Not every business needs all 10 files on day one. But the businesses that implement even three or four of them are giving AI systems dramatically more context than the ones relying on a plugin-generated URL list.
The full specification, including file formats, field definitions, and implementation guidance, is available at the 365i AI Visibility Definition specifications page.
What You Gain
The benefits of AI discovery files are practical and measurable. This is not about hoping for future advantage. It is about controlling how AI represents your business today.
Control. You decide how AI represents your business. Without AI discovery files, AI systems guess based on whatever they can scrape from your website, third-party directories, and reviews. With them, you provide the definitive version. Your name, your positioning, your services, your verified facts.
Accuracy. AI hallucination is a real problem. AI systems sometimes invent services you do not offer, attribute reviews to the wrong business, or mix up your details with a competitor's. Verified information in structured files reduces that risk. When an AI system has a verified FAQ from your faq-ai.txt file, it uses that instead of guessing.
Consistency. Without AI discovery files, ChatGPT might describe your business one way, Claude another, and Gemini a third. With identity.json and brand.txt, every AI platform has the same canonical source of truth. Same name format. Same brand positioning. Same verified facts.
Discoverability. AI systems need structured data they can parse efficiently. A 3,000-word "About Us" page buried three clicks deep on your website is harder for AI to process than a clean, structured identity.json file at your root. Making information easy to consume makes your business easier to recommend. Want proof? This free tool shows what AI actually sees when it visits your website, and the difference between the human view and the AI view is striking.
First-mover advantage. While SEO professionals debate whether AI discovery files are "standardised" enough to implement, the businesses that have already done it are building AI visibility that compounds over time. As we reported, 94% of UK SME websites are currently invisible to ChatGPT. That is not a crowded market. That is an open field.
Getting Started
You do not need to implement all 10 files at once. Start with the files that deliver the most immediate value and build from there. If you're on WordPress, our free AI Discovery Files plugin handles all of this automatically.
Step 1: llms.txt and identity.json. These are the foundational pair. llms.txt gives AI systems your business identity in plain markdown. identity.json provides the same information in a Schema.org-aligned JSON format that machines can process directly. Together, they cover the "who are you?" question from both human-readable and machine-readable angles.
Step 2: ai.json. This file provides structured interaction guidance. It tells AI systems what your business does, what services you offer, and how to interact with your content. Unlike free-form text, JSON format means AI can parse it programmatically with zero ambiguity.
Step 3: faq-ai.txt. Write out the 10 to 20 questions your customers ask most often, with accurate, verified answers. When an AI system encounters a question about your services, it can draw from this file instead of guessing. This is one of the highest-impact files because it directly addresses the information AI systems need for recommendations.
Step 4: brand.txt and ai.txt. Brand.txt ensures AI systems use the correct name format and brand positioning. ai.txt sets permissions for how AI can use your content. Both are quick to implement and prevent misrepresentation.
The full specification for all 10 file types is documented on the 365i AI Visibility Definition specifications page. And if you want to see what Google Gemini itself learns from these files, we ran a live test with real code examples from our own implementation.
If you want help implementing AI discovery files for your business, our AI visibility service covers the full specification, from initial audit to deployment and verification across AI platforms. Once your files are live, you can also register in the AI Discovery Files Directory for a verified listing with dofollow backlinks and an AI visibility score.
Frequently Asked Questions
What are AI discovery files?
AI discovery files are a set of structured files placed on your website that help AI systems understand who your business is, what you do, and how you should be represented. They include files like llms.txt (business identity in markdown), ai.json (structured interaction guidance), identity.json (canonical identity data), brand.txt (brand naming rules), and faq-ai.txt (verified Q&A pairs). Unlike sitemaps that list pages, AI discovery files describe your business identity.
Are AI discovery files the same as a sitemap?
No. This is a common misconception, partly because SEO plugins like Yoast and Rank Math auto-generate llms.txt files that are literally just URL lists, which is a sitemap in markdown format. Proper AI discovery files describe your business identity, brand guidelines, verified facts, and usage permissions. They answer who you are and how AI should represent you, not just what pages exist on your website.
Do AI systems actually read these files?
Yes. CDN data from Profound and Mintlify shows that ChatGPT crawlers are actively fetching llms-full.txt files. Anthropic (Claude), Perplexity, and other AI systems have documented their use of llms.txt for context. Companies like Cloudflare, Stripe, Vercel, and Shopify maintain full AI discovery files specifically because AI systems consume them.
Is llms.txt an official standard?
The llms.txt specification was proposed by Jeremy Howard of Answer.AI in September 2024 and is maintained as a community specification at llmstxt.org. It is not ratified by a standards body like the W3C or IETF. However, robots.txt operated for 28 years without formal standardisation (1994 to 2022) and was universally adopted. The absence of formal standardisation has not prevented adoption by major companies including Anthropic, Cloudflare, Stripe, and Vercel. Google's position is nuanced: John Mueller called markdown page mirrors for AI "a stupid idea", but Google's own Search Central team briefly added llms.txt documentation to their developer docs in December 2025.
Which AI discovery files should I create first?
Start with llms.txt and identity.json as a foundational pair. llms.txt provides your business identity in a format AI systems can read immediately, while identity.json gives structured canonical data aligned with Schema.org. Next, add ai.json for interaction guidance and faq-ai.txt with verified answers to your most common customer questions. The full specification at 365i.co.uk covers all 10 file types.
Do AI discovery files replace SEO?
No. AI discovery files complement SEO, they do not replace it. SEO ensures your website appears in traditional search results. AI discovery files ensure AI systems like ChatGPT, Claude, and Gemini can accurately represent your business when people ask for recommendations. Both are needed for full online visibility in 2026.
Why did the SE Ranking study say llms.txt has no effect?
The SE Ranking study of 300,000 domains treated llms.txt as binary: either a site had one or it did not. It did not assess quality. The vast majority of llms.txt files in the wild are auto-generated URL lists from SEO plugins like Yoast (5 URLs) and Rank Math (up to 100 URLs). These are sitemaps in markdown format, not AI identity files. Measuring auto-generated URL lists and concluding that AI discovery files do not work is like measuring spam emails and concluding that email marketing does not work.
Will Google penalise me for having AI discovery files?
No. AI discovery files are static text and JSON files placed on your server. They do not interfere with Google crawling, do not use cloaking or deceptive techniques, and do not violate any Google webmaster guidelines. They are comparable to robots.txt, humans.txt, or security.txt: standard files that provide information to automated systems. Google itself indexes and caches llms.txt files from major websites.
Get Your Business AI-Ready
AI systems are recommending businesses right now. Most UK businesses are invisible to them. We implement the full AI Visibility Definition specification, from llms.txt through to identity.json, giving your business a verified identity across ChatGPT, Claude, Gemini, and Perplexity.
AI Visibility Service Get in Touch