What The Fitness Industry Actually Looks Like To AI Agents

Ian Mullane, CEO and Founder of Keepme, goes under the hood of our global audit of 901 fitness operators to explain why most gym websites are "speaking the wrong language" to AI. From the hidden pitfalls of JavaScript-injected schema to the outdated robots.txt files that ignore modern crawlers like ClaudeBot and OAI-SearchBot, Ian breaks down the specific technical gaps that keep even the largest fitness brands invisible to the answer engines of 2026.
Ian Mullane
Ian Mullane
May 11th, 2026
What The Fitness Industry Actually Looks Like To AI Agents

In my last article, I shared the headline findings from our audit of 901 fitness operators across 27 countries: a global average AEO score of 21 out of 100, 84% of operators at the level we call Pre-AEO, and 49 sites actively blocking the AI crawlers they should be welcoming. I promised to follow up with what can be done. The response to the first article - the questions, the messages, and the conversations it started - convinced me that jumping straight to the fix would skip over something important. There is enough interest in the detail of what we found, and enough misunderstanding about why these specific gaps matter, that a deeper look at the findings is worth its own piece. The practical guidance on what to do about it follows later this week.

Most operators have a robots.txt file. Almost none have updated it for AI.

The robots.txt file is the document that tells automated visitors which parts of a site they are allowed to access. Every operator of any scale has one. The problem is that the major AI crawlers arrived after most of these files were last touched. OAI-SearchBot, which powers ChatGPT Search, was not a consideration when most operators last thought about their robots.txt. Neither was PerplexityBot, Google-Extended for Gemini, or ClaudeBot.

The result is a document written in 2019 or 2021 that says nothing deliberate about any of the systems that now account for a growing share of how prospective members find businesses. There are 13 AI crawlers worth explicitly addressing, split across three categories: search crawlers that directly affect your visibility in AI answers, user-triggered agents that browse on behalf of live users, and training crawlers that feed the model pipelines. The operators who are getting this right name all 13 explicitly. The majority say nothing, relying on defaults that pre-date the question entirely.

Before the schema findings: what schema actually is

Schema is a vocabulary of structured data that you embed in your website's code. It is not visible to human visitors. It exists entirely for machines. When an AI agent visits your website, it is not reading your page the way a person does, scanning the headline, looking at the photos, reading the about section. It is parsing the code for specific data fields it recognises: the type of business, the address, the phone number, the opening hours, the price of a membership. Schema is how you put that data into a format machines can reliably extract.

The analogy that lands most cleanly: think of schema as the label on the tin. The tin contains everything a human visitor can see and read. But an AI agent is not opening the tin. It is reading the label on the outside. If the label says nothing useful, or says the wrong thing, the contents are irrelevant. The agent moves on.

The schema problem is not absence. It is the wrong kind of presence.

I expected to find most operators with no schema at all. What I found was more nuanced, and in some ways more concerning. A large proportion of operators have schema. They have it because their WordPress plugin generates it automatically. The problem is that the schema a standard Yoast or RankMath install produces is generic: it tells an AI agent that this is a website, that it belongs to an organisation, and that the organisation has a name and a logo. What it does not say is that the organisation is a gym. It does not say where the gym is. It does not list the clubs, their addresses, their opening hours, or what a membership costs.

ExerciseGym and LocalBusiness are the schema types that do that work. Neither appears in the default output of any major SEO plugin. Across the 901 operators in the audit, the presence of fitness-specific schema was the exception rather than the rule, even among the largest multisite groups.

We also found a failure mode I had not anticipated. One operator, a well-run, professionally marketed multi-club group, had schema pointing to a staging server URL rather than their live website. Every AI crawler that read their schema was directed to a development environment. The brand entity could not be resolved with any confidence. In most cases this is not negligence: it is a configuration that was never checked, by people who did not know to look.

Schema that loads through JavaScript does not exist for most AI crawlers.

This is the most technically significant finding and the hardest to explain without sounding more complicated than it is. The mechanism is straightforward. When an AI crawler visits a page, it sends a request, receives the raw HTML, and leaves. It does not open a browser. It does not wait for JavaScript to execute. This means that any schema injected through Google Tag Manager, which fires after page load, inside a browser, is invisible to GPTBot, ClaudeBot, PerplexityBot, and OAI-SearchBot. Those crawlers departed before GTM had a chance to run.

This is not a marginal issue. GTM is how many operators have attempted to deploy schema, guided by advice that made sense for traditional SEO but does not translate to AI crawlers. The schema exists in the sense that a human visitor in a browser would find it. For the crawlers that matter for AEO, it never arrived.

The llms.txt file: most operators have never heard of it, and the ones who have are doing it wrong.

An llms.txt file is a structured, machine-readable document at the domain root that gives AI agents a direct and accurate account of what the business is, where it operates, and what it offers. Think of it as a briefing document written specifically for machines. It is not a substitute for schema, but it is the mechanism through which an AI agent can read your location addresses, your membership tiers, your class types, and your pricing in a single, retrievable document without needing to crawl every page.

Of the 901 operators, almost none had one. Among the handful who did, most had an auto-generated version produced by an SEO plugin as a page index: a list of blog posts and site pages with no location data, no pricing, and no operational detail. One operator, a premium multi-club group with eight locations across North America and Canada, had an llms.txt file listing articles about Pilates and tennis. Their eight club addresses were absent entirely.

The finding underneath all the findings

The common thread across all five components is the same. The fitness industry built its web infrastructure for two types of visitor: human beings using browsers, and Google's search crawler. Both get what they came for. AI agents get a different experience. They do not render JavaScript, so they miss anything injected after page load. They look for structured data in specific formats, so generic schema adds noise rather than signal. They benefit from a structured content document, which most operators do not provide. And they check permissions in a file that most operators have not updated since AI search was a concept rather than a commercial reality.

The operators who address these gaps do not need to rebuild their websites. The changes are structural and relatively contained. In my next article, I will set out exactly what those changes are, and how quickly the gap between where the industry is today and where it needs to be can be closed.