Published:
Updated:
by Wayne Smith
It may be useful to consider how we got here:
- Schema began as fuel for snippets, knowledge panels, and local results.
- SEOs soon discovered it increased visibility in ways not spelled out in Google’s documentation — perhaps an early sign of AI-based refinement.
- These observations gave rise to entity-based SEO.
- AI Overviews shifted the focus toward entity-based understanding.
- Conversational AI is changing how people search — queries are longer, more natural, and more specific, with users expecting direct answers rather than a list of links.
- Today, AI results sit above traditional SERPs, with entities aiding visibility in both AI-driven features and traditional search results.
Google: AI Features & Your Website
Keywords still get you on the field. Beyond keyword-based SEO, observational evidence shows that content tightly focused around a specific entity improves your chances of appearing in AI-powered results.
Part One: LLMs do interact with Schema
LLMs and Schema, how LLMs read Schema Studies how LLM read both JSON-LD and Microdata schema. It does not need to programically parse schema to interrupt it, in fact LLM can sematically read broken schema.
The Schism Between Natural Discovery and SEO
Here’s the irony: almost every page online is “search engine friendly” to some degree. Even without deliberate optimization, pages naturally include elements that help search engines interpret them — a homepage, for example, almost always names the entity (the brand or site) in the title.
The real schism isn’t technical — it’s cultural. Content creators often see SEO as manipulative, while in practice, optimization is closer to a natural aid for discovery.
AI optimization sits firmly in this ‘friendly’ zone. The task isn’t to game algorithms but to make sure LLMs can actually parse and connect your entities. Google and other search providers may not reveal the full playbook — partly for IP reasons, partly because the rules shift as AI evolves — but the principle is stable: if machines can’t understand your entities, you lose visibility.JSON vs Microdata Schema
Both JSON-LD and Microdata can be used to clarify entities and synonyms, giving AI systems a better understanding of your content.
- JSON-LD is generally regarded as cleaner, easier to maintain, and keeps markup separate from the visible page content.
- Microdata, however, can be more explicit and tightly bound to context — useful in tricky cases where ambiguity could confuse an AI system (see the knobbly monster example below).
An often-overlooked benefit of JSON-LD is the clear division of labor: writers create content, coders handle the schema. However, frequent updates can break this advantage, as schema must stay in sync with content. Mistakes may hurt AI-driven visibility or even appear manipulative. Schema should always remain in the ‘friendly’ zone.
What do LLMs consider as an entity?
Google’s Knowledge Panel was seeded from Wikipedia and other curated datasets. LLMs work differently: they read natural language as-is, treating nearly every noun and many noun phrases (noun+) as entities.
For example:
- Noun: “dress”
- Noun+: “formal dress” → which has its own Wikipedia page
Linguistically, “formal” is an adjective in English, but entities are multilingual. In practice, “formal dress” stands on its own as an entity in an LLM, schema, or Knowledge Panel.
And this isn’t just theory:
- Different entities = different SERPs
- Noun+ entities (like “formal dress”) don’t cannibalize their root (“dress”)
- Each can support its own page, markup, and content strategy
No formal list of LLM entities
Here’s the kicker: LLMs have no official list of entities. They generate them on the fly. Drop a new brand, product, or local business into their context, and it will be treated as an entity—even if never seen before: ✨ EntityNLM research. Promoting your entity off-site helps reinforce recognition.
Here’s a handy trick to test if an LLM is interpreting entities correctly:
- Consider how a software translators can be tested ... translate content to another language, then back. Mismatches stand out.
- Ask the LLM to rephrase or change the tone of your content, then see where entity understanding breaks down.
The Knobbly Monster: Connecting entities with microdata schema
Think about the “knobbly monster” riddle. People enjoy riddles because solving them gives a little dopamine kick—it feels rewarding, makes us smile, and sticks in memory. That’s why playful language is so common in writing and why quirky YouTube videos can compete with cat videos. Sometimes this is deliberate: content that people remember is more effective for marketing than content Google labels as merely “helpful.”
Here’s an example:
"The crocodile walked onto the sidewalk, and people called the police because the knobbly monster was scaring people."
Humans solve this quickly. We understand that “knobbly monster” really means “crocodile.” Writers often use this trick to keep content engaging and fresh.
For machines, it’s more complicated. Without guidance, an AI might see “crocodile” and “knobbly monster” as two separate things. That’s where Microdata schema (schema.org guide) helps. It links the terms so the AI knows they refer to the same entity.
If you use this HTML as a prompt for an AI image generator, it won’t produce two creatures. You’ll get one crocodile. The "description" property makes the connection clear. That’s the power of schema: it reduces ambiguity and helps machines interpret content the way humans naturally do.
Gain of knowledge or word density
When markup defines or describes a knobbly monster as a crocodile, it’s tricky to pinpoint why visibility improves. Is it because the meta content subtly boosts keyword density, or because it provides a gain of knowledge that the AI or search engine recognizes as new or reinforced information?
- Keyword density alone affects the terms a page can rank for.
- Gain of knowledge (or fresh content) — information not present on older pages but validated elsewhere — can also boost visibility.
Without an explicit disclosure from LLMs or search engines, the exact reason a page benefits remains uncertain. But both mechanisms influence what the AI or search engine perceives as relevant or authoritative.
AI-Generated Images: Testing Entity Interpretation
AI image generators can be a surprisingly useful tool to see whether LLMs are interpreting entity relationships correctly. Take this prompt: it creates Sam as an entity, but it’s ambiguous who he is — is he the policeman? Or the crocodile? (The crocodile only appears in the microdata description schema.) This illustrates how AI-generated images can help test content clarity for entities.
Example:
"Sam walked onto the sidewalk, and people called the police because the knobbly monster was scaring people."
Using the following HTML as the prompt — where “crocodile” exists only in the microdata — the AI may “hallucinate” that Sam is a policeman who is also a crocodile:
Even minor markup ambiguity can mislead AI, highlighting the importance of explicit entity connections in your schema. If Sam were a key person in a search context, the query entities and content entities must align.
How AI entities change the future of SEO
Zero-click searches for informational queries are rising. Search engines now surface entity-driven answers directly in AI overviews, knowledge panels, or featured snippets.
For informational intent, users want breadth, not just a single keyword. Entity-based, AI-aware SEO helps AI systems understand concepts and relationships, often expanding a query into related entities or LSI-like terms to capture full context.
To appear in AI overviews, your page must cover this fanned-out query space, aligning content with the network of related entities AI expects. Even if a page ranks poorly for the exact term, it can still surface via related entities.
Another approach is brand awareness. Here, marketing value doesn’t require clicks. Appearing in AI overviews or other answer-based results increases visibility and familiarity.
Brand awareness for informational search is like reputation management, but focused on the brand as an entity. Schema data—what AI reads about the brand—helps define how your brand is represented, while also aligning with what other sites say. If the schema lists one address and the web another, which does AI trust? Brand awareness becomes a tool to correct such inconsistencies.
Many questions remain, but AI is advancing rapidly.
Category pages: real-life examples of entities (content breadth) or keywords
A category page defines the main entity — the category itself — but depends on multiple supporting pages to build topical authority. This interlinked breadth signals to AI that the category is more than a keyword cluster: it’s a recognized entity with context and relationships. A well-structured category page acts as a hybrid informational intent page, fanning out across related entities while anchoring them under a single, authoritative topic.
For a deeper dive into schema for a category or "CollectionPage" schema, see: Schema for AI-Aware SEO: Topical Pages and Hubs
Solution Smith Testing Protocols
Solution Smith approaches SEO and AI-aware entity testing with the same rigor as software testing — methodically and with evidence. Features related to entities, schema, and AI interpretation are observed, logged, and verified through repeated experiments to understand how content is surfaced in AI-driven search and overviews.
This process allows us to evaluate whether schema markup, entity relationships, and content structure genuinely influence AI visibility and performance — without relying on public disclosures from Google or other search engines.