Published:
Updated:
by Wayne Smith
Keyword research identifies the terms that connect your content with the right audience, focusing on relevance and intent rather than raw search volume. It aligns what customers seek with what the brand offers, linking marketing goals to search visibility.
Although often treated separately, keyword research and marketing research overlap heavily. Both must support the same market position and value propositions. When they operate in isolation, opportunities are easily missed. Query and entity research surface real-time demand, unmet needs, and emerging topics that traditional marketing research may overlook ... making collaboration essential.
Search data often reveals critical insights into customer intent, behavior, competitor positioning, unmet needs, problem–solution gaps, and signals of market size. Digital research also exposes market shifts as they occur.
The shift from strings to things reflects how search engines now evaluate which pages best represent topics or entities rather than relying on keyword matching. Modern entity-focused SEO builds on this by mapping relationships between entities ... the “things” that define topical relevance ... and using related entities as long-tail keywords or supporting topics within a hub-and-spoke structure, where the hub anchors the core entity and spokes explore connected subtopics in depth.
Long-Tail to Fan-Out Queries
Visibility is not a zero-sum ranking game. Experienced SEO practitioners traditionally used long-tail keyword strategies to capture more search opportunities than those targeting a single high-volume term. Today, search engines extend this approach through fan-out query techniques, which identify entities that demonstrate experience, expertise, and topical depth. While these signals are not direct ranking factors, they expand visibility by increasing a site's presence across a broader set of related queries.
Early long-tail theories suggested Google relied on Latent Semantic Indexing (LSI) to improve relevance. Google later clarified that it uses vector-based semantic models—such as Word2Vec, BERT, and MUM—to interpret relationships between words, entities, and concepts. This evolution supports fan-out query analysis and RAG-style approaches, enhancing entity integration and strengthening Answer Engine Optimization (AEO) by helping systems identify authoritative answers (E-E-A-T).
A strong fan-out strategy is increasingly critical as organic results continue to move lower on search results pages. Even a first-place ranking can receive zero clicks if the link is pushed below the fold by SERP features, AI overviews, or answer boxes. Broader query coverage mitigates this impact by increasing visibility across more entry points.
SEO is Query-Based
Although modern SEO emphasizes entities—the “things” behind the words—the search results are still triggered by queries, and every query carries user intent. Search behavior remains the initiating signal; entities and their relationships merely refine how that intent is interpreted and matched to content.
Each search results page (SERP) presents a mix of features—such as ads, maps, shopping units, videos, and AI-generated summaries—distributed across different visual layers or channels. In many cases, organic listings now appear below the fold, meaning they may receive few or even zero clicks.
When organic results are pushed down the page, SEO strategy must adapt. One effective approach is to position the target query as a category or hub page that performs well for navigation intent, while supporting content focuses on query fan-out—the related searches and subtopics surfaced by AI Overviews or semantic expansion. This alignment helps maintain visibility across both human and AI-driven query interpretations.
AI Query Modifications
Observing the AI-Based Search Query Transformation examines how search engines increasingly reinterpret and correct user queries toward canonical entities. As a result, traditional keyword matching continues to lose precision-based value. Understanding these AI-driven behaviors shows that effective keyword research now depends less on exact phrasing and more on identifying the entities, context, and intent behind searches.
In practice, this means optimizing for the full semantic breadth of a topic—creating interconnected content ecosystems that align with how AI systems recognize and relate meaning, rather than relying solely on isolated keyword targets.
The brand as a canonical entity – a critical keyword
When people search for a brand name or seek information about its products—a core market research subject—they generate signals around the brand’s keywords. This engagement helps search engines recognize these keywords, enhancing their visibility, and algorithms like Navboost interpret this user behavior.
The primary entity data for a brand consists of the products, services, or solutions it offers—these are the main topical keywords for the site. Additionally, NAP (name, address, and phone number) is foundational for establishing the brand as an entity and supporting visibility. Customer perceptions and reviews are an important factor in search visibility, but are not formally part of the brand’s entity graph or the site’s keyword graph.
Terms like “best” are not entities. When search engines process a query, they identify the entity within it and rank pages based on how well they are optimized for that entity. For example, searches for “good attorney in Los Angeles” or “best attorney in Los Angeles” typically return similar results. However, if visitors are likely to use modifiers like “best,” those words should be incorporated in titles and content to align with user intent and improve clarity.
In short, the products, services, or solutions your brand provides define the entities and keywords that should be clearly presented through site navigation and content. Supplemental words like “best” are useful only when they enhance clarity or meet searcher expectations.
Search intent keywords
In the broadest sense, search intent can be classified as informational, navigational, commercial, and transactional—a high-level framework for understanding user goals. However, matching content or keywords to search intent is often deeper and more nuanced, requiring consideration of context, phrasing, and the entity or topic the user is seeking.
For example, if the goal is to create a bottom-of-the-funnel transactional canonical entity for a product, all of the product’s features become critical keywords. By contrast, if the search query is a question about the product, then keywords related to point-of-view, FAQs, and user reviews become critical for capturing informational or consideration intent.
Individual user intent:
While often overlooked, user-group or audience-level intent is important for keyword research. Different audiences—such as investors, medical professionals, or developers—use their own domain-specific lexicons or keywords (for example, “black swan” or “grey rhino” among investors). These specialized vocabularies can be detected by LLM systems and may influence visibility in future algorithm updates.
Currently, individual or audience intent is not recognized as a direct ranking factor. What can be observed, however, is that AI overviews often guide users toward more specific or unambiguous searches that line up with different audiences. These systems tend to favor pages that use precise, experience-based language, effectively using linguistic specificity as a proxy for expertise or firsthand understanding. The lexicons used by experts become critical words to gain visibility.
This doesn’t mean a bricklayer must personally write the content—but the writer should incorporate the bricklayer’s knowledge, perhaps by interviewing them or using their terminology or technical wording directly. The goal is to reflect genuine subject-matter insight in the language itself, signaling depth and credibility to both users and AI systems.
AI Overviews as a keyword research tool
AI Overviews guide users toward more specific and unambiguous search terms. This refinement tends to favor sites and pages that demonstrate strong topical expertise and linguistic precision. The process operates through pattern- and rule-based matching—content that mirrors the vocabulary and phrasing used by recognized experts in a field is more likely to surface within LLM-driven results.
These linguistic refinements depend on the clarity and quality of content—not on superficial signals like author photos, résumés, or backlinks. While backlinks act as proxies for trust and authority, they only support topical relevance; they do not create it.
Here’s the exciting part:
AI Overviews reveal what large language models have already inferred about how topics, terms, and expertise relate. They act as a window into how AI systems interpret semantic precision and topical authority. In practice, using AI Overviews as a research tool allows you to observe which phrasing, terminology, and contextual relationships Google considers most aligned with user understanding of a topic.
Even when users ask simple or factual questions and never click through, those impressions still strengthen brand awareness. Appearing in AI Overviews functions as zero-click exposure—similar to non-converting visits—that reinforces a brand’s presence and credibility.
Limitations:
AI Overviews reflect the current ... today's not tomorrow's... state of an LLM’s knowledge and the relationships between entities and related terms. By examining how these entities connect, it is possible to identify gaps where additional content or context can extend the AI’s understanding—highlighting opportunities that go beyond what the model currently captures.
Knowledge, however, is dynamic—like rankings and search patterns, it evolves continually. The objective isn’t just to mirror what AI already knows, but to expand upon it with original insight and real-world experience—in short, to create gain of knowledge.
Keyword/entity cannibalization
Keyword or entity cannibalization occurs when multiple pages on a site target—or appear for—the same search query. When this overlap happens, search engines must determine which page offers the most relevant or authoritative response, often causing ranking fluctuations or reduced visibility for both.
For example, for a query like “best blue widget,” Google may test different result types by showing both informational and transactional pages. Because search behavior and context evolve, rankings for such blended queries can shift seasonally or as algorithms refine their understanding of the topic.
To prevent this, map keywords and entities to distinct pages with clear topical boundaries. When multiple pages compete for closely related queries, search engines try to identify a single, most representative version. If signals are divided between pages, both may lose visibility and authority—especially when each performs better for a slightly different variation of the same topic.
Practical Solutions to Cannibalization
One approach is to consolidate competing pages into a single, comprehensive resource—especially if the content serves overlapping purposes. However, merging pages can sometimes blur focus or weaken the clarity of the query match. In those cases, a different strategy is better.
A balanced alternative is a hub-and-spoke structure, where a primary “hub” page provides the main overview or transactional focus, while supporting pages (the “spokes”) explore related entities or features in greater depth and link back to the hub. This structure clarifies topical hierarchy and reduces internal competition.
Consider a multi-channel strategy of publishing informational or exploratory content a secondary platform, such as YouTube or community sites. This can help distribute overlapping topics without diluting the main site’s visibility.
The canonical loophole
Cannibalization may not be an issue when the query targets a site or brand name. In these cases, multiple pages on the site may include the same keyword, and this can also occur within a category or hub where several pages share a common keyword. Relying solely on search engines for keyword research can be misleading, as canonical pages often appear to have a “free pass.” The key distinction is that these queries reflect navigational intent rather than informational or transactional intent.
It’s also incorrect to judge cannibalization by comparing content on other sites. Canonical entities are dynamic: as new information emerges, the page considered most authoritative for a topic can change. In short, the loophole works, until is doesn't. Large sites like Amazon or news sites can be seen as canonical within search and appear to get a "free pass," It is not because these sites are treated differently; it is because they often represent the canonical for the information.
By carefully identifying long-tail keywords and related entities, it is possible to structure content around specific entities, often eliminating the risk of perceived cannibalization entirely.
Long tail keywords or related entities
Incorporating long-tail keywords or related entities on a page demonstrates subject depth and contextual understanding—often providing unique or original insights that set the content apart from competing sources.
AI-driven search systems analyze these relationships through query “fan-out,” expanding a single question into semantically related concepts. Pages that effectively address these related entities are considered more comprehensive and therefore more relevant.
This broader coverage also reinforces perceived experience and expertise. While E-E-A-T remains a subjective framework, search systems use measurable proxies—such as topical completeness, entity relationships, and consistency across sources—to estimate whether content likely reflects genuine expertise.
What is a canonical entity
Beyond the technical meaning of a canonical URL, a “canonical entity” is the primary or most authoritative version of an entity within semantic search—the one that other related entities derive from or connect back to.
The term “canonical” appears across disciplines: in religion, it denotes the official or accepted texts; in biology, it describes the most complete or representative form of a protein from which variants are derived. Similarly, in SEO, a canonical entity represents the definitive version of a topic or object within the knowledge graph.
For AI-aware SEO, the objective is to create or establish the canonical entity—ensuring your content is recognized as the authoritative representation of that topic or entity. Long-tail keywords and related entities are essential in this process, helping search systems understand the breadth, context, and depth of the canonical entity.