In simple terms, an entity is a noun or anything that can follow the word "this," or anything that can be the subject of a sentence.
An entity index or entity dataset is built around the properties and attributes of an entity, which has both hierarchical and related entity structures. An entity can be a type of another entity, with different unique properties.
The AI optimization factors are similar to those used for on-page search engine optimization (SEO). A key difference is that instead of displaying a list of websites, AI presents the information as properties or attributes in natural language. For optimization to work, the information must be available on the URL in a format that AI can parse and extract these attributes or properties.
With organic results the related properties or entities may be across the full page ... with Google's overview and OpenAI using the related properties or entities, which are in the same text block.
For Google's AI overview, the attributes and properties are taken from a block of text, where the attributes and properties are related to the main entity of the text block. The text block is rewritten using AI and may include additional information from unknown sources.
Content Qualification Factors / SEO Ranking
A relationship between ranking in organic search and appearing in AI search has been observed. Pages that rank in SEO are seed data ... A page banned from organic listings can remain helpful in AI Overview.
One might say OpenAI is a distillation of Microsoft Bing search, and AI overviews is a distillation of Google search. Both extract data from the top search results. AI-Overview can use information from a page found in a non-competitive term in the overviews for a competitive term.
Pages must have extractable entities to be eligible for a citation. Many pages that work for normal keyword-based searches are excluded.
There is an apparent preference for statement of fact using general TLDR terminalogy.
Source: An entity is “a single person, place, or thing about which data can be stored”. In an SEO context, these entities are stored in something called a relational database, meaning that the entities stored a unique ID then fleshed out with information specific to that entity.
Output: Entities are considered unique objects with specific attributes like name, type, and relationships.
Entity Optimization Factors
Page-wide gain of knowledge content is a factor for traditional SEO but is not used by AI. Good content presents the answer to the question. Content considered too thin in conventional search may still provide a good answer for AI. AI needs entity-specific information.
Content selected for the entity by AI natural language search normally has a positive or neutral sentiment.
The usage of terms like recommended or best increases the positive sentiment of the content.
Other sentiments may exist or be used ... If a search query has a negative sentiment the AI overview or rewrite could have a negative sentiment taken from or a citation provided to a source with a negative sentiment.
India had a problem, a need, one of their branches of Government had so many pages and a traditional keyword-based search did not provide satisfactory results for their clients. The labor to help these people navigate the bureaucracy was considerable.
The documents were written using legal terms ... the search needs to translate a layman's query into legal terms, (or entities with the same meaning); extract the answer from several pages; and provide an answer translated back into layman's entities.
The size of the document pool was small enough that AI and Natural Language Processing could be a solution. They worked with the open-source AI community to build a solution; It worked, and the rest is history.
Smart people have the superpower to make simple complicated
AI is extracting the data and creating an entity-relationship model. It looks at a sentence and finds the subject, finds the verb, finds the modifiers, and finds the attributation. The entire language model is beyond the scope of this post.
AI reads, "Sweet is a type of taste," and extracts:
Entity: Sweet
TypeOf Taste
AI reads, "Honey is sweet," and extracts the property:
Entity: Honey.
Property Taste: Sweet
It reads "Oaked Chardonnay is a type of Chardonnay" and extracts the relationship:
Entity: Oaked Chardonay
Typeof: Chardonnay
It reads "Oaked Chardonay has a taste profile which includes honey," and extracts the property.
Entity: Oaked Chardonay
Property Taste: Honey
To answer the question, "which chardonnay is sweet," it can look through its data for the types of Chardonnay and find the one that tastes sweet -- or the one with a taste that includes honey, which is the same as sweet.
Even though AI did not read, "Oaked Chardonnay is generally sweeter than unoaked Chardonnay," it can artificially look intelligent and provide the answer.
The entity-relationship model, a by-product of AI, can be used independently of the computer that created the model.