Thin Content vs Gain of Information

Published: 12-08-2024
Updated: 02-22-2025
by Wayne Smith

The TLDR for "thin content" refers to a page that fails to provide new information beyond what is already available online. Gain of knowledge comes from sharing information that has not been previously published on the internet.

The longer explanation is that if similar content exists on the same website, it may also lead to keyword cannibalization. Additionally, the content is primarily evaluated based on the keywords or entities present on the pages.

In AI search or AI-guided web searches, the gain of information is delivered within a text block. In contrast, traditional search engines evaluate the gain of knowledge across the entire page, which can also include links to additional resources.

For example:

If a website were to publish a gift list for birthdays, weddings, etc, and list the same gifts with the only difference being the word birthday on one page and weddings on the other ... It is thin cannibalizing content.

If a train schedule was being put online and there was a page for each train leaving the station -- the only difference between the pages is the time and the train number. The content would be thin and cannibalizing.

However, if the differences include features like first-class accommodations, sleeping compartments, and a dining car, the content would add practical information. Nonetheless, there may still be some overlap in keywords, leading to potential cannibalization.

Scraped or AI Content

Automated systems that scrape content or syndicate content from other sites are by their nature not original and do not have any new information.

Thin content is not determined by a word count.

Add breadth, depth, and originality: to fix thin content.

Generally just adding breadth and depth to the content will effectively make it not thin. One can also add a point of view to the information -- for persuasive marketing, the point of view would be one the reader can identify with.

A POV lines up with Google's Search Quality Rater Guidelines for EEAT. Although EEAT is not a algorithm it is part of what Google is attempting to achive.

Schema can be used to assist with specifying the point of view:

A wedding gifts page can expand the breadth of why certain gifts are ideal for weddings to prevent thin content. And making the page more relevant for a search with the word weddings in it than a search for sweet 16 birthday gifts prevents cannibalization.

So the pages are likely to show up for different search queries. And have a lot more differences than changing one word.

Beyond POV consider delving deeper into key points (or entities). Guided AI search essentially looks at blocks of text that explain an entity or key points. Adding say the menu in a dining car on a train -- adds gain of knowledge for both traditional search and AI-guided search.

The expansion into AI-guided search provides more visibility and the user normally can follow the citation from AI-guided search to the site that provided the details.

Entity Optimization in AI Search

Consequences of thin or cannibalizing content.

The pain point for search engines is these pages do not add value to the search results but consume resources and degrade the performance of the search engine. Can lead to:

Algorithmic Penalties: Search engines may demote pages or entire sites with significant thin content, reducing organic visibility.
Manual Actions: In severe cases, search engines might apply manual penalties, leading to removal from search results until issues are rectified.

Generally, these pages are candidates to become, Crawled but are currently not indexed.

The test for thin and cannibalizing content.

Algorithmically, the content is identified because both pages are relevant for, and show up for the same search results.

A search engine does not need to run virtual tests to determine when this happens. When people use the search engine it filters out additional pages from the same site that would have appeared below the best, or most optimized, page. By recording what pages were filtered out; The search engine can later reduce the relevancy of the filtered-out pages.

If a page has no search terms it is actively relevant for ... it is a candidate for, Crawled but is currently not indexed.

On-Page Keyword Optimization

The September Helpful content update saw websites with cannibalization issues hit hard.

During this period the desktop search results on Google were infinite scroll so the filter had to work harder to remove pages that were cannibalizing or competing with pages on their own site.

Sites with good user engagement were unaffected. As well as most branded websites.

Google has turned off infinite scroll on desktop search. Many of the sites hit have had some recovery.

The time frame for the filtered penalty is both during updates and continuos according to statements by Google. It has been observed to have both page-level and site-level effects.

Old Thin vs New Thin

As per the TLDR definition: A page that does not offer any new information to what already exists on the internet. Large sites that already exist with the content or keyword combination already exist.

Generally speaking, the site that first publishes content is ranked before sites that produce similar or even syndicated content. The burden gain of knowledge is placed on new content.

The algorithm to determine whether the pages on different sites are nearly the same is not a filter applied during the search. Rather it is a review of the results based on rankbrain or other ranking algorithms. Having a brand website may be of considerable help, as a brand is a different entity; And, algorithmically is a gain of knowledge ... a technicality to be certain but algorithms are all about technicalities. The reason why brands were less effected has not been scientifically tested.

... Solution Smith tests SEO tactics so you don't have to ...

Full-stack SEO has a lot of moving parts, and it takes time and effort to understand the nuances involved. Solution Smith takes care of the overhead costs associated with digital marketing resulting in real savings in terms of time and effort.