Published:
by Wayne Smith
Crawled not indexed is not always a problem. It can be caused by:
- Thin content
- Orphaned or under-linked content
- Cannibalization
Crawling and indexing are two separate and independent functions of search engines. Many pages on YouTube, Twitter, and other social media sites are crawled but not indexed. In some cases, social media sites have pages indexed but not recently indexed.
It should not be seen as a critical error; In some cases it is good and in others it is something that should be addressed.
While Google has removed the show cache link from the SERP results page; The URI Schema, (Cache:http://example.com), exists in the Chrome browser pulls the current indexed content from Google.
As an equivant to rel="noindex follow"
Crawling and ranking are independent, the bot finds pages, reads and stores them; and looks for URLs on the page. The ranking of pages is done on the stored page. A page that is not indexed may, however, not pass topical authority or juice.
Generally speaking, Google knows about many trillion pages but only indexes half a trillion. Saying Google generally does not follow links is scientifically correct, but ignores Google following links based on factors beyond just knowing about a URL.
A link from a page that Google knows about but does not index should be considered as lessor important than having a link from a page that is indexed with a noindex attribute.
However, neither condition guarantees that the linked page will not be crawled or appear in the index.
Thin Content
The relationship between Google and SEOs can seem bipolar -- Another term for thin content is unhelpful content but without the negative bias given to unhelpful content.
Not enough words or undetected soft 404s
One simple reason for a page to be crawled but not indexed is a lack of material on the page.
SEO strategist POV
If the page in question has value for the visitors to the website but has no value to the SEO strategy, or is supplemental content for visitors.
- Is not needed for topical authority.
- Is not considered a hub critical to other pages on the site.
No action needs to be taken; A noindex meta could be added to the page to move it out of the [Crawled - currently not indexed] box.
If the page is important for SEO strategy: The content on the page needs to be improved with gain of knowledge and links to the page need to be created.
SEO Content Creator POV
Thin Content Example
Title: CNC - Techtronics Laser Cutter
Our company uses the Techtronics Laser Cutter, CNC, for high-precision fabrication of components.
[image]
How our company uses quality tools to insure quality components ...
Would normally be supplemental content for visitors to the site but very thin in terms of appearing for the title term of "CNC - Techtronics Laser Cutter," in SERPS.
While it is not useful for SEO it may be very relevant for marketing to have the information on the site for visitors considering purchasing from the company.
It would not be useful to visitors on the site to have all the specifications details about the "Techtronics Laser Cutter," on the site, which would be required to have the page exist in the SERPS.
Leaving supplemental content as [Crawled - currently not indexed]
If the content is of benefit to the visitors it needs to be on the site, and can be left as a [Crawled - currently not indexed] page.
If the content needs to be indexed?
The problem with the example is the subject entity of the page, "CNC - Techtronics Laser Cutter." Google and other entity based search engines have built a relationship model around the term and expect knowledge that agrees and increases the knowledge about the term.
If the site is branded, then the entity is the brand, and the brand is a blue ocean entity with no excisting knowledge of its relationship to other entities. The brand can be the subject of the page, which has a relationship to the tools it uses.
Blue Ocean Example
Title: Our Company's High-Percision Fabrication
Our company uses the Techtronics Laser Cutter, CNC, for high-precision fabrication of components.
[image]
How our company uses quality tools to insure quality components ...
For blue ocean content creation it is important that the domain is branded ... see Why a brand entity should be used as domain names
Technical SEO POV
For technical SEO an important questions include:
- Is the URL a structural part of the site
- Is the number of [Crawled - currently not indexed] excessive
If the answer to both of these questions is no ... nothing needs to be done.
If the URL is structually important, the content on the page needs to be improved and interal links to the page are important.
If the URL does not need to be in the index, but [Crawled - currently not indexed] is excessive; adding a noindex meta to the page will move it out of the [Crawled - currently not indexed] box.
If the URL is a soft404 or something that should not be indexed it needs a noindex meta or robots.txt to keep it out of the index.
... Solution Smith tests SEO tactics so you don't have to ...
Full-stack SEO has a lot of moving parts, and it takes time and effort to understand the nuances involved. The cost associated with testing are part of Solution Smith overhead ... resulting in real savings in terms of time and effort for clients ... Why reinvent the wheel.