how-to-250px

How to spot orphan pages using OnCrawl?

May 10, 2016 - 3  min reading time - by Emma Labrador
Accueil > SEO Thoughts > How to spot orphan pages using OnCrawl?

Orphan pages, as a web page without any incoming links or parent page, need to receive attention. Indeed, you are maybe wasting some valuable organic traffic and SEO value.
Actually, knowing where are your orphan pages and re-attaching them to your website structure could be highly beneficial for your SEO.

What are orphan pages?

The word ‘orphan’ is used to mention the lack of parent pages or in other words pages that have links pointing to their child pages. An orphan page creates crawling problems for search engines because bots follow links to discover a website. If there is no link to discover a new page, this page is barely impossible to index unless it is linked to externally. Orphan pages are pages that cannot be reached from anywhere on the website and that users can’t find.

Even if orphan pages can be implemented intentionally to avoid been crawled, it is often due to webmasters’ mistakes that forgot to interlink web pages.

Why do we get orphan pages?

An orphan page can be expected and set up in purpose or not. Here are a few reasons for expected orphan pages:

  • Pages linked on external websites, as redirects. Redirected pages are all orphans as internal links should always go directly to the correct page.
  • Expired pages on a website with many pages with a short lifespan. They actually expire during the crawling time so it can become dangerous if they remain orphans for too long.
  • Pages returning errors that have been corrected but that Google still crawls for a few moments.

On the other hand, orphan pages can also occur not in purpose and become an issue:

  • Pages that are only linked in the structure regarding navigation criterias (like category pages or internal search result pages). Those pages should always be linked to the structure if they generate organic traffic.
  • Expired pages still returning content: some websites stop linking old content that is expired and do not deliver the right status code (like a 404 or a redirect to a newer version). The expired page is thus still available.
  • Pages that have not been migrated correctly: there is no redirection and the old content is still available.
  • Syntax errors during canonical tags creation. It creates wrong URLs (HTTP 200 or errors)
  • Syntax errors during sitemaps creation. It creates wrong URLs that can deliver content and duplicates or return HTTP errors.

How to easily detect orphan pages?

As bots are struggling to detect those pages, your current SEO crawler can’t either. That’s why you need to associate your crawl data to a log analysis and cross those insights.
At OnCrawl we offer crossed analysis that allows you to go further than your simple crawl data and combine them with log analysis.
You can thus know all your pages: those in the structure and those not crawled by Google.

 

From that data, you can actually know which pages are not counted by Google and still generate SEO visits and thus organic traffic. Those pages are still valuable.
OnCrawl also clearly shows how many orphan pages and active orphan pages you have. Those lasts are interesting as they have at least generated a SEO visit.

 

 

 

 

 

 

 

 

 

OnCrawl also displays your orphan pages distribution by page group and allows you to determine where your orphan pages are located. You can click on any part of the graph to access the URLs.

You can also analyze the proportion of orphan pages and pages in the structure among all pages known by Google and OnCrawl by page group.

It is also interesting to know if those orphan pages generate SEO visits. It means that they could be optimized if they were linked in the structure.

Regarding your crawl budget, you can also know if Google wastes too much crawl budget analyzing your orphan pages.

Finally, you can also compare your active and inactive orphan pages and see which ones don’t generate any SEO traffic.

This is has never been that easy to spot your orphan pages. If you want to give it a try, don’t hesitate to request a demo! We will be more than happy to show you all the SEO opportunities you are surely missing.

Emma was the Head of Communication & Marketing at Oncrawl for over seven years. She contributed articles about SEO and search engine updates.
Related subjects: