Google bot – Performancing http://performancing.com Tue, 13 Feb 2018 03:14:04 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.5 How To Fix Inner Pages that Aren’t Being Indexed http://performancing.com/fix-inner-pages-arent-indexed/ http://performancing.com/fix-inner-pages-arent-indexed/#comments Tue, 13 Feb 2018 03:14:04 +0000 http://performancing.com/?p=14217 How to Make Google Crawl Inner Pages When Google crawls the web, it indexes each page that it links to. During this indexing, the page is judged for things like its link profile, content, and value to searchers. It’s one of the steps in taking your page and giving it its PageRank. But not every […]

The post How To Fix Inner Pages that Aren’t Being Indexed appeared first on Performancing.

]]>

How to Make Google Crawl Inner Pages

When Google crawls the web, it indexes each page that it links to. During this indexing, the page is judged for things like its link profile, content, and value to searchers. It’s one of the steps in taking your page and giving it its PageRank. But not every page on the web is indexed, which can create a problem for your business.

How Does Google Index Pages?

Google uses a program called Googlebot to examine pages and determine what value they offer and what order they should be presented in for each search. Googlebot uses links to move from one page to the next, as well as Sitemap data from each site. New links, dead links, and updated links are adjusted on Google each time Googlebot crawls a page and updates the content on it.

Every result in a Google search is a page that has been indexed.

What Happens to Pages that Aren’t Indexed?

A page that isn’t indexed doesn’t show up in search results. It still exists on the web, but it loses one of the best ways to get organic visitors.

Why Are Some Pages Not Indexed?

There are a number of reasons a page won’t be indexed. It could be a technical issue, quality issue or problem with the link structure of your site. The only way to find out is to check each issue off one by one, and fix the problems when they appear.

Which Pages Are Most Likely to be Indexed?

Pages that are near the top of your domain, like the homepage and pages linked to the homepage, are most likely to be indexed quickly. One of the reasons for this is that Google’s crawlers use links as gateways to index each page. A crawler follows a certain number of links on each webpage it crawls, most of the time.

How Can I Tell If My Pages Are Indexed?

Type a specific search query into the Google search bar to see whether your site is indexed. First, use “site:domain.com”. If you were checking for indexed pages on Facebook, for example, you’d type “site:facebook.com” and run the search. It will show some—but not necessarily all—pages indexed from your site. You can also use “site:domain inurl:<slug> to show whether a specific page is indexed or “site:domain filetype:<filetype>” to see whether a specific file type has been indexed.

If you use the Google Search Console, check the “Index Status” to see how many pages are indexed, and which URLs are blocked or removed. Check the graph to see how your site has fared over time. Google Search Console also has “Sitemaps” that shows how many pages in the XML sitemap were submitted and indexed.

How to Make Google Index My Pages

There are a number of reasons why your pages may not be being indexed. Check each possibility against your site. Some changes are quick and painless, while others may take a little more effort. Either way, it’s worth doing the work so you can start getting those organic visits to those pages from Google searches.

Length of Time

Sometimes the issue is simply that your website hasn’t been up long enough. New pages aren’t indexed the second they appear on the web. Expect it to take a little while for Google to pick it up. The more you link build, the more quickly this is likely to happen, which is one of the reasons why a good link profile is so important early on.

Check Your Response Codes

If your site is producing anything other than a 200 (OK) server response code, you need to make adjustments. Errors, redirections, and dead links won’t be indexed by Google. You can use an HTTP status checker to check your pages and see what codes they’re producing if you’re not sure. If there are problems, resolve them.

You can hover over the status code boxes for information about each code.

Duplicate Content

Duplicate content is a bad idea for your SEO. Google may choose not to index pages with duplicate content and even if it is indexed, it may have a negative impact on your PageRank. There are tools you can use to see whether you have too much duplicate content. Make adjustments to your content if it’s too similar and repetitive, and you may see an increase in the number of pages that are indexed. It can also have a positive effect on your SEO.

Internal duplicate content isn’t the only issue, however. If you have content that’s too similar to that of other sites, your indexing may suffer. Check sections of your text to see how many other sites have the exact same phrasing. One issue may be that you’re pulling too many quotes from other sites. Another may be that your wording is just too similar. When these problems occur, work to make your content stand out and vary more from content on the same topics that other sites publish.

Page Quality

Sometimes a page is of such poor quality that Google simply won’t index it. Consider this: Google is creating a list of search results that offer value to its users. If the algorithm it uses to determine page quality shows that there isn’t enough, it may choose not to offer that page in search results at all. If you feel quality may be your issue, improve your content, link up with reputable sites, and consider getting expert advice on improving your content and SEO.

Another page quality issue to consider is the length of time it takes your site to load. Sites that spend too much time loading cause visitors to navigate away, which can negatively affect your PageRank. Eventually, it may lead to your page being removed from Google completely. Google’s PageSpeed Insights Tool can give you help to determine whether your site has loading issues and a guide on how to resolve them.

Internal Linking Issues

Sometimes you create pages on your site, but don’t link to them from other pages. These orphaned pages can negatively impact your rankings and may not be indexed. Crawlers follow links to find new pages. It’s impossible for them to find pages with no links pointing to them. If you aren’t sure whether your pages are linked correctly, use a site crawler to figure out whether your internal linking is done well.

Updating your site so that Google indexes more pages will have a positive impact on your representation on the search engine results page. As you produce new content, keep the suggestions for making your site get indexed in mind. That way you can produce content that is SEO-friendly, useful for viewers, and likely to get indexed as soon as possible so you can start getting organic views from search results.

The post How To Fix Inner Pages that Aren’t Being Indexed appeared first on Performancing.

]]>
http://performancing.com/fix-inner-pages-arent-indexed/feed/ 3
What is Your Crawl Budget and Why You Need To Know This http://performancing.com/crawl-budget-need-know/ http://performancing.com/crawl-budget-need-know/#comments Tue, 01 Nov 2016 07:48:55 +0000 http://performancing.com/?p=13348 If you are running a large website with many pages then it is essential that you know exactly what a crawl budget is, how it’s affecting your site, what your budget is and what to do if it’s not enough. What exactly is a crawl budget? The crawl budget is how many times the Google […]

The post What is Your Crawl Budget and Why You Need To Know This appeared first on Performancing.

]]>
If you are running a large website with many pages then it is essential that you know exactly what a crawl budget is, how it’s affecting your site, what your budget is and what to do if it’s not enough.

What exactly is a crawl budget?

The crawl budget is how many times the Google bots or spiders are crawling pages on your website within a given period of time. If you only have a small or medium sized site then this is most likely not going to be a problem, if you have a large site with hundreds, thousands or even tens of thousands of pages (such as an e-commerce site or news/media site) then you need to know that Google is crawling as many of these pages as possible and if changes are being made to these pages that they are re-crawled soon after.

The main factor that affects the crawl budget is PageRank, so for large and established sites its often not a problem, however if you site is relatively new and you are adding many new pages to it then the lank of PageRank could be a problem.

Matt Cutts summarized crawl budget perfectly in this interview published at Stone Temple some years back:

“The first thing is that there isn’t really such thing as an indexation cap. A lot of people were thinking that a domain would only get a certain number of pages indexed, and that’s not really the way that it works. There is also not a hard limit on our crawl. The best way to think about it is that the number of pages that we crawl is roughly proportional to your PageRank. So if you have a lot of incoming links on your root page, we’ll definitely crawl that. Then your root page may link to other pages, and those will get PageRank and we’ll crawl those as well. As you get deeper and deeper in your site, however, PageRank tends to decline.

That interview was way back in 2010 and there have been many changes to how Google crawls sites such as the Caffeine update in June of that year and Google is able to crawl more pages and a lot faster now but what Matt said back then about Google focusing on pages with more authority still remains true, these pages are just going to be crawled with greater frequency now.

How does Google crawl your pages?

First the Google spider will look into your robots.txt file and see what it should and shouldn’t be crawling and indexing. The budget part is how many of these urls Google decides to crawl per day, this is decided by the health of your site and also the number of links pointing to it.

How to check the health of your crawl budget

First check the total number of pages your site has in its XML sitemap, usually this will be at the root of your site eg Yourdomain.com/sitemap.xml. Quick tip, if you don’t have a sitemap setup and are running a WordPress site then we strongly recommend the YOAST SEO plugin which will do all of this for you with just a few clicks 🙂

sitemp1

 

Within your sitemap XML file there will be other sitemaps for different parts of your site eg a sitemap for the blog posts, one for different authors or users and so on. Go into each of these and get the total number of pages for each.

Once you have your total number of pages go into your Google Webmaster Tools account and then to Crawl>Crawl Stats in the left side menu until you see the pages crawled per day like the image below.

crawlstats

Then to find out your crawl budget simply divide the total number of pages your site has by the average number of pages crawled per day.

If your final number is less than 10 you are fine, if its more then you have a problem as Google has not allocated a you a large enough crawl budget and thus not all of your pages are being crawled, this needs to be fixed.

My crawl budget is bad, now what?

First off you need to find out if there are any crawl errors being found by Google on your site. Your server logs are a good place to start, you want to be looking for any 404s and redirecting them if possible or fixing the pages. 301 and 302’s are ok as long as they are redirecting to the correct places.

Once you have cleaned up crawl errors your next step should be to look at how Google is crawling your site.

How to sculpt where Google bots go

Remember, there are a finite number of pages on your site that Google can crawl, however Google bots will parse anything put in front of them so we need to make sure that they aren’t crawling pages that aren’t important to your site.

Robots.txt file – use this a the top level for disallowing sections of your site from bots being able to crawl

Noindex meta tag – this can be used on a more finite level for individual pages so that they will not be indexed

Nofollow tags – this can be used at an even more granular level on individual links to pages but if you don’t add this tag to all links pointing to the page then Google will still be able to find it

Knowing how and when Google bots are crawling your site is crucial for mid-range to large sites, especially ones that might not have so much authority and are competing against more established sites so webmasters have to ensure that these bots are seeing and crawling the most important parts of their site.

For more in-depth and further information on how Google is crawling websites see this Google hangout with John Mueller and Andrey Lipattsev.

 

Stop Letting Visitors Slip Through Your Fingers By Implementing These Conversion Tips

 

The post What is Your Crawl Budget and Why You Need To Know This appeared first on Performancing.

]]>
http://performancing.com/crawl-budget-need-know/feed/ 2