If you have or have made any changes to the site page, you may request that Google re-index the site by using the methods used below; you may not request a URL that does not use managed.

General instructions

  • Procrastination can take you anywhere from a few days to a few weeks; you have to be patient until it happens.
  • All the methods described have the same answers.
  • There is a quote for sending the URL.

Requesting to crawl multiple times for the same URL or sitemap does not mean crawling faster.

Crawl request methods

For an URL in the index:

Data report

Inspect the URL using the Inspection URL tool.

Select the indexing request.

Submit a map (multiple URLs at once)

For a sitemap, the way Google detects URLs on your site is essential; it can also include additional metadata about the languages ​​used, videos, photos, or news.

Reduce the Googlebot crawl rate

Google has sophisticated algorithms to extract the optimal crawl for the site; the purpose is to crawl as many pages from your site that we can visit without worrying about what problems the server is dealing with. However, in some cases, Google’s site crawling can cause a critical load on the infrastructure. For this, you need to reduce the number of requests made by Googlebot.

If you choose to reduce Googlebot crawling, here are some options:

Reduce crawling with Search Console

Allow Google to reduce crawling automatically.

Reduce crawl speed with Search Console (recommended)

You can change the crawl to Search Console; these changes will be generated within a few days. To use this, verify your site and avoid crawling for a too low value for your site care. You can request the degree of procrastination.

Let Google automatically reduce the tracking rate.

If you have an emergency to reduce procrastination, for example, two days, you will get information on the page with 500,503 results instead of all content. Googlebot reduces the crawl rate of your site when it encounters a significantly large number of URLs with 500, 503 429 HTTP result codes.

Verifying Googlebot and other Google crawlers

You can verify if a web crawler entering the server is a Google crawler like Googlebot; this is helpful if you are concerned about those who are causing problems by accessing your site when they want to be Googlebot. There are two ways to verify it:

  • Manually
  • Automatically

Use the command-line tools.

Do a reverse DNS lookup on the host command’s login IP address.

Verify that the domain name is a googlebot.com unit.

Do a DNS query on the domain name obtained in the first step using the host command on the domain name obtained.

Verify that it is the same address as the original IP login.

Use automatic solutions

You can identify the Googlebot IP address by swiping the crawl IP address of the Googlebot IP address. Then, match the crawler IP address with the complete Google IP list for all other crawlers.

Prominent site owner’s guide to managing your crawl budget

This guide shows how to optimize Google crawling from the updated and less frequented one. If your site does not have a large number of pages that change quickly, or if you notice that your site crawls on the same day it is published, you do not need to read this guide. If you have content that has been active for some time is indexed, this is a different problem. Use the inspection URL to them why your site is not indexed.

This guide is for:

Great sites that content you every once a week.

Medium and fewer small sites that content you every day.

The general theory of procrastination

The web is an endless space to explore and index any active URL. As a result, there are limits to how much time we can spend crawling each site, the total time and resources that Google spends on a site is the same as budget procrastination. We need to understand that not everything that crawls is called indexed.

Crawl capacity limit

The drag limit can go more or less depending on several factors:

  • Healthy crawling
  • Limit set by the owner of the Search Console site
  • Google crawl limit.

Request for procrastination

Elements that play an essential role in determining procrastination are:

  • Popularity
  • Holder
  • Perceived inventory

Best practices

  • Manage your URL inventory: 
  • Consolidate duplicate content. 
  • Block crawling of URLs that you don’t want to be indexed. 
  • Return 404/410 for permanently removed pages. 
  • Eliminate soft 404s. 
  • Keep your sitemaps up to date
  • Avoid long redirect chains
  • Make your pages efficient to load.
  • Monitor your site crawling.

Monitor your site’s crawling and indexing

Read some articles from Google

  1.     See if Googlebot is encountering availability issues on your site.
  2.     See whether you have pages that aren’t crawled but should be.
  3.     See whether any parts of your site need to be crawled more quickly than they already are.
  4.     Improve your site’s crawl efficiency.
  5.     Handle over the crawling of your site.

How HTTP status codes, network, and DNS errors influence Google Search

The status of HTTP codes is generated by the servers that host the site when it responds to requests from clients. Each status of HTTP code has different meanings, but the arrival of requests is often the same; search Console causes error messages for status codes at 4xx -5xx and caches 3xx if the server would respond a 2xx, the content received will think indexed.

HTTP status codes

2xx (success) Google examines the content for indexing; Search Console will show a soft 404 error if content suggests a mistake.

200 (success) Googlebot moves on the content to the upcoming indexing. Indexing systems may index content, but that does not mean it is declared.

201 (created) 202 (accepted) Googlebot waits for the content for a limited time and then goes to the indexing pipe.

204, Google signals the indexing pipeline that it has not received any content, and the Search Console’s coverage index report may display a 404 error.

3xx (redirect) Googlebot pursues about ten redirect threats; if the crawler does not receive content, Search Console will display a redirect error in the index coverage report.

301 Googlebot follows redirection and indexing pipeline as a strong signal that the target should be canonical.

302 Google follows redirect and indexing pipeline as a weak signal for the target to be canonical.

303 -304 Googlebot signals the indexing pipeline that the content is the same as the last time; the pipeline can recalculate the URL, but the status code does not affect indexing.

307, temporary redirection equivalent to 302-308 permanently equal to 301. While Google search handles these codes the same way, do not talk. They are semantically different 4xx 400.

Every 4xx error, except 429, is treated the same way: Googlebot signals the indexing pipeline that it does not exist.

The indexing pipeline would remove the URL from the index if previously indexed. Newly encountered 404 pages aren’t processed. As a result, the crawling frequency slowly reduces.

 401 and 403 status codes don’t use for limiting the crawl rate. The 4xx status codes, except 429, do not affect the crawl rate. Instead, learn how to restrict your crawl rate.

401 

403 

404 

410 

411 

429   

Googlebot manages the 429 status code as a signal that the server is overloaded, and it’s considered a server error.

5xx 5xx and 429 server errors prompt Google’s crawlers to slow down with crawling temporarily. Already indexed URLs are defended in the index but, in the end, dropped. If the robots.txt file goes back to a server error status code for around one month, Google will use the latest copy to save to robots.txt memory even if it is not available there may be crawl restrictions.

500 Googlebot reduce the crawl rate for the site. The decrease in crawl rate corresponds to the number of individual URLs returning a server error. Google’s indexing pipeline removes URLs that persistently return a server error.

502 (bad gateway)

503 (service unavailable)

Network and DNS errors

Network and DNS errors often occur, adversely affecting the presence of URLs in search engines. Googlebot handles network deadlines, connection resets, and DNS errors similar to 5xx server errors. If the network shows an error, the crawl starts decreasing; as an error, we understand that the server cannot hold the load.

Correction of network errors

These errors occur before Google crawls a URL; although errors may occur before it responds and there is no code that the problem is challenging, to see these problems is challenging. To correct expiration errors and restore links:

  • View network traffic.
  • If you find something, contact the company.
  • View firewall settings and logs.

Correction of DNS errors

DNS error is caused by non-configuration. It can be caused by the firewall blocking Googlebot and DNS. To answer your questions, do the following:

  • Inspect firewall rules.
  • View DNS records.
  • Check all server names that have the same IP.
  • If you have made changes to the DNS, the configuration will take 72 hours.

If you are using the server, make sure it is healthy and not overcrowd.

About the Author