{"id":892,"date":"2022-03-29T08:58:04","date_gmt":"2022-03-29T08:58:04","guid":{"rendered":"https:\/\/easyschema.com\/blog\/?p=892"},"modified":"2022-03-29T08:58:04","modified_gmt":"2022-03-29T08:58:04","slug":"ask-google-to-recrawl-your-urls","status":"publish","type":"post","link":"https:\/\/easyschema.com\/blog\/ask-google-to-recrawl-your-urls\/","title":{"rendered":"Ask Google to recrawl your URLs"},"content":{"rendered":"<p>If you have or have made any changes to the site page, you may request that Google re-index the site by using the methods used below; you may not request a URL that does not use managed.<\/p>\n<p><span style=\"font-weight: 400\">General instructions<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Procrastination can take you anywhere from a few days to a few weeks; you have to be patient until it happens.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">All the methods described have the same answers.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">There is a quote for sending the URL.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Requesting to crawl multiple times for the same URL or sitemap does not mean crawling faster.<\/span><\/p>\n<p><strong>Crawl request methods<\/strong><\/p>\n<p><span style=\"font-weight: 400\">For an URL in the index:<\/span><\/p>\n<p><span style=\"font-weight: 400\">Data report<\/span><\/p>\n<p><span style=\"font-weight: 400\">Inspect the URL using the Inspection URL tool.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Select the indexing request.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Submit a map (multiple URLs at once)<\/span><\/p>\n<p><span style=\"font-weight: 400\">For a sitemap, the way Google detects URLs on your site is essential; it can also include additional metadata about the languages \u200b\u200bused, videos, photos, or news.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Reduce the Googlebot crawl rate<\/span><\/p>\n<p><span style=\"font-weight: 400\">Google has sophisticated algorithms to extract the optimal crawl for the site; the purpose is to crawl as many pages from your site that we can visit without worrying about what problems the server is dealing with. However, in some cases, Google&#8217;s site crawling can cause a critical load on the infrastructure. For this, you need to reduce the number of requests made by Googlebot.<\/span><\/p>\n<p><span style=\"font-weight: 400\">If you choose to reduce Googlebot crawling, here are some options:<\/span><\/p>\n<p><span style=\"font-weight: 400\">Reduce crawling with Search Console<\/span><\/p>\n<p><span style=\"font-weight: 400\">Allow Google to reduce crawling automatically.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Reduce crawl speed with Search Console (recommended)<\/span><\/p>\n<p><span style=\"font-weight: 400\">You can change the crawl to Search Console; these changes will be generated within a few days. To use this, verify your site and avoid crawling for a too low value for your site care. You can request the degree of procrastination.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Let Google automatically reduce the tracking rate.<\/span><\/p>\n<p><span style=\"font-weight: 400\">If you have an emergency to reduce procrastination, for example, two days, you will get information on the page with 500,503 results instead of all content. Googlebot reduces the crawl rate of your site when it encounters a significantly large number of URLs with 500, 503 429 HTTP result codes.<\/span><\/p>\n<h1><span style=\"font-weight: 400\">Verifying Googlebot and other Google crawlers<\/span><\/h1>\n<p><span style=\"font-weight: 400\">You can verify if a web crawler entering the server is a Google crawler like Googlebot; this is helpful if you are concerned about those who are causing problems by accessing your site when they want to be Googlebot. There are two ways to verify it:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Manually<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Automatically<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Use the command-line tools.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Do a reverse DNS lookup on the host command&#8217;s login IP address.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Verify that the domain name is a googlebot.com unit.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Do a DNS query on the domain name obtained in the first step using the host command on the domain name obtained.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Verify that it is the same address as the original IP login.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Use automatic solutions<\/span><\/p>\n<p><span style=\"font-weight: 400\">You can identify the Googlebot IP address by swiping the crawl IP address of the Googlebot IP address. Then, match the crawler IP address with the complete Google IP list for all other crawlers.<\/span><\/p>\n<h1><span style=\"font-weight: 400\">Prominent site owner&#8217;s guide to managing your crawl budget<\/span><\/h1>\n<p><span style=\"font-weight: 400\">This guide shows how to optimize Google crawling from the updated and less frequented one. If your site does not have a large number of pages that change quickly, or if you notice that your site crawls on the same day it is published, you do not need to read this guide. If you have content that has been active for some time is indexed, this is a different problem. Use the inspection URL to them why your site is not indexed.<\/span><\/p>\n<p><strong>This guide is for:<\/strong><\/p>\n<p><span style=\"font-weight: 400\">Great sites that content you every once a week.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Medium and fewer small sites that content you every day.<\/span><\/p>\n<p><span style=\"font-weight: 400\">The general theory of procrastination<\/span><\/p>\n<p><span style=\"font-weight: 400\">The web is an endless space to explore and index any active URL. As a result, there are limits to how much time we can spend crawling each site, the total time and resources that Google spends on a site is the same as budget procrastination. We need to understand that not everything that crawls is called indexed.<\/span><\/p>\n<h3><strong>Crawl capacity limit<\/strong><\/h3>\n<p><strong>The drag limit can go more or less depending on several factors:<\/strong><\/p>\n<ul>\n<li><span style=\"font-weight: 400\"> Healthy crawling<\/span><\/li>\n<li><span style=\"font-weight: 400\"> Limit set by the owner of the Search Console site<\/span><\/li>\n<li><span style=\"font-weight: 400\"> Google crawl limit.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Request for procrastination<\/span><\/p>\n<p><strong>Elements that play an essential role in determining procrastination are:<\/strong><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Popularity<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Holder<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Perceived inventory<\/span><\/li>\n<\/ul>\n<h2><span style=\"font-weight: 400\">Best practices<\/span><\/h2>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Manage your URL inventory:\u00a0<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Consolidate duplicate content.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Block crawling of URLs that you don&#8217;t want to be indexed.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Return 404\/410 for permanently removed pages.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Eliminate soft 404s.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Keep your sitemaps up to date<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Avoid long redirect chains<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Make your pages efficient to load.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Monitor your site crawling.<\/span><\/li>\n<\/ul>\n<h2><span style=\"font-weight: 400\">Monitor your site&#8217;s crawling and indexing<\/span><\/h2>\n<blockquote><p>Read some articles from Google<\/p><\/blockquote>\n<ol>\n<li><span style=\"font-weight: 400\"> \u00a0 \u00a0 <\/span><a href=\"https:\/\/developers.google.com\/search\/docs\/advanced\/crawling\/large-site-managing-crawl-budget#availability_issues\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400\">See if Googlebot is encountering availability issues on your site<\/span><\/a><span style=\"font-weight: 400\">.<\/span><\/li>\n<li><span style=\"font-weight: 400\"> \u00a0 \u00a0 <\/span><a href=\"https:\/\/developers.google.com\/search\/docs\/advanced\/crawling\/large-site-managing-crawl-budget#not_crawled_should_be\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400\">See whether you have pages that aren&#8217;t crawled but should be<\/span><\/a><span style=\"font-weight: 400\">.<\/span><\/li>\n<li><span style=\"font-weight: 400\"> \u00a0 \u00a0 <\/span><a href=\"https:\/\/developers.google.com\/search\/docs\/advanced\/crawling\/large-site-managing-crawl-budget#updates\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400\">See whether any parts of your site need to be crawled more quickly than they already are.<\/span><\/a><\/li>\n<li><span style=\"font-weight: 400\"> \u00a0 \u00a0 <\/span><a href=\"https:\/\/developers.google.com\/search\/docs\/advanced\/crawling\/large-site-managing-crawl-budget#improve_crawl_efficiency\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400\">Improve your site&#8217;s crawl efficiency.<\/span><\/a><\/li>\n<li><span style=\"font-weight: 400\"> \u00a0 \u00a0 <\/span><a href=\"https:\/\/developers.google.com\/search\/docs\/advanced\/crawling\/large-site-managing-crawl-budget#emergencies\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400\">Handle over the crawling of your site<\/span><\/a><span style=\"font-weight: 400\">.<\/span><\/li>\n<\/ol>\n<h1><span style=\"font-weight: 400\">How HTTP status codes, network, and DNS errors influence Google Search<\/span><\/h1>\n<p><span style=\"font-weight: 400\">The status of HTTP codes is generated by the servers that host the site when it responds to requests from clients. Each status of HTTP code has different meanings, but the arrival of requests is often the same; search Console causes error messages for status codes at 4xx -5xx and caches 3xx if the server would respond a 2xx, the content received will think indexed.<\/span><\/p>\n<p><strong>HTTP status codes<\/strong><\/p>\n<p><span style=\"font-weight: 400\">2xx (success) Google examines the content for indexing; Search Console will show a soft 404 error if content suggests a mistake.<\/span><\/p>\n<p><span style=\"font-weight: 400\">200 (success) Googlebot moves on the content to the upcoming indexing. Indexing systems may index content, but that does not mean it is declared.<\/span><\/p>\n<p><span style=\"font-weight: 400\">201 (created) 202 (accepted) Googlebot waits for the content for a limited time and then goes to the indexing pipe.<\/span><\/p>\n<p><span style=\"font-weight: 400\">204, Google signals the indexing pipeline that it has not received any content, and the Search Console&#8217;s coverage index report may display a 404 error.<\/span><\/p>\n<p><span style=\"font-weight: 400\">3xx (redirect) Googlebot pursues about ten redirect threats; if the crawler does not receive content, Search Console will display a redirect error in the index coverage report.<\/span><\/p>\n<p><span style=\"font-weight: 400\">301 Googlebot follows redirection and indexing pipeline as a strong signal that the target should be canonical.<\/span><\/p>\n<p><span style=\"font-weight: 400\">302 Google follows redirect and indexing pipeline as a weak signal for the target to be canonical.<\/span><\/p>\n<p><span style=\"font-weight: 400\">303 -304 Googlebot signals the indexing pipeline that the content is the same as the last time; the pipeline can recalculate the URL, but the status code does not affect indexing.<\/span><\/p>\n<p><span style=\"font-weight: 400\">307, temporary redirection equivalent to 302-308 permanently equal to 301. While Google search handles these codes the same way, do not talk. They are semantically different 4xx 400.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Every 4xx error, except 429, is treated the same way: Googlebot signals the indexing pipeline that it does not exist.<\/span><\/p>\n<p><span style=\"font-weight: 400\">The indexing pipeline would remove the URL from the index if previously indexed. Newly encountered 404 pages aren&#8217;t processed. As a result, the crawling frequency slowly reduces.<\/span><\/p>\n<p><span style=\"font-weight: 400\">\u00a0401 and 403 status codes don&#8217;t use for limiting the crawl rate. The 4xx status codes, except 429, do not affect the crawl rate. Instead, learn how to restrict your crawl rate.<\/span><\/p>\n<p><span style=\"font-weight: 400\">401\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">403\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">404\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">410\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">411\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">429\u00a0\u00a0\u00a0<\/span><\/p>\n<p><strong>Googlebot manages the 429 status code as a signal that the server is overloaded, and it&#8217;s considered a server error.<\/strong><\/p>\n<p><span style=\"font-weight: 400\">5xx 5xx and 429 server errors prompt Google&#8217;s crawlers to slow down with crawling temporarily. Already indexed URLs are defended in the index but, in the end, dropped. If the robots.txt file goes back to a server error status code for around one month, Google will use the latest copy to save to robots.txt memory even if it is not available there may be crawl restrictions.<\/span><\/p>\n<p><span style=\"font-weight: 400\">500 Googlebot reduce the crawl rate for the site. The decrease in crawl rate corresponds to the number of individual URLs returning a server error. Google&#8217;s indexing pipeline removes URLs that persistently return a server error.<\/span><\/p>\n<p><span style=\"font-weight: 400\">502 (bad gateway)<\/span><\/p>\n<p><span style=\"font-weight: 400\">503 (service unavailable)<\/span><\/p>\n<h2><span style=\"font-weight: 400\">Network and DNS errors<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Network and DNS errors often occur, adversely affecting the presence of URLs in search engines. Googlebot handles network deadlines, connection resets, and DNS errors similar to 5xx server errors. If the network shows an error, the crawl starts decreasing; as an error, we understand that the server cannot hold the load.<\/span><\/p>\n<p><strong>Correction of network errors<\/strong><\/p>\n<p><span style=\"font-weight: 400\">These errors occur before Google crawls a URL; although errors may occur before it responds and there is no code that the problem is challenging, to see these problems is challenging. To correct expiration errors and restore links:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">View network traffic.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">If you find something, contact the company.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">View firewall settings and logs.<\/span><\/li>\n<\/ul>\n<p><strong>Correction of DNS errors<\/strong><\/p>\n<p><span style=\"font-weight: 400\">DNS error is caused by non-configuration. It can be caused by the firewall blocking Googlebot and DNS. To answer your questions, do the following:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Inspect firewall rules.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">View DNS records.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Check all server names that have the same IP.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">If you have made changes to the DNS, the configuration will take 72 hours.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">If you are using the server, make sure it is healthy and not overcrowd.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you have or have made any changes to the site page, you may request that Google re-index the site by using the methods used&#8230;<\/p>\n","protected":false},"author":2,"featured_media":896,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[45,19,1,14],"tags":[116,117],"class_list":["post-892","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-google-search-central","category-advanced-seo","category-easyschema-blog","category-seo","tag-ask-google","tag-submit-site-to-google"],"_links":{"self":[{"href":"https:\/\/easyschema.com\/blog\/wp-json\/wp\/v2\/posts\/892","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/easyschema.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/easyschema.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/easyschema.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/easyschema.com\/blog\/wp-json\/wp\/v2\/comments?post=892"}],"version-history":[{"count":3,"href":"https:\/\/easyschema.com\/blog\/wp-json\/wp\/v2\/posts\/892\/revisions"}],"predecessor-version":[{"id":895,"href":"https:\/\/easyschema.com\/blog\/wp-json\/wp\/v2\/posts\/892\/revisions\/895"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/easyschema.com\/blog\/wp-json\/wp\/v2\/media\/896"}],"wp:attachment":[{"href":"https:\/\/easyschema.com\/blog\/wp-json\/wp\/v2\/media?parent=892"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/easyschema.com\/blog\/wp-json\/wp\/v2\/categories?post=892"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/easyschema.com\/blog\/wp-json\/wp\/v2\/tags?post=892"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}