10 min

Ep201 - ‘How Google Search Crawls Pages‪’‬ #TWIMshow - This Week in Marketing

- Marketing

Episode 201 contains the Digital Marketing News and Updates from the week of Feb 26 - Mar 1, 2024.
1. ‘How Google Search Crawls Pages’ - In a comprehensive video from, Google engineer Gary Illyes sheds light on how Google's search engine discovers and fetches web pages through a process known as crawling.
Crawling is the first step in making a webpage searchable. Google uses automated programs, known as crawlers, to find new or updated pages. The cornerstone of this process is URL discovery, where Google identifies new pages by following links from known pages. This method highlights the importance of having a well-structured website with effective internal linking, ensuring that Google can discover and index new content efficiently.
A key tool in enhancing your website's discoverability is the use of sitemaps. These are XML files that list your site's URLs along with additional metadata. While not mandatory, sitemaps are highly recommended as they significantly aid Google and other search engines in finding your content. For business owners, this means working with your website provider or developer to ensure your site automatically generates sitemap files, saving you time and reducing the risk of errors.
Googlebot, Google's main crawler, uses algorithms to decide which sites to crawl, how often, and how many pages to fetch. This process is delicately balanced to avoid overloading your website, with the speed of crawling adjusted based on your site's response times, content quality, and server health. It's crucial for businesses to maintain a responsive and high-quality website to facilitate efficient crawling.
Moreover, Googlebot only indexes publicly accessible URLs, emphasizing the need for businesses to ensure their most important content is not hidden behind login pages. The crawling process concludes with downloading and rendering the pages, allowing Google to see and index dynamic content loaded via JavaScript.
2. Is Google Happy with 301+410 Responses? - In a recent discussion on Reddit, a user expressed concerns about their site's "crawl budget" being impacted by a combination of 301 redirects and 410 error responses. This situation involved redirecting non-secure, outdated URLs to their secure counterparts, only to serve a 410 error indicating the page is permanently removed. The user wondered if this approach was hindering Googlebot's efficiency and contributing to crawl budget issues.
Google's John Mueller provided clarity, stating that using a mix of 301 redirects (which guide users from HTTP to HTTPS versions of a site) followed by 410 errors is acceptable. Mueller emphasized that crawl budget concerns primarily affect very large sites, as detailed in Google's documentation. If a smaller site experiences crawl issues, it likely stems from Google's assessment of the site's value rather than technical problems. This suggests the need for content evaluation to enhance its appeal to Googlebot.
Mueller's insights reveal a critical aspect of SEO; the creation of valuable content. He criticizes common SEO strategies that replicate existing content, which fails to add value or originality. This approach, likened to producing more "Zeros" rather than unique "Ones," implies that merely duplicating what's already available does not improve a site's worth in Google's eyes.
For business owners, this discussion underlines the importance of focusing on original, high-quality content over technical SEO manipulations. While ensuring your site is technically sound is necessary, the real competitive edge lies in offering something unique and valuable to your audience. This not only aids in standing out in search results but also aligns with Google's preference for indexing content that provides new information or perspectives.
In summary, while understanding the technicalities of SEO, such as crawl budgets and redirects, is important, the emphasis should be on content quality. Businesses should strive to create original

10 min