Back to Library
Tech Deep DiveEngineering

Case Study: Automated SEO Indexing Pipeline Architecture | whoisalfaz.me

Alfaz Mahmud Rizve
Alfaz Mahmud Rizve
@whoisalfaz
April 12, 2024
4 min read
How I Engineered a Zero-Touch SEO Indexing Pipeline That Submits 41 URLs Across 3 Search Engines on Every Deploy

This technical breakdown contains affiliate links. If you deploy this stack using my links, I earn a commission at no extra cost to you.

By Alfaz Mahmud Rizve | RevOps & Full Stack Automation Architect


The Problem: A WordPress Origin Story and the Pain of Manual SEO

This case study is about my own platform — whoisalfaz.me — and the journey from a sluggish WordPress site to a high-performance Next.js content engine with fully automated search engine indexing.

The WordPress Era

The original version of whoisalfaz.me was built on WordPress. It served its purpose as a starting point, but the fundamental limitations became impossible to ignore as the site grew:

  • Performance: WordPress's PHP rendering pipeline, combined with the mandatory plugin ecosystem (Yoast SEO, WPForms, WP Rocket), resulted in page load times consistently exceeding 3-4 seconds. The Google PageSpeed Insights score hovered around 45-60 on mobile. For a site whose primary purpose was to attract technical clients, this was a credibility-destroying first impression.
  • Plugin Bloat: Every new feature required a new plugin, each adding its own CSS, JavaScript, and database queries to every single page load. The site's <head> tag alone contained over 15 external stylesheet references.
  • Security Surface: Each plugin represented an independent software supply chain. The constant stream of "Update Available" notifications was not just annoying — each one was a potential vulnerability vector.

The Migration Decision

I made the strategic decision to rebuild the entire platform from scratch using Next.js (App Router) deployed on DigitalOcean App Platform. The migration immediately solved the performance problem — the PageSpeed score jumped from ~55 to 98+ on both mobile and desktop.

But SEO visibility did not follow automatically.

The Manual Submission Bottleneck

After migrating, I discovered a new, time-consuming bottleneck. Every time I published a new blog post or updated an existing page, I had to manually:

1
Log into Google Search Console and request indexing for the new URL.
2
Log into Bing Webmaster Tools and submit the URL batch.
3
Wait and hope that the search engines would crawl the updated sitemap on their own schedule.

This process took approximately 10-15 minutes per deployment. For a content engine publishing multiple posts per week as part of a 30-day blog series, this friction was unacceptable. I needed to eliminate the human from the indexing loop entirely.


The Architecture Decision: A Triple-Threat Indexing Engine

The core design principle was simple: the moment code hits the main branch, every active URL on the site must be submitted to every major search engine — automatically, concurrently, and with zero human intervention.

I designed a three-pronged indexing architecture that I call the "Triple-Threat Engine":

1. Bing Webmaster API (Batch POST)

A direct POST request to the Bing URL Submission API, sending the full array of URLs in a single payload. This is the most powerful of the three because it explicitly tells Bing "these URLs exist and have been updated."

2. IndexNow Protocol (Bing, Yandex, Seznam, DuckDuckGo)

The IndexNow protocol is an open-source initiative that allows websites to notify participating search engines of URL changes. A single POST to the IndexNow endpoint simultaneously notifies Bing, Yandex, Seznam, and DuckDuckGo. This provides coverage across search engines that the direct Bing API alone does not reach.

3. Google Sitemap Ping (HTTP GET)

Google's legacy sitemap ping endpoint (https://www.google.com/ping?sitemap=...) provides a lightweight notification mechanism. While Google has deprecated formal support for this endpoint, empirical testing shows it still triggers crawl activity when combined with a valid sitemap.xml.

Architect's Note

Architect's Note: I specifically chose NOT to put this logic inside a Next.js build-time hook. Build-time hooks execute on the CI server before the deployment is live. If a search engine crawler attempts to visit the submitted URLs while the build is still deploying, it would encounter 404 errors — potentially damaging the site's crawl budget and trustworthiness. By using a post-deployment API route, we guarantee the content is live before any crawler is invited.


Technical Blueprint: How It Works

The implementation consists of three interconnected layers:

Layer 1: The Discovery Engine (Dynamic URL Collection)

Instead of maintaining a hardcoded list of URLs, the system dynamically reads the file system at runtime to compile the complete list of active pages. This means that when a new blog post is added and deployed, it is automatically included in the next indexing batch without any configuration changes.


Related Services

  • Technical SEO — Forensic audits, crawl budget optimization, and indexability fixes for modern JavaScript sites.
  • n8n Automation — Workflow automation connecting search consoles, CI/CD pipelines, and search engine APIs.
  • Headless Architecture — Decoupled CMS solutions like the MDX engine powering this very site.

In this Article

Ready to automate your agency?

Skip the manual grunt work. Let's build a custom system that runs your business on autopilot 24/7.