Back to Library
Tech Deep DiveEngineering

Case Study: Automated SEO Indexing Pipeline Architecture | whoisalfaz.me

Alfaz
Alfaz Mahmud Rizve
@whoisalfaz
April 12, 2024
9 min read
How I Engineered a Zero-Touch SEO Indexing Pipeline That Submits 41 URLs Across 3 Search Engines on Every Deploy

This technical breakdown contains affiliate links. If you deploy this stack using my links, I earn a commission at no extra cost to you.

By Alfaz Mahmud Rizve | RevOps & Full Stack Automation Architect at whoisalfaz.me

By Alfaz Mahmud Rizve | RevOps & Full Stack Automation Architect

The Problem: A WordPress Origin Story and the Pain of Manual SEO

This case study is about my own platform — whoisalfaz.me — and the journey from a sluggish WordPress site to a high-performance Next.js content engine with fully automated search engine indexing.

The WordPress Era

The original version of whoisalfaz.me was built on WordPress. It served its purpose as a starting point, but the fundamental limitations became impossible to ignore as the site grew:

  • Performance: WordPress's PHP rendering pipeline, combined with the mandatory plugin ecosystem (Yoast SEO, WPForms, WP Rocket), resulted in page load times consistently exceeding 3-4 seconds. The Google PageSpeed Insights score hovered around 45-60 on mobile. For a site whose primary purpose was to attract technical clients, this was a credibility-destroying first impression.
  • Plugin Bloat: Every new feature required a new plugin, each adding its own CSS, JavaScript, and database queries to every single page load. The site's <head> tag alone contained over 15 external stylesheet references.
  • Security Surface: Each plugin represented an independent software supply chain. The constant stream of "Update Available" notifications was not just annoying — each one was a potential vulnerability vector.

The Migration Decision

I made the strategic decision to rebuild the entire platform from scratch using Next.js (App Router) deployed on DigitalOcean App Platform. The migration immediately solved the performance problem — the PageSpeed score jumped from ~55 to 98+ on both mobile and desktop.

But SEO visibility did not follow automatically.

The Manual Submission Bottleneck

After migrating, I discovered a new, time-consuming bottleneck. Every time I published a new blog post or updated an existing page, I had to manually:

1
Log into Google Search Console and request indexing for the new URL.
2
Log into Bing Webmaster Tools and submit the URL batch.
3
Wait and hope that the search engines would crawl the updated sitemap on their own schedule.

This process took approximately 10-15 minutes per deployment. For a content engine publishing multiple posts per week as part of a 30-day blog series, this friction was unacceptable. I needed to eliminate the human from the indexing loop entirely.

The Architecture Decision: A Triple-Threat Indexing Engine

The core design principle was simple: the moment code hits the main branch, every active URL on the site must be submitted to every major search engine — automatically, concurrently, and with zero human intervention.

I designed a three-pronged indexing architecture that I call the "Triple-Threat Engine":

1. Bing Webmaster API (Batch POST)

A direct POST request to the Bing URL Submission API, sending the full array of URLs in a single payload. This is the most powerful of the three because it explicitly tells Bing "these URLs exist and have been updated."

2. IndexNow Protocol (Bing, Yandex, Seznam, DuckDuckGo)

The IndexNow protocol is an open-source initiative that allows websites to notify participating search engines of URL changes. A single POST to the IndexNow endpoint simultaneously notifies Bing, Yandex, Seznam, and DuckDuckGo. This provides coverage across search engines that the direct Bing API alone does not reach.

3. Google Sitemap Ping (HTTP GET)

Google's legacy sitemap ping endpoint (https://www.google.com/ping?sitemap=...) provides a lightweight notification mechanism. While Google has deprecated formal support for this endpoint, empirical testing shows it still triggers crawl activity when combined with a valid sitemap.xml.

Architect's Note

Architect's Note: I specifically chose NOT to put this logic inside a Next.js build-time hook. Build-time hooks execute on the CI server before the deployment is live. If a search engine crawler attempts to visit the submitted URLs while the build is still deploying, it would encounter 404 errors — potentially damaging the site's crawl budget and trustworthiness. By using a post-deployment API route, we guarantee the content is live before any crawler is invited.

Technical Blueprint: How It Works

The implementation consists of three interconnected layers:

Layer 1: The Discovery Engine (Dynamic URL Collection)

Instead of maintaining a hardcoded list of URLs, the system dynamically reads the file system at runtime to compile the complete list of active pages. This means that when a new blog post is added and deployed, it is automatically included in the next indexing batch without any configuration changes.

JSON Payload
// Inside app/api/bing/submit/route.ts
// 1. Collect Static Routes
const staticRoutes = ['', '/portfolio', '/blog', '/contact', '/services', '/labs', '/labs/roi'];
const staticUrls = staticRoutes.map(r => `${siteUrl}${r}`);

// 2. Dynamically Discover All Blog Posts from the File System
const posts = getAllPosts(); // Recursively reads content/blog/**/*.mdx
const blogUrls = posts.map(p => `${siteUrl}/blog/${p.slug}`);

// 3. Merge into a Single Batch Array
const allUrls = [...staticUrls, ...blogUrls];
// Result: 41+ URLs compiled automatically

The getAllPosts() function uses a recursive fs.readdirSync to walk the content/blog/ directory tree, parsing the frontmatter of every .mdx file to extract the slug. This approach is inherently self-healing — if a post is deleted, it naturally drops out of the array on the next deployment.

Layer 2: The Transmission API (Concurrent Fire)

The API route (app/api/bing/submit/route.ts) is the core execution engine. It is protected by a CRON_SECRET environment variable to prevent unauthorized triggers.

When invoked, it fires all three indexing mechanisms concurrently using Promise.allSettled():

JSON Payload
// 3. Triple-Threat Fire: All three run concurrently
const [bingResult, indexNowResult, googleResult] = await Promise.allSettled([
    submitToBing(allUrls),      // Batch POST to Bing Webmaster API
    submitToIndexNow(allUrls), // POST to IndexNow endpoint
    pingGoogle()               // HTTP GET to Google's Sitemap Ping
]);

The use of Promise.allSettled() over Promise.all() is a deliberate architectural choice. If the Bing API returns a 429 (rate limit) error, the IndexNow and Google submissions still complete successfully. The system is fault-tolerant by design — a failure in one channel does not cascade to the others.

Layer 3: The Deployment Trigger (GitHub Actions)

The final piece of the automation is a GitHub Actions workflow that fires the API route after every push to the main branch. It includes a critical 7-minute delay to allow DigitalOcean's App Platform to fully deploy the new build before the URLs are submitted.

JSON Payload
# .github/workflows/index-urls.yml
on:
  push:
    branches: [main]

jobs:
  index:
    runs-on: ubuntu-latest
    steps:
      - name: Wait for DigitalOcean deployment
        run: sleep 420 # 7 minutes

      - name: Trigger Triple-Threat Indexing
        run: |
          curl -s -X GET "${{ secrets.SITE_URL }}/api/bing/submit?secret=${{ secrets.CRON_SECRET }}"

This creates a fully autonomous pipeline: git push → DigitalOcean builds and deploys → GitHub Actions waits → API route fires → All search engines are notified. The developer (me) never touches a search console dashboard again.

The Site Audit Tool: A Client-Facing Trust Engine

Beyond the internal indexing pipeline, I also built a public-facing SEO Audit Tool directly into the site at /labs. This tool allows any visitor to input their website URL and receive a real-time sample audit powered by the Google PageSpeed Insights API.

Why Build This?

The audit tool serves a dual purpose:

1
Lead Qualification: A visitor who uses the audit tool has self-identified as someone who cares about their website's performance. This is a high-intent signal.
2
Credibility Demonstration: By providing a free, functional tool — not just a landing page with testimonials — the site demonstrates technical competence directly. The visitor experiences the quality of my engineering before ever making contact.

The tool runs a live PageSpeed API call, normalizes the URL input (handling missing https:// prefixes, double slashes, and other common formatting issues), and displays the results in a clean, two-column dashboard layout.

The Content Engine: A Blog Architecture Built for Scale

The blog itself is a custom MDX-powered engine with several enterprise-grade features that go beyond standard static site generators:

  • Dynamic Visual Components: Custom React components for callout cards, image lightboxes with click-to-expand, and styled step indicators are automatically detected and rendered from standard Markdown syntax.
  • Dual-Tier Affiliate Engine: The DeployingTheStacks component dynamically reads the affiliates frontmatter array from each post and renders relevant technology partner cards, split into "Primary Stacks" and "Supporting Stacks."
  • Category System: A getPostsByCategory() function powers the dedicated category listing pages, allowing content to be organized into distinct verticals like "Learn Automation in 30 Days" and "Architecture Teardowns."
  • Triple-Push Deployment: A custom npm run push-all script executes git push to both GitHub (for CI/CD and version control) and DigitalOcean (for production deployment) simultaneously.

Results & Metrics

The transformation from a WordPress blog to a high-performance, self-indexing content engine eliminated every manual bottleneck in the publishing workflow. The site now operates as a fully autonomous system — content is written, committed, and the rest is handled by code.

Key Takeaways for Builders

1
Treat SEO as an engineering problem, not a marketing chore. Indexing is a deterministic process that can and should be automated.
2
Use Promise.allSettled() for concurrent, fault-tolerant network operations. Never let one failing API call bring down an entire pipeline.
3
Delay post-deployment triggers. Always give your hosting platform enough time to fully serve the new build before inviting crawlers.
4
Build public tools, not just landing pages. A functional audit tool is worth more than a hundred testimonials for demonstrating competence.

Complementary RevOps Toolchain

Vector DB

Pinecone Vector Database

The vector database for building AI applications. Essential for RAG architectures.

Start Building with Pinecone
Secure Link
Verified Partner
Lead Gen

Apollo.io

The ultimate B2B database and sales engagement platform for lead generation.

Try Apollo Free
Secure Link
Verified Partner
Analytics

Databox

Business analytics platform to build and share custom dashboards.

Start Visualizing Data
Secure Link
Verified Partner
Work OS

Monday.com

The Work OS that lets you shape workflows, your way. Perfect for team scale.

Try Monday.com
Secure Link
Verified Partner
Orchestration

Turbotic

Enterprise automation optimization and orchestration tracking system.

Explore Turbotic
Secure Link
Verified Partner
Comms API

CometChat

Developer-first in-app messaging and voice/video calling APIs.

Integrate CometChat
Secure Link
Verified Partner
AI Design

AdCreative.ai

Generate conversion-focused ad creatives and social media post designs in seconds.

Try AdCreative Free
Secure Link
Verified Partner
Voice AI

ElevenLabs

The most realistic text-to-speech and voice cloning software.

Try ElevenLabs
Secure Link
Verified Partner
RevOps AI

Emergent

AI-powered revenue operations platform for scaling B2B growth.

Try Emergent
Secure Link
Verified Partner
Integration

Tapstitch

Data integration and workflow stitching platform for modern teams.

Explore Tapstitch
Secure Link
Verified Partner
AI Sales

AiSDR

AI-powered sales development representative for automated outbound.

Try AiSDR
Secure Link
Verified Partner
Growth

Accelerated Growth Studio

Growth engineering and product-led acquisition acceleration platform.

Explore AGS
Secure Link
Verified Partner

In this Article

Ready to automate your agency?

Skip the manual grunt work. Let's build a custom system that runs your business on autopilot 24/7.