How I Built an Automated Content Pipeline with OpenClaw + WordPress

Affiliate Disclosure: This post contains affiliate links. If you purchase through my link, I earn a commission at no extra cost to you. I only recommend tools I actually use.

Six months ago I was writing every article by hand. Research in one tab, outline in another, draft in Google Docs, then the tedious copy-paste into WordPress — formatting, featured image, categories, schema markup, internal links. Every single time. For a single piece of content, I was burning three to four hours. At that rate, scaling to daily output wasn't a goal. It was a fantasy.

Today I wake up to published articles. The pipeline runs at midnight, and by 6 AM I have a new post live on the site — researched, written, formatted, deployed, and cross-linked. I want to show you exactly how I built it, because the architecture took me weeks to get right and I've never seen anyone document it this honestly.

This is an OpenClaw + WordPress content pipeline running in production. Not a proof of concept. Not a tutorial with fake screenshots. This is what I actually run.


Why I Needed This (The Real Reason)

Manual content creation has a ceiling, and that ceiling is your time. You can get faster at writing, faster at research, faster at formatting — but you're still bounded by hours in the day. When I modeled what it would take to dominate a niche with 200+ articles, I realized I either needed a team or I needed a machine.

A team means payroll, management overhead, and inconsistent quality. A machine means a one-time build cost and then repeatable, predictable output. I chose the machine.

But the thing about machines is they have to actually work. The first version of this pipeline crashed silently, published half-written drafts, and once uploaded the wrong featured image to three articles before I caught it. Getting from “it kind of works” to “I trust it to run unattended at midnight” took a lot of iteration.

Here's what I eventually landed on.


The 3-Cron Architecture

The pipeline is split into three completely separate cron jobs. Each one is independent — it reads state, does its job, writes state, and exits. They don't talk to each other in real time. They communicate through a shared state file on disk.

Here's how the full flow looks:

🔄 OpenClaw WordPress Content Pipeline — 3-Cron Architecture
CRON 1 — 00:00 UTC (Research + Write)
📡 SERP Fetch & Gap Analysis
→ Top 10 results scraped, keyword clusters extracted
✍️ Outline Generation
→ H2/H3 structure scored for coverage gaps
📝 Full Draft + Readability Pass
→ ~1,500–2,500 words written, Flesch scored
🖼️ Image Plan Written to State File
→ Alt text, prompts, placement anchors saved
📄 STATE FILE: article-{slug}.json Checkpoint at every phase. Cron 2 reads this. Cron 3 verifies against it.
CRON 2 — 03:00 UTC (Deploy)
🔑 Create App Password (per-session, scoped)
→ Created via WP REST API, stored in state file
📤 Post to WordPress REST API
→ Draft created, ID saved immediately
🖼️ Featured Image Upload + Attach
→ Media library upload, set as thumbnail
🏷️ Categories, Tags, Schema, Cross-links
→ All metadata applied, post published
🗑️ App Password Deleted (post-deploy)
→ Zero standing credentials after run
CRON 3 — 06:00 UTC (Watchdog)
🌐 Fetch Live Page (Not DB — the actual URL)
→ Curl the published URL, check HTTP 200
🔍 Verify Content Integrity
→ Check title, word count, image presence
🚨 Alert on Failure / Log on Success
→ Telegram alert if anything is wrong

The separation matters. If Cron 1 fails halfway through writing, Cron 2 never runs. If Cron 2 deploys but something goes wrong with the image upload, the state file knows exactly what succeeded. Cron 3 always checks the live URL — not the database, not the API response. The actual rendered page.


The State File Pattern: How Crash Recovery Works

This was the hardest part to get right, and also the most important. Early versions of the pipeline would crash partway through and I'd have no idea what had completed. Did the draft get saved? Did the post get created? Was the image uploaded? I was hunting through WordPress admin to figure out what was half-done.

The fix was treating the state file as the single source of truth for everything.

Every phase writes to the state file before it starts the next phase. The structure looks like this:

{
  "slug": "best-affiliate-plugins-wordpress",
  "phases": {
    "serp_fetch": { "status": "complete", "completed_at": "2026-03-18T00:04:22Z" },
    "gap_analysis": { "status": "complete", "completed_at": "2026-03-18T00:07:11Z" },
    "outline": { "status": "complete", "completed_at": "2026-03-18T00:09:44Z" },
    "draft": { "status": "complete", "word_count": 2187, "completed_at": "2026-03-18T00:22:08Z" },
    "image_plan": { "status": "complete", "completed_at": "2026-03-18T00:23:01Z" },
    "wp_post_create": { "status": "complete", "post_id": 4821, "completed_at": "2026-03-18T03:02:14Z" },
    "featured_image": { "status": "complete", "media_id": 4822, "completed_at": "2026-03-18T03:04:55Z" },
    "publish": { "status": "complete", "url": "https://mysite.com/best-affiliate-plugins-wordpress/", "completed_at": "2026-03-18T03:06:30Z" },
    "watchdog": { "status": "complete", "http_status": 200, "verified_at": "2026-03-18T06:01:12Z" }
  }
}

When a cron starts, the first thing it does is read the state file and find the last completed phase. It skips everything that's already done and picks up from where it left off. This means a crash anywhere — network timeout, API rate limit, model error — costs me at most one phase of re-work, not the whole run.

One critical rule I learned: state files track phase completion, not file existence. Early on I was checking whether a draft file existed on disk as a proxy for “draft phase complete.” The problem is that a file can exist but be corrupt, empty, or half-written. Now the state file is the only authority. If the phase isn't marked complete in the state file, it gets re-run. Period.


WordPress Integration: REST API + Application Passwords

The WordPress REST API is genuinely good once you understand its quirks. I authenticate using application passwords — a WordPress core feature that lets you create scoped credentials without touching your main account password.

My security pattern is create-use-delete per session:

# 1. Create a new application password via the REST API
POST /wp-json/wp/v2/users/me/application-passwords
Body: { "name": "pipeline-run-2026-03-18" }
→ Response: { "password": "xxxx xxxx xxxx xxxx xxxx xxxx", "uuid": "abc-123..." }

# 2. Use it for all API calls in this run
Authorization: Basic base64(username:app_password)

# 3. After deploy completes (success OR failure), delete it
DELETE /wp-json/wp/v2/users/me/application-passwords/{uuid}

The result: there are no standing credentials sitting around. If the pipeline host were ever compromised, there's nothing for an attacker to grab. The password that authorized last night's run was deleted at 3:07 AM.

The actual deployment sequence via REST API:

  1. Create post as draft — save the post ID immediately to state file. If anything fails after this, we can update or delete the orphaned draft.
  2. Upload featured image to media library — separate API call, returns a media ID.
  3. Update post — attach featured image, set categories, add tags, inject schema JSON-LD into the content, add cross-links.
  4. Set status to publish — final step, only after everything else is confirmed.

That order matters. You never publish first and then try to attach metadata. If step 3 fails after step 4, you have a live post missing its schema and image. Always publish last.


Content Quality Controls

Automated content without quality gates produces slop. Here's what I gate on before anything gets deployed:

SERP analysis before writing. The pipeline fetches the top 10 results for the target keyword, extracts their H2/H3 headings, and builds a topic coverage map. The outline is then scored against that map — if a topic appears in 7 of 10 competitors but not in my outline, it gets added. This isn't keyword stuffing; it's making sure I'm not publishing a piece that obviously misses what readers expect to find.

Word count floor. If the draft comes in under 1,400 words, the phase is marked failed and the run stops. I'd rather get an alert and review it manually than publish a thin piece.

Readability pass. After the draft, a secondary pass checks for sentence length variance, passive voice overuse, and transition density. This doesn't rewrite the article; it flags sections that read like they were written by a machine running on autopilot.

Image planning tied to content structure. The image plan is generated from the outline — not separately. Each planned image has a placement anchor (the H2 it goes below), alt text derived from the surrounding content, and a generation prompt tied to the section topic. This keeps images contextually relevant instead of generic stock-photo-style illustrations dropped in at random.


What It Actually Produces

Here are the real numbers after running this for roughly 90 days:

  • Output rate: 6–7 articles per week (one per night, with a rest day built in for topic queue management)
  • Average word count: 1,950 words per article
  • Pipeline success rate: 91% (the other 9% generate an alert, I review and re-queue)
  • Time I spend per article: ~15 minutes reviewing the watchdog alert and spot-checking the live post
  • Time I was spending before: 3–4 hours per article
  • Net time saved per week: ~18–22 hours

The content consistency is the underrated win. When you write manually, your quality varies with your energy level, your mood, whether you had coffee. The pipeline produces the same structure, the same quality gates, the same metadata completeness — every time, at midnight, whether I'm awake or not.


6–7

articles/week

91%

success rate

15m

my time per article

20h

saved per week

The Cost Breakdown

This is where most people handwave. Here's what it actually costs per article:

  • SERP fetch + parsing: ~$0.15 (Brave Search API calls)
  • Gap analysis + outline: ~$0.40 (Claude Haiku for fast analysis)
  • Full draft: ~$1.20–$2.80 (Claude Sonnet — varies with length)
  • Readability pass: ~$0.20
  • Image generation: ~$0.40–$0.80 (2–3 images per article)
  • Watchdog verification: ~$0.05

Total per article: $2.40–$4.40. Call it $3.50 average.

At 6 articles per week, that's $21/week or about $84/month. For 24 published articles per month that I didn't have to write. If even one of those articles earns $100 in affiliate commissions, the pipeline has paid for itself multiple times over.


Cost Per Article

SERP + parsing
$0.15
Gap + outline
$0.40
Full draft
$2.00
Readability
$0.20
Images
$0.60
Watchdog
$0.05
Average total per article: ~$3.50 24 articles/month ≈ $84/month in API costs

Mistakes I Made Building This (And How to Avoid Them)

I'm going to be honest here because these cost me days of debugging.

Mistake 1: Storing structured data inside WordPress meta

Early on I was serializing my state data into WordPress post meta — storing the phase completion status as custom fields on the post itself. This seemed clever: everything in one place, WordPress as the database. The problem is that WordPress meta is not a reliable data store for structured pipeline state. Meta values get filtered, escaped, and occasionally dropped by caching plugins and REST API quirks. Your state file should live on your pipeline host's filesystem. Full stop.

Mistake 2: Verifying against the database instead of the live page

The WordPress REST API will tell you a post is published with the correct content. It can be lying. Not because WordPress is broken — because caching layers exist between the database and the page a visitor actually sees. I had three articles that the API confirmed as published with featured images, but when a user hit the URL they got a cached version of a draft with no image. The watchdog must fetch the live URL with a cache-busting parameter and inspect the actual HTML. Trust nothing except the rendered page.

Mistake 3: Using file existence as a checkpoint signal

Covered above, but worth repeating: checking whether draft.txt exists is not a reliable way to know if the draft phase succeeded. Files can be half-written. They can be zero bytes. They can be left over from a previous failed run with a different slug. The state file is the checkpoint. The state file records success explicitly. Use it.

Mistake 4: Not separating research cron from deployment cron

My first version was a single cron that did everything. When it failed at step 8 of 15, I had no idea what had run. Splitting into three separate crons with explicit handoffs transformed debugging from archaeology into reading a log file.


🎯

Core Lessons (The Short Version)

1. State files on your filesystem, not in WordPress meta

2. Verify the live rendered page, never trust the API response

3. File existence ≠ phase completion — use explicit status tracking

4. Split into separate crons with state handoffs — never one monolith

Could You Build This Yourself?

Yes. Technically, everything I've described is buildable with open tools — the WordPress REST API is documented, cron scheduling is basic Linux infrastructure, and the AI APIs are accessible to anyone. The knowledge required isn't secret.

But here's the honest answer: it took me about three weeks of iteration to get this pipeline stable enough to trust overnight. Not because any individual piece is hard, but because the integration points are where everything breaks. The state file pattern took me five failed iterations to design correctly. The application password create-use-delete pattern came from a security audit I did after I realized I had standing credentials that didn't need to be standing. The watchdog logic grew from real failures I caught — or didn't catch in time.

If you're starting from zero with AI automation, you're not starting at this pipeline. You're starting with understanding how to give an AI agent a task and get reliable, structured output back. You're learning how state management works, how to design for failure, how to separate phases so that any one of them can fail without corrupting the whole.

The foundational skills take time to build. But they do compound.


The Bridge: Where OpenClaw Cracked Fits

This pipeline took weeks to build and debug. I run OpenClaw self-hosted from the command line — but OpenClaw Cracked is for people who want the same capability without that technical setup burden.

OpenClaw Cracked is a hosted deployment platform with a dashboard called Claw Launcher: click deploy, paste your API key, and your agent is live in about 30 seconds. Zero terminal, zero command line, zero infrastructure wrestling. On top of deployment, it includes 4 business-building skills, a 3-day live workshop, and an installation guarantee.

Pricing is $27 one-time plus $15/month hosting after a 14-day free trial. The value is instant deployment plus business execution — not a course about learning terminal commands.


Final Thoughts

An automated content pipeline is not magic. It's engineering — careful, boring, iterative engineering. State files and error handlers and watchdog crons are not exciting. They're the difference between a demo and a production system.

What I have now is a production system. Six articles a week, 15 minutes of my time per article, consistent quality, fully auditable through state files and watchdog logs. I know exactly what ran, when it ran, and whether it succeeded.

If you're building something similar — or thinking about starting — feel free to adapt anything here. The architecture isn't proprietary. The only secret ingredient was the iteration time.

→ Deploy OpenClaw in 30 Seconds — See OpenClaw Cracked

Affiliate link — I earn a commission if you purchase. No extra cost to you.


Related OpenClaw Articles

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top