๐Ÿ“‹ Portfolio Walkthrough ยท AI Defense Strategy

From Problem
to Architecture

A guided tour through the complete AI scraper defense portfolio โ€” six interconnected documents that take you from understanding the threat to a fully specified, implementation-ready technical design.

How to Read This Portfolio
6 Documents, One Strategic Arc

Each document builds on the previous. Start at the problem narrative and end at the detailed technical design for the defense solution.

1
Problem Framing
Understanding the Threat
The narrative problem statement establishes the business context using a Prohibition-era bootlegger analogy. It answers: what is happening, why does it matter commercially, and who is being hurt? Uses Australian industry case studies (anonymised) to make the threat tangible.
2
Solution Strategy
The Defense Playbook
Maps the FBI "Untouchables" doctrine to a 5-layer enterprise defense stack. Presents the market landscape, identifies three investable whitespace opportunities, and outlines a 4-phase 12-month enterprise roadmap. This is the strategy brief for decision-makers.
3
Current State ยท Architecture
The Undefended Platform
An SVG architecture diagram of a typical enterprise eCommerce platform with animated red attack-flow lines showing exactly how scrapers traverse each layer โ€” CDN โ†’ Load Balancer โ†’ Web App โ†’ API โ†’ Database. Six attack vector panels detail each vulnerability with severity ratings.
4
Target State ยท Architecture
The Defended Platform
The same architecture transformed with 5 color-coded defense layers. Green flows show legitimate user traffic passing normally; orange paths show sophisticated bots being rerouted to the Data Poison Engine; teal lines show the IP watermarking applied to all content. Includes a Bot Fate Decision Matrix showing what percentage of bots each layer stops.
5
Current State ยท Detailed Design
The Attacker's Playbook
A forensic 6-phase attack pipeline showing exactly how an AI scraper operates: Target Discovery โ†’ Browser Impersonation โ†’ JS Execution โ†’ LLM Semantic Extraction โ†’ Batch Crawl โ†’ Monetisation. Each phase includes real API code samples (Firecrawl, Playwright) and a business impact panel. Ends with a swimlane actor interaction map across all six phases โ€” with a key insight: the attacker's total cost for your full 40,000 SKU catalog is approximately $8.
6
Target State ยท Detailed Design
The Defense Stack Blueprint
The complete technical specification: a bot decision tree flowchart showing every possible traffic path; four component specification cards with real Terraform, Next.js, and Express.js code; a swimlane showing how each of the four actor types (real user, basic scraper, headless Chrome, sophisticated AI bot) experience the defended platform; and an L5 IP watermarking spec covering both text (NLP signatures) and image (pixel-level perturbations) provenance.
Portfolio Summary
What This Delivers
๐Ÿ“„
6
HTML Documents
Problem narrative, solution strategy, 2 architecture diagrams, 2 detailed designs โ€” all interlinked via shared navigation.
๐Ÿ›ก๏ธ
5
Defense Layers
TLS fingerprinting, Behavioral AI, App obfuscation + honeypots, Data poisoning engine, IP watermarking.
๐ŸŽฏ
95%
Bot Block Rate
L1 blocks ~60% of basic scrapers. L2 adds ~30% of headless Chrome bots. Remaining 9% receive poisoned data. 1% residual has legal watermark trail.
๐Ÿฆ
3
Market Whitespaces
Text IP Watermarking SaaS, Enterprise RAG Poisoning, Mid-Market Bot Shield โ€” all identified as uninhabited product opportunities.
๐Ÿ“…
12mo
Implementation Roadmap
4-phase deployment plan from edge filtering through full IP legal provenance capability, suitable for enterprise adoption.
โš–๏ธ
โˆž
Legal Recourse
L5 watermarking provides court-grade cryptographic proof for any content scraped and used in LLM training or competitor products.