AI SIM Card Chatbot — AWS Cloud Architecture

Enterprise WhatsApp-to-E-commerce Platform with AI Intelligence, powered by Amazon Bedrock & Claude

Region: ap-southeast-5 (Malaysia) Services: 25+ Date: 2026-03-31 Architecture: Multi-AZ HA
View AI Chatbot User Flow (Interactive Diagram)
FLOW 1 — PRODUCT INQUIRY FLOW 4 — ADMIN KB FLOW 3 — ESCALATION FLOW 2 & 5 — PURCHASE & POST-PURCHASE FLOW 6 — PROACTIVE MESSAGES & FAQ START 📱 Customer Sends WhatsApp message 📡 Twilio API Receives → routes to AWS Auto Reply ON? YES NO → Admin handles in portal 🧠 Claude Sonnet 4.6 Language (EN / BM / CN) 🔍 RAG Retrieval Search knowledge base 💬 AI Bot Response Natural reply in language 📡 Twilio Send Response via WhatsApp Customer action? BUY ANGRY MORE Qs ✅ AI Bot Confirms Product + price + link 🔗 Send Order Link via WhatsApp 👆 Customer Clicks Opens in mobile browser New user? Y 📝 Register N 🔐 Login 🛒 Checkout Page Product + shipping address 💳 Stripe Payment Card / FPX Payment OK? YES RETRY 📦 Order Created Status: Paid → DB ✅ Confirmation Page Order # + tracking link 📤 WA: Order Confirmed Order #, amount, est. date 🚚 SIM Card Shipped Tracking # generated 📤 WA: Shipped Tracking # + courier link ⚠ No refund 📤 WA: Out for Delivery "SIM is on the way!" ✅ Delivery Confirmed Courier confirms 📤 WA: Activation Guide Step-by-step SIM setup SIM activated? YES ✅ DONE N 🤖 Bot helps 😤 Bot Detects Sentiment → escalation 🙏 Bot Acknowledges "Connecting to human..." 🔴 Auto Reply → OFF Bot stops for this chat 🔔 Lark Notification Alert admin + chat summary 👤 Admin Reviews Opens portal, reads history 💬 Admin Resolves Chats via portal → WA 🟢 Toggle Auto Reply Admin → ON in portal 🤖 Bot Resumes AI handles again ↑ back to main 🔐 Admin Login Backend portal / Lark SSO 📂 Navigate to KB "AI Brain" section 📝 Upload / Edit / Remove PDFs, FAQs, pricing, guides promos, T&Cs ⚙️ Process & Index Chunk → embed → vector DB ✅ KB Updated Bot uses new info live 🧪 Test Bot (Optional) Verify responses correct feeds into RAG ──› 📤 System-Triggered (Automatic) 1. Order Confirmed — Stripe webhook 2. Shipped — Logistics: shipped 3. Out for Delivery — Logistics: OFD 4. Delivered + Activation Guide 5. Activation Guide — with delivery msg 💬 Customer-Initiated (Bot Handles) "Where is my order?" → e-commerce tracking "How to activate?" → resends guide "Can I refund?" → no refund if shipped "Help activating" → AI troubleshoots
External Actors & Services
💬
WhatsApp Users
Customers sending messages
📞
Twilio WhatsApp API
Message broker
💳
Stripe
Payment processing
🚚
Logistics Provider
Fulfillment & tracking
AWS Cloud
Edge & Security Services
CloudFront
CloudFront
CDN & DDoS
AWS Region (ap-southeast-5)
Amazon VPC (10.0.0.0/16)
Availability Zone 1 (ap-southeast-5a)
Public Subnet (NAT, ALB)
ALB
ALB
Load Balancer
Private Subnet (ECS, Databases)
ECS Fargate
ECS Fargate
Chatbot & API
ECS (E-commerce)
ECS (E-commerce)
Web App
Availability Zone 2 (ap-southeast-5b)
Public Subnet (NAT, ALB replica)
ALB (Standby)
ALB (Standby)
HA Replica
Private Subnet (ECS standby)
ECS (Standby)
ECS (Standby)
HA Replica
AI & Core Processing
Amazon Bedrock
Amazon Bedrock
Claude AI
API Gateway
API Gateway
Webhooks
SQS
SQS
Message Queue
EventBridge
EventBridge
Event Router
Data & Knowledge Base
DynamoDB
DynamoDB
Chat Sessions
Aurora PostgreSQL
Aurora PostgreSQL
E-Commerce DB
OpenSearch Serverless
OpenSearch Serverless
Vector DB
Amazon S3
Amazon S3
Knowledge Base
Serverless Functions & Caching
Lambda
Lambda
Webhook Processor
Lambda
Lambda
Order Processor
Lambda
Lambda
Notifier
ElastiCache Redis
ElastiCache Redis
Session Cache
Operations & Admin
SNS
SNS
Notifications
Lark Admin Bot
Lark Admin Bot
Admin Portal
Security, Encryption & Monitoring
AWS KMS
AWS KMS
Encryption Keys
Secrets Manager
Secrets Manager
Credentials
CloudWatch
CloudWatch
Monitoring
X-Ray
X-Ray
Tracing
Architecture Data Flow (12 Core Steps)
StepDescription
1 Send WhatsApp message
2 POST webhook (message received)
3 Invoke webhook processor
4 Put message in queue
5 Consume message from queue
6 Invoke Claude with RAG context (OpenSearch + S3)
7 Send reply message to customer
8 Browse e-commerce site
9 Route web traffic to ALB
10 Forward to ECS web app & query Aurora
11 POST payment webhook & create order
12 Order event triggers notifications & logistics
Architecture Deep-Dive

Click any section below to expand detailed architecture documentation.

1. AI & NLP Pipeline (Bedrock + Claude)

Claude Sonnet 4.6 Integration

The chatbot uses Amazon Bedrock to invoke Claude Sonnet 4.6, leveraging a 200K token context window for rich conversational understanding. Requests include RAG-retrieved knowledge base context, conversation history, and customer metadata.

  • Model: Claude Sonnet 4.6 via Bedrock (no container/GPU management)
  • Prompt Engineering: System prompt emphasizes product expertise, escalation detection, and compliance
  • Streaming: Response streamed back to ECS, forwarded to Twilio in real-time
  • Latency: P95 < 3 sec, P99 < 8 sec (includes RAG retrieval + AI inference)
  • Cost: Pay-per-token (input + output), ~$0.003 per message on average

RAG (Retrieval-Augmented Generation)

Before invoking Bedrock, ECS queries OpenSearch Serverless to retrieve relevant knowledge base documents. This grounds Claude's responses in actual product data and company policies.

Customer Query Vector Embedding OpenSearch Semantic Search Top-5 Results Claude Prompt
  • Indexing: Lambda watches S3 for new docs, embeds them, updates OpenSearch
  • Vector Model: text-embedding-3-large (1536 dimensions) via Bedrock
  • Search Latency: < 100 ms for 10K docs
  • Knowledge Base: Product specs, FAQ, policies, pricing, warranty info

Conversation Memory

DynamoDB stores conversation history (last 10 messages) per session. On each new message, ECS retrieves history and includes it in Claude's context.

  • Session Key: user_id + timestamp, TTL: 30 days
  • History Retrieval: DynamoDB query on GSI, < 10 ms
  • Escalation Detection: Claude identifies frustration → ECS triggers SNS alert to Lark for chat takeover
2. E-Commerce & Payment Flow

Checkout Journey

Customer browses e-commerce site (served by CloudFront + ECS), adds items to cart, clicks "Checkout".

Browse Products Add to Cart Checkout Page Stripe Payment Order Created
  • Cart Storage: ElastiCache Redis (session-based, 5-min TTL)
  • Product Data: Aurora PostgreSQL (replicates across AZs)
  • Inventory: Atomic updates with DynamoDB for high-concurrency SKUs

Stripe Payment Processing

ECS redirects customer to Stripe Checkout for PCI compliance. Stripe calls API Gateway webhook → Lambda creates order → EventBridge routes to handlers.

Stripe transaction fee 2.9% + $0.30
Avg order value $50 MYR
Daily orders 100-500 peak
Monthly Stripe fees (100 orders/day) $1,500-2,000

Order Fulfillment

EventBridge routes order.created event to Lambda handlers. One creates shipment with logistics provider; another sends proactive WhatsApp notification.

  • Logistics Integration: REST API call to courier, stores tracking ID in Aurora
  • Proactive Notifications: "Your order #12345 is on the way. Track here: [link]"
  • SLA: Order confirmation sent within 5 minutes of payment
3. Security & Compliance

Network Security

All traffic encrypted in-transit (TLS 1.3). ECS tasks run in private subnets, accessible only via ALB. NAT gateways mask outbound IPs.

  • CloudFront + AWS WAF: Blocks SQL injection, XSS, DDoS (rate limiting)
  • Security Groups: ALB → ECS (internal), ECS → RDS/DynamoDB (internal)
  • NACLs: Inbound: 443/80, Outbound: all (for API calls)
  • VPC Flow Logs: Captures rejected traffic, analyzed for anomalies

Data Protection at Rest

Encryption enabled on all storage services using AWS KMS customer-managed keys.

  • DynamoDB: KMS encryption, point-in-time recovery (35-day history)
  • Aurora: KMS encryption, automated backups (30-day retention)
  • S3: KMS encryption, versioning enabled, MFA delete protection
  • Secrets Manager: Stripe, Twilio, Lark API keys rotated every 30 days

Identity & Access Management

Least-privilege IAM roles per service. ECS pulls secrets at runtime; Lambda has granular DynamoDB/S3 permissions.

  • ECS Task Role: DynamoDB (read/write sessions), SQS (consume), S3 (read knowledge base), Bedrock (invoke)
  • Lambda Role: Aurora (read/write), SQS (delete), EventBridge (put), SNS (publish)
  • CloudTrail: Logs all API calls; retained 90 days in S3
  • MFA: Required for AWS console access (ops team only)

Compliance & Audit

Architecture supports common compliance frameworks for E-commerce & FinTech.

  • PCI DSS: Stripe handles payment data (we never see card numbers). WAF + Security Groups limit attack surface.
  • GDPR/CCPA: Customer data (email, phone) encrypted; automatic deletion after 90 days; audit logs available.
  • Data Residency: ap-southeast-5 (Malaysia) — all data stays in-region; no cross-region replication by default.
4. Scalability & High Availability

Auto-Scaling Strategy

All compute layers scale automatically based on demand. ECS tracks CPU/memory; Lambda scales on concurrency; DynamoDB scales on consumed capacity.

  • ECS Fargate: Target CPU 70%, memory 80%. Scales from 2 to 10 tasks in 60 sec.
  • SQS: Scales ECS based on queue depth. 1 task per 100 messages in queue.
  • DynamoDB: On-demand mode (recommended). Scales to 40K RCU/WCU within seconds.
  • Lambda: Concurrent execution limit: 1000 (default). Provisioned concurrency for spike handling.
  • OpenSearch: Serverless (4-40 OCUs). Auto-scales based on indexing/query load.

Multi-AZ High Availability

Critical services replicated across ap-southeast-5a & ap-southeast-5b. RPO/RTO targets met by synchronous replication.

  • ECS: 2 tasks in AZ1, 2 in AZ2. Auto-fails over if AZ1 Availability Zone goes down.
  • Aurora: 1 primary (AZ1) + 1 replica (AZ2). Synchronous replication, < 1 sec RPO.
  • ALB: Cross-AZ configured. Health checks trigger target de-registration in 30 sec.
  • RDS Auto-Failover: Database instance failure → failover to replica within 30 sec (RTO). **RPO = 0 (no data loss).**

Disaster Recovery Plan

If entire ap-southeast-5 goes down, recovery involves CloudFormation-based redeploy to ap-southeast-1 or ap-south-1.

  • Backup Regions: Aurora snapshots replicated to ap-southeast-1 (hourly). S3 cross-region replication for knowledge base.
  • RTO: 2-4 hours (CloudFormation stack creation, container pulls, database restore from snapshot)
  • RPO: 1 hour (last backup snapshot)
  • Testing: Quarterly DR drills (failed-over to standby region, validated data consistency)

Performance Optimization

P99 latencies maintained below 500 ms for 99% of requests via caching and async processing.

CloudFront cache hit ratio 85%+
ElastiCache hit ratio (session) 95%+
DynamoDB avg latency 10-20 ms
Bedrock + RAG latency (P95) 3-4 sec
AWS Cloud
AWS Region
Amazon VPC
Public Subnet
Private Subnet
Availability Zone
Data Flow
1 Flow Step Number