Securing 15 Services at Once: The Security Architecture of Composable Commerce
By a Senior AWS Solutions Architect | #ComposableCommerce #Security #AWS #PCIDss #ZeroTrust
Composable commerce creates a security surface area that most teams underestimate when they start the migration.
A monolith has one application to secure. One set of dependencies to patch. One deployment pipeline to harden. One network boundary to defend. One set of credentials to rotate.
A composable platform with 15 PBCs has 15 applications, 15 dependency trees, 15 deployment pipelines, 15 sets of API endpoints, and 15 potential entry points for an attacker — all integrated through APIs that cross network boundaries constantly. Mishandle any one of them and the blast radius can reach the others.
This isn't an argument against composable commerce. It's an argument for building security as infrastructure rather than as application code — so that the security posture scales horizontally with the architecture, rather than having to be reimplemented in each PBC.
The Shared Responsibility Model: Precise Boundaries Matter
For composable commerce on AWS, the shared responsibility model has three layers (not two), because many composable platforms use third-party SaaS PBCs alongside custom-built ones.
flowchart TD
subgraph A["Layer A — AWS Infrastructure (AWS certifies this)"]
A1["Physical datacentre security"]
A2["Hypervisor & host isolation"]
A3["Network hardware & backbone"]
A4["Managed service internals (RDS OS · Lambda runtime)"]
end
subgraph B["Layer B — Platform Configuration (You)"]
B1["VPC design · Security Groups · NACLs"]
B2["IAM roles · policies · permission boundaries"]
B3["KMS key management · Secrets Manager"]
B4["OS patching via Systems Manager"]
B5["S3 bucket policies · Config rules · GuardDuty"]
end
subgraph C["Layer C — Application Code (You + PBC Vendors)"]
C1["Custom PBC application logic"]
C2["Input validation · output encoding"]
C3["Third-party PBC vendor application code"]
C4["API auth between PBCs · data handling"]
end
A -->|"AWS PCI DSS AOC\ncovers this layer"| A
B -->|"Your team configures\nand maintains"| B
C -->|"Vendor contracts\n+ security assessments"| C
AWS's PCI DSS Attestation of Compliance covers Layer A. When your QSA reviews your PCI assessment, Layers B and C require your own evidence. The AWS documentation makes this clear; the mistake is assuming Layer A certification implies Layers B and C.
Defence in Depth: Five Independent Layers
A composable platform's security posture is only as good as its weakest layer. The goal is that breaching one layer doesn't breach the platform. Each layer must independently limit the blast radius.
flowchart TD
INET["🌐 Internet"] --> L1
subgraph L1["Layer 1: AWS Shield + Route 53"]
S1["DDoS absorption L3/L4\n(free · automatic for all customers)\nRoute 53 health checks pre-route"]
end
L1 --> L2
subgraph L2["Layer 2: CloudFront + AWS WAF"]
S2["L7 attack blocking: SQLi · XSS · bad bots\nRate limiting per IP per endpoint\nGeographic restrictions"]
end
L2 --> L3
subgraph L3["Layer 3: VPC + NACLs + Security Groups"]
S3["Payment subnet: NACL firewall-isolated\nSecurity Groups: PBC-to-PBC via IAM only\nNo SSH on any production instance"]
end
L3 --> L4
subgraph L4["Layer 4: IAM + Secrets Manager"]
S4["One role per PBC · least privilege\nNo credentials in code\nAuto-rotating secrets · IMDSv2 enforced"]
end
L4 --> L5
subgraph L5["Layer 5: Data Encryption"]
S5["EBS: AES-256 KMS\nS3: SSE-KMS on customer data buckets\nRDS: TDE · DynamoDB: default encryption\nAll inter-service traffic: TLS 1.2+"]
end
L5 --> L6
subgraph L6["Layer 6: Audit + Detection"]
S6["CloudTrail: every API call · all regions\nVPC Flow Logs: all network traffic\nAWS Config: continuous compliance\nGuardDuty: ML threat detection"]
end
A compromised Cart PBC cannot reach the Payment database (Layer 3 blocks it). Even if it somehow did, it has no IAM permission to query the payment database (Layer 4). Even if it somehow had permission, payment data is encrypted with a KMS key the Cart PBC role cannot decrypt (Layer 5). Every attempt is logged in CloudTrail and GuardDuty would flag the anomalous access pattern (Layer 6).
AWS WAF Rules for Composable Commerce
WAF sits at CloudFront and ALB, evaluating every request before it reaches your PBCs. For composable commerce, I configure three categories of rules:
AWS Managed Rules (always on — no configuration required):
AWSManagedRulesCommonRuleSet:
Blocks OWASP Top 10 attack patterns
SQL injection, XSS, path traversal, command injection
Critical for any PBC that accepts user input
AWSManagedRulesSQLiRuleSet:
Additional SQL injection patterns
Important for any PBC with a relational database backend
AWSManagedRulesKnownBadInputsRuleSet:
Blocks requests with Java deserialization exploits
Log4j exploit patterns
Spring framework attacks
Custom Rate Limiting Rules (tuned per PBC endpoint):
Rule: CheckoutRateLimit
IF path matches /api/checkout/*
AND request rate from IP > 20 per 5 minutes
THEN Block
(Human checkout flows never exceed this; bots testing cards do)
Rule: SearchRateLimit
IF path matches /api/search/*
AND request rate from IP > 100 per minute
THEN Block with 429
(Legitimate search users; prevents catalogue scraping)
Rule: LoginBruteForce
IF path matches /api/auth/login
AND request rate from IP > 10 per 5 minutes
THEN Block for 15 minutes
(Credential stuffing protection)
Geographic restrictions (if your business requires them):
Rule: GeoBlock
IF source country NOT IN [your operating countries]
THEN Block
Use case: if you only sell in EU + US + AU,
block all other geographies to reduce attack surface
WAF logging to S3 via Kinesis Firehose enables analysis of blocked requests. Regularly reviewing WAF logs reveals attack patterns and allows tuning rules to avoid false positives (legitimate customers blocked) and false negatives (attacks passing through).
The Payment PBC: PCI Isolation Pattern
If your composable platform processes card data directly (rather than tokenising via Stripe/Braintree and removing yourself from PCI scope), the Payment PBC requires the highest level of network isolation.
Network topology for PCI-isolated Payment PBC:
flowchart TB
subgraph PCI["🔒 PCI Isolated — Payment Subnet 10.0.10.0/24"]
PAY["💳 Payment PBC"]
KMS["🔑 KMS CMK\n(payment-specific key)"]
SM["🔐 Secrets Manager\npayment-gateway-credentials\nauto-rotated every 90 days"]
end
subgraph RULES["Strict Network Controls"]
NACL["NACL:\nInbound: TCP 8443 from Checkout Subnet ONLY\nInbound: TCP 443 from Payment Processor IP\nAll other inbound: DENY\nAll non-response outbound: DENY"]
RT["Route Table:\nNo 0.0.0.0/0 default route\n(no internet access at all)\nPatches via SSM PrivateLink only"]
end
CHK["🛒 Checkout PBC\n(App Subnet 10.0.3.0/24)"] -->|"TCP 8443 only\nvia CheckoutPBC-SG"| PAY
PAY -->|"TCP 443 only"| PROC["💰 Payment Processor\n(external)"]
PAY --> KMS & SM
PAY --> RULES
BREACH["❌ Compromised Cart/Catalogue PBC\nattempts to reach Payment subnet"]
BREACH -. "BLOCKED by NACL\nbefore packet reaches PBC" .-> PAY
style PCI fill:#2a0a0a,color:#fff
style BREACH fill:#3a0a0a,color:#fff
Even if an attacker compromises every other PBC in the platform, the Payment subnet is unreachable. The NACL blocks all traffic except from the Checkout subnet on the specific port. The Security Group provides a second independent layer of the same control. The KMS key ensures payment data at rest is readable only by the Payment PBC. This is PCI defence-in-depth.
The better approach: tokenisation. If you integrate with Stripe, Braintree, or Adyen, card data never touches your infrastructure. The PBC sends card data directly to the processor via their SDK (executing in the user's browser), receives a token, and stores the token. Your PCI scope drops dramatically — you're handling tokens, not card numbers. The processor handles PCI compliance for card data; you handle compliance for everything else.
EC2 Instance Hardening: IMDSv2 and No SSH
Two EC2 security controls that should be non-negotiable in any composable platform:
IMDSv2 Enforcement: The Instance Metadata Service (IMDS) provides temporary credentials for IAM roles to EC2 instances. IMDSv1 is vulnerable to Server-Side Request Forgery (SSRF) attacks: if a web application has an SSRF vulnerability, an attacker can use it to call the IMDS endpoint and steal the instance's IAM role credentials.
IMDSv2 requires a session token for all IMDS requests — a token that must be obtained via a PUT request (which SSRF cannot initiate). Enforcing IMDSv2 eliminates this attack vector:
// In every Launch Template — non-negotiable production requirement
{
"MetadataOptions": {
"HttpTokens": "required", // IMDSv2 only — reject v1 requests
"HttpPutResponseHopLimit": 1 // Prevent metadata access from containers
}
}
Enforce via AWS Config rule (ec2-imdsv2-check) to detect any instance not enforcing IMDSv2.
No SSH in Production: Direct SSH access to production instances creates audit complexity (who accessed what, when, what did they do?), requires managing SSH keys, and opens port 22 as an attack surface. The modern alternative is AWS Systems Manager Session Manager:
- No port 22 open in any security group
- No SSH keys to manage or rotate
- All sessions logged to CloudWatch Logs and S3 automatically
- Access controlled by IAM (the engineer's role must have ssm:StartSession permission)
- Session content recorded for audit purposes
For a composable platform with 15 PBC teams, eliminating SSH means eliminating 15 sets of SSH key management, 15 open port-22 rules, and 15 potential jump points into your production environment.
Secrets Management at Platform Scale
A composable platform with 15 PBCs might have 60–100 secrets: database passwords, API keys, OAuth tokens, webhook secrets, encryption passwords. Managing these manually is operationally unsustainable and insecure.
AWS Secrets Manager handles the full lifecycle:
# PBC startup: fetch live secret value, never cache indefinitely
import boto3
import json
from functools import lru_cache
import time
_secret_cache = {}
_CACHE_TTL = 300 # Refresh secrets every 5 minutes
def get_secret(secret_name: str) -> dict:
now = time.time()
cached = _secret_cache.get(secret_name)
if cached and (now - cached['fetched_at']) < _CACHE_TTL:
return cached['value']
client = boto3.client('secretsmanager')
response = client.get_secret_value(SecretId=secret_name)
value = json.loads(response['SecretString'])
_secret_cache[secret_name] = {'value': value, 'fetched_at': now}
return value
# Usage: credentials always fresh, rotation happens transparently
db_creds = get_secret('commerce/checkout/rds-credentials')
connection = psycopg2.connect(
host=db_creds['host'],
user=db_creds['username'],
password=db_creds['password'], # Auto-rotated every 30 days by Secrets Manager
database=db_creds['dbname']
)
When Secrets Manager rotates the RDS password (automated, every 30 days), it updates the database password in RDS and the secret value in Secrets Manager atomically. The next time a PBC fetches the secret (within 5 minutes, given the cache TTL), it gets the new password. Zero downtime. Zero manual rotation. Full audit trail in CloudTrail of every access.
CloudTrail: The Audit Log Every Composable Platform Needs
In a composable platform where 15 teams make changes across hundreds of AWS resources, CloudTrail is the indispensable record of "who did what to which resource at what time."
Configure CloudTrail for maximum coverage:
- All regions (not just your primary region — attackers don't limit themselves to your primary)
- Management events (IAM changes, VPC changes, security group modifications)
- Data events for sensitive resources (S3 GetObject on invoice buckets, DynamoDB GetItem on payment tables)
- Log file integrity validation enabled (detect if logs are tampered with post-incident)
- Delivery to S3 with Object Lock (WORM policy — logs cannot be deleted or modified)
For incident response after a suspected breach:
-- Query CloudTrail logs in Athena to investigate
-- "What did this compromised IAM role do in the last 24 hours?"
SELECT
eventtime,
eventsource,
eventname,
awsregion,
sourceipaddress,
json_extract_scalar(requestparameters, '$.bucketName') as s3_bucket,
json_extract_scalar(requestparameters, '$.key') as s3_key
FROM cloudtrail_logs
WHERE useridentity.arn LIKE '%CheckoutPBCRole%'
AND eventtime > date_add('hour', -24, now())
AND eventname NOT IN ('AssumeRole', 'GetCallerIdentity')
ORDER BY eventtime DESC;
This query runs in seconds against the S3-stored CloudTrail logs via Athena. In a real incident, knowing exactly what a compromised role accessed in the 24 hours before discovery is the difference between a contained incident report and a full breach notification.
The Security Posture Scales With the Architecture
The key insight for composable commerce security: implement controls at the infrastructure layer, not the application layer. IAM permissions enforced by AWS regardless of application code. Network controls enforced by the hypervisor. Encryption enforced at storage. Audit logging enforced by CloudTrail.
When security lives in infrastructure, it scales with your composable architecture. Add a 16th PBC and it automatically inherits the VPC controls, the CloudTrail audit, the Config compliance rules, and the WAF protections — before a single line of application-layer security code is written.
That's the security dividend of treating AWS as a platform, not just a cloud provider.
Next: Compliance — what AWS certifications cover, what they don't, and how to build the compliance posture your enterprise composable commerce platform requires.
💬 What's the hardest security boundary to maintain in your composable platform — the network layer, the IAM layer, or the application layer? In my experience the answer is usually IAM.
#Security #AWS #ComposableCommerce #PCIDss #WAF #ZeroTrust #CloudSecurity #MACH #SolutionsArchitect #IAM