Secure Ingress
Cloudflare Tunnel, DNS, custom rules, and zero-open-port access into the local lab.
Architecture Lab
My lab notebook for AI diagrams, RAG flows, gateways, local models, and architecture research.
lab.index
Primary Use
Learn how the architecture is built
Core Lab
Self-hosted AI home lab on Apple Silicon
Research Focus
Agents, RAG, gateways, local models
For
Architects, engineers, developers, clients
Diagrams, field notes, resources, and interactive tools will keep landing here.
Home Lab Blueprint
Secure ingress, API gateway, agent runtime, local inference, data stores, and frontend surfaces.

Cloudflare Tunnel, DNS, custom rules, and zero-open-port access into the local lab.
FastAPI gateway with OpenAI-compatible contracts, auth, rate limits, logging, and model routing.
LangGraphJS orchestration for agent state, tool calls, MCP-style extensions, and approval paths.
Ollama and MLX for local model experiments, with fallback thinking for cloud model lanes.
PostgreSQL, Redis, and Qdrant for metadata, cache, queues, and vector search.
Next.js interfaces that turn the lab into a usable architecture playground and learning surface.
Graph Playground
Preset inputs, node transitions, checkpoints, approval gates, retries, and state inspection.
LangGraph Run
Risk
High
Mode
Async queue recommended
Checkpoints
0
Retries
0
Execution Path
01 / 08
active
input node
state
state.intent, state.policy_scope, state.risk_level
checkpoint
pending
gate
Not reached
Interactive Workbench
Select a system and inspect inputs, outputs, metrics, and failure modes.
Private Apple Silicon lab: agents, local models, gateway, and vector stores.
Evaluation Metrics
$0 / month
All inference, databases, and platform layers run fully on local hardware.
32GB RAM
Enables co-location of local models and transactional databases.
Zero ports open
Incoming traffic strictly flows through secure, egress-only Cloudflare tunnels.
Encrypted tunnel ingress with no exposed local ports.
Public DNS request / HTTPS request
Routed traffic to local daemon (zero open ports)
WAF + custom firewall rules enforced
Tunnel drop. Auto-restart restores the daemon.
Empirical Sandbox
Live-ready local model inventory, throughput, memory, and tunnel health.
Local Sandbox
Live-ready metrics for models, memory, tunnel health, and local AI infrastructure.
Source
Snapshot
Health
Local lab telemetry hook ready
Tunnel
Cloudflare
Stores
Postgres / Redis / Qdrant
Ollama / Apple Silicon
Generation
18-28 tok/s
Memory
9-12GB unified memory
Ollama local reasoning lane
Generation
10-18 tok/s
Memory
12-16GB unified memory
MLX / local router
Generation
35-55 tok/s
Memory
3-5GB unified memory
Host
MacBook M1 Pro
32GB unified memory, local inference and data stores.
Gateway
FastAPI
OpenAI-compatible routing, auth, logging, and fallback policy.
Vector Store
Qdrant
Local semantic retrieval with cache-aware query paths.
Ingress
Cloudflare Tunnel
Egress-only tunnel path with zero open local ports.
Tunnel Log
T-00:04
Tunnel route healthy; gateway reachable through private ingress.
T-00:11
Local model lane selected for low-risk router classification.
T-00:18
Cloud fallback reserved for high-context reasoning requests.
Audit Estimator
Estimate token budget, latency shape, and cache/compaction savings.
Model cost, latency, cache savings, and review signal in one pass.
Primary Model
Monthly Token Budget
577.5M
75,000 monthly queries
Estimated Savings
$819
Semantic cache + context compaction
Sync P95
3.2s
Direct request-response path
Async Queue
5.4s
Queued worker path with retry control
Monthly Model API Budget
Baseline
$1,950
Optimized
$1,131
Architecture Readout
Review focus: cache, compaction, queues, reranking, and fallback routing.
Research Queue
Future notes, experiments, proof gaps, and publishable architecture decisions.
LAB-01
PublishingPrivate AI platform on MacBook M1 Pro with Cloudflare, FastAPI, LangGraphJS, Ollama, MLX, PostgreSQL, Redis, and Qdrant.
Next detail to publish
Topology notes, constraints, and failure-mode checks.
LAB-02
MeasuringSource routing, context assembly, reranking, local reasoning, and grounded answer checks.
Next detail to publish
Eval cases for weak grounding, source gaps, and freshness.
LAB-03
HardeningOne API contract for local models, cloud models, tools, scripts, and future MCP clients.
Next detail to publish
Validation, auth, fallback routing, traces, and cost/privacy tradeoffs.
LAB-04
DesigningInteractive diagrams for layers, failure modes, metrics, and decisions.
Next detail to publish
Topology, sequence, data-flow, and reliability views.
LAB-05
PlannedFirst-party architecture notes with diagrams, code walkthroughs, and file-aware references.
Next detail to publish
MDX model, Mermaid rendering, code highlighting, SEO, and newsletter capture.
Lab Resources
Diagrams, data resources, tool reviews, playgrounds, and reusable patterns.
Topology maps, data-flow diagrams, sequence flows, failure-mode maps, and infra stack references.
Short field notes on what worked, what failed, what changed, and which tradeoffs still need proof.
Evaluation fixtures, source inventories, prompt contracts, retrieval examples, and grounding checklists.
Practical reviews of agent frameworks, vector databases, model runtimes, gateways, and observability tools.
Interactive inspectors for agent routing, retrieval assembly, gateway fallback, and architecture decisions.
Reference patterns that teams can adapt for private AI, RAG reliability, agent orchestration, and platform handoff.
Work With Me
Bring one constraint: grounding, routing, gateway design, observability, data ownership, deployment, or developer adoption.