#ai-agents

10 posts

Privacy-Routed LLM Inference: Keeping Sensitive Data Out of the Cloud

Privacy-Routed LLM Inference: Keeping Sensitive Data Out of the Cloud

How to build a routing layer for AI agents that ensures sensitive data stays on local hardware while leveraging cloud LLMs for non-private tasks.

Cognitive Memory for Agents: Vector Search vs Activation-Based Recall

Cognitive Memory for Agents: Vector Search vs Activation-Based Recall

Comparing vector databases and activation-based memory for AI agents. Trade-offs in latency, scale, and interpretability.

Three-Layer Safety for Autonomous Agents: Stopping the Infinite Loop

Three-Layer Safety for Autonomous Agents: Stopping the Infinite Loop

Moving beyond prompt engineering to implement token-level schema enforcement, pre-execution gates, and shell-safe execution pipelines for AI agents.

Self-Improving AI Infrastructure: How Your Homelab Wiki Updates Itself

Self-Improving AI Infrastructure: How Your Homelab Wiki Updates Itself

How to automate your homelab wiki with self-improving AI infrastructure

The 6-Layer Memory Architecture I Run for Claude Code

The 6-Layer Memory Architecture I Run for Claude Code

Open-sourcing the memory system behind my Claude Code setup: CLAUDE.md, path-scoped rules, wiki, vector search, cognitive memory. With the mistakes.

Building Karpathy's LLM Wiki: A Production Homelab Implementation

Building Karpathy's LLM Wiki: A Production Homelab Implementation

Implementing Karpathy's LLM Wiki in a homelab with real-world lessons and gotchas

Agent Credential Management: Two-Tier Service Accounts for Secure AI Agent Workflows

Agent Credential Management: Two-Tier Service Accounts for Secure AI Agent Workflows

Managing agent credentials with two-tier service accounts: a secure approach for AI agent orchestration

NVIDIA Container Toolkit: Why the Default Runtime Matters

NVIDIA Container Toolkit: Why the Default Runtime Matters

Fixing default runtime misconfigurations in NVIDIA Container Toolkit for GPU workloads

Building MCP Servers with FastMCP: Stop Writing Boilerplate, Start Writing Tools

Building MCP Servers with FastMCP: Stop Writing Boilerplate, Start Writing Tools

FastMCP makes building Model Context Protocol servers feel like FastAPI. Here's how to go from zero to a working MCP server in under an hour.

Multi-Agent AI Systems: Architecture Patterns That Actually Work

Multi-Agent AI Systems: Architecture Patterns That Actually Work

A practical guide to designing multi-agent AI systems — orchestrator patterns, trust boundaries, and the tradeoffs I learned running agents in production.

← All tags