#ollama

3 posts

Privacy-Routed LLM Inference: Keeping Sensitive Data Out of the Cloud

Privacy-Routed LLM Inference: Keeping Sensitive Data Out of the Cloud

How to build a routing layer for AI agents that ensures sensitive data stays on local hardware while leveraging cloud LLMs for non-private tasks.

Three-Layer Safety for Autonomous Agents: Stopping the Infinite Loop

Three-Layer Safety for Autonomous Agents: Stopping the Infinite Loop

Moving beyond prompt engineering to implement token-level schema enforcement, pre-execution gates, and shell-safe execution pipelines for AI agents.

Ollama on Kubernetes: Recreate Strategy and Single-GPU Deadlock

Ollama on Kubernetes: Recreate Strategy and Single-GPU Deadlock

Deploying Ollama on Kubernetes can lead to GPU deadlocks. Here's how to avoid them.

← All tags