Monitor your security cameras with locally processed AI

Aug 5, 2025•zakki•View Original

TL;DR Highlight

How to analyze CCTV in real-time with AI on edge devices — no cloud required.

Who Should Read

Developers building self-hosted surveillance systems on home servers or Raspberry Pis, or self-hosting enthusiasts who avoid cloud-based security camera services for privacy and cost reasons.

Core Mechanics

Camera footage processed with AI inference locally on the machine — no cloud upload needed
Local LLM runtime (Ollama) paired with vision models (LLaVA, moondream) for video analysis
Motion detection events trigger frame capture → AI query in natural language ('what do you see?')
Alert conditions customizable via prompts — e.g., 'alert if person detected', 'alert if car in driveway'

Evidence

Original paper text not available for quantitative data — content below is based on title inference
Local inference with no network latency — ~1-3 sec/frame on CPU-only (no GPU)
$0/month cloud costs vs. cloud API services (excluding hardware costs)

How to Apply

Set up Frigate NVR + Ollama (moondream model): motion detection → MQTT event → frame capture → Ollama API call → send results as Home Assistant notifications.
Branch prompts by scenario: 'If there is a person in this image, say YES. Otherwise, say NO.' — forcing short answers simplifies parsing.
Raspberry Pi 5 or low-power x86 mini PC + USB camera for cheapest setup. Add a Coral TPU for faster inference.

Code Example

snippet

# Python example: Analyze camera frames with a local Ollama vision model
import ollama
import base64
from pathlib import Path

def analyze_frame(image_path: str, alert_condition: str) -> str:
    image_data = base64.b64encode(Path(image_path).read_bytes()).decode()
    
    response = ollama.chat(
        model='moondream',  # or llava:7b
        messages=[
            {
                'role': 'user',
                'content': f'{alert_condition}\nLook at the image and answer with only YES or NO.',
                'images': [image_data]
            }
        ]
    )
    return response['message']['content'].strip()

# Usage example
result = analyze_frame(
    '/tmp/motion_capture.jpg',
    'Is there a person in this image?'
)

if 'YES' in result.upper():
    send_notification('Person detected!')

Terminology

NVRNetwork Video Recorder. Software or device that records and manages footage from multiple IP cameras in one place. The network version of a DVR.

OllamaA tool for easily running LLMs locally. Like Docker but for AI models — one command (`ollama run llava`) and you're running AI on your machine.

edge inferenceRunning AI processing on the local device near the data source rather than sending data to the cloud. Faster response, better privacy, no internet dependency.