# VM Audit — 2026-05-16

Deep audit of running services across the three VMs, cross-referenced against [master.md](master.md) and per-service setup docs. This file lists **gaps**: services that run but are undocumented, doc entries that no longer match reality, and notable cross-VM wiring not captured in the index.

## Audit method

For each VM:
- `docker ps -a` (running + exited containers)
- `systemctl list-units --type=service --state=running` (both system and `--user`)
- `ss -tlnp` (every listening TCP socket → owning PID/cmd)
- `crontab -l` (user) + `/etc/cron.d`
- Caddy / Traefik config (admin API + on-disk)

---

## pop-os (`192.168.0.178` / `100.78.69.20`)

### Missing from docs

| # | Item | What it is | Where it lives |
|---|------|------------|----------------|
| 1 | `sub2api-tunnel.service` | systemd **user** unit running `~/sub2api-tunnel.py` — TCP forwarder `127.0.0.1:8090 → 100.108.123.60:8090`. Lets local tools hit OCI-hosted sub2api as if it were local. | `~/.config/systemd/user/sub2api-tunnel.service` + `~/sub2api-tunnel.py` |
| 2 | Hermes WhatsApp bridge | Node bridge on **`127.0.0.1:3100`** (`bridge.js --mode bot --session ~/.hermes/whatsapp/session`). Started by `hermes-gateway.service` — current docs say Hermes is "systemd" but don't list its WhatsApp bridge port. | `~/.hermes/hermes-agent/scripts/whatsapp-bridge/` |
| 3 | OpenClaw node-host port `18791` | Second listener owned by the OpenClaw gateway process (`openclaw gateway --port 18789` also opens 18791 for the node host). Docs only mention 18789. | `openclaw-gateway.service` (user) |
| 4 | `openclaw-node.service` | Separate user unit for the OpenClaw node host (v2026.4.2) — currently only `openclaw-gateway` is mentioned. | `~/.config/systemd/user/openclaw-node.service` |
| 5 | `tg-filter` container | Running container, no setup doc. Pulls a Telegram group (`-1002225681039`), filters, writes to **Neon PG** (`ep-falling-moon-a1r4bgch-pooler.ap-southeast-1.aws.neon.tech`), queues via **Upstash Redis** (`faithful-ewe-77854.upstash.io`), forwards to channel `-1003704234947`. Runs scheduled ingest every 6h (`INGEST_INTERVAL_SECONDS=21600`). | container env-only; no compose file checked-in to repo |
| 6 | Fundamental-analysis cron | 4 weekday cron entries (UTC) — **not** under `openclaw cron`, regular user crontab. Missing from master.md's Scheduled Jobs table. | `crontab -l` for `sze` |
| 7 | `grok-register-ui.service` | User systemd unit running `web_ui.py` on `0.0.0.0:5000`. Doc lists port 5000 but not that it's managed by systemd-user. | `~/.config/systemd/user/grok-register-ui.service` |
| 8 | `nezha-agent.service` | **systemd-user** unit, not "process" as `master.md` says. | `~/.config/systemd/user/nezha-agent.service` |
| 9 | `next-server` on `*:3001` | `v2ray-proxy-manager` is listed but the actual process is a Next.js server; no setup doc. | `~/apps/...` (location not chased — confirm) |

### Stale / incorrect in docs

| # | Claim in master.md | Reality |
|---|--------------------|---------|
| A | `poeReg2api` runs on host port **9000** (public) | Container exposes `8100/tcp` internally with **no host port published** (`docker port poeReg2api` is empty). Port 9000 is **not listening** on the host. Service is reachable only inside the docker network or via `docker exec`. Either re-publish to a host port or remove the row from the public port table. |
| B | "Hermes Agent / systemd" with no port | Hermes also opens `127.0.0.1:3100` for the WhatsApp bridge — add to port map. |
| C | OpenClaw "18789 localhost" only | Add the secondary `18791` port. |
| D | Nezha agent "process" | It is a systemd-user service, same lifecycle as the others. |

### Fundamental-analysis cron entries to add

All weekdays (Mon–Fri), times in UTC:

| Schedule (UTC) | Local HKT | Script | Notes |
|----------------|-----------|--------|-------|
| `35 12 * * 1-5` | 20:35 HKT | `update_reports.sh --wa-send` | Post-NFP / CPI / PPI window (08:35 ET) |
| `5 14 * * 1-5` | 22:05 HKT | `update_reports.sh --wa-send` | ISM PMI / JOLTS (10:05 ET) |
| `5 18 * * 1-5` | 02:05 HKT (next) | `update_reports.sh --wa-send` | FOMC window (14:05 ET) |
| `10 20 * * 1-5` | 04:10 HKT (next) | `update_reports.sh` (no `--wa-send`) | Market close recap (16:10 ET) |

Script path: `~/.openclaw/agents/fundamental-analysis/workspace/skills/update-reports/scripts/update_reports.sh`
Log: `~/.openclaw/agents/fundamental-analysis/workspace/reports/cron.log`

---

## coolify-master (`129.146.218.222` / `100.82.177.59`)

### Missing from docs

The Coolify control-plane is documented, but the **applications it has deployed** are not. Pulled from the `applications` table in `coolify-db`:

| Coolify App | FQDN(s) | Status | Target server (per `standalone_dockers`) |
|-------------|---------|--------|-----------------------------------------|
| `lung-wai/1000-saas:main` | `1000saas.xyz`, `www.1000saas.xyz` (http+https) | **exited:unhealthy** | `naughty-narwhal` = instance VM (137.131.41.18) |
| `lung-wai/-cloud-flare--img-bed` | _(no fqdn)_ | **running:unhealthy** | instance VM |
| `lung-wai/-githubuilder` | `githubuilder.com`, `www.githubuilder.com` (http+https) | **exited:unhealthy** | instance VM |
| `lung-wai/new-api:main` | _(no fqdn)_ | **exited:unhealthy** | instance VM |
| `lung-wai/gemini-balance:main` | _(no fqdn)_ | **exited:unhealthy** | instance VM |

**Key fact not captured anywhere:** the Coolify *worker* lives on the instance VM, not on coolify-master. Coolify-master only runs the control plane (UI + DB + Redis + realtime + its own Traefik for the admin UI). All deployed apps land on the instance VM's `coolify-proxy` (Traefik) at 137.131.41.18.

### Action items (deployment debt)

- 4 of 5 Coolify apps are `exited:unhealthy`. The two public domains (`1000saas.xyz` and `githubuilder.com`) are currently down.
- The `cloud-flare-img-bed` Coolify app is `running:unhealthy` — overlaps with the standalone `imgbed-...` container on the instance (port 7658). Confirm which one is authoritative.

---

## instance-20250520-1933 (`137.131.41.18` / `100.108.123.60`)

### Missing from docs

| # | Item | Detail |
|---|------|--------|
| 1 | `sub2api-postgres` (postgres:18-alpine) | sub2api dependency, port 5432 internal |
| 2 | `sub2api-redis` (redis:8-alpine) | sub2api dependency, port 6379 internal |
| 3 | `coolify-proxy` role | Listed in port map but not explained — this Traefik is the **Coolify worker proxy** that fronts deployed apps (1000saas, githubuilder, imgbed). Same image as the one on coolify-master but a different role. |
| 4 | `imgbed-w0kcco0ckog8ww0w8go0k4ko-...` is the Coolify-managed deployment | master.md treats `imgbed` as a standalone service. It is actually the `cloud-flare-img-bed` Coolify app. Document the link. |
| 5 | Staged but **not running** compose dirs | `/home/ubuntu/poeReg2api/docker-compose.yml` and `/home/ubuntu/grok2api/docker-compose.yml` exist — appear to be migration prep from pop-os to the instance. No containers currently running for either. Decide: finish migration or remove. |
| 6 | `/home/ubuntu/new-api/` exists with no compose file | Bare dir — partial setup or cleanup leftover. |

### Stale / incorrect in docs

- `master.md` Service Map lists `imgbed` with port 7658 as standalone. It is Coolify-managed and the container name is auto-generated (`imgbed-w0kcco0ckog8ww0w8go0k4ko-123824117747`).

---

## Cross-VM observations

1. **sub2api access path on pop-os goes through a tunnel** — `sub2api-tunnel.py` listens locally and forwards to instance. Any client config on pop-os pointing at `http://localhost:8090` is actually hitting `100.108.123.60:8090`. Worth noting in setup docs to avoid confusion when debugging.

2. **Two Traefik proxies, two roles** — both VMs run `traefik:v3.1` named `coolify-proxy` on `:80/:443/:8080`. The one on coolify-master fronts the Coolify UI; the one on the instance fronts deployed apps. They are independent.

3. **Nezha monitor coverage gaps** — based on `master.md` monitor list, there is no probe for:
   - `grok-register` (port 5000)
   - `v2ray-proxy-manager` (port 3001)
   - `tg-filter` (no HTTP endpoint — needs a synthetic check)
   - `hermes-gateway` / WhatsApp bridge (`:3100`)
   - `openclaw-gateway` (`:18789`)
   - Public Coolify-deployed domains (`1000saas.xyz`, `githubuilder.com`) — they would catch the current "exited" state immediately.

4. **Tailscale Caddy HTTPS table is correct** (verified live against the Caddy admin API at `127.0.0.1:2019`). All four reverse-proxy entries match.

---

## Proposed doc updates

1. Apply the fixes inline to `master.md`:
   - Correct the `poeReg2api` row (remove from public port table or fix the port).
   - Add WhatsApp bridge port `3100` and OpenClaw `18791` to the pop-os port map.
   - Add `sub2api-tunnel` row under pop-os services.
   - Add `tg-filter` setup-doc link (create the doc).
   - Add a "Coolify Deployed Apps" subsection under `coolify-master` listing the 5 apps and noting their target is the instance VM.
   - Move `imgbed` row under coolify-master deployments and cross-link from the instance section.
   - Add the 4 fundamental-analysis cron entries to the Scheduled Jobs table (or split into a "User crontab" subsection so it's clear they are **not** managed by `openclaw cron`).
2. Create `tg-filter/tg-filter-setup.md` documenting the container's env, ingest schedule, and external dependencies (Neon, Upstash).
3. Add a Nezha probe for each gap listed in #3 above.

Once those are done, `master.md` will once again be the source of truth.