# VM Audit — 2026-05-16 Deep audit of running services across the three VMs, cross-referenced against [master.md](master.md) and per-service setup docs. This file lists **gaps**: services that run but are undocumented, doc entries that no longer match reality, and notable cross-VM wiring not captured in the index. ## Audit method For each VM: - `docker ps -a` (running + exited containers) - `systemctl list-units --type=service --state=running` (both system and `--user`) - `ss -tlnp` (every listening TCP socket → owning PID/cmd) - `crontab -l` (user) + `/etc/cron.d` - Caddy / Traefik config (admin API + on-disk) --- ## pop-os (`192.168.0.178` / `100.78.69.20`) ### Missing from docs | # | Item | What it is | Where it lives | |---|------|------------|----------------| | 1 | `sub2api-tunnel.service` | systemd **user** unit running `~/sub2api-tunnel.py` — TCP forwarder `127.0.0.1:8090 → 100.108.123.60:8090`. Lets local tools hit OCI-hosted sub2api as if it were local. | `~/.config/systemd/user/sub2api-tunnel.service` + `~/sub2api-tunnel.py` | | 2 | Hermes WhatsApp bridge | Node bridge on **`127.0.0.1:3100`** (`bridge.js --mode bot --session ~/.hermes/whatsapp/session`). Started by `hermes-gateway.service` — current docs say Hermes is "systemd" but don't list its WhatsApp bridge port. | `~/.hermes/hermes-agent/scripts/whatsapp-bridge/` | | 3 | OpenClaw node-host port `18791` | Second listener owned by the OpenClaw gateway process (`openclaw gateway --port 18789` also opens 18791 for the node host). Docs only mention 18789. | `openclaw-gateway.service` (user) | | 4 | `openclaw-node.service` | Separate user unit for the OpenClaw node host (v2026.4.2) — currently only `openclaw-gateway` is mentioned. | `~/.config/systemd/user/openclaw-node.service` | | 5 | `tg-filter` container | Running container, no setup doc. Pulls a Telegram group (`-1002225681039`), filters, writes to **Neon PG** (`ep-falling-moon-a1r4bgch-pooler.ap-southeast-1.aws.neon.tech`), queues via **Upstash Redis** (`faithful-ewe-77854.upstash.io`), forwards to channel `-1003704234947`. Runs scheduled ingest every 6h (`INGEST_INTERVAL_SECONDS=21600`). | container env-only; no compose file checked-in to repo | | 6 | Fundamental-analysis cron | 4 weekday cron entries (UTC) — **not** under `openclaw cron`, regular user crontab. Missing from master.md's Scheduled Jobs table. | `crontab -l` for `sze` | | 7 | `grok-register-ui.service` | User systemd unit running `web_ui.py` on `0.0.0.0:5000`. Doc lists port 5000 but not that it's managed by systemd-user. | `~/.config/systemd/user/grok-register-ui.service` | | 8 | `nezha-agent.service` | **systemd-user** unit, not "process" as `master.md` says. | `~/.config/systemd/user/nezha-agent.service` | | 9 | `next-server` on `*:3001` | `v2ray-proxy-manager` is listed but the actual process is a Next.js server; no setup doc. | `~/apps/...` (location not chased — confirm) | ### Stale / incorrect in docs | # | Claim in master.md | Reality | |---|--------------------|---------| | A | `poeReg2api` runs on host port **9000** (public) | Container exposes `8100/tcp` internally with **no host port published** (`docker port poeReg2api` is empty). Port 9000 is **not listening** on the host. Service is reachable only inside the docker network or via `docker exec`. Either re-publish to a host port or remove the row from the public port table. | | B | "Hermes Agent / systemd" with no port | Hermes also opens `127.0.0.1:3100` for the WhatsApp bridge — add to port map. | | C | OpenClaw "18789 localhost" only | Add the secondary `18791` port. | | D | Nezha agent "process" | It is a systemd-user service, same lifecycle as the others. | ### Fundamental-analysis cron entries to add All weekdays (Mon–Fri), times in UTC: | Schedule (UTC) | Local HKT | Script | Notes | |----------------|-----------|--------|-------| | `35 12 * * 1-5` | 20:35 HKT | `update_reports.sh --wa-send` | Post-NFP / CPI / PPI window (08:35 ET) | | `5 14 * * 1-5` | 22:05 HKT | `update_reports.sh --wa-send` | ISM PMI / JOLTS (10:05 ET) | | `5 18 * * 1-5` | 02:05 HKT (next) | `update_reports.sh --wa-send` | FOMC window (14:05 ET) | | `10 20 * * 1-5` | 04:10 HKT (next) | `update_reports.sh` (no `--wa-send`) | Market close recap (16:10 ET) | Script path: `~/.openclaw/agents/fundamental-analysis/workspace/skills/update-reports/scripts/update_reports.sh` Log: `~/.openclaw/agents/fundamental-analysis/workspace/reports/cron.log` --- ## coolify-master (`129.146.218.222` / `100.82.177.59`) ### Missing from docs The Coolify control-plane is documented, but the **applications it has deployed** are not. Pulled from the `applications` table in `coolify-db`: | Coolify App | FQDN(s) | Status | Target server (per `standalone_dockers`) | |-------------|---------|--------|-----------------------------------------| | `lung-wai/1000-saas:main` | `1000saas.xyz`, `www.1000saas.xyz` (http+https) | **exited:unhealthy** | `naughty-narwhal` = instance VM (137.131.41.18) | | `lung-wai/-cloud-flare--img-bed` | _(no fqdn)_ | **running:unhealthy** | instance VM | | `lung-wai/-githubuilder` | `githubuilder.com`, `www.githubuilder.com` (http+https) | **exited:unhealthy** | instance VM | | `lung-wai/new-api:main` | _(no fqdn)_ | **exited:unhealthy** | instance VM | | `lung-wai/gemini-balance:main` | _(no fqdn)_ | **exited:unhealthy** | instance VM | **Key fact not captured anywhere:** the Coolify *worker* lives on the instance VM, not on coolify-master. Coolify-master only runs the control plane (UI + DB + Redis + realtime + its own Traefik for the admin UI). All deployed apps land on the instance VM's `coolify-proxy` (Traefik) at 137.131.41.18. ### Action items (deployment debt) - 4 of 5 Coolify apps are `exited:unhealthy`. The two public domains (`1000saas.xyz` and `githubuilder.com`) are currently down. - The `cloud-flare-img-bed` Coolify app is `running:unhealthy` — overlaps with the standalone `imgbed-...` container on the instance (port 7658). Confirm which one is authoritative. --- ## instance-20250520-1933 (`137.131.41.18` / `100.108.123.60`) ### Missing from docs | # | Item | Detail | |---|------|--------| | 1 | `sub2api-postgres` (postgres:18-alpine) | sub2api dependency, port 5432 internal | | 2 | `sub2api-redis` (redis:8-alpine) | sub2api dependency, port 6379 internal | | 3 | `coolify-proxy` role | Listed in port map but not explained — this Traefik is the **Coolify worker proxy** that fronts deployed apps (1000saas, githubuilder, imgbed). Same image as the one on coolify-master but a different role. | | 4 | `imgbed-w0kcco0ckog8ww0w8go0k4ko-...` is the Coolify-managed deployment | master.md treats `imgbed` as a standalone service. It is actually the `cloud-flare-img-bed` Coolify app. Document the link. | | 5 | Staged but **not running** compose dirs | `/home/ubuntu/poeReg2api/docker-compose.yml` and `/home/ubuntu/grok2api/docker-compose.yml` exist — appear to be migration prep from pop-os to the instance. No containers currently running for either. Decide: finish migration or remove. | | 6 | `/home/ubuntu/new-api/` exists with no compose file | Bare dir — partial setup or cleanup leftover. | ### Stale / incorrect in docs - `master.md` Service Map lists `imgbed` with port 7658 as standalone. It is Coolify-managed and the container name is auto-generated (`imgbed-w0kcco0ckog8ww0w8go0k4ko-123824117747`). --- ## Cross-VM observations 1. **sub2api access path on pop-os goes through a tunnel** — `sub2api-tunnel.py` listens locally and forwards to instance. Any client config on pop-os pointing at `http://localhost:8090` is actually hitting `100.108.123.60:8090`. Worth noting in setup docs to avoid confusion when debugging. 2. **Two Traefik proxies, two roles** — both VMs run `traefik:v3.1` named `coolify-proxy` on `:80/:443/:8080`. The one on coolify-master fronts the Coolify UI; the one on the instance fronts deployed apps. They are independent. 3. **Nezha monitor coverage gaps** — based on `master.md` monitor list, there is no probe for: - `grok-register` (port 5000) - `v2ray-proxy-manager` (port 3001) - `tg-filter` (no HTTP endpoint — needs a synthetic check) - `hermes-gateway` / WhatsApp bridge (`:3100`) - `openclaw-gateway` (`:18789`) - Public Coolify-deployed domains (`1000saas.xyz`, `githubuilder.com`) — they would catch the current "exited" state immediately. 4. **Tailscale Caddy HTTPS table is correct** (verified live against the Caddy admin API at `127.0.0.1:2019`). All four reverse-proxy entries match. --- ## Proposed doc updates 1. Apply the fixes inline to `master.md`: - Correct the `poeReg2api` row (remove from public port table or fix the port). - Add WhatsApp bridge port `3100` and OpenClaw `18791` to the pop-os port map. - Add `sub2api-tunnel` row under pop-os services. - Add `tg-filter` setup-doc link (create the doc). - Add a "Coolify Deployed Apps" subsection under `coolify-master` listing the 5 apps and noting their target is the instance VM. - Move `imgbed` row under coolify-master deployments and cross-link from the instance section. - Add the 4 fundamental-analysis cron entries to the Scheduled Jobs table (or split into a "User crontab" subsection so it's clear they are **not** managed by `openclaw cron`). 2. Create `tg-filter/tg-filter-setup.md` documenting the container's env, ingest schedule, and external dependencies (Neon, Upstash). 3. Add a Nezha probe for each gap listed in #3 above. Once those are done, `master.md` will once again be the source of truth.