GeoGlobal — Deployment & Operations
Where the service lives, how it's wired, and how to keep it healthy.
For app consumers: you don't need this. See
GeoGlobal_README.mdandGeoGlobal_Integration_Guide.md. For ops, SRE, on-call: read this top to bottom.
Topology
All GeoGlobal containers run on .106 — DataAcuity Maps server (197.97.200.106, Ubuntu 24.04.2 LTS).
┌────────────────────────────────┐
│ maps.dataacuity.co.za (public) │
│ Traefik / nginx │
└──────────────┬─────────────────┘
│
▼
┌────────────────────────────────────────┐
│ maps_api (FastAPI :5020) │
│ - existing geocode/route endpoints │
│ - NEW: /api/v2/* proxies to geo_mcp │
│ and valhalla │
└────────────┬───────────────┬────────────┘
│ │
┌────────────────────────────────┘ └──────────────────────────────────┐
▼ ▼
┌──────────────────────────┐ ┌────────────────────────────────────┐
│ geo_mcp (FastMCP SSE │ │ valhalla (gis-ops 3.5.1 │
│ :8000 internal / :5026 │ │ :8002 internal / :5027 external) │
│ external) │ │ - 38,822 Africa tiles │
│ - 6 MCP tools │ │ - 8.3 GB tile.tar mounted from │
│ - reads geo_db │ │ /home/geektrading/valhalla/ │
│ - calls valhalla for │ │ tiles │
│ route / quest │ └────────────────────────────────────┘
└────────────┬─────────────┘
│
▼
┌──────────────────────────┐
│ geo_db (PostGIS 16 │
│ :5432 internal) │
│ - geonames 13.4 M │
│ - interesting_locations │
│ 3.5 M │
└──────────────────────────┘
All four GeoGlobal containers (geo_db, geo_mcp, valhalla, and the proxy endpoints in maps_api) sit on the data-warehouse_data_stack Docker network. Container names resolve via Docker DNS — geo_mcp reaches valhalla by hostname.
Container inventory
| Container | Image | Internal port | Host port | Network | Restart policy |
|---|---|---|---|---|---|
geo_db |
postgis/postgis:16-3.4-alpine |
5432 | (none — internal) | data-warehouse_data_stack | unless-stopped |
geo_mcp |
geo-mcp:0.4 (locally built) |
8000 | 5026 | data-warehouse_data_stack | unless-stopped |
valhalla |
ghcr.io/gis-ops/docker-valhalla/valhalla:latest |
8002 | 5027 | data-warehouse_data_stack | unless-stopped |
Environment variables
geo_mcp
| Var | Default | Notes |
|---|---|---|
DB_DSN |
host=geo_db user=geo password=geoG10balInit2026 dbname=geoglobal |
Connection string to geo_db |
VALHALLA_URL |
http://valhalla:8002 |
In-cluster URL of the routing container |
MCP_TRANSPORT |
sse |
Either stdio (for CLI invocation) or sse (HTTP) |
valhalla
| Var | Default | Notes |
|---|---|---|
serve_tiles |
True |
Must be True for the serve container |
server_threads |
2 |
Increase if CPU headroom (server has 4 cores) |
use_tiles_ignore_pbf |
True |
Skip rebuilding; consume the prebuilt tile.tar |
build_tar |
False |
Don't rewrite the tar on each restart |
force_rebuild |
False |
Don't trigger a rebuild even if config changes |
maps_api (the existing FastAPI proxy — /api/v2/* endpoints to be added)
| Var | Default | Notes |
|---|---|---|
GEO_MCP_URL |
http://geo_mcp:8000 |
NEW — for /api/v2/* proxies |
VALHALLA_URL |
http://valhalla:8002 |
NEW — for /api/v2/route |
DATABASE_URL |
(existing — maps_db) |
Unchanged |
REDIS_URL |
(existing — maps_redis) |
Unchanged |
ENVIRONMENT |
production |
|
| Rate limit | 60/min/IP via slowapi | Bump via env if needed |
Volumes / persistence
| Path on host | Mount point | Contents | Backup |
|---|---|---|---|
/home/geektrading/valhalla/tiles/ |
/custom_files on valhalla |
valhalla_tiles.tar (8.3 GB) + admin_data/ (99 MB) + valhalla.json (7.5 KB) |
NO — rebuildable in ~3 h |
/home/geektrading/geo-mcp/ |
/app on geo_mcp (via docker cp) |
server.py (v0.4), Dockerfile, helper scripts |
Source is in this repo; backups in git |
Postgres volume geo_db_data |
/var/lib/postgresql/data on geo_db |
8.4 GB DB | YES — nightly via dataacuity_backup_job |
Health checks
Quick green-light checklist (30 seconds)
# From .106 host
docker ps --filter name='^(geo_db|geo_mcp|valhalla)$' --format 'table {{.Names}}\t{{.Status}}'
# All three should be "Up <duration>"
# geo_db responds
docker exec geo_db psql -U geo -d geoglobal -c "SELECT COUNT(*) FROM geonames;"
# expected: 13434746 (or close)
# geo_mcp serves SSE
curl -sS -o /dev/null -w "%{http_code}\n" -H "Accept: text/event-stream" http://localhost:5026/sse
# expected: 200 (curl may hang — that's correct for SSE)
# valhalla answers /status
curl -sS http://localhost:5027/status | head -1
# expected: {"version":"3.5.1", ...}
# end-to-end Africa route
curl -sS -X POST http://localhost:5027/route -H 'Content-Type: application/json' \
-d '{"locations":[{"lat":-33.9249,"lon":18.4241},{"lat":-26.2041,"lon":28.0473}],"costing":"auto","units":"km"}' \
| python3 -c "import sys,json; d=json.load(sys.stdin); print('OK', d['trip']['summary']['length'], 'km')"
# expected: OK 1399.x km
Continuous monitoring (Grafana)
The prometheus + grafana stack on .106 already collects:
- Container up/down state via
cadvisor - HTTP response codes via nginx-exporter
- Postgres connection counts via
postgres-exporter-warehouse/-markets - Disk usage via
node-exporter
TODO: add a Grafana dashboard for geo_mcp and valhalla specifically — request rates, p50/p95/p99 latencies, error rates. Currently we just have raw container metrics.
Restart procedures
geo_db — never restart casually
geo_db holds 13.4 M + 3.5 M rows. Recovery from an unclean shutdown takes ~5 s but it crash-loops if disk is full (we hit this on 2026-05-28). Always check df -h /home first.
docker restart geo_db
# Wait ~10 s
docker logs --tail 5 geo_db # look for "database system is ready to accept connections"
geo_mcp — fast restart, safe anytime
docker restart geo_mcp
# ~5 s. Will reconnect to geo_db on first query.
To deploy a new server.py:
# 1. Upload new file to /home/geektrading/geo-mcp/server.py
# 2. Hot-swap into the container (don't need to rebuild the image)
docker cp /home/geektrading/geo-mcp/server.py geo_mcp:/app/server.py
docker restart geo_mcp
docker logs --tail 10 geo_mcp # confirm "v0.X starting" line
valhalla — careful, slow to load
The serve container takes ~30 s to mmap the 8.3 GB tile.tar on startup. During that window /status returns 503/connection-reset.
docker restart valhalla
# Wait 45 s, then:
curl -sS http://localhost:5027/status
Don't restart valhalla from cron or any unattended automation — coordinate with humans.
Rebuilding Africa routing tiles
The full rebuild script is at /home/geektrading/build_africa_valhalla.sh on .106.
Phases (with timings from the 2026-05-28 build):
| Phase | Duration | Disk delta |
|---|---|---|
| 0. Stop serve container | <1 s | 0 |
| 1. Clear previous tile dir | ~5 s | +58 GB freed |
2. Download africa-latest.osm.pbf (8.4 GB) |
~15 min @ 10 MB/s | -8.4 GB |
| 3. osmium tags-filter prefilter | ~1 min | unchanged |
| 4. valhalla_build_admins | ~27 min | +99 MB (admin.sqlite) |
| 5. Parse ways → relations → nodes | ~1 h 15 min | grows then shrinks |
| 6. Build graph (33,162 tiles) | ~1 h 25 min | +40 GB |
| 7. Enhance graph (1 h 40 min on 2 threads) | ~1 h 40 min | unchanged |
| 8. Validate / clean / tar | ~5 min | tile dir 40 GB → 8.4 GB |
| 9. Launch serve container | ~30 s | 0 |
| 10. Smoke tests (6 city pairs) | ~30 s | 0 |
Total wall-clock: ~4 h 50 min from script start to live serve container.
Run it as geektrading user, not root. Output goes to /tmp/build_africa_valhalla.log.
Critical: delete the prefilter PBF immediately after the parse phase completes. The script does this in Phase 5, but if you need to free disk during a build,
rm /home/geektrading/valhalla/tiles/africa-latest-filt.osm.pbfis safe once "Parsing nodes..." has finished.
Capacity & limits
Server hardware (.106):
- 4 CPU cores, 23 GB RAM, 4 GB swap, 352 GB disk
- Africa Valhalla build needs ~50 GB peak disk and fits in RAM comfortably
- World Valhalla build was attempted 6 times and failed every time (RAM and disk both short). Don't try a world build on this box.
Current load (2026-05-28 idle): 38 GB / 352 GB disk used. 8 GB / 23 GB RAM used.
Burst capacity for routing: ~50 concurrent route calls before the 2-thread server saturates. For more, scale server_threads and add CPU.
Common problems and fixes
"geo_db crash loops with No space left on device"
The cause is always / filling up. Check docker system df, du -sh /home/geektrading/*, docker image prune -a -f, and delete stale tile builds in /home/geektrading/valhalla/tiles/*.bin. Then restart geo_db. Recovery takes a few seconds once disk is freed.
"Valhalla returns No suitable edges near location"
Two causes:
- Outside Africa. Expected. Document this in the calling app's UI.
- Coordinate is on a non-routable feature (lake, sea, private road). Snap to nearest road manually or call Valhalla's
/locateendpoint to find the nearest routable edge.
"geo_mcp returns {"detail":"Not Found"}"
You're hitting the wrong path. MCP traffic is SSE on /sse, not /mcp or /api. Or you're hitting geo_mcp with REST and confusing it with maps_api — the REST proxy lives in maps_api.
"Docker registry credential errors when building"
.106 Docker has a broken credential helper. Workarounds:
- Don't pull new images on
.106— build locally anddocker save/docker load - For hot-fixes use
docker cpinto a running container instead of rebuilding
"discover_quest returns POIs but routing_error is set"
Means POIs were found but Valhalla's /optimized_route failed. Usually because the chosen POIs span outside Africa. Retry with a smaller within_km or a more selective theme.
Disaster recovery
If .106 dies completely:
- Spin up a new Ubuntu 24.04 box with at least 4 vCPU / 24 GB RAM / 100 GB disk (200 GB recommended for headroom)
- Install Docker + docker-compose
- Restore
geo_dbfrom the latest nightly dump at/home/geektrading/backups/geo_db/. The dump is ~2 GB compressed. - Re-clone
/home/geektrading/geo-mcp/from this repo (the source is mirrored incode/Apps/Tools/geo-mcp/) - Run
build_africa_valhalla.shto rebuild Valhalla tiles (~5 h) - Re-attach DNS for
maps.dataacuity.co.za
Total RTO if disk image survived: ~30 min. If we need to rebuild Valhalla from scratch: ~5 h.
Runbook quick links
- Build script:
/home/geektrading/build_africa_valhalla.sh(on.106) - MCP source:
/home/geektrading/geo-mcp/server.py(on.106) — mirrored in this repo - Postgres data:
/var/lib/docker/volumes/geo_db_data/(on.106) - Tile.tar:
/home/geektrading/valhalla/tiles/valhalla_tiles.tar(on.106) - Last build log:
/tmp/build_africa_valhalla.log(on.106, ephemeral — copy elsewhere if you need to retain)
Future ops work
- Add Grafana dashboards for GeoGlobal-specific request rates, latencies, error rates
- Set up alerts on
geo_dbconnection count > 80,valhalla5xx rate > 1%, disk free < 20 GB - Build the
/api/v2/*proxy endpoints inmaps_api(not yet implemented — seeGeoGlobal_API_Reference.mdfor the spec) - Wire
geo_mcpinto Butler as a registered MCP source (config snippet inGeoGlobal_Integration_Guide.mdsection 2.7) - Schedule monthly OSM-Africa rebuild so routing stays current with road changes