Skip to content
DA DataAcuity by The Geek Network

DataAcuity — Architecture Overview

System-level picture of everything on .106, how it connects to the rest of The Geek Network, and the architectural patterns it follows. Read this before going deeper into any of the individual service docs.


1. The four servers (where things live)

The Geek Network production runs on four servers; this doc focuses on .106 but everything connects:

┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
│  .104 PRIMARY   │  │  .105 STANDBY   │  │  .106 DATAACUITY│  │  .118 LB / EDGE │
│  ─────────────  │  │  ─────────────  │  │  MAPS / BI      │  │  ─────────────  │
│  Windows Server │  │  Windows Server │  │  Ubuntu 24.04   │  │  Windows Server │
│  IIS + 36 APIs  │  │  IIS + 36 APIs  │  │  Docker (54 c.) │  │  IIS + ARR      │
│  PostgreSQL R/W │  │  PostgreSQL R/O │  │  PostgreSQL,    │  │  Let's Encrypt  │
│   (primary)     │  │   (replica)     │  │   Valhalla,     │  │  ButlerAPI      │
│                 │  │                 │  │   GeoGlobal,    │  │  CircleVpnAPI   │
│  All TGN app    │  │  Read replica   │  │   dbt, Superset │  │                 │
│  data lives     │  │  for ETL +      │  │   Grafana, etc. │  │  Traffic split  │
│  here           │  │  failover       │  │                 │  │  + SSL term     │
└─────────────────┘  └─────────────────┘  └─────────────────┘  └─────────────────┘
   197.97.200.104      197.97.200.105      197.97.200.106      197.97.200.118

.106 is a different beast than the other three. The other three run the TGN ecosystem (apps + APIs + LB). .106 runs a data/intelligence stack that consumes TGN data, enriches it, and surfaces it back as BI + map services.

2. .106 — at a glance

54 Docker containers across 8 logical groups:

┌────────────────────────────────────────────────────────────────────────┐
│  .106 — DataAcuity / Maps / BI                                         │
│  ──────────────────────────                                            │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │ 1. MAPS STACK (DataAcuity Maps product)                         │  │
│  │    maps_api, maps_db, maps_nominatim, maps_osrm, maps_tiles,    │  │
│  │    maps_prerender, maps_redis, dataacuity_portal                │  │
│  └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │ 2. GEO-GLOBAL STACK (new — geocode/POI/routing)                 │  │
│  │    geo_mcp (MCP server :5026), valhalla (:5027),                │  │
│  │    geo_db (PostGIS, 13.4M places + 3.5M POIs)                   │  │
│  └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │ 3. APP-SUPPORT APIS (subset of TGN APIs that live here)         │  │
│  │    tagme_api (:5023), transit_api (:5030)                       │  │
│  │    (most TGN APIs live on .104/.105; these two are exceptions)  │  │
│  └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │ 4. DATA WAREHOUSE + BI (the heart of the BI strategy)           │  │
│  │    data_warehouse (postgres :5001), dbt_transform               │  │
│  │    superset (:5003) + superset_db                               │  │
│  └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │ 5. MARKETS / FINANCIAL DATA                                     │  │
│  │    markets_api (:8000), markets_db, markets_dashboard,          │  │
│  │    markets_openbb_backend, markets_redis                        │  │
│  │    ⚠️ STATUS: scaffolding exists, live data ETL is broken      │  │
│  └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │ 6. WORKFLOW / AUTOMATION                                        │  │
│  │    n8n (:5008) — actively used (55 MB sqlite, content unknown) │  │
│  │    automatisch (:5004) — empty, 0 flows                         │  │
│  └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │ 7. AUTH / IDENTITY / GATEWAY                                    │  │
│  │    keycloak (:8180) + keycloak_db                               │  │
│  │    api-gateway-external (:8084), api-gateway-internal (:8083),  │  │
│  │    gateway-db, gateway-redis, api-docs (:8082)                  │  │
│  └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │ 8. AI / TOOLING / OBSERVABILITY                                 │  │
│  │    ai_brain_webui (Open WebUI :5000), morph_convertx (:5011),   │  │
│  │    sandbox_webstudio (:5012), sandbox_developer_ide (:5013),    │  │
│  │    twenty_crm (:5005), bio_onelink (:5009),                     │  │
│  │    dashboard-backend (:5007), markets_dashboard (:5010),        │  │
│  │    grafana (:5015), prometheus (:9090), loki (:3100),           │  │
│  │    promtail, alertmanager (:9093), cadvisor (:8081),            │  │
│  │    node-exporter (:9100), nginx-exporter, redis-exporters       │  │
│  └─────────────────────────────────────────────────────────────────┘  │
└────────────────────────────────────────────────────────────────────────┘

3. Networks (Docker)

The 54 containers sit on 13 Docker networks. The two important ones:

Network What lives there Notes
data-warehouse_data_stack Most of the .106 stack (geo_db, geo_mcp, valhalla, maps_api, dbt_transform, markets_*, superset, grafana, prometheus, loki, n8n, automatisch, twenty, bio, ...) Service-name DNS works → geo_mcp reaches valhalla by hostname
maps_maps_network Maps-specific subset (maps_api, maps_db, etc.) Older / legacy partition; some maps services dual-homed
dataacuity_network dataacuity_portal-only Public portal isolation
ai-brain_data_stack ai_brain_webui only LLM workload isolation
keycloak_network keycloak + keycloak_db Auth isolation
monitoring grafana + prometheus + loki + alertmanager + exporters Observability isolation

Container-to-container traffic is unencrypted but contained within Docker bridge networks — it doesn't touch the host network interface. Anything published with -p HOST:CONTAINER is on the public NIC; see §5.

4. Data flow — three patterns

4.1 App-driven (user-initiated, real-time)

User (mobile/web)
    → .118 (Traefik → SSL term, ARR routing)
        → .104 or .105 (TGN APIs)
            → app's postgres DB on .104
            → on map-related calls: out to .106:5020 maps_api → geo_mcp → geo_db / valhalla
            → on B!/Butler calls: → maps_api or geo_mcp via MCP

4.2 ETL (scheduled, batch)

.104 postgres (primary, R/W) — write traffic from apps
    ⇣ streaming replication ⇣
.105 postgres (standby, R/O)
    ⇣ extract job pulls (15 min cadence for events, hourly for dims) ⇣
.106 data_warehouse (raw.*) — landing zone
    ⇣ dbt transformations (staging → intermediate → marts → analytics) ⇣
.106 data_warehouse (marts.*) — business-ready
    ⇣ push-back via postgres_fdw or logical replication ⇣
.104 analytics_db — per-app aggregates, app reads locally for low-latency in-app analytics

4.3 Enrichment loop (cross-service, real-time during ETL)

.106 dbt (during a transformation)
    → .106 geo_mcp (reverse_geocode, interesting_nearby — enrich location events with city/POI context)
    → .106 valhalla (route distance/duration — enrich trip events with travel context)
    → external FX provider (when markets_api is repaired or via fallback)

5. Public exposure — what's reachable from the internet

.106 currently publishes 35+ ports directly to 0.0.0.0 (raw port mapping, no reverse proxy). This is documented in detail in DataAcuity_Security_Posture.md. The summary:

  • Intended public (with their own auth): keycloak (8180), grafana (5015), superset (5003), api-gateway-external (8084), dataacuity_portal (5006)
  • ⚠️ Public but should be gateway-only (currently bypassable): geo_mcp (5026), valhalla (5027), maps_api (5020), markets_api (8000), n8n (5008), automatisch (5004), twenty_crm (5005), morph_convertx (5011), ai_brain_webui (5000), bio_onelink (5009), dashboard-backend (5007)
  • 🚨 Critical leaks (must be internal-only): data_warehouse postgres (5001), maps_db postgres (5433), loki logs (3100), cadvisor (8081), all the prometheus exporters (9100, 9113, 9121-3, 9188)
  • 🟢 Already internal-only (good): geo_db, gateway-db, gateway-redis, keycloak_db, markets_db (internal port), maps_osrm, maps_prerender, maps_redis, superset_db, twenty_db, twenty_redis, automatisch_db, automatisch_redis, bio_db

See DataAcuity_Security_Posture.md for the hardening plan.

6. Storage layout

Path on .106 What Approximate size
/var/lib/docker/volumes/ All Docker volumes (DBs, caches) ~50 GB
/home/geektrading/valhalla/tiles/ Africa Valhalla tiles + tile.tar 8.4 GB
/home/geektrading/geo-mcp/ geo_mcp source code ~50 KB
/home/geektrading/dbt/ dbt project (DataAcuity DBT Project) ~10 MB
/home/geektrading/api-gateway/ API gateway config small
/home/geektrading/suite/traefik/ Traefik config + ACME certs small
/home/geektrading/suite/keycloak/ Keycloak config small
/home/geektrading/monitoring/ Grafana/Prometheus/Loki config small
/home/geektrading/backups/ Nightly DB dumps ~10 GB rolling
/home/geektrading/airbyte/ Airbyte (status: empty placeholder?) TBD

Disk: 352 GB total, ~270 GB used as of 2026-05-28, ~80 GB free.

7. Service catalog — at a glance

What every container does in one line:

Container Purpose Status
geo_db PostGIS — 13.4M places + 3.5M POIs 🟢 LIVE
geo_mcp MCP server v0.4 — 6 geo tools 🟢 LIVE (⚠️ no auth)
valhalla Africa turn-by-turn routing 🟢 LIVE (⚠️ no auth)
maps_api DataAcuity Historical Maps API 🟢 LIVE (⚠️ no auth)
maps_db PostGIS for maps_api 🟢 LIVE (🚨 public port)
maps_nominatim OSM Nominatim geocoder 🟢 LIVE
maps_osrm OSRM router — but no data loaded 🟡 EMPTY
maps_tiles Vector/raster tile server 🟢 LIVE
maps_prerender SEO prerender for maps pages 🟢 LIVE
maps_redis Cache for maps_api 🟢 LIVE
tagme_api TagMe backend 🟢 LIVE (but should be on .104/.105 per pattern)
transit_api Transit routing 🟢 LIVE
data_warehouse Postgres — BI warehouse 🟢 LIVE (🚨 public port 5001)
dbt_transform dbt CLI for transformations 🟢 LIVE (cron-driven)
superset BI dashboards 🟢 LIVE
superset_db Postgres for Superset 🟢 LIVE
markets_api Markets/financial REST API 🟡 SCAFFOLDING (no data)
markets_db Postgres for markets 🟡 PARTIAL
markets_dashboard Markets UI 🟡 PARTIAL
markets_openbb_backend OpenBB backend 🔴 DEAD (port 8080 timeout)
markets_redis Cache for markets 🟢 LIVE
n8n Workflow automation 🟢 LIVE (content audit pending)
automatisch Workflow automation (alternate) 🟡 EMPTY
automatisch_db / automatisch_redis Backing for automatisch 🟡
keycloak Identity / SSO 🟢 LIVE
keycloak_db Postgres for keycloak 🟢 LIVE
api-gateway-external External API gateway 🟢 LIVE
api-gateway-internal Internal API gateway 🟢 LIVE
gateway-db / gateway-redis Backing for gateway 🟢 LIVE
api-docs API documentation portal 🟢 LIVE
dataacuity_portal Public DataAcuity portal 🟢 LIVE
dashboard-backend DataAcuity dashboard backend 🟢 LIVE
ai_brain_webui Open WebUI (local LLM frontend) 🟢 LIVE
twenty_crm Open-source CRM 🟢 LIVE
twenty_db / twenty_redis Backing for twenty 🟢 LIVE
bio_onelink Link-in-bio service 🟢 LIVE
bio_db Postgres for bio 🟢 LIVE
sandbox_webstudio Web dev sandbox 🟢 LIVE
sandbox_developer_ide IDE sandbox 🟢 LIVE
morph_convertx File conversion (ConvertX) 🟢 LIVE
grafana Dashboards (ops) 🟢 LIVE
prometheus Metrics scraping 🟢 LIVE
loki Log aggregation 🟢 LIVE (🚨 public port 3100)
promtail Log shipper 🟢 LIVE
alertmanager Alert routing 🟢 LIVE
cadvisor Container metrics 🟢 LIVE (🚨 public port 8081)
node-exporter Host metrics 🟢 LIVE (🚨 public port 9100)
nginx-exporter nginx metrics 🟢 LIVE (🚨 public port 9113)
redis-exporter-* Redis metrics (3 instances) 🟢 LIVE (🚨 public ports)
postgres-exporter-* Postgres metrics (2 instances) 🟢 LIVE (🚨 public ports)
frosty_cray (gone) Previous valhalla build container

8. Architectural patterns we follow (or should)

8.1 Layered data

raw → staging → intermediate → marts → analytics (see DataAcuity_BI_Pipeline.md §8). Never skip a layer.

8.2 Read from replicas, write to primary

ETL only reads from .105. Pushes back to .104 go through analytics_db (a separate DB on .104 with its own write user). Never read or write to .104 user DBs from .106 directly.

8.3 Service-name DNS inside Docker

Containers reach each other by name (geo_mcp → http://valhalla:8002). The host network IP is for external traffic only. Anything internal that uses localhost:5026 instead of geo_mcp:8000 is a smell.

8.4 Gateway > raw ports (target state)

Public traffic should hit Traefik → routed to the appropriate container internally. The current state with 35 raw ports published is a transient dev convenience and a security gap (see DataAcuity_Security_Posture.md).

8.5 SSO via Keycloak (target state)

Anything that an internal user touches (Superset, Grafana, n8n, twenty_crm, ai_brain_webui, dashboard-backend) should authenticate against Keycloak. Today most of these use their own user/password.

8.6 PII gate at intermediate.* (BI pipeline)

The intermediate.* layer in the warehouse is the only place where PII is allowed to go from "present" to "absent". Every model in this layer has compliance review. Every downstream model (marts.*, analytics.*, ml.*) has a PII-absence dbt test that fails the build if anything leaks.

8.7 K-anonymity ≥ 5 on user aggregates

Any aggregated user-level mart (per-city, per-app, per-day) is checked for k ≥ 5 by dbt test. Suppress or generalise below that.

9. Tech stack

Layer Choice Notes
OS Ubuntu 24.04 LTS Standardised
Containerisation Docker + docker-compose Per-stack compose files in /home/geektrading/<stack>/
Reverse proxy Traefik (deployed, partially wired) Target state owner of all public TLS
Identity Keycloak Realm: master + project-specific
DBs PostgreSQL 14/15/16, PostGIS Per-service version pinning is OK
Data warehouse PostgreSQL 15 Schema-based separation, dbt-managed
Transformations dbt Core 1.7 (postgres adapter) CLI in dbt_transform container
BI Apache Superset (latest) Wired to data_warehouse via SQLAlchemy
Workflow n8n (active), Automatisch (dead) Plus dbt cron for batch
Routing Valhalla 3.5.1 (Africa tiles, gis-ops image) OSRM is also installed but data-less
Geocoding Custom geo_mcp + Nominatim (legacy) geo_mcp is the new primary
Tile serving tileserver-gl (maptiler) For DataAcuity Maps frontend
LLM access Open WebUI + Ollama For local model access from B!
Monitoring Prometheus + Grafana + Loki + Alertmanager Standard observability stack
Logs shipping Promtail To Loki
Container introspection cAdvisor For Grafana dashboards
CRM Twenty (open-source) B2B pipeline
File conversion ConvertX (morph_convertx)

10. What this folder does NOT cover

For these, look elsewhere in the repo:

Topic Where
TGN app code code/Apps/*/
TGN API code code/APIs/*/
TGN deployment scripts Deployment/*.ps1
Banking compliance rules .claude-memory/banking-compliance-rules.md
PgBouncer pattern .claude-memory/deploy-pattern-pgbouncer-cascade.md
Server credentials Deployment/deployment-credentials.ps1
TGN service registry code/Config/TheGeekNetworkServices.json
App design guidelines docs/DESIGN_GUIDELINES.md
Wolverine healing docs/WOLVERINE_HEALING_PIPELINE.md

11. Cross-references within this folder

DataAcuity_README.md              ← you read this to find docs
DataAcuity_Architecture_Overview.md  ← you are here (the picture)
DataAcuity_Security_Posture.md       ← what's exposed, how we fix it
DataAcuity_BI_Pipeline.md            ← the BI data pipeline design
GeoGlobal_README.md                  ← GeoGlobal service overview
GeoGlobal_API_Reference.md           ← every endpoint
GeoGlobal_Integration_Guide.md       ← C#/Blazor patterns
GeoGlobal_Data_Schema.md             ← geo_db schema
GeoGlobal_Deployment.md              ← GeoGlobal ops runbook
Something went wrong on this page. Reload