NoDb State Management

File-backed, Redis-cached singleton controllers for WEPPcloud run state management with distributed locking and zero-downtime serialization.

See also: AGENTS.md for coding conventions and docs/dev-notes/style-guide.md for clarity expectations.

Overview

The NoDb module replaces traditional relational databases with a constellation of file-backed singleton objects for managing WEPPcloud run state. Each NoDb controller:

Serializes to JSON - Human-readable .nodb files in the working directory
Caches in Redis - 72-hour TTL in DB 13 for instant hydration
Distributed locking - Redis-backed locks (DB 0) prevent concurrent mutations
Singleton per run - getInstance(wd) guarantees same object across workers
Structured telemetry - Per-controller log files (<wd>/<controller>.log) and Redis pub/sub (DB 2)

Instead of SQL queries, developers interact with rich Python objects that expose domain-specific methods and properties. Redis provides coarse-grained locking and caching so these objects can be quickly deserialized and shared across workers and RQ tasks without conflicts.

Why NoDb?

Portability - Zip a run directory and move it anywhere
Schema flexibility - Add attributes without migrations
Developer ergonomics - Python methods instead of SQL queries
Crash safety - Redis caching with disk fallback
Distributed coordination - Multi-worker safe via Redis locks

Tradeoffs:

No relational queries or foreign keys
Lock discipline required for all mutations
JSON payloads can grow large
Learning curve for bespoke patterns

NoDbBase Core Responsibilities

wepppy/nodb/base.py provides the NoDbBase superclass that every controller inherits from. Important behaviors:

Singleton lifecycle – NoDbBase.getInstance(wd) guarantees a single controller per working directory, hydrating from Redis (DB 13) before touching disk.
Distributed locking – with controller.locked(): acquires a Redis-backed lock (DB 0), mirrors legacy hash flags, and raises NoDbAlreadyLockedError when re-entrancy is unsafe.
Persistence helpers – dump_and_unlock() fsyncs the JSON payload, refreshes Redis cache entries, and validates the round-trip before releasing locks.
Telemetry wiring – _init_logging() attaches a QueueListener fan-out to StatusMessenger, controller-scoped log files, and a console error stream; try_redis_set_log_level() dynamically adjusts levels via DB 15.
Status channels – _status_channel resolves to <runid>:<controller> (pup runs routed to runid:omni).
Trigger events – TriggerEvents enum documents lifecycle hooks (e.g., LANDUSE_BUILD_COMPLETE) that mods and UI components listen for when orchestrating runs.

When extending NoDb, prefer these utilities over bespoke implementations—custom locking or logging code frequently regresses cross-worker behavior. See the module docstring in wepppy/nodb/base.py for deeper context and example usage.

Lock Contention and Retry Pattern

Multi-worker tasks (RQ, API workers) can collide on the same NoDb lock. When with controller.locked(): cannot acquire the Redis lock, NoDbAlreadyLockedError is raised. Use a short retry loop with backoff for lightweight metadata writes. Avoid keeping locks open across long-running operations.

Example pattern:

from wepppy.nodb.base import NoDbAlreadyLockedError
import time

max_tries = 5
for attempt in range(max_tries):
    try:
        controller = Controller.getInstance(wd)
        with controller.locked():
            controller.some_field = value
    except NoDbAlreadyLockedError:
        if attempt + 1 == max_tries:
            logger.warning("NoDb lock busy after %d retries", max_tries)
            break
        time.sleep(1.0)
    else:
        break

Guidelines:

Keep lock scope minimal; do I/O outside the lock when possible.
For optional metadata (job IDs, timestamps), log and skip after retries.
For critical state changes, bubble the exception so the caller can fail fast.

Path Placeholders in Configs

NoDb configs reference large, location-specific datasets through placeholders that config_get_path() resolves at runtime:

MODS_DIR expands to wepppy/nodb/mods, keeping legacy bundles inside the repo.
EXTENDED_MODS_DATA points to heavy datasets that now live outside the repo. The resolver honors the EXTENDED_MODS_DATA environment variable, falling back to the default bind mounts (/wc1/geodata/extended_mods_data, /geodata/extended_mods_data) or the legacy mods/locations folder when the external volumes are unavailable.

Use the helper script python wepppy/nodb/scripts/update_extended_mods_data.py --apply whenever locations (Portland, Seattle, Lake Tahoe) need to be relinked to the external bundle; the script rewrites the .cfg files to use the placeholder consistently.