WEPPcloud

← Back to usersum index

NoDb State Management

File-backed, Redis-cached singleton controllers for WEPPcloud run state management with distributed locking and zero-downtime serialization.

See also: AGENTS.md for coding conventions and docs/dev-notes/style-guide.md for clarity expectations.

Overview

The NoDb module replaces traditional relational databases with a constellation of file-backed singleton objects for managing WEPPcloud run state. Each NoDb controller:

  • Serializes to JSON - Human-readable .nodb files in the working directory
  • Caches in Redis - 72-hour TTL in DB 13 for instant hydration
  • Distributed locking - Redis-backed locks (DB 0) prevent concurrent mutations
  • Singleton per run - getInstance(wd) guarantees same object across workers
  • Structured telemetry - Per-controller log files (<wd>/<controller>.log) and Redis pub/sub (DB 2)

Instead of SQL queries, developers interact with rich Python objects that expose domain-specific methods and properties. Redis provides coarse-grained locking and caching so these objects can be quickly deserialized and shared across workers and RQ tasks without conflicts.

Why NoDb?

  • Portability - Zip a run directory and move it anywhere
  • Schema flexibility - Add attributes without migrations
  • Developer ergonomics - Python methods instead of SQL queries
  • Crash safety - Redis caching with disk fallback
  • Distributed coordination - Multi-worker safe via Redis locks

Tradeoffs:

  • No relational queries or foreign keys
  • Lock discipline required for all mutations
  • JSON payloads can grow large
  • Learning curve for bespoke patterns

NoDbBase Core Responsibilities

wepppy/nodb/base.py provides the NoDbBase superclass that every controller inherits from. Important behaviors:

  • Singleton lifecycleNoDbBase.getInstance(wd) guarantees a single controller per working directory, hydrating from Redis (DB 13) before touching disk.
  • Distributed lockingwith controller.locked(): acquires a Redis-backed lock (DB 0), mirrors legacy hash flags, and raises NoDbAlreadyLockedError when re-entrancy is unsafe.
  • Persistence helpersdump_and_unlock() fsyncs the JSON payload, refreshes Redis cache entries, and validates the round-trip before releasing locks.
  • Telemetry wiring_init_logging() attaches a QueueListener fan-out to StatusMessenger, controller-scoped log files, and a console error stream; try_redis_set_log_level() dynamically adjusts levels via DB 15.
  • Status channels_status_channel resolves to <runid>:<controller> (pup runs routed to runid:omni).
  • Trigger eventsTriggerEvents enum documents lifecycle hooks (e.g., LANDUSE_BUILD_COMPLETE) that mods and UI components listen for when orchestrating runs.

When extending NoDb, prefer these utilities over bespoke implementations—custom locking or logging code frequently regresses cross-worker behavior. See the module docstring in wepppy/nodb/base.py for deeper context and example usage.

Lock Contention and Retry Pattern

Multi-worker tasks (RQ, API workers) can collide on the same NoDb lock. When with controller.locked(): cannot acquire the Redis lock, NoDbAlreadyLockedError is raised. Use a short retry loop with backoff for lightweight metadata writes. Avoid keeping locks open across long-running operations.

Example pattern:

from wepppy.nodb.base import NoDbAlreadyLockedError
import time

max_tries = 5
for attempt in range(max_tries):
    try:
        controller = Controller.getInstance(wd)
        with controller.locked():
            controller.some_field = value
    except NoDbAlreadyLockedError:
        if attempt + 1 == max_tries:
            logger.warning("NoDb lock busy after %d retries", max_tries)
            break
        time.sleep(1.0)
    else:
        break

Guidelines:

  • Keep lock scope minimal; do I/O outside the lock when possible.
  • For optional metadata (job IDs, timestamps), log and skip after retries.
  • For critical state changes, bubble the exception so the caller can fail fast.

Path Placeholders in Configs

NoDb configs reference large, location-specific datasets through placeholders that config_get_path() resolves at runtime:

  • MODS_DIR expands to wepppy/nodb/mods, keeping legacy bundles inside the repo.
  • EXTENDED_MODS_DATA points to heavy datasets that now live outside the repo. The resolver honors the EXTENDED_MODS_DATA environment variable, falling back to the default bind mounts (/wc1/geodata/extended_mods_data, /geodata/extended_mods_data) or the legacy mods/locations folder when the external volumes are unavailable.

Use the helper script python wepppy/nodb/scripts/update_extended_mods_data.py --apply whenever locations (Portland, Seattle, Lake Tahoe) need to be relinked to the external bundle; the script rewrites the .cfg files to use the placeholder consistently.