NoDb State Management
File-backed, Redis-cached singleton controllers for WEPPcloud run state management with distributed locking and zero-downtime serialization.
See also: AGENTS.md for coding conventions and docs/dev-notes/style-guide.md for clarity expectations.
Overview
The NoDb module replaces traditional relational databases with a constellation of file-backed singleton objects for managing WEPPcloud run state. Each NoDb controller:
- Serializes to JSON - Human-readable
.nodbfiles in the working directory - Caches in Redis - 72-hour TTL in DB 13 for instant hydration
- Distributed locking - Redis-backed locks (DB 0) prevent concurrent mutations
- Singleton per run -
getInstance(wd)guarantees same object across workers - Structured telemetry - Per-controller log files (
<wd>/<controller>.log) and Redis pub/sub (DB 2)
Instead of SQL queries, developers interact with rich Python objects that expose domain-specific methods and properties. Redis provides coarse-grained locking and caching so these objects can be quickly deserialized and shared across workers and RQ tasks without conflicts.
Why NoDb?
- Portability - Zip a run directory and move it anywhere
- Schema flexibility - Add attributes without migrations
- Developer ergonomics - Python methods instead of SQL queries
- Crash safety - Redis caching with disk fallback
- Distributed coordination - Multi-worker safe via Redis locks
Tradeoffs:
- No relational queries or foreign keys
- Lock discipline required for all mutations
- JSON payloads can grow large
- Learning curve for bespoke patterns
NoDbBase Core Responsibilities
wepppy/nodb/base.py provides the NoDbBase superclass that every controller inherits from. Important behaviors:
- Singleton lifecycle –
NoDbBase.getInstance(wd)guarantees a single controller per working directory, hydrating from Redis (DB 13) before touching disk. - Distributed locking –
with controller.locked():acquires a Redis-backed lock (DB 0), mirrors legacy hash flags, and raisesNoDbAlreadyLockedErrorwhen re-entrancy is unsafe. - Persistence helpers –
dump_and_unlock()fsyncs the JSON payload, refreshes Redis cache entries, and validates the round-trip before releasing locks. - Telemetry wiring –
_init_logging()attaches a QueueListener fan-out to StatusMessenger, controller-scoped log files, and a console error stream;try_redis_set_log_level()dynamically adjusts levels via DB 15. - Status channels –
_status_channelresolves to<runid>:<controller>(pup runs routed torunid:omni). - Trigger events –
TriggerEventsenum documents lifecycle hooks (e.g.,LANDUSE_BUILD_COMPLETE) that mods and UI components listen for when orchestrating runs.
When extending NoDb, prefer these utilities over bespoke implementations—custom locking or logging code frequently regresses cross-worker behavior. See the module docstring in wepppy/nodb/base.py for deeper context and example usage.
Lock Contention and Retry Pattern
Multi-worker tasks (RQ, API workers) can collide on the same NoDb lock. When
with controller.locked(): cannot acquire the Redis lock, NoDbAlreadyLockedError
is raised. Use a short retry loop with backoff for lightweight metadata writes.
Avoid keeping locks open across long-running operations.
Example pattern:
from wepppy.nodb.base import NoDbAlreadyLockedError
import time
max_tries = 5
for attempt in range(max_tries):
try:
controller = Controller.getInstance(wd)
with controller.locked():
controller.some_field = value
except NoDbAlreadyLockedError:
if attempt + 1 == max_tries:
logger.warning("NoDb lock busy after %d retries", max_tries)
break
time.sleep(1.0)
else:
break
Guidelines:
- Keep lock scope minimal; do I/O outside the lock when possible.
- For optional metadata (job IDs, timestamps), log and skip after retries.
- For critical state changes, bubble the exception so the caller can fail fast.
Path Placeholders in Configs
NoDb configs reference large, location-specific datasets through placeholders that
config_get_path() resolves at runtime:
MODS_DIRexpands towepppy/nodb/mods, keeping legacy bundles inside the repo.EXTENDED_MODS_DATApoints to heavy datasets that now live outside the repo. The resolver honors theEXTENDED_MODS_DATAenvironment variable, falling back to the default bind mounts (/wc1/geodata/extended_mods_data,/geodata/extended_mods_data) or the legacymods/locationsfolder when the external volumes are unavailable.
Use the helper script python wepppy/nodb/scripts/update_extended_mods_data.py --apply
whenever locations (Portland, Seattle, Lake Tahoe) need to be relinked to the external
bundle; the script rewrites the .cfg files to use the placeholder consistently.