StorePolicy And Persistence¶
This document defines the policy model used by register and put and
explains how persistence and placement are executed.
Related docs:
- Public surface and contracts: API Design
- Registration lifecycle and where policy is resolved: Registration Flow
- Persistence status and on-call signals: Error, Retry, Observability
Why a unified policy?¶
TensorCast uses a single StorePolicy object to describe durability and
placement intent so that:
- applications declare outcomes (“must be on shared disk”) instead of implementation details
- the daemon can choose the correct mechanism (sync local stable admission vs async persistence)
- the system can report degraded vs failed outcomes in a consistent way
StorePolicy is not “just configuration”: it is a contract that influences
correctness (data loss avoidance) and operational behavior (admission/eviction).
StorePolicy Model¶
StorePolicy is the single durability and placement declaration. It supports:
- Profile presets:
cache,durable,ha,cold,warm,pinned. - Explicit tiers via
must,should,maylists. overflow_policyfor local stable DRAM admission.layoutto control shard layout.
Architectural boundary:
- profile-specific runtimes may derive internal lowering hints from
StorePolicy, - but they must not introduce a second declarative policy surface beside
StorePolicy, - this applies to retained byte-body staging as well: any body-specific backing hint must stay an internal execution derivation rather than a user-facing policy contract.
The SDK validates policy shape and forwards it to the daemon. The daemon is the authoritative resolver.
What “must/should/may” mean¶
must: required for correctness from the caller’s point of view. Failure is a hard error.should: best-effort. Failure downgrades the result to degraded but does not fail the primary operation.may: opportunistic. Failure is silent and does not degrade.
Why this matters:
- It allows a “fast local success” while still requesting background durability.
- It allows operators to distinguish “everything is durable” from “we returned something usable but not fully placed”.
TierSpec (policy tier parameters)¶
Policy tiers are expressed as TierSpec.
- Python model: tensorcast/api/_config.py (
StorePolicy,TierSpec) - Proto model: proto/tensorcast/daemon/v2/store_daemon.proto (
StorePolicy,TierSpec)
Fields:
| Field | What it does | Examples |
|---|---|---|
tier |
Which tier: stable_dram or shared_disk. |
TierSpec(tier="stable_dram", scope="local") |
scope |
Where the tier should exist: local, remote, any. |
stable_dram(scope="remote") for remote stable replicas. |
min_replicas |
Minimum replicas for the tier. | Currently only 1 is supported (see validation). |
retention_policy |
Local stable retention: best_effort, ttl, pinned. |
pinned for “must stay resident”. |
retention_ttl_ms |
TTL for retention_policy=ttl. |
e.g. 60_000 for 60s local caching. |
Profiles (what they expand to)¶
Profiles are convenience presets. Expansion happens in both the SDK and daemon:
- SDK: tensorcast/api/_config.py (
StorePolicy.from_profile) - Daemon: daemon/state/store_policy_resolver.cc (
profile_defaults)
At a high level:
cache: local stable (may, best-effort), overflowevictdurable: shared disk (must) + local stable (should, best-effort)ha: shared disk (must) + remote stable (should) + local stable (should)cold: shared disk (must) + local stable (should, ttl) with a default TTLwarm: local stable (should, best-effort), overflowrejectpinned: local stable (must, pinned), overflowreject
Policy Examples (recommended starting points)¶
import tensorcast
# Default cache semantics (fast, not durable).
cache = tensorcast.StorePolicy(profile="cache")
# Durable: must land on shared disk, but also try to keep a local stable copy.
durable = tensorcast.StorePolicy(profile="durable")
# HA: durable + try to create a remote stable replica for fast cross-node reads.
ha = tensorcast.StorePolicy(profile="ha")
# Pinned local: fail if we cannot keep this resident in local stable memory.
pinned = tensorcast.StorePolicy(profile="pinned")
Policy Validation And Resolution¶
Validation rules are enforced in two places:
Key constraints:
shared_diskrequires scopeanyandmin_replicas=1and forbids retention fields.stable_dramsupports onlymin_replicas=1and retention only for local scopes.mustlocal stable requiresretention_policy=pinned.overflow_policy=spillrequires shared disk inmustorshould.
overflow_policy (local stable admission behavior)¶
overflow_policy controls what happens when local stable DRAM is under pressure:
evict: allow best-effort eviction of non-pinned entries to make space.reject: refuse admission when capacity is insufficient (caller sees failure/degraded depending on requirement).spill: allow eviction only when durability requirements are satisfied (gated by the durability index).
Why spill is special:
- It is intended to prevent “evict the only durable copy” scenarios.
- It couples local admission/eviction decisions to persistence completion; see Spill Gating And Durability Index.
layout (sharding intent)¶
layout declares how the artifact should be treated for persistence planning:
auto: daemon chooses based on size and tier requirements.unsharded: prefer a single logical unit (fewer parts, simpler placement).sharded: prefer shard planning (better parallelism and partial retries).
Why this exists:
- Sharding can improve throughput and failure isolation for very large artifacts.
- Unsharded can reduce overhead for smaller artifacts and simplify operator reasoning.
Local Stable Tier Versus Persistence¶
Local stable DRAM can be satisfied synchronously at commit time. Remote stable and shared disk are satisfied asynchronously through persistence tasks.
- Local stable
mustfailures fail commit. - Local stable
shouldfailures return degraded status.
StartPersistence And QueryPersistenceStatus¶
The SDK triggers StartPersistence after registration when the resolved policy
includes shared disk or remote stable tiers. The daemon runs a background task
and exposes status via QueryPersistenceStatus.
Task results are attached to the SDK surface as persistence_task_id and can be
queried by task id or artifact id.
StartPersistenceRequest / QueryPersistenceStatusResponse¶
Proto: proto/tensorcast/daemon/v2/store_daemon.proto
StartPersistence:
- Inputs:
artifact_id,policy - Output:
task_id,plan_id,state,progress,degraded_reason
QueryPersistenceStatus:
- Query by
task_idorartifact_id - Returns:
- task-level:
state,progress,degraded_reason,last_error - shard-level:
shards[]withstate/progress, plustarget_nodesandlease_ids
How to interpret shard fields:
target_nodes[i]is the planned target for shardi(index-aligned).lease_ids[i]is non-empty once the daemon has acquired/acknowledged the lease for that shard/target.- A task can be
degradedwhile still progressing if optional tiers are failing.
Placement Planning And Shards¶
The daemon requests a placement plan from the Global Store:
- Placement policies:
local_only,replicated,sharded. - Shard planning uses UMA chunk layout with a 128MB sharding threshold and 64MB to 256MB shard caps.
- When remote stable capacity is insufficient, placement degrades to local only and reports a degraded reason.
Task States And Degradation¶
Persistence tasks report:
pending,running,degraded,success,failed- Degraded states when optional tiers fail or placement downgrades.
- Failed states when required tiers fail.
Spill Gating And Durability Index¶
Stable DRAM admission with overflow_policy=spill uses a durability index
maintained by persistence. Spill eviction is allowed only when required
non-local tiers are satisfied.
Code Map¶
- Policy model: tensorcast/api/_config.py
- Policy resolver: daemon/state/store_policy_resolver.cc
- Persistence manager: daemon/state/persistence_manager.cc
- Global Store placement service: tensorcast/global_store/services/placement_service.py
- Daemon RPCs: proto/tensorcast/daemon/v2/store_daemon.proto