Storage
Who stores what in the Gozzip network. Storage is not centralized — users are each other's infrastructure through reciprocal storage pacts.
Storage Model
Three tiers of storage, tried in priority order:
| Tier | Who (Persona) | What they store | Availability | Expected Uptime |
|---|---|---|---|---|
| Device | User's own devices | All own events, full history | Only when online | — |
| Storage peers (Full) | ~20 WoT peers running full nodes (Keepers) | Complete event history | Distributed — always-on | 95% |
| Storage peers (Light) | ~20 WoT peers running light nodes (Witnesses) | Events since last checkpoint (~monthly window) | Distributed — when online | 30% |
| Relays | Third-party infrastructure (Heralds) | Whatever their retention policy allows | Optional fallback | — |
Storage Pacts
Every user commits to storing recent data for ~20 volume-matched peers in their web of trust. The commitment is reciprocal. See ADR 005.
Reciprocal pacts require WoT membership. Two exceptions: bootstrap pacts (triggered by follow) and guardian pacts (volunteered by an established user). See Glossary.
How pacts form
- User broadcasts kind 10055 (storage pact request) with their data volume
- WoT peers with similar volume respond with kind 10056 offers
- User selects partners — both exchange private kind 10053 pact events
- Both begin storing each other's events from the current checkpoint forward
Volume balancing
Peers are matched by data volume (+/- 30% tolerance) so the storage commitment is symmetric. If a partner's volume drifts beyond tolerance:
- Protocol flags the pairing as unbalanced
- Client waits
random(0, 48h)before broadcasting (jittered delay prevents renegotiation storms when many users detect the same peer failure simultaneously). During the delay, standby pacts provide immediate failover. See ADR 008. - User broadcasts kind 10055 for a replacement
- Negotiates new pact, migrates data
- Closes old pact only after new one is confirmed
Data scope
For Light node pact partners, each pact covers events from the latest checkpoint onward — roughly a monthly window. Old data ages out of their pact obligations. Full node pact partners store complete event history for their peers. In both cases, the user's own devices hold full history, and archivists can opt into deeper storage via archival pacts.
Keepers (Full node pact partners) maintain 95% uptime. Witnesses (Light node pact partners) maintain 30% uptime. See Glossary.
Bootstrap pacts
New users have no WoT to form reciprocal pacts. The first person they follow becomes a temporary storage peer. See ADR 006.
- One-sided — the followed user stores the new user's data, no reciprocal obligation
- Auto-expires after 90 days or when the new user reaches 10 reciprocal pacts
- The followed user's client auto-accepts if capacity allows
- Transition: as the user builds WoT, bootstrap pacts phase out and reciprocal pacts take over
Guardian pacts
An established user (Guardian) can volunteer to store data for one untrusted newcomer (Seedling) outside their Web of Trust. Guardian pacts complement bootstrap pacts — together they give newcomers two independent storage peers before forming any reciprocal pacts.
- One slot per Guardian — voluntary opt-in, no WoT required
- Kind 10053 with
type: guardiantag - Expiry: 90 days or Seedling reaches Hybrid phase (5+ reciprocal pacts)
- Challenge-response applies the same as any pact
| Phase | Storage Peers | Source |
|---|---|---|
| Seedling joins (0 pacts) | 0 | — |
| Guardian pact forms | 1 | Guardian volunteer |
| First follow → bootstrap pact | 2 | Followed user |
| WoT builds → reciprocal pacts | 3+ | WoT peers |
| 5+ pacts (Hybrid) | 5+ | Guardian expires |
| 10+ pacts | 10+ | Bootstrap expires |
See ADR 006 and the protocol paper §2.4, §4.7.
Archival pacts
Standard pacts cover ~monthly windows. For long-term persistence, users can form archival pacts:
- Cover full history or a specified deep range
- Lower challenge frequency (weekly instead of daily)
- For power users, archivists, and users running always-on nodes
- Not mandatory — users without archival pacts are advised to run a persistent node
Standby pacts
Maintain 3 extra pacts in standby mode to eliminate rebalancing delays:
- Standby peers receive events but aren't challenged or expected to serve
- When an active pact drops, promote a standby immediately — no discovery delay
- Backfill standby pool in the background
Proof of Storage
Challenge-response protocol via kind 10054. Two challenge modes. See ADR 006.
Hash challenge: Alice sends Bob "hash events [47..53] with this nonce." Bob computes hash from local copy. Proves possession.
Serve challenge: Alice sends Bob "give me the full event at position 47." Measures response latency. Consistently slow responses suggest the peer is fetching remotely.
Completeness verification: The checkpoint (kind 10051) includes a Merkle root of all events in the current window. Requesters compute the Merkle root from received events and compare against the checkpoint. Mismatch = events are missing. Light nodes additionally cross-verify the per-event hash chain for the most recent M events (default: 20) from each device — this catches checkpoint delegate censorship where a delegate omits sibling device events. See ADR 008.
Reliability scoring: Clients track a rolling 30-day reliability score per peer:
| Score | Status | Action |
|---|---|---|
| 90%+ | Healthy | No action |
| 70–90% | Degraded | Increase challenge frequency |
| 50–70% | Unreliable | Begin replacement |
| < 50% | Failed | Drop immediately |
Failure handling:
- First failure → retry (network issues)
- Second failure → ask other storage peers for same data
- If others have it → failing peer's reliability score drops
- Score below threshold → begin replacement negotiation
- Natural consequence: failing peer loses reciprocal storage
Bounded timestamps: Clients, relays, and storage peers reject events with created_at more than 15 minutes in the future. Events backdated more than 1 hour from the last known event from the same device are flagged. For replaceable event merge tiebreakers, timestamps within 60 seconds use lexicographic ordering of event ID (deterministic, non-gameable) instead of later-timestamp-wins.
Event Retrieval
See Data Flow for the full flow diagrams.
Delivery paths (priority order): 0. Tier 0 — BLE mesh — nearby devices serve events via Bluetooth. No internet required. Interoperable with bitchat. See ADR 010.
- Tier 1 — Cached endpoints — follower has storage peer endpoints cached from kind 10059. Direct connection, zero broadcast overhead.
- Tier 2 — Gossip — send kind 10057 to directly connected peers. Each peer forwards if they can't respond (TTL=3, reaches ~8,000 nodes in a 20-peer network).
- Tier 3 — Storage peers via DVM — traditional kind 10057 broadcast through relay. Relay-dependent fallback.
- Tier 4 — Relays — traditional relay query as last resort.
All paths produce self-authenticating events (signed by author's keys). Source doesn't matter — signatures prove authenticity.
Gossip Hardening
Gossip forwarding (kind 10055, 10057) is hardened against amplification attacks. See ADR 008.
Per-hop rate limiting: Each node enforces a maximum request rate per source pubkey:
- Kind 10055 (pact request): 10 req/s per source
- Kind 10057 (data request): 50 req/s per source
- Excess requests are dropped silently
Request deduplication: Each request carries a request_id tag. Nodes track seen request_ids in an LRU cache (10,000 entries). Duplicate requests are dropped.
WoT-only forwarding: Nodes only forward gossip from pubkeys within their 2-hop WoT. Requests from unknown pubkeys are served locally (if possible) but never forwarded. This bounds the gossip blast radius to the WoT graph.
Gossip topology exposure: Gossip forwarding reveals the network graph to observers at multiple nodes. The WoT-only forwarding rule limits exposure to WoT members. Relays can offer onion routing as an optional service — gossip requests wrapped in NIP-44 encrypted layers per hop, hiding the request path from intermediate nodes. Users subscribe to onion routing via Lightning zaps. See relay Lightning services.
Privacy
- Pacts are private — kind 10053 exchanged directly, never published
- Topology is hidden — no public list of who stores whose data
- Retrieval is per-request — storage peers reveal themselves only to individual requesters via kind 10058
- Peers can filter — respond only to WoT members, or not at all
- Blinded data requests — kind 10057 uses
H(target_pubkey || daily_salt)instead of raw pubkey. Observers can't identify whose data is being requested or link requests across days. Storage peers match against both today and yesterday's date to handle clock skew at day boundaries (dual-day blind matching). See ADR 008. - DM integrity — NIP-44 uses AEAD (authenticated encryption). Storage peers hold encrypted DM blobs but cannot serve corrupted ciphertext — tampered ciphertext fails decryption with an authentication error. The existing challenge-response proves possession; AEAD proves integrity.
Peer Selection
Client-side rules for choosing storage peers. See ADR 006.
WoT cluster diversity: Maximum 3 peers from any single social cluster. At least 4 distinct clusters across 20 peers. Prevents eclipse attacks where an attacker becomes a majority of your storage peers.
Geographic diversity: Target 3+ timezone bands. Never more than 50% of peers in the same ±3 hour band. Protects against correlated regional failures.
Peer reputation: Weight offers by identity age, challenge success rate, and active pact count. Identities < 30 days old are limited to bootstrap pacts.
Popularity scaling: Scale pact count with follower count (< 100 followers → 10 pacts, 1,000+ → 30 pacts, 10,000+ → 40+). More pacts = more peers sharing serving load.
Offer filtering: Drop offers from non-WoT pubkeys, identities < 30 days old, or volume mismatch > 50%.
Mobile Considerations
Mobile devices participate when possible but aren't expected to be always-on storage servers. A user's desktop, home server, or cloud VPS handles primary storage obligations. If all a user's devices are offline, storage peers serve their data.
Relay Role
Relays continue to exist but shift from required infrastructure to optional accelerators:
- Clients try storage peers and relays opportunistically
- Relays provide faster delivery when available (always-on, indexed)
- No single relay failure can make a user's data unavailable
- Gradual migration — relays work alongside storage peers
Three-Phase Adoption Model
Client behavior adapts automatically based on the user's pact count:
| Phase | Pact count | Behavior |
|---|---|---|
| Bootstrap | 0–5 pacts | Publish to relays primarily. Form pacts as available. |
| Hybrid | 5–15 pacts | Publish to both relays and storage peers. Fetch from peers first, relay fallback. |
| Sovereign | 15+ pacts | Storage peers primary. Relays optional accelerator. |
Transition is automatic and per-user — no network-wide flag. Early adopters stay in bootstrap phase longer. As the network grows, users transition naturally.
No protocol changes — this is client-side logic controlling which delivery path to prioritize.
Cascading Read-Caches
Popular accounts would overwhelm their storage peers (10–40+, depending on follower count) with serving load. Read-caches solve this by turning followers into voluntary data mirrors.
- Bob fetches Alice's events from storage peer S1
- Bob now has a local copy
- Carol broadcasts kind 10057 for Alice's data
- Bob's client sees the request and responds (he has Alice's events)
- Carol verifies Alice's signatures — source doesn't matter
Properties:
- No pact needed — Bob serves cached data he already has
- Self-authenticating events — signatures prove integrity regardless of source
- Popular data naturally replicates across the follower base
- Storage peers handle first wave; followers' caches handle the tail
- Load scales with O(followers), not O(storage_peers)
Client configuration:
read_cache_enabled— whether to serve cached data for followed users (default: true)read_cache_max_mb— storage limit for cached data (default: 100MB)read_cache_respond_to— who to serve:wot_only(default) oranyone
No new event kinds — uses existing kind 10057/10058 flow.
Incentive Model
Storage contribution translates directly to content reach through pact-aware gossip routing. No explicit scores, tokens, or subscriptions — the network topology IS the incentive. See ADR 009.
Pact-Aware Gossip Priority
When a node forwards gossip or decides what content to propagate, it applies priority ordering:
- Active pact partners — highest priority. You store their data, you forward their content eagerly. Self-interested: if their content is discoverable, your storage pact has value (others can find the data you're holding).
- WoT contacts (1-hop follows) — standard priority.
- Extended WoT (2-hop) — lower priority, forwarded if capacity allows.
- Unknown pubkeys — served locally if available, never forwarded (existing gossip hardening rule).
A user with 20 reliable pact partners has 20 nodes that eagerly forward their content. A user with 5 flaky pacts has fewer advocates.
Natural Consequences
- Dropped pact — the dropped peer loses a forwarding advocate. Their content distribution naturally shrinks.
- Reliable storage — maintained pacts = maintained reach. The incentive to store reliably is self-interested.
- No cliff — there's no minimum contribution to participate. Less contribution means less reach, not exclusion. The network degrades gracefully.