Storage

Who stores what in the Gozzip network. Storage is not centralized — users are each other's infrastructure through reciprocal storage pacts.

Storage Model

Three tiers of storage, tried in priority order:

Tier Who (Persona) What they store Availability Expected Uptime
Device User's own devices All own events, full history Only when online
Storage peers (Full) ~20 WoT peers running full nodes (Keepers) Complete event history Distributed — always-on 95%
Storage peers (Light) ~20 WoT peers running light nodes (Witnesses) Events since last checkpoint (~monthly window) Distributed — when online 30%
Relays Third-party infrastructure (Heralds) Whatever their retention policy allows Optional fallback

Storage Pacts

Every user commits to storing recent data for ~20 volume-matched peers in their web of trust. The commitment is reciprocal. See ADR 005.

Reciprocal pacts require WoT membership. Two exceptions: bootstrap pacts (triggered by follow) and guardian pacts (volunteered by an established user). See Glossary.

How pacts form

  1. User broadcasts kind 10055 (storage pact request) with their data volume
  2. WoT peers with similar volume respond with kind 10056 offers
  3. User selects partners — both exchange private kind 10053 pact events
  4. Both begin storing each other's events from the current checkpoint forward

Volume balancing

Peers are matched by data volume (+/- 30% tolerance) so the storage commitment is symmetric. If a partner's volume drifts beyond tolerance:

  1. Protocol flags the pairing as unbalanced
  2. Client waits random(0, 48h) before broadcasting (jittered delay prevents renegotiation storms when many users detect the same peer failure simultaneously). During the delay, standby pacts provide immediate failover. See ADR 008.
  3. User broadcasts kind 10055 for a replacement
  4. Negotiates new pact, migrates data
  5. Closes old pact only after new one is confirmed

Data scope

For Light node pact partners, each pact covers events from the latest checkpoint onward — roughly a monthly window. Old data ages out of their pact obligations. Full node pact partners store complete event history for their peers. In both cases, the user's own devices hold full history, and archivists can opt into deeper storage via archival pacts.

Keepers (Full node pact partners) maintain 95% uptime. Witnesses (Light node pact partners) maintain 30% uptime. See Glossary.

Bootstrap pacts

New users have no WoT to form reciprocal pacts. The first person they follow becomes a temporary storage peer. See ADR 006.

  • One-sided — the followed user stores the new user's data, no reciprocal obligation
  • Auto-expires after 90 days or when the new user reaches 10 reciprocal pacts
  • The followed user's client auto-accepts if capacity allows
  • Transition: as the user builds WoT, bootstrap pacts phase out and reciprocal pacts take over

Guardian pacts

An established user (Guardian) can volunteer to store data for one untrusted newcomer (Seedling) outside their Web of Trust. Guardian pacts complement bootstrap pacts — together they give newcomers two independent storage peers before forming any reciprocal pacts.

  • One slot per Guardian — voluntary opt-in, no WoT required
  • Kind 10053 with type: guardian tag
  • Expiry: 90 days or Seedling reaches Hybrid phase (5+ reciprocal pacts)
  • Challenge-response applies the same as any pact
Phase Storage Peers Source
Seedling joins (0 pacts) 0
Guardian pact forms 1 Guardian volunteer
First follow → bootstrap pact 2 Followed user
WoT builds → reciprocal pacts 3+ WoT peers
5+ pacts (Hybrid) 5+ Guardian expires
10+ pacts 10+ Bootstrap expires

See ADR 006 and the protocol paper §2.4, §4.7.

Archival pacts

Standard pacts cover ~monthly windows. For long-term persistence, users can form archival pacts:

  • Cover full history or a specified deep range
  • Lower challenge frequency (weekly instead of daily)
  • For power users, archivists, and users running always-on nodes
  • Not mandatory — users without archival pacts are advised to run a persistent node

Standby pacts

Maintain 3 extra pacts in standby mode to eliminate rebalancing delays:

  • Standby peers receive events but aren't challenged or expected to serve
  • When an active pact drops, promote a standby immediately — no discovery delay
  • Backfill standby pool in the background

Proof of Storage

Challenge-response protocol via kind 10054. Two challenge modes. See ADR 006.

Hash challenge: Alice sends Bob "hash events [47..53] with this nonce." Bob computes hash from local copy. Proves possession.

Serve challenge: Alice sends Bob "give me the full event at position 47." Measures response latency. Consistently slow responses suggest the peer is fetching remotely.

Completeness verification: The checkpoint (kind 10051) includes a Merkle root of all events in the current window. Requesters compute the Merkle root from received events and compare against the checkpoint. Mismatch = events are missing. Light nodes additionally cross-verify the per-event hash chain for the most recent M events (default: 20) from each device — this catches checkpoint delegate censorship where a delegate omits sibling device events. See ADR 008.

Reliability scoring: Clients track a rolling 30-day reliability score per peer:

Score Status Action
90%+ Healthy No action
70–90% Degraded Increase challenge frequency
50–70% Unreliable Begin replacement
< 50% Failed Drop immediately

Failure handling:

  1. First failure → retry (network issues)
  2. Second failure → ask other storage peers for same data
  3. If others have it → failing peer's reliability score drops
  4. Score below threshold → begin replacement negotiation
  5. Natural consequence: failing peer loses reciprocal storage

Bounded timestamps: Clients, relays, and storage peers reject events with created_at more than 15 minutes in the future. Events backdated more than 1 hour from the last known event from the same device are flagged. For replaceable event merge tiebreakers, timestamps within 60 seconds use lexicographic ordering of event ID (deterministic, non-gameable) instead of later-timestamp-wins.

Event Retrieval

See Data Flow for the full flow diagrams.

Delivery paths (priority order): 0. Tier 0 — BLE mesh — nearby devices serve events via Bluetooth. No internet required. Interoperable with bitchat. See ADR 010.

  1. Tier 1 — Cached endpoints — follower has storage peer endpoints cached from kind 10059. Direct connection, zero broadcast overhead.
  2. Tier 2 — Gossip — send kind 10057 to directly connected peers. Each peer forwards if they can't respond (TTL=3, reaches ~8,000 nodes in a 20-peer network).
  3. Tier 3 — Storage peers via DVM — traditional kind 10057 broadcast through relay. Relay-dependent fallback.
  4. Tier 4 — Relays — traditional relay query as last resort.

All paths produce self-authenticating events (signed by author's keys). Source doesn't matter — signatures prove authenticity.

Gossip Hardening

Gossip forwarding (kind 10055, 10057) is hardened against amplification attacks. See ADR 008.

Per-hop rate limiting: Each node enforces a maximum request rate per source pubkey:

  • Kind 10055 (pact request): 10 req/s per source
  • Kind 10057 (data request): 50 req/s per source
  • Excess requests are dropped silently

Request deduplication: Each request carries a request_id tag. Nodes track seen request_ids in an LRU cache (10,000 entries). Duplicate requests are dropped.

WoT-only forwarding: Nodes only forward gossip from pubkeys within their 2-hop WoT. Requests from unknown pubkeys are served locally (if possible) but never forwarded. This bounds the gossip blast radius to the WoT graph.

Gossip topology exposure: Gossip forwarding reveals the network graph to observers at multiple nodes. The WoT-only forwarding rule limits exposure to WoT members. Relays can offer onion routing as an optional service — gossip requests wrapped in NIP-44 encrypted layers per hop, hiding the request path from intermediate nodes. Users subscribe to onion routing via Lightning zaps. See relay Lightning services.

Privacy

  • Pacts are private — kind 10053 exchanged directly, never published
  • Topology is hidden — no public list of who stores whose data
  • Retrieval is per-request — storage peers reveal themselves only to individual requesters via kind 10058
  • Peers can filter — respond only to WoT members, or not at all
  • Blinded data requests — kind 10057 uses H(target_pubkey || daily_salt) instead of raw pubkey. Observers can't identify whose data is being requested or link requests across days. Storage peers match against both today and yesterday's date to handle clock skew at day boundaries (dual-day blind matching). See ADR 008.
  • DM integrity — NIP-44 uses AEAD (authenticated encryption). Storage peers hold encrypted DM blobs but cannot serve corrupted ciphertext — tampered ciphertext fails decryption with an authentication error. The existing challenge-response proves possession; AEAD proves integrity.

Peer Selection

Client-side rules for choosing storage peers. See ADR 006.

WoT cluster diversity: Maximum 3 peers from any single social cluster. At least 4 distinct clusters across 20 peers. Prevents eclipse attacks where an attacker becomes a majority of your storage peers.

Geographic diversity: Target 3+ timezone bands. Never more than 50% of peers in the same ±3 hour band. Protects against correlated regional failures.

Peer reputation: Weight offers by identity age, challenge success rate, and active pact count. Identities < 30 days old are limited to bootstrap pacts.

Popularity scaling: Scale pact count with follower count (< 100 followers → 10 pacts, 1,000+ → 30 pacts, 10,000+ → 40+). More pacts = more peers sharing serving load.

Offer filtering: Drop offers from non-WoT pubkeys, identities < 30 days old, or volume mismatch > 50%.

Mobile Considerations

Mobile devices participate when possible but aren't expected to be always-on storage servers. A user's desktop, home server, or cloud VPS handles primary storage obligations. If all a user's devices are offline, storage peers serve their data.

Relay Role

Relays continue to exist but shift from required infrastructure to optional accelerators:

  • Clients try storage peers and relays opportunistically
  • Relays provide faster delivery when available (always-on, indexed)
  • No single relay failure can make a user's data unavailable
  • Gradual migration — relays work alongside storage peers

Three-Phase Adoption Model

Client behavior adapts automatically based on the user's pact count:

Phase Pact count Behavior
Bootstrap 0–5 pacts Publish to relays primarily. Form pacts as available.
Hybrid 5–15 pacts Publish to both relays and storage peers. Fetch from peers first, relay fallback.
Sovereign 15+ pacts Storage peers primary. Relays optional accelerator.

Transition is automatic and per-user — no network-wide flag. Early adopters stay in bootstrap phase longer. As the network grows, users transition naturally.

No protocol changes — this is client-side logic controlling which delivery path to prioritize.

Cascading Read-Caches

Popular accounts would overwhelm their storage peers (10–40+, depending on follower count) with serving load. Read-caches solve this by turning followers into voluntary data mirrors.

  1. Bob fetches Alice's events from storage peer S1
  2. Bob now has a local copy
  3. Carol broadcasts kind 10057 for Alice's data
  4. Bob's client sees the request and responds (he has Alice's events)
  5. Carol verifies Alice's signatures — source doesn't matter

Properties:

  • No pact needed — Bob serves cached data he already has
  • Self-authenticating events — signatures prove integrity regardless of source
  • Popular data naturally replicates across the follower base
  • Storage peers handle first wave; followers' caches handle the tail
  • Load scales with O(followers), not O(storage_peers)

Client configuration:

  • read_cache_enabled — whether to serve cached data for followed users (default: true)
  • read_cache_max_mb — storage limit for cached data (default: 100MB)
  • read_cache_respond_to — who to serve: wot_only (default) or anyone

No new event kinds — uses existing kind 10057/10058 flow.

Incentive Model

Storage contribution translates directly to content reach through pact-aware gossip routing. No explicit scores, tokens, or subscriptions — the network topology IS the incentive. See ADR 009.

Pact-Aware Gossip Priority

When a node forwards gossip or decides what content to propagate, it applies priority ordering:

  1. Active pact partners — highest priority. You store their data, you forward their content eagerly. Self-interested: if their content is discoverable, your storage pact has value (others can find the data you're holding).
  2. WoT contacts (1-hop follows) — standard priority.
  3. Extended WoT (2-hop) — lower priority, forwarded if capacity allows.
  4. Unknown pubkeys — served locally if available, never forwarded (existing gossip hardening rule).

A user with 20 reliable pact partners has 20 nodes that eagerly forward their content. A user with 5 flaky pacts has fewer advocates.

Natural Consequences

  • Dropped pact — the dropped peer loses a forwarding advocate. Their content distribution naturally shrinks.
  • Reliable storage — maintained pacts = maintained reach. The incentive to store reliably is self-interested.
  • No cliff — there's no minimum contribution to participate. Less contribution means less reach, not exclusion. The network degrades gracefully.