This is the nineteenth post in an ongoing series describing new and upcoming privacy features in iBrowe. This post highlights work by Alex Davidson, Peter Snyder, eV Quirk, Joseph Genereux, Benjamin Livshits, and Hamed Haddadi, and was written by Senior Director of Privacy Peter Snyder.
📋 Summary
iBrowe researchers have built STAR—a cryptographically secure telemetry framework that only reveals data when multiple users submit identical values. 🔒 STAR makes k-anonymity practical for organizations of any size, avoiding trusted third parties or special hardware. It enables Brave’s Web Discovery and Privacy-Preserving Product Analytics (P3A) to collect “heavy hitter” metrics while ensuring individual submissions remain private. STAR is open-source (Rust + WASM) under MPLv2, will be proposed for IETF standardization, and ships in upcoming iBrowe releases.
🔍 1. The Need for Privacy-Preserving Telemetry
1.1 Why Telemetry Matters
Collecting usage data—crash reports, feature adoption, performance metrics—helps developers identify bugs, optimize performance, and prioritize new features. 📈 Users benefit from more stable, faster, and more useful software when developers understand real-world usage patterns.
1.2 Privacy Risks of Traditional Analytics
Conventional analytics systems often record unique identifiers, timestamps, or device metadata that can deanonymize participants. 🕵️ Even pseudonymous IDs can link activities across sessions, risking user privacy. Large vendors either require trusted intermediaries (e.g., third-party servers) or use special hardware enclaves (e.g., SGX), making such solutions expensive and inaccessible for smaller projects.
🛡️ 2. k-Anonymity as a Privacy Foundation
2.1 k-Anonymity Principles
At its core, k-anonymity ensures that any reported value cannot be traced back to fewer than k users. 🧩 In other words, a data collector learns a value only if at least k distinct users submitted that same value. Unique or rare values (e.g., a one-off error code) remain hidden, preserving individual privacy.
2.2 The Ice-Cream Survey Analogy
- Suppose a company polls employees about their favorite ice-cream flavor. 🍦
- Common flavors (chocolate, vanilla, strawberry) appear at least k times, so they’re visible.
- A unique flavor (e.g., “olive-oil gelato”) appears once; revealing it would identify that employee.
- With k-anonymity, “olive-oil gelato” is suppressed, and only common flavors are counted.
⚙️ 3. The Challenge of Real-World k-Anonymity
3.1 Why k-Anonymity Is Hard to Deploy
- Who Counts First? Counting submissions to identify common vs. rare values requires trusting a party or hardware that sees raw inputs—defeating privacy.
- Scale Requirements: Many prior systems need millions of users before any value crosses the k threshold, making them unusable for smaller communities.
- Trusted Third Parties: Some solutions insert a neutral server that aggregates counts—this simply shifts trust.
3.2 STAR’s Innovations
STAR solves these problems by combining well-understood cryptographic building blocks without introducing new trust domains or specialized hardware. STAR’s keys are:
- Symmetric Encryption: Users encrypt their values under a shared key.
- Shamir Secret Sharing: Decryption only succeeds if at least k users submit the same encrypted value.
- Verifiable Oblivious Pseudorandom Functions (vOPRFs): Ensures encrypted values remain unpredictable and can’t be guessed in advance.
🔒 4. How STAR Works: Step-by-Step
4.1 System Roles and Setup
- Clients: Each user runs a STAR client library (Rust or WASM) inside the browser or application.
- Collector Servers: A small cluster of servers (e.g., three nodes) hold secret-shared keys. No single node can decrypt any submission alone.
- Parameters:
- k: The anonymity threshold (e.g., k = 10).
- Encryption key is split via Shamir Secret Sharing into n shares (each stored on a different server).
4.2 Client Submission Flow
- Value Encoding: User’s data (e.g., crash code, feature flag, visited URL fingerprint) is hashed via a vOPRF to produce a pseudorandom token. 🎲
- Symmetric Encryption: That token is encrypted under the shared key—creating a ciphertext C.
- Upload: The client sends C to the collector cluster along with a zero-knowledge proof (ZKP) confirming correct vOPRF usage (preventing malicious clients from gaming the system).
4.3 Server-Side Shamir Reconstruction
- Shard Storage: Each collector server stores all received ciphertexts but cannot decrypt individually.
- Threshold Trigger: Once k identical ciphertexts C have been received (i.e., k clients submitted the same token), servers run Shamir reconstruction to combine their key shares and derive the symmetric key.
- Decrypt & Release: The decrypted token is revealed to the collector—signaling a “heavy hitter”—and counted once. All other matching ciphertexts map to the same value.
- Privacy Guarantee: If fewer than k users submit the same token, decryption never occurs—those values remain hidden.
4.4 Example: Detecting a Top Feature Usage
- User A–I: Nine users enable “dark mode,” each generates token t, encrypts to C.
- User J: Tenth user also submits C. Now k = 10 identical ciphertexts.
- Servers: Combine secret shares → decrypt C → reveal token t.
- Collector: Knows “dark mode” happened 10+ times.
- User K: Submits “experimental feature X” once. Only one C′ in store—never decrypted. No risk of correlating “X” to any user.
🎯 5. STAR’s Key Advantages
5.1 Cost-Effective & Scalable
- Low Overhead: STAR’s cryptographic operations (symmetric encryption, vOPRF, Shamir reconstructions) run efficiently on standard cloud or personal servers—no SGX or HSM needed. ⚙️
- Small to Large Deployments: Works for projects with a few dozen users (e.g., internal beta tests) up to millions (public telemetry). Unlike existing approaches requiring millions, STAR yields accurate “heavy hitter” counts even at small scales.
5.2 Strong Privacy Guarantees
- k-Anonymity by Design: Unique or rare values never appear in plaintext. Even if all servers collude except one, they cannot decrypt values below threshold. 🛡️
- Post-Compromise Safety: If a collector node is compromised after decryption, it only sees aggregates of k or more. Individual contributions remain inaccessible.
5.3 Auditable & Open
- Familiar Primitives: STAR uses well-known cryptography (AES-GCM, Shamir, OPRFs), making audits and verification simpler. 🔍
- Rust + WASM Libraries: Reference implementations published under MPLv2, allowing easy integration in browsers, apps, or server ecosystems.
- Community Collaboration: Open GitHub repositories let other projects adopt, inspect, or improve STAR.
🌱 6. Real-World Applications & Commitments
6.1 iBrowe’s Use Cases
- Web Discovery Project: Users opt in to share anonymized browsing data (e.g., frequently visited domains) to improve Brave Search’s index. STAR ensures only popular domains (≥ k users) are learned—rare or unique domains remain private. 🔍
- Privacy-Preserving Product Analytics (P3A): iBrowe collects metrics like feature adoption rates and crash patterns. STAR ensures that error codes or feature flags only surface if used by multiple users, preserving user anonymity.
6.2 User Control & Transparency
- Opt-In by Default: Users who agree to share data with iBrowe can see STAR in action. Full transparency: what values are collected, how k-anonymity is enforced.
- Opt-Out Always Available: Users can disable telemetry entirely—STAR never forces data collection. ❌
- Open Source: Rust and WASM code, test vectors, and integration guides are public. Developers can inspect end-to-end flow and verify cryptographic soundness.
6.3 Standardization Efforts
- IETF Privacy Preserving Measurements (ppm) WG: STAR is proposed as a reference design for standardized k-anonymity telemetry. 📜
- Interoperability Goals: Aim to align with RAPPOR, Prio, and other PCC schemes so organizations can mix and match vetted tools.
🔮 7. Conclusion
STAR brings robust k-anonymity to telemetry, removing the trade-off between privacy and actionable analytics. By combining symmetric encryption, Shamir secret sharing, and verifiable OPRFs, STAR ensures that only values submitted by at least k users become visible—rare submissions stay private. Open-source, cost-effective, and scalable, STAR powers iBrowe’s Web Discovery and P3A while preserving user anonymity. As STAR moves toward IETF standardization, it sets a new bar for privacy-preserving data collection in browsers and beyond.