Private Set Intersection (OPRF-based)

Two parties each hold a private set of identifiers and want to learn which elements they share without exposing the rest. This variant uses Oblivious Pseudorandom Functions built on OT extensions: each party acts as the OPRF evaluator for the other's inputs, producing pseudorandom tags that match only when the underlying elements match. Cuckoo hashing compresses communication to O(n + m) with small constants, so the protocol scales to millions of elements where the DH-based variant becomes impractical.

Components

OPRF construction (RFC 9497 or batched-OPRF) and an OT extension library (IKNP, Silent OT, or similar).
Cuckoo hash tables with an agreed number of hash functions and stash slots, used by both parties to pack their inputs.
Authenticated bilateral channel for OT base setup, OPRF evaluation, and tag exchange.
Local matcher on each side that compares received OPRF tags against locally-computed tags to surface the intersection.

Protocol

counterparty Parties A and B agree OPRF parameters, cuckoo-hash functions, and tag length.
counterparty Each party inserts its elements into a cuckoo hash table, producing fixed-size tables.
counterparty A holds OPRF key k_A; B sends its table entries through an OT-based OPRF protocol and receives PRF(k_A, y) for each y without A learning y.
counterparty Roles reverse: B holds OPRF key k_B; A sends its table entries and receives PRF(k_B, x) for each x without B learning x.
counterparty A locally computes PRF(k_A, x) for its own elements and sends these tags to B. B does the same with PRF(k_B, y) and sends its tags to A.
counterparty Each party compares the tags it computed via the OPRF step against the tags the other party sent, and maps equal tags back to its own plaintext elements.

Guarantees & threat model

Guarantees:

Input privacy: non-intersecting elements are computationally hidden under OT hardness and PRF security.
Bilateral: runs directly between two parties over an authenticated channel.
Completeness: all shared elements are found, assuming cuckoo insertion succeeds (negligible failure probability with stash slots).
Soundness: zero false positives under a collision-resistant PRF.
Scalability: O(n + m) communication with small constants; amortises well as sets grow.

Threat model:

OT and PRF security; in particular, the base OT must remain secure against the network adversary.
Honest cuckoo construction by both parties. A malicious party can skew table layout to probe specific identifiers; commit-and-prove variants mitigate.
Authenticated transport between the parties.
Set-size hygiene: singleton or near-singleton submissions turn the protocol into a yes/no oracle on the submitted element.

Trade-offs

More moving parts than the DH-based variant: an OT extension library and cuckoo-hashing machinery are required.
Larger per-element constant factor than DH-based due to OT base setup, but linear communication amortises for large sets.
Cuckoo hashing has a small insertion-failure probability. Stash slots or larger tables mitigate at minor cost.
Equality-only: aggregates over the intersection require the circuit-based variant.

Example

Two custodians hold flagged-address lists under different jurisdictions. Custodian A holds 500k addresses and Custodian B holds 400k. Both run the bilateral OPRF intersection protocol. They discover 8,200 addresses appearing on both lists; the remaining 491,800 and 391,800 non-overlapping entries are not disclosed. Matched addresses feed into a joint investigation workflow.

Private Set Intersection (OPRF-based)

Works best when

Avoid when

I2I vs I2U — context differences

Institution to institution

Institution to end user

Post-quantum exposure