Most social networks use a single database with an app, storing usernames, posts, algorithms, moderation rules, and HTML within the same company. Building on top means relying on the vendor's rate-limited API, which can change or disappear unexpectedly. The AT Protocol (ATproto) divides the monolith into parts that different people can run, use different languages for, and swap out without losing accounts or posts. Bluesky is its biggest app, but the protocol is more interesting for developers than the app. This article explains ATproto from a developer's perspective: roles, data formats, trade-offs, and integration points. The closest analogy is email: an email handle points to a mailbox at a provider, which you can retain by updating an MX record when changing providers. ATproto generalizes this: a handle is a domain name, the "mailbox" is a signed repository of social actions, and various applications can access it. Federation exists in DNS and signatures, not in a specific vendor's API. ## Why ATproto exists Earlier decentralized protocols handled federation at the server level. In ActivityPub, Mastodon owns user accounts, stores posts, and federates through signed HTTP messages. The server is the decentralization unit, and switching resets your identity, followers, and history. ATproto separates identity, data, indexing, and application layers, allowing independent movement and clearer, developer-focused design goals. * Account portability without data loss when a user changes hosts. * A global stream of network events accessible to any application. * Multiple apps competing over identical content for user switching. * Cheap operation for small hosts, since hosts only store their own users' data. The team behind it published the rationale in an academic paper, [Bluesky and the AT Protocol: Usable Decentralized Social Media][atproto-paper], if you want the long form. ## A brief history Bluesky started in 2019 inside Twitter, funded by Jack Dorsey to develop decentralized social media standards. It spun out in 2021 as an independent nonprofit led by Jay Graber. The team revealed the AT Protocol in October 2022. The Bluesky app launched in an invite-only beta in February 2023 and became publicly available in February 2024. Open federation to third-party PDSes followed. Growth sped up through 2024 and 2025, mainly as users left Twitter/X. The IETF chartered the Authenticated Transfer working group in early 2026, starting a multi-year shift from Bluesky Social's implementation to independent specs. That arc explains why ATproto's centralization risk still traces back to one company. **Bluesky, Inc.** is a public benefit corporation overseeing the protocol's reference implementations, including the default Bluesky AppView (`bluesky.social`), a major relay with firehose traffic, Ozone labeler, and default Personal Data Server users get when signing up. Backed by venture capital, including Jack Dorsey, and led by CEO Jay Graber, Bluesky controls most infrastructure users rely on, despite open AT Protocol and IETF standards. While anyone can run their own PDS, relay, AppView, or labeler, switching from Bluesky's defaults needs effort. This highlights the centralization risk: architecture is decentralized, but user control isn't automatic. ## The four moving parts ATproto divides the monolith into four roles. Hold these in your head, and the rest of the protocol makes sense. ```mermaid graph TB PDS["Personal Data Server (PDS)
hosts a user's repo + blobs"] Relay["Relay
aggregates repos into the firehose"] AppView["AppView
indexes the firehose, serves the app"] Client["Client
web, mobile, or another service"] Client -->|reads / writes via PDS| PDS PDS -->|publishes commit events| Relay Relay -->|firehose websocket| AppView Client -->|queries| AppView style PDS fill:#e1f5fe style Relay fill:#f3e5f5 style AppView fill:#e8f5e8 style Client fill:#fff3e0 ``` **Personal Data Server (PDS):** A user's data includes their signed repository of posts, follows, likes, and records, plus blobs like images and videos. Clients write to the PDS, which broadcasts changes to relays. **Relay:** A relay subscribes to many PDSes, follows their commit streams, and republishes all changes as a single global stream called the **firehose**. It's a fan-in that becomes a fan-out. **AppView:** An AppView subscribes to the firehose, builds indexes (timelines, follower graphs, search), and serves queries. Bluesky, Frontpage, Tangled, and Leaflet are all AppViews; they read the same data but produce different products. **Client:** Web, mobile, bot, or any consumer; writes go through PDS, reads via AppView. Users don’t contact a relay directly. Each role has a specific task: the PDS manages identity and storage, the relay handles aggregation, and the AppView oversees product, moderation, and ranking. Anyone can operate or swap roles without disrupting others. 'Decentralized' means no part is restricted to one operator. ## The wire format Services communicate via HTTP using [XRPC], a thin RPC protocol with JSON requests and responses. [DAG-CBOR], a deterministic binary format with content-addressed references, carries signed or stored data. Records link by content hash, enabling independent verification of data. That is the full protocol surface for an application developer: JSON over HTTP for live queries, signed CBOR records in repositories for canonical data, and a WebSocket stream for change events. ## Understanding the pieces ## Identity: handles and DIDs Every account has two identifiers, each for a different audience. A **handle** is a human-readable domain name like `alice.example.com`. Handles are mutable and can be changed anytime. Control is proven by adding a `_atproto` DNS TXT record or serving a file at a known HTTP path. If your handle is your own domain, your account persists even if services shut down, as verification relies on the internet's naming system. To verify a handle, look up its DNS TXT record: ```bash dig TXT _atproto.jeffbailey.us ``` The response should contain a TXT record pointing to the user's DID, like: ``` _atproto.jeffbailey.us. 300 IN TXT "did=did:plc:ugdk4xz7im2lhb6lad6xpvfe" ``` A **DID** (Decentralized Identifier) is a stable internal identity, like `did:plc:ewvi7nxzyoun6zhxrhs64oiz`, that never changes. It resolves to a DID document with the user's public keys, handle, and PDS URL. Software always references the DID, not the handle, ensuring the graph survives handle changes. In practice, ATproto supports two DID methods: `did:plc` and `did:web`. The `did:web` method fetches a JSON from a well-known path on a controlled domain, while `did:plc` uses a centralized directory run by Bluesky Social. This centralization draws criticism of ATproto's decentralization, but the company plans to spin out the PLC directory into an independent Swiss organization to remove a single point of failure (see [PLC directory transition announcement][plc-directory]). Resolve a handle to a DID once, then key everything off that DID. ## Repositories: a user's signed Merkle tree Every user has exactly one **repository**, owned and signed by their key. A repo is a collection of records organized into lexicon-defined namespaces (`app.bsky.feed.post`, `app.bsky.graph.follow`, `app.bsky.feed.like`, and so on). Each record has a unique key, currently a TID (timestamp identifier) derived from its creation time. A Merkle Search Tree stores records and offers two practical properties. * The root hash summarizes the repository state, so two services can compare it to check if they're in sync. * You can produce a compact cryptographic proof that a record is or isn't in a repo at a specific revision, without shipping the whole repo. A user can export their repo as a signed `.car` file (Content Addressable Archive) and import it into a different PDS. The new PDS can verify the user's signature on every record without trusting the old host. "Account portability" involves downloading the `.car`, verifying it, uploading it elsewhere, and pointing the DID document at the new PDS. Large binary content like images and videos exists outside the Merkle tree as **blobs**, stored on the PDS and referenced by CID. ## Personal Data Servers A **PDS** stores a user's repository, blobs, and signing keys. When a client posts, it sends an authenticated XRPC call to the PDS, which verifies the record, updates the repo, signs a commit, and emits an event. PDSes are small, skipping timeline computation, moderation, and network indexing. They store user data and broadcast changes. A Raspberry Pi can host a single-user PDS. Application developers can run a PDS only when hosting users; otherwise, they run the relevant stack part. ## Relays and the firehose A **relay** subscribes to known PDS, follows their commits, and republishes everything as a merged websocket called the **firehose**. Connecting to a relay provides a near real-time stream of signed commits it indexes across the network. This component is essential for ambient development. To build a Bluesky post search engine, subscribe to a relay and index. To create a bot reacting to certain records, filter the firehose by collection NSID. Events are signed and addressed, allowing verification without trusting the relay. Relays are the most politically contested component. Running one is costly, with unclear operational incentives, and few relays support most of the network. While the protocol permits competition, economic factors favor consolidation. Don't assume anyone can run one. ## AppViews An **AppView** is what most people picture when they think of a social network. It consumes the firehose, builds indexes (timelines, follower graphs, like counts, reply trees), enforces its own moderation and ranking, and serves XRPC queries to clients. [Bluesky][bluesky-app], [Frontpage][frontpage], [Smoke Signal][smoke-signal], [Tangled][tangled], and [Leaflet][leaflet] are AppViews. They read from the same repositories but differ in what records they show, how they combine them, and their UI. Two consequences of this design matter when you build. The first is no content lock-in. A new microblogging AppView can display posts created with Bluesky's lexicons, as the records belong to users, not Bluesky. Competition shifts from "who has the data" to "who builds the best experience over shared data." The second is no federation handshake for reads. Your AppView doesn't negotiate with another server for content. The firehose is one global stream. Building an AppView mainly involves indexing infrastructure: a database that ingests CBOR records, denormalizes them for queries, and serves XRPC endpoints. Key product decisions are upstream: selecting records, defining lexicons, ranking, and moderation. ## Lexicons: the schema layer A **lexicon** is a JSON Schema-based definition of records and XRPC methods, identified by a reverse-DNS NSID like `app.bsky.feed.post` or `events.smokesignal.calendar.event`. Anyone can define a lexicon under their domain. Lexicons are crucial for ATproto interoperability. Two AppViews using `app.bsky.feed.post` can display each other's posts independently. When an AppView adds a new record type, it publishes its lexicon at a stable URL, waiting for others to adopt it. Small ecosystems share a common lexicon but differentiate through layered experience. New product categories introduce new lexicons and AppViews competing on similar content. Designing a new social object mainly involves creating a lexicon. ## Opinionated services: labelers and feed generators Two extension points allow smaller services to participate without a full AppView. **Labelers** read the firehose and emit labels like spam, NSFW, "this account is in our community," "this post is in Spanish," and "this is an event in Berlin." AppViews subscribe to labelers and decide actions: hide, blur, badge, or surface. Users can add more labelers. Bluesky's open-source moderation service, [Ozone][ozone], is the reference, but others can create theirs. Moderation in ATproto is unbundled, stackable, and competitive, a key shift from the monolithic Trust-and-Safety team model. **Feed generators** are small services that, given a Bluesky user, return a list of post URIs. They operate through the `app.bsky.feed.getFeedSkeleton` XRPC method. The AppView expands the skeleton into full posts for display. This contract powers Bluesky's Both are good first projects: narrow interfaces, small surface area. ## How the pieces fit together ```mermaid graph TB User["User / Client
web, mobile, bot"] subgraph id["Identity"] Handle["Handle
alice.example.com
(DNS or HTTP proof)"] DID["DID document
keys, handle, PDS URL"] PLC["PLC directory
did:plc resolver"] end subgraph storage["Storage (one PDS per user)"] PDSA["PDS A
Alice's signed repo + blobs"] PDSn["PDS ...
other users"] end Relay["Relay
merges PDS commits into the firehose"] subgraph apps["AppViews (build products)"] Bsky["Bluesky AppView
microblog timeline"] Other["Other AppViews
Frontpage, Tangled, Leaflet, ..."] end subgraph plugins["Pluggable services"] Labeler["Labelers
spam, NSFW, locale, community"] FeedGen["Feed generators
custom post lists"] end Lex["Lexicons
shared record + XRPC schemas
(app.bsky.feed.post, ...)"] User -. resolve handle .-> Handle Handle -->|resolves to| DID DID -. via did:plc .-> PLC DID -->|points at| PDSA User -->|writes via XRPC| PDSA User -->|reads via XRPC| Bsky PDSA -->|commit events| Relay PDSn -->|commit events| Relay Relay -->|firehose| Bsky Relay -->|firehose| Other Relay -->|firehose| Labeler Relay -->|firehose| FeedGen Labeler -. labels .-> Bsky FeedGen -. feed skeletons .-> Bsky Lex -. validates writes .-> PDSA Lex -. shapes indexes .-> Bsky Lex -. shapes indexes .-> Other style User fill:#fff3e0 style Handle fill:#fce4ec style DID fill:#fce4ec style PLC fill:#fce4ec style PDSA fill:#e1f5fe style PDSn fill:#e1f5fe style Relay fill:#f3e5f5 style Bsky fill:#e8f5e8 style Other fill:#e8f5e8 style Labeler fill:#dcedc8 style FeedGen fill:#dcedc8 style Lex fill:#fff9c4 ``` Identity is always the entry point: a handle resolves to a DID document, which points to a PDS, and everything keys off the DID. The PDS is the sole write source; clients never write to a relay or AppView. The relay is a fan-in, becoming a fan-out, transforming many per-user PDS streams into a network-wide firehose. AppViews, labelers, and feed generators all use this stream but create different outputs. Lexicons are the agreement enabling consistent understanding of records between PDS writes and AppView reads. The non-existent arrow in the diagram is significant: AppViews don't federate or coordinate, as each sees the entire network through the same firehose. ## Common misconceptions A few claims circulate that are worth correcting before they shape your design. _ATproto is a Bluesky API._ Bluesky is a single app with one lexicon namespace (`app.bsky`), built on a protocol-agnostic system. Most current code focuses on the Bluesky lexicon by market gravity rather than protocol design. _It's the same idea as ActivityPub._ Both aim for decentralized social, but their architectures differ. ActivityPub federates monolithic servers; ATproto separates storage, indexing, and product. Their data models, identity systems, and transport differ. Bridges like [Bridgy Fed] exist, but protocols remain distinct. _Federation means everyone runs everything._ Most users rely on a hosted PDS with few relays and AppViews. The protocol ensures these choices remain swappable, but doesn't require everyone to run their own infrastructure. _The firehose is the network._ The firehose is a useful, central synchronization mechanism, but still just one path. PDSes backfill repos directly, and AppViews fetch records on demand. _It's already fully standardized._ As of early 2026, the IETF working group on Authenticated Transfer is just starting. The repository format, sync protocol, and architecture overview are in [Internet Drafts][ietf-charter]. Identity, OAuth, lexicons, and private data remain in Bluesky Social's reference implementation. Expect progress. ## Trade-offs worth consideration ATproto targets key points on spectra, helping you predict if it suits your use case. _Public by default._ All records in a repo are public; private records are upcoming but not yet available. Anyone can index your stored data. _Global, not local._ The firehose model assumes everyone wants a global activity view, great for discovery but weak for small private groups. For those, ATproto may be the wrong protocol today. _Centralized in practice, decentralized in principle._ Bluesky Social and a few independent operators run the relays, PLC directory, and default PDS. The protocol allows alternatives, but few exist. Consider this a risk if your product depends on protocol neutrality. _Lexicon flexibility costs coordination._ Anyone can publish a lexicon, but adoption is social. Defining your own is easy; convincing others to read it is hard. _Storage is cheap for hosts, expensive for indexers._ Running a PDS is light, but a relay or full-network AppView requires ingesting and indexing everything, which needs serious infrastructure. ## Where the surface area is For a first-time ATproto developer, the participation surface roughly orders by effort. ### See it in action: Real AppViews Before diving into building, it's worth spending time on real AppViews to see how the architecture works in practice. **Bluesky**[bluesky-app] is the largest, most stable AppView, built on `app.bsky`. Most ATproto users start here, serving as the reference implementation. Bluesky demonstrates feed algorithms, moderation, search, and a full-featured AppView.** [**Frontpage**][frontpage] is a Hacker News-style aggregator on the `app.bsky` lexicon, operated independently from Bluesky, Inc. It demonstrates multiple AppViews can compete over the same content. Unlike Bluesky, Frontpage ranks by community voting instead of an algorithmic feed, offering a different product with the same data. These two illustrate the core benefit: using your Bluesky account and posts on either AppView without friction, data loss, or vendor lock-in. Try posting on Bluesky and viewing on Frontpage. That portability is the protocol at work. ### Building For a first-time ATproto developer, the participation surface roughly orders by effort. The lightest entry point is reading the firehose. A WebSocket subscription to a public relay provides a live stream of every signed commit, filterable by collection NSID. An afternoon with the `@atproto/api` client offers a real feel for the data. Publishing a feed generator is a step up: a small service behind an XRPC method that returns post URIs based on your logic. The Bluesky client subscribes; the AppView loads the URIs into full posts. A labeler has the same shape but emits labels instead of feed entries. Useful for category tagging, moderation, or community filtering. Defining a new lexicon and building a small AppView showcase the protocol's design advantage. A new social object (RSVPs, recipes, code reviews, etc.) gets a record type, a minimal indexer, and a frontend. A portable user base and shared identity are included automatically. Running your own PDS is the highest level of participation. The [official PDS distribution][pds-distribution] runs on a small VPS and tests the protocol's portability end-to-end. Working code is in the official documentation at [atproto.com][atproto-docs], under open-source licenses. ## The mental model to keep A user owns a signed repository on a server they can move. A relay combines many repositories into one global stream. Multiple AppViews build products on this stream. Four types of software handle identity, storage, indexing, and product, with a small enough contract for one developer to understand. That separation makes ATproto interesting. The future infrastructure operator will determine if the protocol remains decentralized in practice, not just in design. Its technical foundation is solid, the developer experience is good, and the scope for building small useful apps is narrow. If you build social software or anything requiring users to own data and identity across services, ATproto deserves a serious look. ## References * [Bluesky and the AT Protocol: Usable Decentralized Social Media][atproto-paper] is the canonical academic write-up of the design and its rationale. * The [AT Protocol developer documentation][atproto-docs] is the source of truth for specs, lexicons, and reference implementations. * [Bluesky app directory][bluesky-apps] lists all known AppViews, feed generators, and other services built on ATproto. * [XRPC specification][xrpc-spec] describes the HTTP RPC convention used for all inter-service calls. * The [Data Model reference][data-model] documents how CBOR, CIDs, and Merkle trees fit together. * The [PLC directory transition announcement][plc-directory] explains the planned move to an independent organization. * The [IETF Authenticated Transfer working group charter][ietf-charter] tracks current standardization. * [Bluesky][bluesky-app], [Frontpage][frontpage], [Smoke Signal][smoke-signal], [Tangled][tangled], [Leaflet][leaflet], and other AppViews show the range of what gets built. * [Ozone][ozone] is the open-source labeler that backs Bluesky's own moderation. * [Bridgy Fed][bridgy-fed] is the bridge between ATproto and ActivityPub. * The [official PDS distribution][pds-distribution] is the easiest path to running your own server. [atproto-paper]: https://arxiv.org/abs/2402.03239 [atproto-docs]: https://atproto.com/ [bluesky-apps]: https://bsky.app/profile/apps.bsky.social [xrpc-spec]: https://atproto.com/specs/xrpc [data-model]: https://atproto.com/specs/data-model [plc-directory]: https://docs.bsky.app/blog/plc-directory-org [ietf-charter]: https://datatracker.ietf.org/wg/atproto/about/ [bluesky-app]: https://bsky.app/ [frontpage]: https://frontpage.fyi/ [smoke-signal]: https://smokesignal.events/ [tangled]: https://tangled.org/ [leaflet]: https://leaflet.pub/ [ozone]: https://github.com/bluesky-social/ozone [bridgy-fed]: https://fed.brid.gy/ [pds-distribution]: https://github.com/bluesky-social/pds