-->
The shared workspace where full-duplex AI sees, hears, speaks, remembers, and acts alongside humans. One multiplayer surface, one permission model, one event stream. In production today, carrying paid customer workloads.
.png)
A frontier model with no real surface to act on is an F1 driver in a rental car. The talent is not the constraint. The car is. A great car turns an average driver into a fast one — and an exceptional driver into a world champion.
Collaboration tools are everywhere; most now include AI. What they lack is a persistent, programmable surface where a human and an AI share context, see the same state, and act under the same rules. Notion holds the document; its AI is a sidebar. Zoom carries the conversation; its AI is a notetaker. The AI is always adjacent to the work, never intrinsic to it.
Spacebar is one synchronized space where every signal, every action, and every participant — human or model — flows into a single event stream. It is the F1 car.
.png)
An infinite canvas that remembers exactly where everything is. Lay out documents, applications, browsers, whiteboard sketches, and video tiles where they belong, and they stay there. Close a space, come back a week later, and every object is exactly where it was left.
.png)
Cursors, voice, video, and screen share sit on the same surface as the work itself. Video tiles are objects in the space, not a frame layered over it. Nothing here is a meeting tool bolted on.
.png)
Open web apps, desktop applications, or virtual machines directly inside a space. Server-provisioned VMs streamed over WebRTC, no install. Any number of users sharing control of one live instance.

Every object, event, and stream on the canvas is observable and addressable through one SDK. space.objects.add(…) to write, space.events.subscribe(…) to read. Any service or agent joins the same way a person does.
room.objects.add(. )
// every event observable
room.speak(stream)
// every action addressableA server-provisioned browser tab that the human and the agent operate simultaneously — not a remote desktop, not a Live View onto an agent-controlled session. Both parties click, scroll, and type on the same live instance. The agent receives a continuous video feed of the human's interaction, not discrete screenshots. That signal — mouse trajectories, scroll pauses, abandoned inputs — is the telemetry that screenshot-based agents never see.
Human and agent simultaneously
Agent drives; human intervenes
Continuous video + live DOM
Discrete screenshots or a11y tree
Native browsing
Remote control
The architecture difference sits above the headless-browser primitive, not within it. Spacebar uses infrastructure like that under the hood; the distinction is in the observation and control layer built on top. Full comparison with Browserbase →
To work multimodally (conversation, sight, action) on a single surface
Voice, video, cursor, gesture, screen share, every embedded browser, every canvas object — all on one surface, through one coherent interface. Not a stack of SDKs glued together.
To see what humans are looking at, in real time
Live, structured access to every canvas object, cursor position, embedded browser, and video feed — all observable through one consistent interface.
To hear humans, including overlapping speech, while filtering out noise
Per-participant, server-side audio streams. Full-duplex. No “release the mic” turn-taking. Noise suppression and acoustic filtering applied per stream. STT provider configurable per session, not locked to one vendor.
To read embodied and social signals
Client-side perception streams: 478 face landmarks, 21 keypoints per hand, 20+ recognized gestures, attention score, engagement signals — all derived locally, in the browser. Only the results leave the device and are made available to the agent; the video stream follows standard WebRTC routing, as in any video call.
To read drawings and sketches as data, not just as images
Every stroke on the canvas is stored as a vector object — coordinates, shape, path — directly addressable through the same SDK as any other canvas element. Drawings are structured, addressable data from the moment of creation, not pixels that need interpretation.
To act on the same surface as humans, through the same interface
A symmetric API: every canvas action a human takes is also an API call, through the same surface and the same permission model. Same interface, same rules, for any actor.
To act with bounded authority
An agent takes on the full persona of the user it represents. The same role assignments, space access, and device controls that bind a person bind it.
To remember — minutes, weeks, or months
Persistent space state, full journaled history, lossless replay: every event captured in sequence, used by the substrate to reconstruct context for the agent's memory. Short-term and long-term memory layers, active in the voice + vision agent.
To run on any model, from any vendor
Pluggable STT, LLM, and TTS providers, selected per session. No single vendor is a hard dependency; when one degrades, new sessions route to another provider.
To respond fast enough that it feels like collaboration
50 ms event propagation, p99. The latency budget belongs to the model; the canvas adds effectively none.
Spacebar supports two distinct agent modes — not as features, but as first-class deployment patterns. Real-time agents join live sessions and act in the moment. Always-on agents run between sessions, on triggers, and return to a Space rather than sending a notification. Both run on the same substrate. Both are in production today.
The agent joins a Spacebar space over WebRTC as a first-class participant, on the same surface as everyone in the space. It runs a streaming voice-activity detector against the live per-participant audio streams, transcribes each participant’s speech in real time through a pluggable provider, and watches the canvas as a live visual stream. Board images are sampled at the cadence of speaking activity, so the multimodal frame budget follows the conversation rather than being exhausted during silence. It speaks back through a pluggable streaming text-to-speech provider into the shared audio mix, and retains session memory in two tiers.
.png)
Runs as a sidebar inside the canvas itself. The board image is piped into every iteration, so the model always has full visual context. Every tool routes through the same authorization layer a human action would pass through; the assistant cannot do what the user it represents is not permitted to do. It runs multi-step loops, observing the board, calling a tool, and receiving the result — until the requested change is complete or the loop depth is exceeded.
Mute everyone but one participant. Or unmute an entire space. One call.
.png)
Both are production systems, not demos: evidence that a real-time multimodal agent can participate reliably in a live, multi-participant space. We built the substrate, scaled it under real customer load, and proved the cost structure holds. The numbers are measured, not modeled.
Most proactive agent products are headless monitoring loops with notification surfaces. The agent finishes and fires off a Slack message. That model works for tasks with a single output. It breaks down the moment the task is ongoing, ambiguous, or needs a human to pick up and continue it.
Most proactive agent products today are headless monitoring loops with notification surfaces. When the agent finishes, it sends you a Slack message. Spacebar is different: when your agent finishes, it has been working in a Space. You walk back into the room it was working in — the canvas already laid out, the sources already open, the draft already there. You don't read a summary of what happened. You see it.
The agent opens the inbox, reads each RFP, checks the canvas for relevant past proposals, drafts a response, and leaves it pinned to the board — ready for your first review when you arrive.
The agent opens a Space, pulls the account history from the CRM, drafts a personalized outreach sequence, and flags the three most at-risk accounts with recommended next actions. No prompt required.
You asked it to watch. It watched. When it found something real — a 7% price increase on a critical component — it logged it on the canvas, cross-referenced your contract terms, and drafted a response for your approval.
The handoff. When the agent comes back to you, it is not a notification. It is a Space with the work already laid out — sources open, canvas annotated, next steps visible. Every other proactive agent platform sends you a summary. Spacebar puts you back in the room it was working in.
Fig. 08 · where the time goes
The first two are p50 estimates, for context. The 50 ms is Spacebar’s measured p99: even the worst case clears the others’ typical case by more than an order of magnitude. Whatever a user waits on, almost none of it is the substrate.
Building a system that holds live video, an embedded browser, a shared document, and a collaborative whiteboard in one synchronized space — all of it surviving a dropped connection — took five years. It is the infrastructure a real-time agent needs to see, hear, act, and remember alongside people, without adding latency, context loss, or broken permissions.
A custom CRDT engine on the hot path, computing minimum binary deltas from a client-supplied state vector. Runs as a compiled native service with multi-threaded execution, process-isolated from socket I/O to keep the hot path fast under load.
CRDT apply and encode operations run on a dedicated worker pool, sized to leave headroom on the main loop for socket I/O.
Hot in-memory state, backed by a shared cache, backed by versioned durable storage. Versioning invalidates the hot tier whenever a snapshot lands: warm-start by design.
Snapshot compaction runs out-of-band with cooldown to prevent cascading recompactions. The hot path never blocks on snapshot work. That is how the 50 ms p99 holds under load.
Per-space ownership locks ensure one server compacts a given space at a time; ownership tracking enables failover detection. The mechanism behind reliable sharded sessions.
Client and server each maintain expectations about the next updates; deviation is detected within milliseconds and the system recovers without losing a single state update. An immutable mutation log captures every create, update, and delete.
Model capability is converging: real-time APIs from Google and OpenAI, computer-control surfaces from Anthropic, tool-using agents everywhere. The substrate beneath them is not.
A real-time agent needs a persistent, permissioned, multiplayer surface to see, hear, act, and remember alongside people. Building one is a multi-year systems project. The opportunity cost is real: every month spent building this layer is a month not spent on your actual product. Spacebar is that surface, in production. The question is whether you build it or build on it.
You will need most of this. Even with a large team working in parallel, building it is at least two years of work. Build on Spacebar, and free that time for your product.
The hard problems in applied AI have moved. The model is rarely the bottleneck now — it's everything around it: state, permissions, the surface a human and an agent actually share. Spacebar is built for exactly this.
Most “agentic” demos fall apart the moment they meet a real, multi-party session. What's notable here is that it is already running in production, with the operating data to show for it.
We evaluated building the real-time layer ourselves and stopped counting at eighteen months. One SDK, one permission model, one event stream — that is the part nobody should be rebuilding.
Putting a human and an agent on the same surface, under the same rules, sounds obvious until you try to build it. This is the first substrate I have seen that actually treats them as equals.
Every object, event, and stream on the Spacebar canvas is observable, addressable, and actionable through one coherent SDK. Build an app, an integration, a service, or an autonomous agent — each joins the canvas the same way a person would. The protocol is symmetric: human and agent reach the same surface through the same API. Any permission a human holds, an agent can hold. The system draws no distinction between what is available to a person and what is available to a program.
// Make any web component multiplayer in a Spacebar space.
// @pncl/mario — Spacebar's real-time SDK
import { MarioClient } from "@pncl/mario";
const space = await MarioClient.join("space_8KXq...");
space.bind(myComponent);
// every state change now syncs to everyone in the space.
// presence, CRDT conflict resolution, cursors, undo/redo,
// version history, and snapshot recovery: free.// Observe everything happening in a space, in real time.
import { MarioClient } from "@pncl/mario";
const space = await MarioClient.join("space_8KXq...");
space.events.subscribe(event => {
// event.kind: "cursor" | "draw" | "speak" | "type"
// | "object.add" | "object.move" | ...
// event.actor, event.timestamp, event.payload
});// Drive the canvas the same way a human would.
await space.objects.add({
kind: "stickyNote",
x: 320, y: 480,
text: "Try this approach instead."
});
await space.objects.move("obj_a91...", { x: 600, y: 480 });
await space.speak({ stream: ttsStream });
await space.browser.type("doc_b14...", "Hello.");An agent built against this SDK is not integrated into the canvas. It is a participant in it, with the same reach and the same limits as the person sitting next to it.
Most real-time AI platforms are rigid: the surface they ship is the surface you get. Spacebar is designed to be extended. Everything in a space — presence, state, events, audio, browser, memory — is observable and writable through the same SDK your agent uses. Adding a connector is adding a participant that speaks a specific protocol. That is all it is.
A model deployed on Spacebar reaches the systems your customers use. External agents connect through the protocol they already speak. Wherever the work goes — desktop, mobile, any browser — the substrate follows.
Any agent that speaks MCP can speak Spacebar out of the box. Exposes users, spaces, sessions, recordings, transcripts, presence, audit logs, billing, and scheduling availability — the full operational context — through standard MCP protocol. No custom integration code required.
Every connector on this list was built using the same public SDK and REST API available to you. If a connector does not exist yet, building one takes hours, not weeks. The event model is uniform: subscribe to anything, write to anything, from any runtime.
HMAC-signed outbound events on session, recording, and presence state. Drop a URL and start receiving structured payloads immediately — no polling, no SDK required on the receiving end.
Adapter bots join the meeting tool your team already uses — Zoom, Meet, Teams — bringing the canvas and the agent surface with them. Your AI joins the call as a participant, not a sidebar.
Surface a Spacebar sidebar on any webpage. Useful for building context-aware agents that work alongside users in their existing browser workflows, without redirecting them to a new URL.
Bi-directional session sync. Sessions appear on the user's calendar; calendar events can trigger space creation. Two lines of configuration, not two weeks of integration.
iOS and iPad with native screen capture over WebRTC. Full canvas on Android through Chrome or any modern browser. Browser-based delivery is intentional: browsers are automatable, which matters for agent integrations running on mobile surfaces.
Pencil Spaces — our own platform, built entirely on Spacebar — has run paid customer workloads for over four years. The numbers are measured, not estimated: per-session cost, failure modes, cost structure, and uptime. Across that four-year window, availability has held above 99.99%. Status page → Technical evaluators: reach out to partnerships@spacebar.ai for the full data package. Spacebar monetizes as infrastructure: usage-based pricing for builders, enterprise licensing for deployers, and trajectory data licensing for frontier labs. Working with frontier labs on training data? See spacebar.ai/labs →
Scoped agency. An agent takes on the full persona of the user it represents and nothing more. The same role assignments, space access, and device controls that bind a person bind it. The authorization model prevents privilege escalation: an agent cannot exceed its delegated permissions.
Terraform-managed regional infrastructure: GKE, object storage, and cache. JWT-scoped real-time channels, validated at every message, not just at connection time. Perception models run in-browser: face landmarks and gesture signals are derived locally and only the results leave the device. The video stream follows standard WebRTC routing.
Not every use case fits the standard platform. When yours does not, we build to specification.
A bespoke environment for teams building where the field is still open: custom memory architecture, full session instrumentation, configurable tool sets, and a shared surface where your model works alongside your researchers in real time — observable, steerable, and debuggable at every step.
A controlled environment for showing your system to investors, customers, or conference audiences. Custom canvas layout, branded, silent by default — only what you want visible, nothing you do not. Live in front of anyone.
An AI agent participates in a real consultation alongside a professional — seeing the same documents, hearing the same conversation, assisting in real time. Role-scoped, fully auditable, HIPAA-ready on request. Built for the constraints of regulated industries.
Every custom build starts with a conversation. Tell us what you need →
Spacebar is a runtime for real-time, multimodal AI — a shared, multiplayer canvas where full-duplex AI agents and humans work together on the same surface, under the same permission model, through the same event stream. It ships with a live canvas, an embedded shared browser, real-time voice and video, persistent memory, and an SDK for building agents that participate as first-class members of any session.
Primarily AI engineers and product teams building real-time, multimodal agent experiences — anything where an AI needs to see, hear, speak, act, and remember alongside people in a shared workspace. If your use case involves live collaboration between humans and AI, Spacebar is the substrate. We also work with frontier AI labs on training data infrastructure, and with enterprises needing custom deployments for specific workflows.
An LLM API gives you inference — a prompt goes in, a response comes out. Spacebar gives the model a workspace: a live canvas, an embedded browser, shared documents, real-time audio and video, persistent memory across sessions, and a permission model that controls what the agent can and cannot do. The model is not the constraint. The surface it operates on is. Spacebar is that surface.
Both. Pencil Spaces — our own platform, built entirely on Spacebar — is a product running in production with paid customers. The Spacebar SDK, event stream, and permission model are infrastructure that other teams build their own products on. You can use Spacebar as a consumer product, build on it as a developer, or deploy a custom version through our engineering partnership.
Full-duplex means the AI and the human can speak, act, and listen simultaneously — there is no push-to-talk, no turn-taking enforced by the system. The agent hears you while you hear it. It can interrupt if something changes on the canvas; you can redirect it mid-sentence. Full-duplex is a prerequisite for real-time collaboration; without it, you have a chatbot, not a participant.
Every action in a Spacebar session — a cursor move, a voice turn, a canvas object added, a browser click, a tool call — is emitted as an event into a single ordered stream. The agent subscribes to this stream and can act on any event in real time. This is what makes the agent a true participant rather than an integration: it observes the same stream every human participant observes, and its own actions appear in the same stream.
Any permission a human holds, an agent can hold — the system draws no distinction between what is available to a person and what is available to a program. An agent takes on the full persona of the user it represents and cannot exceed the permissions assigned to that user. Role assignments, space access, and device controls all apply equally. There is no privilege escalation path through the API.
Event propagation is 50 ms at p99 — measured, not modeled. Space cold-start is under 2 seconds at p99. The latency budget belongs to the model and the network; the canvas itself contributes effectively none. Whatever a user waits for, almost none of it is the substrate.
Yes. Spacebar supports fully autonomous agent sessions — no human in the room required. The agent can open a Space on a schedule or in response to a trigger, work through a task using the full canvas, browser, and tool set, and leave the completed work on the canvas for a human to review when they return. This is the proactive agent model: the Space is the durable memory and the handoff surface.
Time-based schedules, external webhooks, threshold conditions on monitored values, and document or web page change detection. Any signal expressible as a condition can trigger a Space to open and an agent to begin working. The agent then operates with the full capabilities of the platform — canvas, browser, voice, tools, and memory — until the task is complete or the loop depth is exceeded.
Dust is a knowledge orchestration and workflow platform — connect your Slack, Notion, and CRM, build agents that understand your company's data, and deploy them across teams. It is strong at that. The fundamental difference is the workspace. Dust's collaboration surface is chat-based and asynchronous; Spacebar's is a live, real-time canvas with video, browser, and voice. When a Dust agent finishes a proactive task, it sends a message. When a Spacebar agent finishes, it has been working in a live Space — the canvas is annotated, the browser is open to the relevant page, the next steps are visible. You return to the room, not a report.
Terraform-managed regional GKE on Google Cloud, with regional object storage and cache. The CRDT engine runs as a compiled native service with multi-threaded execution, process-isolated from socket I/O. Media is delivered over WebRTC. Availability has held above 99.99% across a four-year production window, verified on our public status page.
SOC 2 Type II, HIPAA, and GDPR — all current. Tenant isolation is enforced at every layer: JWT-scoped real-time channels validated at every message, not just at connection time, and a permission model that prevents any agent or user from accessing another tenant's data.
Install @pncl/mario — Spacebar's real-time SDK — from npm. The SDK gives you MarioClient.join() to connect to a Space, space.objects to read and write canvas state, space.events.subscribe() to receive the full event stream, and space.bind() to make any existing web component multiplayer. Reach out to engineering@spacebar.ai for access and documentation.
We partner with teams to build Spacebar-powered environments tailored to specific use cases — research substrates with custom memory architecture, demo environments for investor and customer presentations, consultation rooms for regulated industries, and bespoke multimodal agent deployments. Every custom build starts with a conversation. Contact us through the form below or at partnerships@spacebar.ai.
Tell us what you’re building and we’ll route you to the right person. Or reach out directly. Whichever is easiest for you. If your use case calls for something beyond the standard platform, we build to specification.
Spacebar was built by the same team behind Pencil Spaces — four years of production at scale, carrying real customer workloads. The substrate was not designed in theory.
Head of Product, Map Ads at Google. Head of Enterprise Products at Meta. Venture investing at Madrona. McKinsey.
LinkedInSenior Staff Software Engineer at Google, leading systems handling hundreds of thousands of queries per second at low latency. IIT Bombay MTech.
LinkedInMcKinsey & Company, QuantumBlack. MEng with distinction, University of Cambridge. Cambridge–MIT Exchange in Computer Science.
LinkedInOperations and growth across the full Pencil Learning Technologies portfolio. Four years scaling Spacebar and Pencil Spaces from zero to production.
LinkedIn