Summary: WebRTC has a strong security reputation — and it’s earned. Every audio call, video stream, and data channel is encrypted by default, and the standard doesn’t give you a way to turn that off. But encrypted media is only part of the picture. What sits around WebRTC — signalling, authentication, infrastructure, compliance architecture — is entirely your responsibility to secure. This article covers both sides of that line: what WebRTC handles automatically and what it leaves to you.
WebRTC has an unusually good reputation for security, and for once the reputation is deserved. But “WebRTC is secure” is a statement about a protocol — not about an application. Developers who build on WebRTC sometimes treat the protocol’s security as a property of their application, which it isn’t. The confusion is understandable, and it has consequences.
This article covers what WebRTC actually does: mandatory, non-negotiable encryption of all media and data channels. And what it doesn’t do: everything that sits around that encryption, which is your responsibility to build and secure. If you’re new to WebRTC and want to understand how the protocol works before getting into the security specifics, Real-time Communication with WebRTC covers the foundational context — this article focuses on the security layer specifically.
All audio and video streams in a WebRTC session are encrypted using SRTP. All data channels are secured via DTLS. This is not a setting you configure or a compliance tier you opt into — the WebRTC specification explicitly prohibits unencrypted media transmission. Any compliant implementation, in Chrome, Firefox, Safari, Edge, or any standards-compliant mobile browser, enforces this automatically. You cannot send unencrypted media over WebRTC. The standard doesn’t allow it.
That said, encrypted media transport is not the same as a secure application. WebRTC handles its layer well. What sits around it — signalling, authentication, data storage, the application itself — is your responsibility. The rest of this article covers both sides of that line.
WebRTC uses two protocols to handle encryption, and the relationship between them is worth understanding before we get to where things can go wrong.
DTLS — Datagram Transport Layer Security — is the handshake layer. Before any media flows between two devices, DTLS establishes a secure connection and negotiates the encryption keys that will protect the session. Think of it as the equivalent of TLS for real-time media, adapted to work over the UDP-based connections that low-latency communication requires. During the DTLS handshake, each party exchanges certificate fingerprints, and those fingerprints are verified against the session description exchanged during signalling. This verification step is what prevents a man-in-the-middle from intercepting or substituting keys — and as we’ll come to, it only works if the signalling channel itself is secure.
SRTP — Secure Real-Time Transport Protocol — is what actually encrypts the media once the session is underway. Every audio packet, every video frame, travels over SRTP. The encryption keys used by SRTP are derived during the DTLS handshake — a mechanism called DTLS-SRTP — which means the keys are never transmitted in a form that could be intercepted. The WebRTC specification also prohibits SDES, the older method of exchanging keys in session descriptions, because it created exposure even over otherwise secure channels.
Data channels — WebRTC’s pathway for non-media data exchange — are secured via DTLS independently of the media streams. All three paths are encrypted. None of them are optional.
The WebRTC specification states explicitly what is mandatory and what is prohibited:
These are not best practices. They are requirements baked into the standard itself.
A significant number of developers searching for WebRTC security information arrive at a specific question: does WebRTC require the encrypted-media permissions policy, and what happens if it’s blocked?
The short answer is no — and the confusion, while understandable, comes from two unrelated browser features sharing similar-sounding names.
The encrypted-media permissions policy governs Encrypted Media Extensions (EME), a separate browser API used for DRM-protected content — streaming video from services like Netflix, for example. It has nothing to do with WebRTC’s own encryption mechanisms.
If you’re seeing errors related to permissions-policy: encrypted-media in a WebRTC application, the cause is almost certainly elsewhere. A Content Security Policy blocking something WebRTC depends on is the most common culprit, or a browser configuration restricting camera or microphone access through a separate permissions mechanism. Blocking encrypted-media in your permissions policy will not affect WebRTC media encryption, which operates through DTLS-SRTP at the protocol level regardless of permissions policy configuration.
WebRTC’s encryption is mandatory and handled at the protocol level. It does not pass through or depend on the permissions policy framework.
Understanding the boundary of WebRTC’s security model matters as much as understanding what it does protect.
WebRTC strongly recommends, and most browser implementations require, that signaling use TLS or QUIC. But signalling is not part of the WebRTC specification itself. It’s the application’s responsibility to ensure that session negotiation happens over a secure channel.
Here’s how the failure mode typically plays out. A team builds a WebRTC application — video consultations, say, or a financial services communication tool. Media is encrypted. DTLS handshakes are completing successfully. From the application’s perspective, everything looks secure. What the team hasn’t done is secure the signaling server with TLS.
During session negotiation, the signaling server exchanges the session description between participants — including the certificate fingerprints that DTLS will use to verify the connection. On an unsecured signaling channel, that exchange is visible. An attacker positioned between the client and the signaling server can intercept the session description and substitute their certificate fingerprints before it reaches the other participant. When DTLS runs its handshake, it verifies the fingerprints — but it’s verifying the attacker’s fingerprints, not the legitimate participant’s. The handshake completes. The session is “encrypted.” The attacker is in the middle of it.
DTLS-SRTP did exactly what it was designed to do. The signalling layer is what failed — and because DTLS depends on signalling integrity to do its job, the entire security model unravelled at a point when the team wasn’t watching.
In a peer-to-peer call between two participants, end-to-end encryption holds — only the two participants hold the keys. When a media server is involved — for group calls, recording, or transcription — encryption terminates at the server. The server decrypts incoming streams and re-encrypts outgoing ones. This is a deliberate architectural trade-off, not a vulnerability, but it means the security of the media server itself becomes part of your security model. In a healthcare or enterprise context, this is a due diligence question as much as a technical one: who operates your media server, what are their security certifications, and what does their access to unencrypted session audio mean for your compliance posture?
WebRTC uses STUN and TURN servers to establish connections across NAT and firewalls. These servers are infrastructure you operate — or choose a vendor to operate — and they carry the same security obligations as any other component in your stack. TURN servers in particular relay media when direct peer-to-peer connection isn’t possible, which means they sit in the media path. People don’t discuss unsecured TURN servers as often as they should, even though they are a real exposure point.
The protocol verifies that the party you’re communicating with is the same one you negotiated the session with, but it doesn’t verify who that party is. User authentication, session authorization, and identity management are application responsibilities. WebRTC does not prevent an unauthorized user from initiating a session if your application permits it.
If your application notifies users of incoming calls when the app isn’t active, that channel is entirely separate from WebRTC’s encryption. In a general application, it is standard practice to include the caller’s name or other identifying information in the notification payload. In healthcare or financial services contexts, that same information may constitute protected data traveling over a channel with entirely different security properties than the call itself — something we’ll come back to in the regulated industries section.
Messages, recordings, session metadata, and file attachments stored after a session depend on whatever security architecture your application provides — or doesn’t. WebRTC does not record or store anything. That’s the application’s responsibility.
This is what WebRTC’s “zero trust” design philosophy means in practice: each component in the stack implements its security independently. The security of a WebRTC-based application is the sum of how well each layer is implemented — not something WebRTC guarantees end-to-end.
WebRTC’s mandatory encryption makes it a strong foundation for communication in regulated environments — healthcare, financial services, legal — where data protection requirements aren’t negotiable. Encrypted media in transit is one of the most fundamental compliance requirements, and WebRTC handles it by default.
However, default media encryption alone does not make an application compliant. Regulated industries don’t bring anything special about WebRTC itself — the compliance requirements attach to the application, not the protocol.
In healthcare, HIPAA compliance extends to every component that touches protected health information. Signaling servers, message storage, session logs, file attachments, and the infrastructure underneath the application all carry compliance obligations. A WebRTC-based telehealth application isn’t HIPAA compliant if the signaling server logs unencrypted session metadata, regardless of what WebRTC does at the media layer.
The details matter in ways that aren’t always visible at the start. Call recordings require patient consent, appropriate storage, encryption at rest, and audited access — none of which WebRTC provides. Transcription introduces further considerations: if a third-party LLM or edge model is processing call audio, that vendor relationship carries its own compliance assessment. And push notifications — the channel that alerts a user to an incoming call when the app isn’t active — need to be evaluated for PII. Including a patient’s name or appointment details in a notification payload is unremarkable in a general application; in a healthcare context, that information may be protected health information traveling over a channel with entirely different security properties than the call itself.
For a detailed look at what HIPAA compliance requires at the communication infrastructure level, see What Is a HIPAA-Compliant Chat API? and What Is HIPAA-Compliant Video Conferencing?
WebRTC’s mandatory encryption handles the media layer. The application layer is yours to secure.
The most critical thing to get right is signaling channel security. As the scenario above illustrates, an unsecured signaling channel doesn’t just create a parallel vulnerability — it undermines DTLS-SRTP specifically, because DTLS certificate verification depends on the integrity of the session description exchanged during signaling. TLS on your signaling server isn’t one item on a security checklist. It’s the prerequisite for the rest of WebRTC’s security model to function as designed.
Authentication is the next layer. Token-based authentication tied to your existing identity system is the standard approach — establish it before any WebRTC session is initiated. The protocol will connect whoever presents themselves. Verifying that they’re authorized to be there is the application’s job.
Beyond those two, a few areas that deserve attention:
STUN/TURN infrastructure carries the same security obligations as any other component in your stack. Ensure access is controlled, connections are authenticated, and your TURN servers are kept current. An unsecured TURN server sitting in the media relay path is a real exposure point.
Media server security matters in proportion to what that server can access. If it’s handling group calls or recording, encryption terminates there — harden it, control access to it, and ensure it meets whatever regulatory requirements apply to your deployment.
Audit logging doesn’t come from WebRTC. If your deployment requires a record of who communicated with whom, when, and what was shared, that logging needs to be built into your application and infrastructure layer.
Browser updates are part of your security maintenance. WebRTC security depends on the browser implementations that enforce it. Keeping your application compatible with current browser versions isn’t optional housekeeping — it’s how the communication layer stays current with security patches.
For a broader look at the infrastructure decisions that sit around WebRTC — signaling architecture, media servers, TURN servers, compliance considerations — see What Is Real-Time Communication Infrastructure?
The security boundary this article describes — where WebRTC’s guarantees end and application responsibility begins — is precisely where infrastructure decisions become consequential. QuickBlox provides WebRTC-based video and voice infrastructure with HIPAA-compliant hosting, encrypted media transport, and a BAA that covers the communication infrastructure specifically, not just the hosting environment. Flexible deployment options, including private cloud and on-premise, allow you to meet data residency requirements without architectural compromise.
If you’re evaluating WebRTC-based communication infrastructure for a regulated environment or want to understand where the security boundary sits in a specific deployment, explore the QuickBlox Video Calling API or browse the full SDK documentation.
Yes. WebRTC mandates encryption for all media and data channels. Audio and video streams are encrypted using SRTP; data channels are secured via DTLS. The WebRTC specification explicitly prohibits unencrypted media — this is a requirement of the standard itself, not a configuration option. Any compliant WebRTC implementation enforces it automatically.
DTLS-SRTP is the mechanism WebRTC uses to establish and protect media encryption. DTLS handles the handshake that negotiates encryption keys between participants. SRTP uses those keys to encrypt the actual media streams. Together they ensure that keys are never transmitted in a form that could be intercepted, and that all media is encrypted before it leaves the device.
No. The encrypted-media permissions policy governs Encrypted Media Extensions — a separate browser API used for DRM-protected streaming content. It has no connection to WebRTC’s own encryption mechanisms, which operate through DTLS-SRTP at the protocol level regardless of permissions policy configuration.
In peer-to-peer calls between two participants, yes — end-to-end encryption holds throughout the session. When a media server is involved — for group calls, recording, or transcription — encryption terminates at the media server, which decrypts incoming streams and re-encrypts outgoing ones. The media server becomes a trusted entity in the security chain. This is a deliberate architectural trade-off that most production group communication systems make.