White label video solution
Automate workflows and conversations
White label messaging app
White label telehealth
HIPAA-compliant AI medical assistant
Tools to build your own HIPAA telehealth app
Secure hosting with encryption and BAA
QuickBlox Discord
Community
Developers researching how to add video to their application often arrive with a specific term in mind — video API or video SDK — and a specific outcome they’re trying to reach: reliable, production-grade video embedded in their own product without building the underlying infrastructure themselves.
In simple terms, the answer to which one you need is usually both — because a video API handles server-side session management and a video SDK handles client-side implementation, and a complete video integration requires both layers working together.
At QuickBlox, we work with development teams across telehealth, enterprise, and digital health who arrive at this question from different directions. What follows cuts through the terminology to help you understand what each layer actually does, how they fit together, and what that means for your integration decision.
| Video API | Video SDK | |
| Primary purpose | Server-side control | Client-side implementation |
| Handles | Sessions, tokens, recordings, webhooks | Video capture, rendering, UI, connection management |
| Runs on | Backend servers | User devices |
| Best for | Custom server-side workflows | Fast application development |
| Required for production video | Usually yes | Usually yes |
The confusion between video APIs and video SDKs is understandable. Both get used loosely in vendor marketing to mean “a way to add video to your application.” The technical distinction is real but narrower than the terminology suggests.
A video API is a server-side interface. It exposes endpoints your backend calls to manage the infrastructure around a video session — creating sessions, generating participant access tokens, initiating and retrieving recordings, managing webhooks, and handling usage reporting. It is the control plane. It doesn’t capture video, render streams, or manage anything on a user’s device.
A video SDK is a client-side implementation layer. It provides the libraries your application integrates to capture media from a device, establish connections, render participant video streams, manage in-session controls — mute, camera toggle, screen share, participant management — and handle the user-facing experience of a call. It runs on your users’ devices, not your servers.
Neither one alone gives you a complete video implementation.
A video API without client-side libraries means your team builds the entire device-level implementation — media capture, stream rendering, connection management, UI — from scratch on top of the server-side control layer. That’s a significant engineering commitment.
A video SDK without server-side session management means your application has no way to create sessions, authenticate participants, or manage recordings before the call begins. The client-side experience has nowhere to connect to.
In practice, the question is rarely “API or SDK?” It’s “how much of each layer does the provider handle, and how much does my team need to build and maintain?”
A typical video call in a production application involves both layers working in sequence:
| Step | Layer | What happens |
| Session creation | Server-side (API) | Your backend calls the video infrastructure to create a session and generate access tokens for each participant |
| Authentication | Server-side (API) | Tokens are validated; participants are authorized to join |
| Connection | Client-side (SDK) | The SDK establishes the peer-to-peer or media server connection between participant devices |
| Media handling | Client-side (SDK) | The SDK captures audio and video, manages encoding, and handles adaptive bitrate based on network conditions |
| In-session controls | Client-side (SDK) | Mute, camera, screen share, participant management — all handled by the SDK on the user’s device |
| Recording | Server-side (API) | Recording initiation, storage, and retrieval managed through the server-side infrastructure |
| Session end | Server-side (API) | Session terminated, usage logged, webhooks triggered |
What this illustrates: the server-side and client-side layers handle genuinely different things. A developer who arrives looking for a “video API” typically needs the full stack — both layers — to ship a working implementation.
For most development teams adding video to an application, the API vs SDK framing matters less than two more practical questions:
Some providers give developers direct access to server-side video endpoints — creating sessions, managing recordings, configuring webhooks — through a separately documented API. Others bundle that server-side functionality into the SDK, abstracting it behind higher-level calls. Both approaches work. Direct API access gives more flexibility for custom server-side integrations. Bundled access through the SDK reduces the surface area your team needs to manage.
QuickBlox’s video infrastructure includes server-side session management, access token generation, recording, and webhooks — all accessible through the SDK rather than through a separately marketed API. Developers get the full server-side control layer without needing to integrate and maintain a separate API alongside the client SDK.
A raw video API with minimal SDK support means your team builds most of the client-side implementation — UI components, connection handling, adaptive bitrate logic, cross-platform support — from scratch. A full-featured video SDK provides that client-side layer ready to integrate and customize. The difference in development time is significant — weeks to months depending on the complexity of the implementation.
For most development teams, an SDK-first approach — where both the server-side control layer and the client-side implementation are accessed through a single integrated SDK — has practical advantages worth understanding.
Pre-built UI components, connection handling, and cross-platform support mean your team isn’t building the client-side layer from scratch. The integration surface is smaller and the path to a working implementation is shorter.
A single SDK integration means a single point of update when the underlying infrastructure evolves. Browser APIs change. WebRTC implementations get updated. Codec support shifts. When the server-side and client-side layers are maintained together by the provider, your team inherits those updates through SDK version upgrades rather than managing them independently across two separate integrations.
In regulated environments, having the server-side and client-side layers under a single BAA — rather than requiring separate compliance verification for an API layer and an SDK layer from different vendors — simplifies the compliance architecture meaningfully.
If your team needs deep server-side customization — custom session orchestration, non-standard recording workflows, integration with proprietary infrastructure — direct API access gives more control. Teams with the engineering capacity to build and maintain a custom client-side implementation, and specific requirements that a pre-built SDK doesn’t accommodate, may find a raw API approach worth the overhead.
“I need a video API, not a video SDK.” Usually what this means is: I need server-side control over video sessions — creating them, authenticating participants, managing recordings. That’s a reasonable requirement. What’s worth checking is whether those server-side capabilities are accessible through an SDK rather than a separately documented API — because in most modern video infrastructure they are. The question isn’t which term the provider uses. It’s whether the server-side control layer you need is accessible and well-documented.
“A video SDK is just for the front end.” A common assumption and consistently wrong in practice. Modern video SDKs include server-side session management, access token generation, recording infrastructure, webhook handling, and usage reporting alongside the client-side libraries. “SDK” doesn’t mean client-only — it means a packaged set of tools that spans whatever layers the provider has chosen to bundle together.
The developers who get stuck on the API vs SDK question are usually asking the right underlying question in slightly the wrong way. The real question is: what does the provider’s video infrastructure actually include, how is it accessed, and how much does my team need to build and maintain on top of it?
At QuickBlox, our video infrastructure includes server-side session management, access token generation, recording, and webhooks alongside the full client-side SDK — iOS, Android, web, React Native, Flutter. Developers access the complete stack through the SDK. There’s no separately marketed video API because there doesn’t need to be — the server-side control layer is part of the same integration.
If you’re evaluating video infrastructure and want to understand exactly what’s accessible through our SDK and how it maps to your specific use case, that’s a conversation worth having before you’ve committed to an architecture.
Explore QuickBlox Video Calling API or browse the full QuickBlox SDK documentation → to see what the integration actually looks like.
A video API is a server-side interface — endpoints your backend calls to create sessions, authenticate participants, manage recordings, and handle webhooks. A video SDK is a client-side implementation layer — libraries your application integrates to capture media, render streams, and manage the participant experience on a user's device. Most production video implementations require both layers. Modern video infrastructure providers typically bundle them together.
In most cases, both — and most providers give you both together. If you're adding video to an application and need server-side session management alongside a client-side implementation, an SDK that includes the server-side control layer is the most practical starting point. A separately documented video API makes more sense if you need deep server-side customization that a bundled SDK doesn't accommodate.
For most use cases, yes — if the SDK includes the server-side control layer. Session creation, access token generation, recording management, and webhooks are all server-side operations that modern video SDKs expose through higher-level calls rather than raw API endpoints. What you lose with a purely SDK-based approach is direct server-side access for custom orchestration — which most teams don't need.
In terms of server-side customization, yes. Direct API access gives more control over session orchestration and server-side workflow integration. In terms of total implementation effort, a full-featured SDK is significantly faster — the client-side layer comes pre-built rather than requiring custom development. The right choice depends on how much server-side customization your use case actually requires.
Mostly terminology and positioning rather than fundamental architectural difference. Providers that market a video API typically expose their server-side infrastructure directly and leave the client-side implementation to the developer. Providers that market a video SDK typically bundle both layers together. Both approaches provide access to server-side and client-side video infrastructure — the difference is in how much is pre-built versus how much your team builds on top.
A video call API is a server-side interface used to create and manage video sessions, authenticate participants, handle recordings, and integrate video functionality into an application. In practice, developers evaluating a video call API typically also need a client-side video SDK to handle media capture, rendering, and the user experience of the call.
Last reviewed: June 2026
Written by: Gail M.
Reviewed by: QuickBlox Product & Platform Team