What is WebRTC protocol?

WebRTC or Web Real Time Communication protocol is an open source protocol and technology that enables real time communication directly between web browsers and webRTC enabled applications

Using WebRTC you can do video calling, audio calling and data transfer between devices.

This capability is implemented using a set of JavaScript APIs that enable video, audio and data transmission between devices. These APIs include ICE, STUN, TURN, NAT and SDP

We are going to learn more about these protocols below

ICE (Interactive Connectivity Establishment)

ICe is a protocol that is used to find the best path to reach devices that is to establish a connection between devices.

ICE is used to navigate a best path through NAT routers and firewall rules. It overcomes the connectivity barriers introduced by NAT and firewall rules using the STUN and TURN servers

How does it work:

ICE gathers all the candidates for the media streams, that is the potential paths between devices trying to connect

It first tries a direct connection using STUN servers to find the client device IP addresses, if that fails this is due to NAT devices or firewall rules then it tries to connect using relays around NAT that is the TURN server

If you are looking for ICE servers and want to know more about ICE then refer to our article: Interactive Connectivity Establishment (ICE) Server: The Complete Guide

If you are looking for a TURN server for your app, then you can consider the Metered TURN servers, A global TURN server service provider.

If you are looking for a list of ICE servers

STUN server (Session Traversal Utilities for NAT)

STUN servers are used by devices that are behind a NAT to find out what their public IP address and port number is

Devices that are behind a NAT have private IP address assigned to them by the NAT router.

And all the traffic of all the devices that are behind a perticular NAT is routed through a single or a few public IP addresses

When devices want to connect with each other directly they want to know what is their own and others public IP and port number is

These client devices use STUN server to find out their own public IP and port number when they then communicate to (send to ) each other so as to establish a direct line of communication

A client device sends a request to the STUN server when then replies back with the IP address and port number from which the request came from

There are a lot of free and paid STUN servers available. Google also provides free stun servers for public usage google stun server list

NAT Network Address Translation

NAT or Network address translation is a method by which NAT devices use a single or a few public IP addresses to channel traffic to and from multiple devices which are behind it (These devices are give private IP and port number by the NAT device)

This process was invented to conserve limited number of IPv4 addresses, you can learn more about NAT and how the NAT or Network address translation works here: NAT traversal: How does it work?

TURN (Traversal Using Relays around NAT)

TURN relays the data for WebRTC connections when direct peer to peer connection is not possible due to NAT or firewall restrictions

TURN servers relay traffic between peers when direct connection between them fails

It is used as a last resort in the ICE server when direct communication between devices fail

TURN servers are resource intensive and require a lot of bandwidth and cpu to function

TURN servers need to be near your users hence you require TURN servers all over the globe if your users are distributed

If you are looking for a global turn server provider then you can consider Metered.ca TURN servers

SDP (Session Description Protocol)

SDP is a standard for describing multimedia communicatoin sessions for the purpose of

Session announcement
session invitataion
and other forms of multimedia session initiation

SDP protocol itself does not deliver media streams or transport data. It just describes the format for session descrition that will convey information about the media steams in multimedia sessions to help the devices reviece any perticular multimedia stream

Purpose of SDP

SDP was designed to be extensible and works with varied network environments and formats

It is used to describe the multimedia communication and to control the logistics of connectivity and media exchange

Structure of SDP

SDP describes multimedia sessions using plain text encoding with simple sintax

An SDP message has text in the form of type=value where type is a single char that signifies the type of the field and value is a structured text string

These messages are typically transportd with other protocols such as SIP Session Initiation Protocol or as a part of the WebRTC signalling process and establishing a new connection

Here are some of the key components of SDP

Version: This shows the version of SDP that is being used
Origin: Identifies the initiator of the session and the session identifyer
Session Name Provides a human readable name for the session
Timing: Describes the start time and the stop time for the session
Media Descriptions: Describes the Media components of the session, including media type that is audio, video text, port, protocol and other media formats theat are being used.

Role of SDP in WebRTC

SDP plays a part in the offer/answer model, this is a fundamental signalling mechanism in webrtc which is used to establish a connection between peers

Here is how SDP works in webrtc

Offer/Answer: Here one client generates an SDP offer and sends this to the other client device with whom it wants to establish a connection

The other client then responds with an answer. This exchange describes the proposed media capabilities at both the client devices such as

supported codes, media types and encryption requirements for establishing a connection

Negotiation:

The SDP exchange includes negotiation between the clients about which codes and encryption requirements are supported by both and can be used to establish a connection.

ICE candidates:

SDP also conveys the ICE candidates in webrtc. These ICE candidates describes the potential pathways that can be taken to establish the connection in webrtc including STUN and TURN server addresses.

The SDP is dynamically updated during the ICE candidates gathering phase with addresses for STUN and TURN server connections from both the client devices.

MediaStream

The MediaStream API is a an important component of the WebRTC suite of APIs/

This API manages the flow of data related to media such as audio and video, with the help of media stream api a broad range of appliations can function like video streams, video calls and audio calls

MediaStream represents multiple streams of media such as multiple audio and video tracks that are synchronized for a seamless experience.

These streams can come from multiple sources such as microphones, cameras, screen recorders and even pre recorded media

These streams are then transmitted between peer devices for real time communication

Key features of MediaStream API

Stream Capture

The MediaStream API can capture the media stream from a user device. This is done with the help of getUserMedia() method.

This method asks the user's permission to access the microphone and camera inputs and returns a MediaStream containing the requested media types

Track Manipulation:

The MediaStream returned by the getUserMedia() function contains multiple tracks such as audio tracks and video tracks and these tracks can be indivdually manipulated as required

For example you can easily enable and disable individual tracks thus muting a user or disable their video output etc

Stream Combination:

As we know there are multiple mediastream objects or tracks as we have seen above, these objects can be combined into a single stream of data or can also be seprated and individually manipulated as required

These tracks can be removed from one stream and be added to another stream , thus allowing for dynamic reconfiguration during a video call amoung many participants

Cloning:

MedisStreams can also be cloned, this is perticular useful where the same media stream needs to used in multiple cpntexts simultaneously.

for example a single meia stream needs to be shown to multiple users in a video call and also has to be recorded for fututre referene

This stream can aso be encoded and manipulated as the user wishes without affecting the original stream

Compatibility and constraints:

The API provides ability to have constraints on the media stream, these could be a reduction in the video quality or noise supperation for audio

This allows you to specify the media capture needs according to your application and client device compatibility and performance

Practical use cases for MediaStream API

Video Conferencing

You can conduct video conferencing with the mediastream api, capture camera and audio streams of multiple participants and show it to other participants

Media Recording

You can combine the MediaStream api with the MediaRecorder API and record the stream locally in the browser or have features like session recording

Real Time Media Processing

Media Stream can be processed in real time to apply affects, change the resolutions, and perform analytics and any other things that you want to do

BroadCasting

MediaStreams can be be broadcasted to a loarge audience over the internet through media servers, you can also live stream events by using webrtc to record the camera and audio then using the media servers to broadcast the MediaStream on the internet

RTCPeerConnection

RTCPeerConnection is one of the core components of WebRTC suite of APIs.

The primary function for the RTCPeerConnection as the name implies is to establish and maintain a connection.

This connection allows direct exchange of data between client devices without the need for an intermediary (that is apart from the initial signalling process)

Everything handled by the RTCPeerConnection includes things like negotiating the connection details, managing the media and data transfer once the devices are connected.

Key features of RTCPeerConnection

Connection Setup:

RTCPeerConnection handles all the negotiation with regards to the media and the network details that is required to setup a connection between two devices.

These details include the offer/answer model and the ICE candidates. These details need to be communicated between peers through a signalling server.

Signalling

While RTCPeerConnection does not perform signalling, it generates the data that is required to send by the signalling server

This data includes the offer/answer and the ICE server candidates. The signalling process is important for establishing a connection between client devices and the RTCPeerConnection generates the data for the signalling server

NAT Traversal:

Using the ICE and STUN and TURN servers RTCPeerConnection finds the best possible way to establish a connection between devices

If you are looking for a STUN and TURN servers then you can consider the metered.ca turn servers

Media Stream Management

Once the connection is establihed the RTCPeerConnection manages the media streams that are provided by the media stream api.

The RTCPeerConnection controls the flow of streams to and from the client devices.

Data Channel Setup

The RTCPeerConnection can establish data channels using the RTCDataChannel API

Using the RTCDataChannel API any arbitary data can be transferred between devices thus you can build any application using the webrtc

Encryption:

All the data transmitted is encryted by the RTCPeerConnection using DTLS encryotion protocol.

This ensures that all the communication is safe and secure

Bandwidth Management

Using RTCPeerConnection you can use the inbuilt mechanisms to manage bandwidth consumption based on factors like your application requirements and network requirements.

RTCDataChannel

RTCDataChannel is an important component of webrtc API. It enables bi directional transfer of data between devices using webrtc

Using this featues developers can build a wide range of applications apart from the video and audio calling for which traditionally webrtc has been used

Developers can build apps like chat apps, collaborative whiteboard and other collaborative apps, file sharing services

The data channel is designed to be highly flexible and supports both highly reliable data delivery as well as unreliable data delivery with low overhead.

The data channel can be configured to suit all kinds of data transfer needs

Key features of RTCDataChannel

Bidirectional and Peer to Peer

Data Channel allows for direct peer to peer and birectional transfer of data between client devices

Unlike media streams that are used for video and audio data transfer the RTCDataChannel and pretty much transfer anything you throw at it

Configurable transport

There are two modes of transport available iwth RTCDataChannel. 1. Reliable mode, where data is gauranteed to arrive in the order it was sent but it has a heavy load with it. 2 the unreliable mode which is quite lightweight but the data is not guaranteed to arrive at all.

Integration with RTCPeerConnection

Data channels are established using the same RTCPeerConnection and utilize the same channels for communication as the other webrtc media apis and thus use the same TURN servers for communications

Security

Similar to other webrtc apis the RTCDataChannels are encrypted using the DTLS encrption for end to end encryption security

Low Overhead

The RTCDataChannels use SCTP that is stram control transport protocol over DTLS and UDP

This comnination provides a balance of low latency and reliability over TCP based real time communication solutions

Practical Applications for RTCDataChannel

Chat Application
Collaborative tools
File Sharing
Gaming

Metered TURN servers

API: TURN server management with powerful API. You can do things like Add/ Remove credentials via the API, Retrieve Per User / Credentials and User metrics via the API, Enable/ Disable credentials via the API, Retrive Usage data by date via the API.
Global Geo-Location targeting: Automatically directs traffic to the nearest servers, for lowest possible latency and highest quality performance. less than 50 ms latency anywhere around the world
Servers in all the Regions of the world: Toronto, Miami, San Francisco, Amsterdam, London, Frankfurt, Bangalore, Singapore,Sydney, Seoul, Dallas, New York
Low Latency: less than 50 ms latency, anywhere across the world.
Cost-Effective: pay-as-you-go pricing with bandwidth and volume discounts available.
Easy Administration: Get usage logs, emails when accounts reach threshold limits, billing records and email and phone support.
Standards Compliant: Conforms to RFCs 5389, 5769, 5780, 5766, 6062, 6156, 5245, 5768, 6336, 6544, 5928 over UDP, TCP, TLS, and DTLS.
Multi‑Tenancy: Create multiple credentials and separate the usage by customer, or different apps. Get Usage logs, billing records and threshold alerts.
Enterprise Reliability: 99.999% Uptime with SLA.
Enterprise Scale: With no limit on concurrent traffic or total traffic. Metered TURN Servers provide Enterprise Scalability
5 GB/mo Free: Get 5 GB every month free TURN server usage with the Free Plan
Runs on port 80 and 443
Support TURNS + SSL to allow connections through deep packet inspection firewalls.
Supports both TCP and UDP
Free Unlimited STUN

You can also consider reading some of our other articles

Metered | Blog

Metered