Build Video Chat App On AWS with Amazon API Gateway WebSocket, AWS Lambda and Amazon DynamoDB

I'm Deer - AWS
18 min read6 days ago

Overview of WebRTC

WebRTC is an open-source project that enables you to communicate in real-time via APIs built-in on the browser. It uses a direct Peer-to-Peer connection to transfer video and audio streams right inside the browser without installing any extensions or third-party plugins.

WebRTC is usually implemented in three ways(or three architectures). The first one is Peer-to-Peer, which establishes a connection between each pair in a room. For example, if room #1 has three members joined: A, B and C, then the connections between A — B, B — C, and A — C will be established and enable the media stream(audio and video) from A, B and C to be sent to the other person. This is also known as the Mesh network, where each participant is directly connected to every other participant in the room.

The second and the third ones are MCU (Multipoint Control Unit) and SFU (Selective Forwarding Unit). Both architectures share the same commonality: they both require a central server to facilitate communication among the participants in the room. The MCU and SFU are designed for Video Conference applications or Live Streaming applications which have a large number of members in a room, while Peer-to-Peer is the architecture for building a one-to-one Video Chat App which just has only two or three people in a room, and not suitable for a meeting with hundreds of participants.

The Peer-to-Peer model does not require any servers to manage the room centrally. However, a signalling mechanism is required to create connections between participants. This can be achieved using a channel that facilitates the exchange of Invitation(or Offer) and Acceptance(or Answer) for a video chat. Examples of such channels include NodeJS Socket.IO, .NET SignalR, or other options. In this post, we will use the Amazon API Gateway WebSocket API with AWS Lambda backed as a signalling channel to build our Video Chat App.

Mesh, MCU and SFU WebRTC architecture

Breaking down the application architecture

Let’s take a look at the architecture that we will implement, then I will explain to you the flow of it.

The overall system architecture

First, you can see that we have AWS Amplify, which acts as the Frontend Application in our architecture. Amplify is a set of tools that streamline full-stack app development, allowing us to do more with less. One of its many features is supporting application hosting with HTTPS/TLS. This is essential for our architecture because we are using a WebSocket API to enable real-time communication among the participants, so security is very important. Additionally, WebRTC, the technology used for video chat, typically use HTTPS/TLS for security reasons. By hosting our application on Amplify, we ensure the WebSocket connection is secure and leverage the full capabilities of WebRTC.

Next, the Amazon API Gateway is the core of our system. With API Gateway, we can construct a wide range of APIs, such as REST API, HTTP API or even WebSocket API. To build a real-time video chat app, we need to leverage the WebSocket API, which allows for bidirectional communication with the client. This means that a client can send a message to a service, and services can independently send messages to clients. This bidirectional behaviour enables richer client/services communication because services can push data to clients without requiring clients to make an explicit request. For example, in the chat app, a user in a room can get notified by WebSocket API when someone else joins the same room without making any requests to the Chat/Room service.

The API Gateway WebSocket receives requests from clients and routes them to Lambda functions for processing. Before routing the request, API Gateway performs several steps such as request validation, authentication and transforming the request body to a suitable format. API Gateway WebSocket uses a table that maps route keys to the corresponding backends. A route key is a pattern that is matched against the incoming request to determine the appropriate backend. A backend can be a Lambda function, an HTTP Endpoint or an AWS Service. There are three predefined routes, $connect, $disconnect and $default. The $connect route is selected when the client initializes a connection to API Gateway. The $disconnect route is selected when the client closes a connection, and the $default route is selected when the route selection expression can’t be evaluated against the incoming request or if no matching route is found.

Besides the predefined routes, you can create your own routes, such as joinRoom or sendMsgToRoom. API Gateway selects the appropriate route by evaluating the route selection expression. For example, a client sends a message {“action”: “joinRoom”, “roomId”: “123” } to API Gateway, and the route selection expression we set to $request.body.action. A route selection expression is a JavaScript expression that evaluates to a string representing a route key. Based on this expression, API Gateway will use the value of the action field in the message to route the message above to the appropriate function handler. A function handler is a Lambda function that is associated with a specific route. When a message is routed to that Lambda function, it is invoked and processes the message.

How API Gateway WebSocket choose the route

The Lambda functions receive the requests and then process and store the data in Amazon DynamoDB. In our chat app, to simplify things, we only store the connecionId and the roomId of participants in a DynamoDB table. ConnecionId is a unique identifier for each connection the client made to API Gateway, it continues to exist until the client closes the connection. For our Lambda functions to push data back to clients, we need to store the connectionId for each client in the DynamoDB, then we can scan or query it and invoke the Lambda function to send messages back to clients.

The final component is Amazon DynamoDB. In our application, we need to store two things, the connectionId for sending messages back and the roomId of which the room users are joining, in one table.

Here is the overall flow of our application:

  1. A user first accesses the Frontend application hosted in Amplify.
  2. An input will appear and require the user to enter the roomId.
  3. After the user submits the roomId, he/she will be redirected to a video chat room. A new WebSocket connection is made to API Gateway, and the Lambda function that matches the $connect route is invoked to retrieve the connectionId and then store it in DynamoDB.
  4. A message with body {“action”: “joinRoom”, “roomId”: “…….” } is sent to API Gateway WebSocket, invoking the Lambda function that matches the route “joinRoom”. The function will update the DynamoDB table to store the roomId.
  5. Those four steps above will be repeated for the second user who joins the room as same as the room of the first user. Then, the first user will be notified about the presence of the second user. He/she starts to create an invitation(an Offer) and send it in the message having the format {“action”: “sendMsgToRoom”, “roomId”: “…….”, “offer”: “……” }. The message is received by the Lambda function handler for the “sendMsgToRoom” route. That function is responsible for receiving messages from one participant and sending them to other participants in a room.
  6. The second user receives the invitation, he/she will generate an acceptance (an Answer) and send it to the first user in the message formatted as {“action”: “sendMsgToRoom”, “roomId”: “…….”, “answer”: “……” }. The connection between the first and the second user is then established, and they are now able to video chat with each other.
The flow of the application

Preparing the environment for hands-on lab

Setting up the Cloud9 Instance

To have an environment that has everything necessary for us to develop our application, we will initialize a Cloud9 instance. If you don’t know yet, Cloud9 is an AWS Service that provides you with a cloud-based environment with full amenities, allowing you to develop your application right in your browser.

1. Open the AWS Management Console, and search for Cloud9.

2. Click on Create Environment.

3. Enter “videoChat-env” for name.

4. Add a new tag, enter “App” for key and “videoChat” for value. This is used for the cleaning resource step, we only have to filter all resources associated with the tag App = videoChat, and then delete them.

5. Click on Create.

6. Wait for the creation to complete, then click on Open.

Cloning and running the application

1. Still in the Cloud9 environment, open the terminal and execute:

git clone https://github.com/weebNeedWeed/workshop-serverless-video-chat.git

cd workshop-serverless-video-chat && npm install

The code of the application is cloned for you to the environment. The application was built using Vite and pure HTML with CSS and JS. Vite is the build tool that is recommended by AWS Amplify, helping you build your application very quickly compared to other build tools.

2. To run your app, open the terminal and execute:

npm run dev

3. Click on Preview, then click on Preview Running Application.

4. A new tab will appear. Inside it, click on Pop out into new window.

5. Here you can enter a roomId and click on Join. Our application has two boxes, one is used to display your camera and the other is used to display your partner’s camera. You cannot use the application now because we have not set up the WebSocket channel for it.

Open the file workshop-serverless-video-chat/index.js to view the code. If you don’t know yet what the code is doing, please read through the comments I have written in that file.

Building the application

Creating API Gateway WebSocket

1. Open the AWS Console, then search for API Gateway.

2. Scroll down to WebSocket API, then click on Build.

3. Enter “videoChat-apiGw” for Name and “request.body.action” for Route selection expression. Then click on Create blank API.

4. Scroll down to the bottom and click on Create.

5. Wait for the creation to complete. Navigate to your API, and click on Stages, then click on Production. A stage is a reference to a deployment, which is a snapshot of your API. You can use stages to manage a particular deployment. For example, you can have stages Dev and Prod or V1 and V2 or both in your API. By default, AWS creates a stage named Production for you.

6. Scroll down and click on Tags, then click on Manage tags.

7. Add a tag with tag key as “App” and tag value as “videoChat”. Click on Save.

Creating DynamoDB table

1. Navigate to DynamoDB on AWS Console. Click on Create table.

2. Enter “videoChat-table” for Table name and “connectionId” for Partition key.

3. Scroll down to the Tags section, then add a tag with the following value and click on Create table.

Creating IAM Role for Lambda functions

For our Lambda functions to work, we need to create an IAM Role with enough permissions.

1. Navigate to IAM on AWS Console. Click on Roles, then click on Create role.

2. For Trusted entity type, select AWS Service, then select Lambda for Use case. Click on Next.

3. Search for the following policies, then tick the checkbox to add them to our Role. Then click on Next.

  • AWSLambdaExecute: Allows Lambda functions to put logs into CloudWatch.
  • AmazonDynamoDBFullAccess: Allows Lambda functions to access our DynamoDB table. In the Production environment, you should not use it because of security compliance.
  • AmazonAPIGatewayInvokeFullAccess: Allows Lambda functions to access API Gateway and push data back to clients.

4. Enter “videoChat-lambdaRole” for Role name. Scroll down to the Add tags step and add a new tag with the following value, then click on Create role.

Creating Lambda function — $connect

First, let’s inspect the code before creating our lambda function for the $connect route.

import { DynamoDBClient  } from "@aws-sdk/client-dynamodb";
import {DynamoDBDocumentClient, PutCommand} from "@aws-sdk/lib-dynamodb";

const client = new DynamoDBClient({});
const docClient = DynamoDBDocumentClient.from(client);
export const handler = async (event) => {
console.log(event);

// Retrieve the connectionId of the event object
const { connectionId } = event.requestContext;

// Store it in DynamoDB table
const putCommand = new PutCommand({
TableName: process.env.TABLE_NAME,
Item: {
connectionId
}
});

try {
await docClient.send(putCommand);
} catch (error) {
console.log("Error occured:",error);
return {statusCode: 500};
}

return {statusCode: 200};
};

We will use the DynamoDB Document Model to interact with our DynamoDB table via DynamoDBDocumentClient. If you don’t know yet, the Document Model is an abstract way to talk with DynamoDB API without specifying the Type descriptor against each Attribute in every command. For example, in low-level API, you must use { “connectionId”: { “S”: connectionId } }. However, in the Document Model, it is just { “connectionId”: connectionId }.

The function needs an environment variable, TABLE_NAME. Later on, we will configure our function to use the DynamoDB table as TABLE_NAME.

1. Navigate to Lambda on AWS Console. Click on Create function.

2. Enter “videoChat-connect” for Function name.

3. Expand the Change default execution role section. Choose to Use an existing role. After that, select the IAM Role you have created in the previous step, then click on Create function.

4. In the Code tab, copy the code above and paste it into the Lambda editor, then click on Deploy.

5. To associate tags for our function, click on Configuration, then select Tags.

6. Click on Manage tags and add a tag with the following value. Click on Save.

7. Still in Configuration. Click on Environment variables, then click on Edit.

8. Enter Key as “TABLE_NAME” and Value as “videoChat-table”, which is the name of the DynamoDB table you have created. Click on Save.

Associating your Lambda function with API Gateway $connect route

1. Navigate to your API Gateway.

2. Click on Create route.

3. For Route key, enter “$connect”. Turn on Lambda proxy integration, then choose the videoChat-connect function you have created. Lambda Proxy integration mode allows you to send entire request to the Lambda function without going through any mapping steps. After that, click on Create route.

4. Click on Deploy API.

5. Select production for stage. Click on Deploy.

6. You will be redirected to the stage console. Make sure to copy the WebSocket URL.

7. To test your function, in your Cloud9 environment, Run the command below to install wscat. This tool helps you to create connections to your WebSocket API.

npm install -g wscat

8. Execute the command below to make a connection to your WebSocket API.

wscat -c <your-websocket-url>

9. Open your DynamoDB table, then click on Explore table items.

10. You can see that a new connectionId got inserted into your table.

11. In your Cloud9 environment, press Ctrl + c to terminate the connection. The connectionId will be still in your table, you can manually delete it. However, in the next section, you will implement a function that will handle the deletion for you when users terminate their connections.

Creating Lambda function — $disconnect

The function handler for the $disconnect route is responsible for removing the connectionId from the DynamoDB table when users close their connections.

Here is the code of the function:

import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import { DeleteCommand, DynamoDBDocumentClient } from "@aws-sdk/lib-dynamodb";

const client = new DynamoDBClient({});
const docClient = DynamoDBDocumentClient.from(client);

export const handler = async (event) => {
console.log(event);

const { connectionId } = event.requestContext;
const deleteCommand = new DeleteCommand({
TableName: process.env.TABLE_NAME,
Key: {
connectionId
}
});

try {
await docClient.send(deleteCommand);
} catch (error) {
console.log("Error occured:", error);
return {
statusCode: 500
};
}

return {
statusCode: 200
};
};

To create the function, repeat steps 1 to 8 as demonstrated in the Creating Lambda function — $connect section. Note that the function’s name now is “videoChat-disconnect”.

Associating your Lambda function with API Gateway $disconnect route

1. Repeat steps 1 to 6 in the same manner as in the Associating your Lambda function with API Gateway $connect route section, but now the Route key should be “$disconnect”.

2. To test the function, first, make a connection to your WebSocket API using wscat, then terminate it to see that the connectionId created on initiating is automatically deleted.

Creating Lambda function — joinRoom

The function updates the table in DynamoDB to store the client’s roomId alongside the connectionId. It is also responsible for notifying the other participants in the room that a new user has joined.

import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import { DynamoDBDocumentClient, UpdateCommand, ScanCommand } from "@aws-sdk/lib-dynamodb";
import {
ApiGatewayManagementApiClient,
PostToConnectionCommand,
} from "@aws-sdk/client-apigatewaymanagementapi";

const client = new DynamoDBClient({});
const docClient = DynamoDBDocumentClient.from(client);

export const handler = async (event) => {
console.log(event);

const body = JSON.parse(event.body);
const roomId = String(body.roomId);
const { connectionId, domainName, stage } = event.requestContext;
// Callback URL is like the WebSocket URL you copied.
// We will use it to push data back to clients.
const callbackUrl = `https://${domainName}/${stage}`;

// Update DynanmoDB to store the roomId alongside the connectionId
const updateCommand = new UpdateCommand({
TableName: process.env.TABLE_NAME,
Key: {
connectionId,
},
UpdateExpression: "SET roomId = :roomId",
ExpressionAttributeValues: {
":roomId": roomId,
},
});

try {
await docClient.send(updateCommand);
} catch (error) {
console.log("Error occured:", error);
return {
statusCode: 500,
};
}

// Scan to get all participants in a room
const scanCommand = new ScanCommand({
TableName: process.env.TABLE_NAME,
FilterExpression: "roomId = :roomId",
ExpressionAttributeValues: {
":roomId": roomId
},
});

let result;
try {
result = await docClient.send(scanCommand);
} catch (error) {
console.log("Error occured:", error);
return {
statusCode: 500,
};
}

// Init the API Gateway Management Client
const apiGwClient = new ApiGatewayManagementApiClient({ endpoint: callbackUrl });

// Notify all participants in a room except the newest participant
const promises = result.Items.map(async (item) => {
if(item.connectionId !== connectionId){
const postCommand = new PostToConnectionCommand({
ConnectionId: item.connectionId,
Data: JSON.stringify({
type: "newParticipant",
participantConnectionId: item.connectionId
})
});
await apiGwClient.send(postCommand);
}
});

try {
await Promise.all(promises);
} catch (error) {
console.log("Error occured:", error);
return {
statusCode: 500,
};
}

return {
statusCode: 200,
};
};

In order to create that function, repeat steps 1 to 8 in the same manner as in the Creating Lambda function — $connect section with the function name “videoChat-joinRoom”.

Associating your Lambda function with API Gateway joinRoom route

1. Repeat steps 1 to 6 in the same manner as in the Associating your Lambda function with API Gateway $connect route section, but now the Route key should be “joinRoom”.

2. In the Cloud9 environment, open two terminals, then use both of them to connect to WebSocket API using wscat.

3. In both terminals, enter { “action”: “joinRoom”, “roomId”: “123” }. You can see that the one terminal will display the message formatted as {“type”:”newParticipant”,”participantConnectionId”:”diZbufWlSQ0CFHQ=”}.

Creating Lambda function — sendMsgToRoom

The function receives a message from one participant and then sends it to the other participants in a room, with a mechanism the same as the joinRoom function.

import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import { DynamoDBDocumentClient, ScanCommand } from "@aws-sdk/lib-dynamodb";
import {
ApiGatewayManagementApiClient,
PostToConnectionCommand,
} from "@aws-sdk/client-apigatewaymanagementapi";

const client = new DynamoDBClient({});
const docClient = DynamoDBDocumentClient.from(client);

export const handler = async (event) => {
console.log(event);

const body = JSON.parse(event.body);
const roomId = String(body.roomId);
const message = String(body.message);

const { connectionId, domainName, stage } = event.requestContext;
const callbackUrl = `https://${domainName}/${stage}`;

const scanCommand = new ScanCommand({
TableName: process.env.TABLE_NAME,
FilterExpression: "roomId = :roomId",
ExpressionAttributeValues: {
":roomId": roomId
},
});

let result;
try {
result = await docClient.send(scanCommand);
} catch (error) {
console.log("Error occured:", error);
return {
statusCode: 500,
};
}

const apiGwClient = new ApiGatewayManagementApiClient({ endpoint: callbackUrl });

const promises = result.Items.map(async (item) => {
if(item.connectionId !== connectionId){
const postCommand = new PostToConnectionCommand({
ConnectionId: item.connectionId,
Data: message
});
await apiGwClient.send(postCommand);
}
});

try {
await Promise.all(promises);
} catch (error) {
console.log("Error occured:", error);
return {
statusCode: 500,
};
}

return {
statusCode: 200,
};
};

After deploying like the other three functions, you will again need two terminals to test your function. Initialize new connections in two terminals and make them join the same room, then send the message with the format:

{"action": "sendMsgToRoom", "roomId": "123", "message": "hello world!"}

You will see that “hello world!” is sent to the second terminal.

Testing the full application with WebSocket enabled

Now let’s get back to the application in the Cloud9 environment.

1. Create a file name .env in the root folder of the project. Copy the following snippet the paste it into that file.

VITE_WEBSOCKET_URI=<your-websocket-url>

2. Start your application and then open two browser tabs joining the same room to test the video chat feature.

npm run dev

Hosting your application on AWS Amplify

Amplify allows you to deploy your application using the code in Github, Gitlab, Bitbucket or Amazon CodeCommit. You can also build your app and then host it in Amplify manually. This is also the way we will apply in this section.

1. To prepare for the deployment, you will need an S3 bucket. Navigate to S3, and then click on Create bucket.

2. Enter a global-unique name for your bucket. Scroll down and click on Create bucket.

4. Go to your S3 bucket, and click on Permissions.

5. Scroll down to Bucket Ownership, and click on Edit.

6. Click on ACLs enabled, then tick I acknowledge that ACLs will be restored. Click on Save changes.

7. Go back to your Cloud9 environment. In the terminal, run the command below to build your application. You will have a dist folder containing the build artifacts.

cd ~/environment/workshop-serverless-video-chat
npm run build

8. Zip all files in the dist folder. After running this command, you will have a file named output.zip.

cd ./dist 
cp ../lobby.html .
zip -r output.zip ./*

9. Upload output.zip to your S3 bucket.

aws s3 cp ./output.zip s3://<your-bucket-name>

10. Navigate to the Amplify Console, and click on Create new app.

11. Click on Deploy without Git, then click on Next.

12. Enter your app name. Select Amazon S3 for Method, then find and select your output.zip file. Click on Next.

Wait for the deployment to complete. Then click on View deployed URL.

Cleaning up resources

Follow the following steps to tear down your resources.

1. Delete the application hosted on Amplify.

2. Delete the API Gateway.

3. Delete the S3 Bucket.

4. On the AWS Console, search for a service named “Resource Groups & Tag Editor”.

5. Click on Tag editor. Select the region you are in, then select All supported resource types. For tag key and value, enter “App” and “videoChat”. Then click on Search resources.

6. Carefully follow the list and delete your resources.

👍If you like this post, please follow me to get more interesting and helpful content.

--

--