Using AWS Lambda with Timescale Cloud for IoT Data
While it's already quite easy to get started with AWS Lambda and TimescaleDB, here are some tips & tricks to help with your integration.
We speak with a ton of users that are capturing IoT data from devices & sensors and storing that data in TimescaleDB. A common pattern we are seeing is IoT data landing in TimescaleDB via AWS Lambda.
The IoT -> Lambda -> TimescaleDB reference architecture looks something like this:
For the AWS IoT to AWS Lambda part, AWS provides an IoT Rules Action that can call Lambda functions. You can read more about that here. So, then how do you go about setting up the AWS Lambda to TimescaleDB integration?
In this blog, we want to highlight how easy it is to integrate AWS Lambda with TimescaleDB (which can be easily hosted on Timescale Cloud).
Let’s get to it!
What is AWS Lambda?
AWS Lambda is a server-less computing platform provided by Amazon. Lambda makes it very easy to call or automatically trigger function code (that can be written in Go, Node.js, Java or Python) to execute on the platform. You don’t have to worry about server maintenance or scaling… that is all handled by the Lambda service. You just write your code (to effectively do whatever you want), and you have AWS Lambda handle the execution of that code.
Using AWS Lambda to access TimescaleDB
Since TimescaleDB is powered by PostgreSQL and is accessible via SQL, you can connect to and work with TimescaleDB from just about any code. Since that code can be made into a Lambda function, the integration is really just as simple as:
- Use a Postgres / SQL client library that is available for your code (in the language of your choice)
- Write code that speaks SQL to TimescaleDB
- Setup a Lambda function to execute your code
Notable Tips & Tricks
Even though the above bullet points are pretty simple, there are a couple tips & tricks we want to highlight to help you along the way.
#1 TimescaleDB can be accessed using PostgreSQL client libraries
As mentioned above, TimescaleDB leverages PostgreSQL and is accessible via SQL so you can use client libraries to connect your code to TimescaleDB. Here is a short listing of client libraries to use with TimescaleDB.
#2 Use AWS Lambda Layers to load client libraries
Clearly, the AWS Lambda folks know your function code will need to pull in libraries and dependencies. To help with this, they include a capability called AWS Lambda Layers which allows you to package (as a ZIP) and upload your dependencies so your function code can import as necessary.
For example, if we want to write Lambda function code in Node.js that calls TimescaleDB, we will need the Node PostgreSQL (pg) client library available for our Node.js code. Use the following instructions (thank you!) to build a ZIP package that includes the pg library that we can load into a Lambda Layer.
mkdir node-example
cd node-example
mkdir node-pg-tsdb
cd node-pg-tsdb
npm init
npm install --save pg
cd ..
zip -r node-pg-tsdb.zip node-pg-tsdb -x "*.DS_Store"
Now create the Layer in Lambda by uploading the ZIP file.
Add this new Layer to your function code in Lambda.
#3 Leverage Lambda environment variables to set client connection properties
Rather than coding your connection parameters directly into your function, you can leverage Lambda environment variables for these parameters. The environment variables are made available to your code by Lambda during execution.
For example, the Node.js pg client library environment settings are documented here.
You can get the connection setting for your TimescaleDB instance from the Timescale Cloud portal. If you are new to Timescale Cloud, use this guide to get started and create your TimescaleDB instance.
Now in your function code, just load the library, create a client, and connect. Once those steps are completed, you can leverage SQL and start working with time-series data in TimescaleDB. For reference, you can find the TimescaleDB API documentation here.
const { Client } = require('pg')
const client = new Client();
await client.connect();
Advanced Topic: Network Security
A security best practice is controlling network access to your TimescaleDB instance to the smallest range of IP addresses possible. Timescale Cloud supports configuring Allowed IP Address which only allows access if the source traffic is in that specified IP range. To learn how to do this yourself, check out this tutorial.
To take advantage of this IP access control when using Lambda, you’ll want your function code to execute & appear as if it’s coming from behind a static IP. You can use the Allowed IP Address setting in Timescale Cloud to limit access only from your Lambda function code.
For this advanced networking setup in Lambda, I encourage you to take a look at this blog that documents how to setup Lambda functions with a NAT Gateway and Elastic IP for a static public IP address. Once the Lambda side is setup, use the static IP and configure Allowed IP Access for your TimescaleDB instance from the Timescale Cloud portal to restrict access ONLY to traffic from that IP.
What’s Next
The above just showed how easily it is to integrate AWS Lambda with a TimescaleDB instance (that is hosted on Timescale Cloud).
Here are a couple resources for you to to try out:
- Sign up for Timescale Cloud to get a hosted TimescaleDB instance
- Try the Timescale MTA example application, where we used Lambda to handle data ingestion via a script on Lambda