Tuesday, August 22, 2017

IoT Time Series Data with InfluxDB and Knowi



InfluxDB is a Time Series NoSQL database specifically built for use cases centered around IoT/sensor data. This is a brief introduction into InfluxDB as well as analytics on InfluxDB using Knowi.  

IoT

Internet of Things -- the umbrella term for network aware devices, wearables, home, cars and other connected devices -- appears to be shedding its hype and starting to coming of age. The projections are certainly eye-popping, with an expected 50-200 billion connected devices by 2020, $1.7 trillion in spending, and 173 million wearable devices by 2019.

Given the volume and velocity of sensor data, NoSQL databases are often the preferred option to storing sensor data. Commonly used datastores include Cassandra, HDFS,  Graphite and more recently, Druid and InfluxDB.  

InfluxDB

InfluxDB is an open source database, created and supported by the InfluxData team. It can be downloaded here.

Highlights:

InfluxDB does not require a schema defined upfront. Queries are enabled using a nice SQL-like syntax with easy extensions for time-based queries. Write and Query API’s are exposed through a REST API.
Given its focus on time series databases, data is always stored with a timestamp component,  which acts like a primary index.

It uses a Line Protocol format, with measurements (like a SQL table), tags (like an indexed column) and fields (an unindexed column)  

Data Format:
<measurement>[,<tag-key>=<tag-value>...] <field-key>=<field-value>[,<field2-key>=<field2-value>...] [unix-nano-timestamp]

Example:
stock,symbol=AAPL bid=127.46,ask=127.48 1434067467100293230

In the example:
Stock is a measurement
Symbol is a tag
Bid and ask are fields.
The number field is the Unix timestamp in nanoseconds (optional: InfluxDB will auto insert time for the record)

Setup:

InfluxDB is straightforward to download and setup.

a. Download the appropriate package here.
b. Start the service.

To import some data:
Download test data: curl https://s3-us-west-1.amazonaws.com/noaa.water.database.0.9/NOAA_data.txt > NOAA_data.txt
(Water level data from the National Oceanic and Atmospheric Administration)

Insert: influx -import -path=NOAA_data.txt -precision=s

Queries:

Queries can be executed against InfluxDB using InfluxQL, a SQL- like syntax to query data stored in InfluxDB. Example:

SELECT * FROM mydb WHERE time > now() - 1d

This returns all tags & fields for the mydb measurement for last day.  

Analytics:

InfluxData provides Chronograf, a visualization tool to interface with InfluxDB. This might be an option to consider for time series charts if you are planning to store all your data in InfluxDB alone.   

Knowi provides native integration into InfluxDB to be able to query, aggregate, visualize, analyze data from InfluxDB, but also allows multi-datasource joins it with your polyglot persistence architectures (data across various NoSQL, SQL, REST API’s and files), all the way to produce interactive dashboards for non-technical users.

To see InfluxQL in action:

  1. Go to InfluxDB Instant Visualization page at Knowi. It’s configured with a live InfluxDB database and InfluxDB queries.
Note: The query generator section can be used to discover measurements, tags & fields and build basic InfluxQL dynamically.


  1. Click on Show me for instant visualizations.
  2. Experiment with your own time series InfluxQL queries against the data.

Enjoy!

Resources