In the IoT world, general-purpose databases can’t cut it
- by 7wData
We live in an age of instrumentation, where everything that can be measured is being measured so that it can be analyzed and acted upon, preferably in real time or near real time. This instrumentation and measurement process is happening in both the physical world, as well as the virtual world of IT.
For example, in the physical world, a solar energy company has instrumented all its solar panels to provide remote monitoring and battery management. Usage information is collected from a customers’ panels and sent via mobile networks to a database in the cloud. The data is analyzed, and the resulting information is used to configure and adapt each customer’s system to extend the life of the battery and control the product. If an abnormality or problem is detected, an alert can be sent to a service agent to mitigate the problem before it worsens. Thus, proactive customer service is enabled based on real-time data coming from the solar energy system at a customer’s installation.
In the IT world, events are being measured to determine when to autoscale a system’s virtual infrastructure. For example, a company might want to correlate a number of things taking place at once — visitors to a website, product lookups, purchase transactions, etc. — to determine when to burst the cloud capacity for a short time to accommodate more sales or other kinds of activity.
The idea of measuring everything is to become more data-driven as a business, to be able to make better business decisions and take timely actions based on events, metrics, or other time-based data. This is happening across all industries as companies use their digital transformations to change the way they do business.
Much of this data is time-series data, where it’s important to stamp the precise time when an event occurs, or a metric is measured. The data can then be observed and analyzed over time to understand what changes are taking place within the system.
Time-series databases can grow quite large, depending on how many events or metrics they are collecting and storing. Consider the case of autonomous vehicles, which are collecting and evaluating an enormous number of data points every second to determine how the vehicle should operate.
A general-purpose database, such as a Cassandra or a MySQL, isn’t well suited for time-series data. A database that is purpose-built to handle time-series data has to have the following capabilities, which general-purpose databases don’t have.
The founder of InfluxData, Paul Dix, saw this unique need, and he built the InfluxData Platform specifically to accumulate, analyze, and act on time-series data. He started with an open-source project that contained InfluxDB, the core database. InfluxDB was a quick hit on GitHub among developers. After that, he raised some funding and kicked off three more open-source projects to round out the InfluxData Platform. Those projects included:
Telegraf — This is a data collector that goes on things such as a network device, an application, a sensor, or a standalone server. It collects all the data and sends it to the InfluxDB database. Open source contributors have developed more than 160 Telegraf plug-ins to date.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More