Efficient and scalable data processing with Amazon Kinesis

30th January 2014

When it comes to developing Internet-of-things systems, a lot of public focus is placed on the hardware and networking infrastructure required to make it a physical reality. However when designing a system, the processing and analysis of collected data requires an equal or increased effort – and anything that can make this easier or more cost-efficient is necessary.

One example of efficient data processing for the Internet-of-things can be provided by the Amazon Kinesis – a new managed service for real-time processing of streaming data at massive scale, adding big-data services to the Amazon Web Services line-up.

Kinesis can collect and process hundreds of terabytes of data an hour from hundreds of thousands of sources, allowing you to write applications that process information in real time from all sorts of different data sources.

Data can be harvested from almost anything- such as sensors and instruments, user interfaces, or other sources of data. Let’s take a quick look at Kinesis and its potential role in Internet-of-Things applications.

Kinesis service accepts real-time data, replicates it and delivers it to applications running on Amazon’s cloud, allowing applications to tap big data in real time. Real-time operations on large amounts of data made possible by Kinesis enable you to collect and analyse information in real-time, answering questions about the current state of your data without waiting.

With Kinesis, developers can get more creative about what to do with large amounts of data flowing in live, and developers building applications on Amazon’s cloud services can now more easily take advantage of sensors collecting data, which is an important development for realising the potential of large-scale analytics on data collected from Internet-of-Things networks.

This certainly makes Amazon Web Services an attractive choice for developers seeking to put large scale data collected from sensor networks to work in the cloud.

The system can be scaled elastically for real-time processing of streaming data on a large or small scale, taking in large streams of data records that can be consumed in real time by multiple data-processing applications running on instances of Amazon’s Elastic Compute Cloud (EC2).

Data-processing Kinesis applications use the Amazon Kinesis Client Library, and these applications can read data from the Kinesis stream and perform real-time processing on the data they read. The processed records can be emitted to dashboards, used to used to generate alerts, or emit data to a variety of other Amazon big data services such as Amazon Simple Storage Service (S3), Amazon Elastic MapReduce (EMR), or Amazon Redshift.

Interoperability and compatibility with existing, established Amazon cloud computing services and products is an important factor which is likely to give the uptake and usability of Kinesis a significant advantage for established Amazon Web Services users. Kinesis applications can also emit data into another Kinesis stream, enabling more complex data processing.

With Kinesis applications, you can build real-time dashboards, capture exceptions and generate alerts, output data to drive user interactions, and output data to Amazon S3, DynamoDB or other cloud computing services.

Kinesis makes it possible to respond to changes in your data stream in seconds, at any data scale – for example, in Internet of Things applications, such a response may take the form of activating a certain device or automation system in a specified way.

amazon kinesis 2

You can create a new stream, set the throughput requirements, and start streaming data quickly and easily. Kinesis automatically provisions and manages the storage required to reliably and durably collect your data stream.

Kinesis will scale up or down based on your needs, seamlessly scaling to match the data throughput rate and volume of your data, from megabytes to terabytes per hour.

This allows your systems to reliably collect, process, and transform all of your data in real-time before delivering it to data stores of your choice, where it can be used by existing or new applications. Connectors enable integration with Amazon S3, Amazon Redshift, and Amazon DynamoDB.

Kinesis provides developers with client libraries that enable the design and operation of real-time data processing applications – a new class of big data applications which can continuously analyze data at any volume and throughput in real time.

Kinesis is cost effective for workloads of any scale – you can pay as you go, and you will only pay for the resources you use, like with other Amazon cloud computing services. Initiall you can start by provisioning low-throughput streams, and only pay a low hourly rate for the throughput you need.

Kinesis enables sophisticated streaming data processing, because one Kinesis application may emit Kinesis stream data into another Kinesis stream. Near-real-time aggregation of data enables processing logic that can extract complex key performance indicators and metrics from that data.

Complex data-processing graphs can be generated by emitting data from multiple Kinesis applications to another Kinesis stream for downstream processing by a different Kinesis application. You can use data ingested into Kinesis for simple data analysis, real-time metrics and reporting in real time.

For example, metrics and reporting for system and application logs ingested into the Kinesis stream are available in real time, allowing data-processing application logic to work on such data as it is streaming in, rather than wait for data bunches to be sent to the data-processing applications later.

Data can be taken into Kinesis streams, helping to ensure ensure durability and elasticity. The delay between the time a record is added to the stream and the time it can be retrieved is less than 10 seconds – in other words, Kinesis applications can start consuming the data from the stream less than 10 seconds after the data is added – this is useful in applications where real-world actuation or control of automation devices needs to happen relatively quickly.

By using such a powerful and scalable system such as Kinesis, you can get the power you need without paying for surplus processing capacity – but still have reserves ready on demand. But how to get started with Kinesis and your Internet-of-things plans?

Simply join us for an obligation-free and confidential discussion about your ideas and how we can help bring them to life – click here to contact us, or telephone 1800 810 124.

LX is an award-winning electronics design company based in Sydney, Australia. LX services include full turnkey design, electronics, hardware, software and firmware design. LX specialises in embedded systems and wireless technologies design.

Published by LX Pty Ltd for itself and the LX Group of companies, including LX Design House, LX Solutions and LX Consulting, LX Innovations.

Muhammad AwaisEfficient and scalable data processing with Amazon Kinesis

Have a project you’d like to discuss? Speak to someone today.

Add your email and we’ll keep you in the loop with our future developments.