Kinesis

Real Time Big Data App logs, metrics, IOT, click-streams

 

Kinesis Streams - Low latency streaming ingest at scale

Kinesis Analytics - Perform real time analytics on streams using SQL

Kinesis Firehose - load streams into S3, Redshift, Elastic Search

kinesis.PNG

·    Kinesis Streams

o streams are ordered into Shards/Partitions

shard.PNG

§ One stream is made of many shards

§ 1 mb or 1000 messages write per sec per shard

§ 2mb read per sec per shard

§ billing is per shard provisioned, can have as many as you want

§ batching available

§ number of shards can evolve over time

§ Records are ordered per shard

 

o Data retention is 1 day as default , can go upto 7 days

o Ability to reprocess and replay data

o Multiple applications can consume the same stream

o Real-time processing with Scale of throughput

o Once data is inserted into kinesis it cant be deleted (immutability)

Kinesis Data Streams enables real-time processing of streaming big data. It provides ordering of records, as well as the ability to read and/or replay records in the same order to multiple Amazon Kinesis Applications. The Amazon Kinesis Client Library (KCL) delivers all records for a given partition key to the same record processor, making it easier to build multiple applications reading from the same Amazon Kinesis data stream (for example, to perform counting, aggregation, and filtering).

·      Use Kinesis scaling utility to modify the number of shards in a stream

 

·    Kinesis Security

o Control Access/authorization using IAM policies

o Encryption in Flight using HTTPS Endpoints

o Encryption at Rest using KMS

o Possible to encrypt data client side (harder)

o VPC Endpoints available to access Kinesis in VPC

 

Kinesis Analytics

·    Perform real-time on Kinesis streams using SQL

·    pay for actual consumption use

·    Can create streams out of real-time queries

Kinesis Firehose -

·    Fully Managed Service

·    load streams into S3, Redshift, Elastic Search, splunk

·    automatic scaling