Redshift
·
Redshift is loosely based on
PostgreSQL but its not OLTP
·
Redshift is OLAP - online analytical
processing ( data warehouse )
·
10x better than other data warehouses
, scales to PBs of data
·
Columnar storage of data (instead of
row based)
·
Massively Parasail Query Execution
(MPP), highly distributed
·
Pay as you go based on instance
provisioned
·
Has SQL Interface for performing the
queries
·
BI tools such as AWS Quicksight or
Tableau integrate with it
·
Data is loaded from S3, DynamoDB,
DMS, other DB`s
·
From 1 node to 128 nodes, up to 160gb
space per node
·
Leader node for query planning and
results aggregation
·
Backup & Restore,
o 2
copies of data in each AZ, in 3 AZ`s so min 6 in total
o 2
types of replicas; Auroa(automated
failover) and MYSQL replicas
o Automated
backups by default; 1 day retention by default to 35 days max retention; can
share snapshots with other AWS accounts
o replicate
snapshots asynchronously to another region for DR
·
Redshift enhanced VPC routing all
COPY / Unload goes through VPC
o Amazon
Redshift Enhanced VPC Routing, Amazon Redshift forces all COPY and UNLOAD
traffic between your cluster and your data repositories through your Amazon
VPC. By using Enhanced VPC Routing, you can use standard VPC features, such as
VPC security groups, network access control lists (ACLs), VPC endpoints, VPC
endpoint policies, internet gateways, and Domain Name System (DNS) server
·
Redshift
Spectrum : conduct fast, complex analysis on objects can perform SQL queries on data stored in
Amazon S3 buckets.
·
Operations: OLAP
·
Security:
IAM, VPC, KMS, SSL, Monitoring (similar to RDS)
·
Reliability:
highly available, auto healing features
·
Performance:
10x performance that other data warehouses , compression
·
Cost
Optimization: pay per node provisioned 1/10th the cost of
alternatives
USE CASE Redshift : BI / Analytics / Data Warehouse