
Page 11
Big Data, QlikView, and Redshift
So far, we have focused on data sets that are small enough to be analyzed in-memory. For data sets that are too
large to be held in-memory, QlikView’s Direct Discovery technology provides data analysis capabilities. Direct
Discovery, the hybrid approach allows QlikView to access data residing in-database. The architecture of Direct
Discovery places small reference data in memory and access large fact data in-database. Amazon Redshift has
been tested with Direct Discovery and is known to perform well with millions of rows of data.
Keep in mind the following key points in order to make sure Direct Discovery performs well.
•
Redshift Cluster and QlikView components are in same AWS Zone
•
Redshift data uses correct column types and sizes
•
Redshift data is sorted during inserts depending on query pattern
•
If multiple clusters are used, take advantage of zone maps so tables scans are more efficient
•
Ensure cursors and fetch sizes are set correctly
Note: All tests have been performed with high performance EC2 (m3.large and m3.xlarge) instances in same AWS
zone to Redshift cluster.
In conclusion, Amazon Redshift and Qlik provide Keep in mind the following key points in order to make sure Direct
Discovery performs well.
•
Redshift Cluster and QlikView components are in same AWS Zone
•
Redshift data uses correct column types and sizes
•
Redshift data is sorted during inserts depending on query pattern
•
If multiple clusters are used, take advantage of zone maps so tables scans are more efficient
•
Ensure cursors and fetch sizes are set correctly
Note: All tests have been performed with high performance EC2 (m3.large and m3.xlarge) instances in same AWS
zone to Redshift cluster.
In conclusion, Amazon Redshift and Qlik provide organizations the following new capability:
to
to to
to quickly create the
quickly create the quickly create the
quickly create the
right infrastructure
right infrastructure right infrastructure
right infrastructure to host big data environments, perform a multitude of
to host big data environments, perform a multitude of to host big data environments, perform a multitude of
to host big data environments, perform a multitude of discoveries within
discoveries within discoveries within
discoveries within all of their data assets
all of their data assets all of their data assets
all of their data assets
and
and and
and quickly obtain valuable insights to
quickly obtain valuable insights to quickly obtain valuable insights to
quickly obtain valuable insights to better
better better
better manage their businesses
manage their businessesmanage their businesses
manage their businesses.
. .
.
Kommentare zu diesen Handbüchern