Cloud Data Platform - New Paradigm
In the beginning, the late 1980's, Data warehouses were the decision support and intelligence applications.
Technologies kept evolving and Massively parallel processing (MPP) architectures led to systems able to handle larger data sizes.
But these warehouses excelled in handling structured data. In today's world we deal with unstructured, semi-structured, and data with high variety, velocity and volume.
Data warehouses are not suited for this - they are also cost-inefficient.
As volumes of data increased from a variety of sources, a single system to house this data along with many analytic products led to the building of data lakes - this happened about a decade ago. These data lakes were repositories for raw data in a variety of formats.
But these data lakes had their drawbacks - they did not support atomicity, consistency, isolation and durability (ACID) transactions.
Data lakes also did not enforce data-quality and lacked consistency - making it impossible to mix appends/reads, batch and stream jobs.
Organizations need systems that include: SQL analytics, real-time monitoring, data science, machine-learning and articifial intelligence.
Advances have led to models that process unstructured data (texts, images, video, audio).
This led to multiple systems to co-exist - they included: data-lake, several data warehouses, streaming systems, time-series - graph - image databases.
Multiple systems brought-in complexity and delay since data-professionals needed to move/copy data between different systems.
This led to the innovation of cloud data platform design wherein one could implement data-structures and data-management on low-cost storage.