Since the publication of the Microsoft/Keystone “Data & Analytics Maturity Model & Business Impact” white paper, the data landscape has evolved significantly. While the original model provided a strong foundation for assessing data platform maturity, today's organizations face new expectations and technologies — from unified architectures and ML platforms to AI-driven data interaction and domain-driven design.
To reflect this evolution, we have updated and restructured the technology areas used in the assessment to better align with modern architectures, use cases, and platform principles.
These updates aim to:
While still foundational, operational databases are now often embedded in third-party applications or standardized platforms. This area covers transactional systems relevant to internal products or custom applications but is given lower weight in scoring due to limited differentiation value in most organizations.
A unified platform that combines capabilities traditionally split between enterprise data warehouses (structured data) and data lakes (unstructured/semi-structured data). It supports both SQL analytics and machine learning, provides a central source of truth, and enables scalable, cost-efficient storage and compute. This area assesses architecture, freshness, query performance, and data integration.
Focuses on the technical foundation that enables platform scalability, reliability, and automation. This includes cloud provisioning, infrastructure-as-code, monitoring, cost controls, and deployment pipelines. It also considers how easily environments can be managed and extended by the data engineering team.
Covers the tools and methods that business users rely on to access and interact with data. This includes traditional dashboards, self-service BI, and the growing use of natural language interfaces, AI copilots, and GenAI-powered assistants that lower the barrier to insight. This area evaluates reach, ease of use, and semantic layer maturity.
Encompasses the systems, workflows, and talent needed to develop, train, and operationalize analytical models. This includes notebooks, data science platforms, ML pipelines, MLOps capabilities, and model monitoring. It evaluates both analytical depth and operational integration — critical for predictive and prescriptive use cases.
Combines centralized governance (e.g., access control, lineage, PII compliance) with data product thinking and orchestration principles. It evaluates how well data ownership is defined across domains, how reusable assets are managed, and how workflows are governed across the platform. Metadata management, mesh maturity, and SLA enforcement are key elements.
The original Keystone model categorized organizational maturity into four levels — Reactive, Informative, Predictive, and Transformative — based on how effectively data was used across the enterprise. While this structure provided a useful progression, the data landscape has evolved. Today, organizations operate in environments shaped by GenAI, real-time decisioning, cloud-native platforms, and data product thinking.
To reflect these shifts, we have updated the maturity stages to better align with modern enterprise needs and capabilities. The revised structure puts greater emphasis on:
Organizations at this stage have fragmented systems, limited documentation, and ad-hoc reporting. Data is often siloed, inaccessible, or outdated. Analytics is primarily backward-looking and built on manual processes.
Key Characteristics:
These organizations have laid the groundwork for scale. They’ve established unified data architecture (e.g., lakehouse), introduced BI standards, and formalized governance. Data products begin to emerge, and self-service is encouraged.
Key Characteristics:
Analytics becomes predictive and action-oriented. Machine learning and automation are embedded into operations. Teams iterate on models and use data proactively to anticipate business needs.
Key Characteristics:
At this stage, data is central to how the organization runs and innovates. GenAI tools make data accessible to all employees, and governance is federated yet standardized. Teams respond to signals in real-time and continuously improve based on feedback loops.
Key Characteristics: