Jump to content

User:SukuDan/sandbox

From Wikipedia, the free encyclopedia

Data Product

[edit]

Definition:

[edit]

A Data Product is a curated and trustworthy dataset or a combination of datasets, its lineage is traceable to the primitive sources, such as systems of record or other Digital Data producers.

It is de-coupled from the source or producer and owned by a 'Data Product Owner' responsible for clearly defined Schema and Metadata management. It could be enriched with reference data, designed to meet specific business needs or analytical objectives.

It can be consumed by any authorised user and implements appropriate user access rights governance frameworks; for demonstrated compliance to statutory Requirements.

This concept emphasizes treating data as a product, ensuring it is managed, maintained, and delivered with a service focus on timeliness, quality, usability, and trustworthy value creation.

Key Characteristics of a Data Product:

[edit]

Purpose-Driven: Developed to address operational needs, business questions, analytical requirements and data archives.

Curated and Enriched: Involves the integration of data from multiple sources, including systems of record and reference data, to provide comprehensive insights.

Managed Lifecycle: Subject to continuous management, including updates, quality assurance, and user feedback incorporation.

User-Centric Design: De-Coupled and Structured to be easily discoverable, accessible and interpretable by end-users, facilitating informed decision-making.

Ownership for Governance: Data Products have an Identified "Data Product Owner" who owns the actions related to specifying the Schema, and Quality Parameters for any source to add its data to the Product.

User access rights governance frameworks: Access to Data Products are managed through appropriate access rights framework that implement Security Policies for Access and demonstrate compliance to statutory requirements and Organisational Policies for ensuring quality of data.

Examples of Data Products:

[edit]
  • Trustworthy stream of data: used for analytics and or for integration between systems
  • Dashboards and Reports: Visual representations of key performance indicators (KPIs) derived from various data sources.
  • Recommendation Engines: Systems that analyze user behavior and preferences to suggest products or services.
  • Predictive Models: Analytical tools that forecast future trends based on historical data.
  • Information Archives: Storage of Information for longer term analytical requirements.

Academic Perspectives:

[edit]

The concept of data products has been explored in academic literature. For instance, Hasan and Legner (2023) define data products as “a managed artifact that satisfies recurring information needs and creates value through transforming and packaging relevant data elements into consumable form.” 

Additionally, Microsoft’s Cloud Adoption Framework discusses the importance of treating data as a product to enhance data quality and usability.

Significance In Data Management:

[edit]

Adopting a data product approach aligns with modern data management strategies, such as data mesh, which advocates for decentralized data ownership and treating data as a product to improve scalability and agility.

  • Capability to Create a Digital Twin: Organizational data inherently outlives applications, which undergo periodic upgrades, replacements, or migrations as technology evolves. In a data-centric approach, Data Products act as persistent Digital Twins by extracting and holding data from the System of Record (SoR). These Digital Twins ensure that critical data remains accessible and usable beyond the lifecycle of the application that originally generated it.
  • Benefits from De-coupling sources from consumers: By decoupling data from applications, organizations eliminate the disruptions and inefficiencies caused by frequent changes in application landscapes. Instead, new applications can seamlessly utilize the Digital Twin, fostering long-term data integrity and reducing the organizational strain associated with application-centric approaches where data often gets siloed or fragmented as applications come and go.
  • Essential Component of Event-Based Architecture: In an event-based architecture, Data Products play a crucial role in analyzing Business Events that occur within the services of Value Chain Functions. Each event represents a moment in the business process, capturing key details about its role in managing the value stream. By creating Data Products from these events, organizations can extract rich insights—beyond the immediate operational context—into patterns, trends, and anomalies. These insights provide a deeper understanding of the event’s impact on the overall business process, enabling better decision-making, performance optimization, and strategic planning. Furthermore, by decoupling the data from the event’s source application, these Data Products become reusable assets, supporting downstream analytics, AI models, and compliance reporting, ensuring a resilient and insightful approach to managing value streams.
  • Trustworthy Embedded Quality: Trustworthiness and embedded quality are essential characteristics of Data Products. These attributes ensure that data serves as a reliable foundation for analytical and AI engines, driving accurate insights and better business outcomes. By meeting statutory requirements and fostering confidence, trustworthy data becomes a cornerstone for organizational success and compliance.
  • Consumer-Centricity: Like any product, a data product prioritizes the end-user experience, ensuring ease of access, clarity, and relevance.
  • Automated Governance: Data products usually leverage automated governance tools to enforce policies, ensure compliance, and maintain data integrity without manual intervention.
  • Lifecycle Management: Data products are not static; they evolve through a managed lifecycle that includes: - Versioning to track changes, Retirement of outdated or irrelevant products. Feedback loops to incorporate user input and adapt to new requirements.
  • Business Value Alignment: A data product is purpose-driven, directly linked to Operational needs, business objectives or problem solving, making it possible to measure value delivered.
  • Built for Analytics and Action: Data products can be crafted to be directly usable for Analytical purposes (e.g., dashboards, machine learning models), Operational actions (e.g., triggering workflows or notifications) and making predictions.
  • Role in Future Ready Architectures: Data products are a foundational element in future ready Enterprise Architecture placing a Data Mesh as the central nervous system of the Enterprise.
  • Data Fabric: Serving as building blocks in an interconnected data ecosystem and can be woven from multiple data products.
  • Enables Collaborative Autonomy: Data Product-based Streams empower autonomous teams to operate independently within their own application and DevOps environments while seamlessly collaborating across business lines, business units, or other organizational domains. This approach allows data producers to retain control over their processes while sharing data in a standardized and accessible manner. By bridging the gap between autonomous teams, Data Product Streams enable a paradigm of collaborative autonomy, fostering interconnectedness without compromising individual team independence. This dynamic enhances efficiency, innovation, and adaptability across the organization.

References:

[edit]

Hasan, S., & Legner, C. (2023). Data products, data mesh, and data fabric. Business & Information Systems Engineering. 

Dama NL Foundation: Fact Sheet (2023) Data As A Product As Part of Data Mesh https://dama-nl.org/product/13496/

Zhamak Deghani (2022). Data Mesh: Delivering Data-Driven Value at Scale Paperback