Data Models (Knowledge Node Object)

The Reasoning Flows component for defining logical data entities and connecting them to the Knowledge Atlas.

🧭 Purpose

The Data Models — also referred to as Knowledge Node Objects — are used to represent structured, logical data models that are integrated directly into Reasoning Flows.
Unlike standalone data sources, these objects define semantic entities (e.g., customers, transactions, products) that can be referenced and reused by other components in the workflow.

By embedding data models into a project, teams maintain clarity, structure, and consistency across workflows while anchoring key business concepts within the logic of each process.

🔹 Where It Fits in Reasoning Flows

In the overall Reasoning Flows architecture:

Extract & Load → Brings raw or external data into the system.
Transform & Prepare → Cleans, standardizes, and prepares data.
AI & Machine Learning → Builds and trains intelligent models.
Visual Objects → Turns processed data into reports or APIs.
Data Models (Knowledge Nodes) → Define the logical meaning and structure of data entities for use within Reasoning Flows.
Knowledge Atlas → Connects Data Models to organizational context, semantics, and generative reasoning.

Goal: Data Models act as the semantic bridge between data pipelines and business concepts, ensuring every data flow is traceable, documented, and aligned with the enterprise knowledge graph.

⚙️ Key Features

🧩 Defines Logical Data Models
Establishes structured entities (e.g., Customer, Product, Transaction) that are part of Reasoning Flows.
🔗 Reference for Downstream Objects
Serves as a centralized data contract — enabling downstream pipelines, APIs, or ML components to use consistent structures.
🧠 Promotes Modularity & Clarity
Encapsulates business logic within well-defined data entities, simplifying documentation and maintenance.
✅ Ensures Consistency
Keeps schema definitions and transformations aligned across multiple workflows.

💡 Recommended Use Cases

Integrating business entities (like customers or orders) as part of a data transformation or AI pipeline.
Structuring and documenting how datasets interact across the workflow lifecycle.
Anchoring and tracing inputs, transformations, and outputs for governance and explainability.
Enabling semantic linkage between Reasoning Flows and the Knowledge Atlas for reasoning and generative AI.

🖼️ Visual Example

Data Models in Reasoning Flows

Example: A “Customer Model” Knowledge Node defines customer-level attributes consumed by ML pipelines and exposed as a unified business entity.

🧠 Best Practices

Define Data Models early in project design to serve as anchors for business concepts.
Reuse the same Knowledge Node across multiple workflows for consistency.
Always link Data Models to source or prepared tables (CLEAN / GOLD layers) for traceability.
Document entity relationships and data lineage in the Knowledge Atlas.
Use clear naming conventions (dm_customer, dm_product, dm_transaction) for discoverability.

🔗 Related Documentation

Knowledge Atlas Overview — Understand how Data Models connect to organizational knowledge and semantics.
Transform & Prepare Overview — Learn how data is standardized before being modeled.
Repository Table Overview — See how base data is registered for reuse.
*ARPIA Data Layer Framework** — Follow best practices for naming and tagging data layers (RAW → CLEAN → GOLD → OPTIMIZED).