Transform & Prepare
Transform & Prepare Overview
The Reasoning Flows layer dedicated to data refinement, enrichment, and transformation.
🧭 Purpose
The Transform & Prepare object type enables comprehensive data processing within Reasoning Flows.
These tools allow teams to clean, structure, and transform data, preparing it for advanced analysis, AI modeling, and operational integration.
This includes filtering out irrelevant information, standardizing formats, handling missing or inconsistent values, and merging data from multiple sources.
By refining the data at this stage, teams ensure that it is accurate, consistent, and analysis-ready — forming the foundation for reliable insights, ML performance, and AI reasoning.
🔹 Where It Fits in Reasoning Flows
In the Reasoning Flows architecture:
- Extract & Load brings raw or external data into the repository.
- Repository Tables register structured datasets for reuse across workflows.
- Transform & Prepare refines and cleans this data for analysis or model training.
- Knowledge Atlas connects refined data to the Generative AI and Semantic reasoning layers.
Goal: Transform & Prepare bridges the transition from data ingestion to data intelligence — it’s where raw data becomes business-ready information.
🧩 Development Environments
AutoML
This object type provides access to the machine learning and AI tools available within the Arpia platform.
These tools support a wide range of functions, from generating text embeddings to training custom machine learning models tailored to specific needs.
- Singular-AI Text Embeddings
Allows you to generate semantic embeddings from raw text data for use in AI pipelines.
Ideal for NLP-based tasks such as similarity search, classification, or clustering.
Extract
These objects automate the extraction process from MySQL-compatible databases registered as Data Sources within Reasoning Flows.
Extractions can either perform direct table-to-table transfers or use custom SQL queries to precisely define the data retrieved from the source.
-
AP DataPipe Engine - MySQL
GUI-based tool for moving data directly from a MySQL source into a destination table. -
Python 3.12 DataPipe Engine
Script-based version of DataPipe that supports Python logic for dynamic extraction and transformation workflows.
High Performance Computing
These objects provide open development environments for writing and executing custom code — ideal for advanced data processing, ML model training, or complex business logic.
-
PHP 7.4 Application
Full code environment for procedural logic, integrations, or custom backends. -
Python 3.8 Advanced ML Application
Python environment for advanced processing, modeling, and data manipulation. -
Python 3.8 Advanced ML & Plotly
Same as above, but preconfigured to support rich data visualizations using Plotly.
Notification Engine
This object type provides access to Arpia's Notification Engine, enabling configuration of automated email notifications and alerts.
Requires a Mailgun API key for operation.
- AP Notification Engine
GUI-based setup for defining email templates, triggers, and delivery rules across your workflows.
Prepare & Transform Tools
These objects enable flexible data transformation and preparation tailored to project needs.
Capabilities include adding indexes, converting data types, transforming date formats, executing SQL logic, or segmenting text for downstream AI applications.

-
AP Prepared Table
GUI-based tool that allows converting an existing table into a modifiable dataset.
Supports field-by-field data cleaning, retyping, and transformation. -
AP Transform String to Binary
Converts text fields into binary encodings — ideal for classification flags. -
AP Transform String to Numeric
Converts categorical or string-based values into numeric form for aggregation or modeling. -
AP Transform Dates to Numeric
Converts date fields into numeric representations (e.g., timestamps, day-of-week). -
AP SQL Code Execution
Code block object for executing custom SQL logic.
Perfect for logic-heavy preparation, complex joins, or multi-step transformations. -
SingularAI Text Splitter
Splits long text entries into smaller segments — useful for tokenization, summarization, or RAG-based workflows.
Web-Hook Sender
This object enables external integration through webhooks, allowing Reasoning Flows to send event-based payloads to external systems in real time.
- AP Web-Hook Sender
GUI-based tool for defining webhook payloads and mapping data to third-party systems or APIs.
🧠 Best Practices
- Use Transform & Prepare objects as the standard layer between ingestion and modeling.
- Align all transformed tables with the Arpia Data Layer Framework (RAW → CLEAN → GOLD → OPTIMIZED).
- Use SQL Code Execution or Python DataPipe for complex transformation logic.
- Document transformation rules and results in the Knowledge Atlas for governance and reusability.
🔗 Related Documentation
-
Extract & Load in Arpia — Knowledge Base Overview
Learn how data is ingested into Reasoning Flows before transformation. -
Repository Table Overview
Understand how registered tables act as reusable data sources across workflows. -
Arpia Data Layer Framework
Review how data classification and governance are applied across Reasoning Flows. -
Knowledge Atlas Overview
Explore how prepared datasets integrate into Arpia’s Generative AI and Semantic reasoning ecosystem.
Updated 18 days ago
