Transform & Prepare

Transform & Prepare Overview

The Transform & Prepare object type enables comprehensive data processing within our Workshop Project. These tools allow us to clean, structure, and transform data, preparing it for in-depth analysis. This includes filtering out irrelevant information, standardizing formats, handling missing or inconsistent values, and merging data from multiple sources. By refining the data at this stage, we ensure that it is accurate, consistent, and ready for advanced analytics, reporting, or integration into downstream processes. This preparation is crucial for maintaining data quality and achieving reliable insights.

Development Environments

AutoML

This object type provides access to the machine learning and AI tools available on the ARPIA platform. These tools support a wide range of functions, from generating text embeddings to training custom machine learning models tailored to specific needs.

  • Singular-AI Text Embeddings
    Allows you to generate semantic embeddings from raw text data for use in AI pipelines. Ideal for NLP-based tasks like similarity search, classification, or clustering.


Extract

These objects automate the extraction process from MySQL-compatible databases registered as Data Sources within the ARPIA workspace. Extractions can be configured to either transfer data directly from table to table or to use custom SQL queries, enabling precise retrieval of desired information from the database.



  • AP DataPipe Engine - MySQL
    Form-based tool to move data directly from a MySQL source into a destination table.
  • Python 3.12 DataPipe Engine
    Script-based version of DataPipe that supports Python logic for dynamic extraction and transformation workflows.

High Performance Computing

These objects provide open development environments where you can write and execute code to solve complex, nested problems and create customized processes tailored to your specific needs.


  • PHP 7.4 Application
    Full code environment for procedural logic, integrations, or custom backends.
  • Python 3.8 Advanced ML Application
    Python environment designed for advanced processing, modeling, and data manipulation.
  • Python 3.8 Advanced ML & Plotly
    Same as above but pre-configured to support rich data visualizations using Plotly.

Notification Engine

This object type provides access to ARPIA's Notification Engine, allowing you to configure and customize the platform's email notification system. To enable functionality, a Mailgun API key is required.


  • AP Notification Engine
    GUI-based setup for defining email templates, triggers, and delivery rules across your workflows.

Prepare & Transform Tools

These objects enable flexible transformation and preparation of input data to meet specific project requirements. Capabilities range from adding indexes to tables, converting data types (e.g., string to binary or numeric formats), and transforming dates into numeric formats, to executing SQL code for advanced data preparation. Additionally, they include tools like the SingularAI Text Splitter, which allows text to be segmented as needed.


  • AP Prepared Table
    GUI-based tool that lets you take an existing database table and convert it into a modifiable dataset. Once prepared, you can go field by field to transform types, formats, or clean values.

  • AP Transform String to Binary
    Used to convert text fields into binary encodings—useful for flagging or classification models. GUI-driven setup.

  • AP Transform String to Numeric
    Converts categorical or string-based values into numeric representations, preparing them for use in models or aggregation. Also GUI-configurable.

  • AP Transform Dates to Numeric
    Transforms date fields into numerical formats (e.g., Unix timestamps, day-of-week, etc.), making them usable in time series or modeling tasks.

  • AP SQL Code Execution
    A code block object that allows direct SQL execution on registered data sources. Best for defining logic-heavy preparation or chained transformations. Fully programmable—ideal for complex business rules.

  • SingularAI Text Splitter
    Splits long text entries into chunks (for tokenization, summarization, etc.) using prebuilt rules. Useful for RAG pipelines or text-heavy workflows.

Web-Hook Sender

This object type allows us to configure the platform’s web-hook sender, enabling it to send web-hook events to a specified web-hook address.

  • AP Web-Hook Sender
    GUI tool for creating webhook payloads and mapping data to external systems in real time. Great for event-based integrations with third-party platforms.