AI Workshop
ARPIA Data Workshop Overview
The ARPIA Data Workshop is an integrated development platform designed to enable teams to build, deploy, and manage data-driven solutions with high efficiency. It supports a wide range of use cases—from machine learning models and ETL pipelines to custom APIs, web applications, and automated data workflows.
Built with a flexible architecture, the Workshop combines GUI-based tools with open coding environments, providing both low-code and pro-code capabilities. It leverages Docker containerization for scalable compute, supports scheduling and orchestration of data tasks, and facilitates collaboration through reusable assets and project cloning features.
Capabilities and Use Cases
The Workshop enables the development of solutions that span across:
- Machine Learning model training and prediction pipelines
- Data extraction, transformation, and loading (ETL) from various sources
- Custom application and API development
- Text intelligence using embeddings, classification, and segmentation
- Automated notification systems (e.g., email alerts)
- Real-time or scheduled execution of complex workflows
Object Categories in the Workshop
Extract and Load
This category enables automated data extraction from registered data sources (e.g., MySQL-compatible databases) and supports loading into tables managed within the ARPIA platform. These objects may utilize direct table-to-table transfers or custom SQL queries to define the scope of data retrieval.
Key Objects:
- AP DataPipe Engine - MySQL
- AP DataPipe Engine - File
- Python 3.12 DataPipe Engine
Note: The
AP DataPipe Engine
provides a GUI-based configuration form for mapping source and destination tables, enabling rapid setup for common ETL tasks.
Transform and Prepare
These objects focus on refining data before it is used for analytics or modeling. They support data cleaning, format standardization, index generation, data type conversion, date processing, and custom SQL transformations. GUI-based tools allow for quick configuration, while SQL objects provide full scripting control.
Key Objects:
- AP Prepared Table
- AP Transform String to Binary
- AP Transform String to Numeric
- AP Transform Dates to Numeric
- AP SQL Code Execution
- AP Model Render
- SingularAI Text Splitter
Prepared Table: Converts raw database tables into datasets that support field-level transformation and analysis.
SQL Code Execution: Executes custom SQL logic as a standalone process within the pipeline.
AI and Machine Learning
This category offers both AutoML tools and custom development environments for machine learning. AutoML tools include GUI-driven workflows for training, deploying, and predicting with models. Dedicated GPU environments are available for high-performance training workloads.
Key Objects:
- AP AutoML Engine
- Singular-AI Text Embeddings
- AP Generative AI Workflow
- AP AutoML GPU Engine
AutoML objects: Focused on low-code model development with visual workflows.
GPU engines: Designed for larger datasets and more complex model training, leveraging containerized GPU acceleration.
High Performance Computing
These are open development environments that allow teams to write and execute custom code using supported languages. Ideal for advanced data processing, ML model development, API services, and custom application logic.
Key Objects:
- PHP 7.4 Application
- PHP 8.2 Application
- Python 3.8 Advanced ML Application
- Python 3.8 Advanced ML & Plotly
- Python 3 FastAPI
- Python 3.9 Google Cloud Speech
Notebooks
The Arpia Notebooks object allows for the development of interactive Python notebooks directly within the Workshop environment. This object supports data exploration, experimentation, and the documentation of logic and results in a single interface.
Notification Engine
Allows configuration and execution of custom notifications using email services. Requires a Mailgun API key for operation. These objects can be triggered within workflows to send status updates, alerts, or summaries.
Key Object:
- AP Notification Engine
Web-Hook Sender
This object type enables integration with external systems via webhook calls. Users can define webhook URLs and payloads to trigger downstream services based on events occurring within the Workshop.
Key Object:
- AP Web-Hook Sender
Development Interface
Each object within the Workshop features a robust development interface that includes:
- Global Files: Shared code or libraries accessible by multiple objects in the same project.
- Data Repository Access: Direct integration with the project’s internal data tables and schemas.
- Execution and Scheduling: Support for on-demand or scheduled runs of objects and workflows.
- Project and Object Cloning: Rapid duplication of entire projects or individual objects for reuse.
- Dynamic Parameter Configuration: Runtime parameter injection to support flexible and scalable executions.
Deployment and Compute Infrastructure
The Workshop runs on a containerized infrastructure based on Docker. It offers two types of compute resource configurations:
- Shared Container Resources: Suitable for general workloads; cost-effective and managed across tenants.
- Dedicated Container Resources: Reserved compute environments offering guaranteed performance, designed for enterprise-scale needs.
Summary
The ARPIA Data Workshop is a unified environment for data development, supporting both GUI-based and code-driven workflows. It empowers teams to extract insights from data, build intelligent applications, and deploy solutions across the organization. Whether handling structured transformations, deploying AI models, or integrating with external APIs, the Workshop provides the tools and infrastructure to deliver robust, scalable outcomes.
Updated 8 days ago