Repository Table
Repository Table Overview
The Repository Table
object in the Data Workshop allows you to load data from a repository table into your workflow. This object serves as the main entry point for structured data and makes it available for use across other components in the pipeline.
Once registered, the Repository Table acts as a referenceable data source. It can be used to build datasets, train models, enrich features, or support any downstream ML process within the same workflow.

Key Capabilities
- Direct data access: Load an existing table from your source repository without writing custom queries or scripts.
- Reusable reference: Once added, the table is available to all other objects that accept tabular input.
- Centralized updates: Any change to the source (schema or content) is reflected automatically across all dependent objects.
- Workflow consistency: Ensures that every component in the ML pipeline uses the same source of truth.
Typical Usage Flow
-
Add Repository Table
Select a table from the repository and register it via theRepository Table
object. -
Connect it to downstream objects
Use the registered table as input for other components, such as:Dataset
object: to prepare training and test setsTrain Model
object: to feed training dataFeature Builder
object: to apply transformations or enrichments
-
Leverage it across the workflow
Any step that requires structured data can now reference the Repository Table without redefining its schema or logic.
Example Workflow
- Add a
Repository Table
→ feed it into aDataset
→ use that to train a model. - Or: Repository Table → Feature Builder → Dataset → Train Model.
Best Practices
- Use Repository Tables as the starting point for any data-driven object.
- Avoid using raw queries elsewhere in the workflow if the table is already registered as a Repository Table.
Updated 1 day ago