Contents

What is the Data Catalog?

The narrative of a Data-Centric platform must, as expected, start with the data. Data Sources are the primary way of ingesting and understanding data.

To ensure that large volumes of data can be processed through the entire platform, integrated direct Connectors for various types of data sources - namely relational database management systems (RDBMS), data warehouses, remote object storages and even local files - exist. These Connectors offer:

Using a Connector, it’s possible to connect to specific Datasets, yielding a Data Catalog. Currently, YData supports tabular and time series data (including transactional data), in a variety of file formats and in multi-table RDBMS settings.

The list of Data Sources is accessible via Data Sources, on the sidebar. The status of the connection to the data is periodically checked. Clicking on a Data Source will open its details page (see below).

The list of Data Sources is accessible via Data Sources, on the sidebar. The status of the connection to the data is periodically checked. Clicking on a Data Source will open its details page (see below).

Creating a new Dataset

To access the simplified wizard for Data Source creation, one can simply Data Catalog > + Create Dataset.

Choosing a specific Connector when configuring a Data Source

Choosing a specific Connector when configuring a Data Source

From there, a specific Connector must be chosen and the simplified connection details will be requested:

The required details for connecting to a file stored on AWS S3. The information required depends on the type of connector.

The required details for connecting to a file stored on AWS S3. The information required depends on the type of connector.