Category

Target:

After performing any task on the data extracted from the source, the target helps us load the data into the database, any file system, or any cloud storage.  

 

Dim load:

Dim load” refers to loading data into dimension tables. Dimension tables are one of the two types of tables in a data warehouse, the other being fact tables. 

  • Dimension tables contain descriptive attributes that provide context for the measures in the fact table. For example, in a sales data warehouse, a dimension table for time would have attributes such as date, month, quarter, and year.  
  • A product dimension table would contain attributes such as product name, category, and brand. 
  • Loading data into dimension tables involves inserting or updating the dimension records based on the source data. This process is typically performed using an Extract, Transform, Load (ETL) tool or process. 
  • During the “Dim load” process, the ETL tool may perform transformations such as data cleansing, mapping, and validation to ensure the data is accurate and consistent with the rest of the data warehouse.  
  • Once the data has been loaded into the dimension tables, the user can use it for reporting, analysis, and data mining. 

 

  1. Enter the Step name, which will appear in the grid. 
  2. Form type: Mode type as per the requirement. 

 

Update Insert:

This will update the current data and insert the new data into the file.  

For example., current data in an existing table is of employees and their salaries: 

Here we are taking form type Update_Insert, table from the Vertica connection, and serial number as a key value. Users can add/delete keys as per the requirement. It will insert the new data and update the current data into the data. 

 

 

A total of 5 employees’ data is present here. 

Now we will update one more employee’s data in this table 

As an output, we can see that Vipin’s salary is updated from 16000 to 14000, and a new employee Bipul is added to the table by the Update insert function. 

 

Delete Insert:

This deletes the current data and only adds the new data to the file.  

For example., current data in the existing table is of employees and their salaries: 

Here we are taking form type Delete_Insert, table src_data_salary from the Vertica connection, and serial number as a key value. Users can add/delete keys as per the requirement. It will delete the existing data and only insert the newly added data into the table. 

 

This is the current data in the table which has 5 employee’s information. 


 

Here we added one more employee Akshat’s information to the table  

As an output, we can see that it deleted the current data that was present in the table and added the new one only, i.e., Akshat  

     3. Select the connection which is built earlier. 

     4. Click Get table, and all the available tables in the connection will appear. 

     5. Select the table from the drop-down. 

     6. Click on Get Columns to select the columns present in the table.  

 

Encrypt Load:

Encryption is one way to ensure data security during the ETL process. Encrypting data involves transforming it into an unreadable format that can only be read with a decryption key. This makes it difficult for unauthorized users to access or tamper with the data. 

  •  In an ETL process, encryption can be applied to the data during the load stage. This means the data is encrypted before storing it in the destination system, such as a data warehouse.  
  • Depending on the ETL tool used, encryption can be implemented in different ways, such as using a built-in encryption function or calling an external encryption library. 

  1. Enter the Step name, which will appear in the grid. 
  2. Select the connection which is built earlier. 
  3. Set Password for encrypting the table; the user should remember this password to access the table in the future. 
  4. Click on Get Tables; select the table from the drop-down. (All the tables present in the selected connection will appear in the drop-down) 
  5. Mode: The user has to select the mode type: 
    • Append: “Append” refers to the process of adding new data or records to an existing dataset or database. When appended data is typically added to the end of the existing dataset or database without modifying or deleting any existing data. 
    • Truncate & Load: Truncate is a database operation that removes all data from a table, effectively deleting all rows in the table. This operation is often faster than deleting rows one by one because it does not generate any undo logs or transaction logs, and it does not log individual row deletions. 
  6. Click on Get Columns, and all the columns will appear at the bottom; the user can select the columns as per the requirement. 
  7. Click Ok.

 

Lumenore Dataset: 

Lumenore dataset lets the user create and name the dataset and the column. The user can make a dataset from different sources. For example, he can connect a database and an Excel file through Joiner and use the same for creating the Schema and joins. 

  1. Enter the Step name, which will appear in the grid. 
  2. Loading method: How the user wants to load the data. 
  • Update Insert: This will update the current data and insert the new data into the file.  

For example, current data in the existing table 

 

 

Now we are trying to update Dhoni city as Ranchi and insert a new record as Rohit using UPDATE INSERT mode. 

 

While loading the data into the table used, the user should select the mode as an ‘Update Insert,’ and the user should select key columns based on their requirement. 

The output of the graph: 

 

 

  • Delete Insert: This deletes the current data and only adds the new data to the file. For e.g., 1234 (current data) and 56(new data), the output will be 56. 

For example, current data in an existing table 

 

We are now trying to insert a new record using DELETE INSERT mode. Here we are selecting id as a Key Column. 

Here Dhoni & Rohit has to be deleted, and Kohli’s record should be inserted. 

The output of the graph: 

 

  • Append: “Append” refers to the process of adding new data or records to an existing dataset or database. When appended data is typically added to the end of the existing dataset or database without modifying or deleting any existing data. 

For example, we are trying to create a new dataset by using this data. 

 

 

Now we are trying to create a Dataset and Dataset table by using APPEND. 

The output of the graph: 

 

  • Truncate & Load: Truncate is a database operation that removes all data from a table, effectively deleting all rows in the table. This operation is often faster than deleting rows one by one because it does not generate any undo logs or transaction logs, and it does not log individual row deletions. 

     3. Click on Get Datasets, and select the dataset from the drop-down. (Datasets that are made earlier will appear) 

     4. Check on Encrypt Table to make it encrypted. 

     5. Click on Get Tables, and select the dataset table from the drop-down. 

     6. Click Ok

 

Lumenore Data Loader:

It is used to load bulk data into the database. It improves the data loading speed.

  1. Enter the Step name, which will appear in the grid. 
  2. Select the connection which is built earlier. 
  3. Click on Get Tables; select the table from the drop-down. (All the tables present in the selected connection will appear in the drop-down) 
  4. By Rejected Table, the user can reject the table which the user does not want to include. 
  5.  Truncate: Truncate is a database operation that removes all data from a table, effectively deleting all rows in the table. This operation is often faster than deleting rows one by one because it does not generate any undo logs or transaction logs, and it does not log individual row deletions. 
  6. Match by Position: 1st to 1st, 2nd to 2nd, 3rd to 3rd. 

         Match by Name: Name to Name. 

      7. Click Ok

  

Vertica Bulk Loader:

Vertica bulk loader is a high-speed data loading tool designed to load large volumes of data into Vertica, a columnar data warehouse. It is optimized for high-performance data loading and can load data into Vertica at up to several gigabytes per second. 

 

  • Vertica bulk loader uses various techniques for high-speed data loading, including parallelization and compression. By breaking up the data into smaller chunks and loading them in parallel, the Vertica bulk loader can achieve high throughput rates. 
  • Vertica bulk loader is typically used when large volumes of data need to be loaded into Vertica quickly and efficiently. For example, it might load data from log files or sensor data streams into Vertica for analysis. It is also commonly used in data warehousing scenarios where large data sets must be regularly loaded into Vertica. 
  1. Enter the Step name, which will appear in the grid. 
  2. Select the connection which is built earlier. 
  3. Click on Get Tables; select the table from the drop-down. (All the tables present in the selected connection will appear in the drop-down) 
  4. By Rejected Table, the user can reject the table which the user does not want to include. 
  5.  Truncate: Truncate is a database operation that removes all data from a table, effectively deleting all rows in the table. This operation is often faster than deleting rows one by one because it does not generate any undo logs or transaction logs, and it does not log individual row deletions. 
  6. Match by Position: 1st to 1st, 2nd to 2nd, 3rd to 3rd. 

         Match by Name: Name to Name. 

     7. Click Ok

 

Table Output:

Tables are used to store and organize data. The output tables in ETL are the result of transforming and processing the data from the source system. 

The structure and content of the output tables depend on the business requirements and the transformation rules applied during the ETL process. Generally, the output tables should be designed to meet downstream applications’ reporting or analytical needs. 

  1. Enter the Step name, which will appear in the grid. 
  2. Mode: The user has to select the mode type: 
  • Append: “Append” refers to the process of adding new data or records to an existing dataset or database. When appended data is typically added to the end of the existing dataset or database without modifying or deleting any existing data. 
  • Truncate & Load: Truncate is a database operation that removes all data from a table, effectively deleting all rows in the table. This operation is often faster than deleting rows one by one because it does not generate any undo logs or transaction logs, and it does not log individual row deletions. 
  • Update Insert: This will update the current data and insert the new data into the file.  

For example, the current data in the existing table is as follows: 

 

Now we are trying to update Dhoni city as Ranchi and insert a new record as Rohit using UPDATE INSERT mode. Here Dhoni city name should be changed as Ranchi and Rohit records should be inserted. 

 

While loading the data into the table used, the user should select the mode as an ‘Update Insert,’ and the user should select key columns based on their requirement. 

 

The output of the graph: 

 

  • Delete Insert: This deletes the current data and only adds the new data to the file.  

 

For example., current data in the existing table: 

 

We are now trying to insert a new record using DELETE INSERT mode. Here we are selecting id as a Key Column. Here Dhoni & Kohli has to be deleted, and Rohit’s record should be inserted. 

The output of the graph: 

 

     3. Click on Get Tables; select the table from the drop-down. (All the tables present in the selected connection will appear in the drop-down) 

4. Batch: It is basically a group of rows; the user can select the batch size. 

For example, if the table has 5000 rows, the user can make 5 batches of 1000 rows. 

     5. Click Ok 

 

Target File:

The “target file” refers to the file or database table where the transformed data is ultimately loaded or stored. The user can download the data in CSV/Excel or other formats through the target file. 

  1. Enter the Step name, which will appear in the grid. 
  2. Enter the File Name
  3. Enter the File Type

.