Microsoft Word - Assignment Description ETL Assignment Description Write an essay comparing and contrasting at least 2 enterprise class ETL solutions. The purpose of this assignment is not to state...

1 answer below »
Essay comparing 2 ETL Tools


Microsoft Word - Assignment Description ETL Assignment Description Write an essay comparing and contrasting at least 2 enterprise class ETL solutions. The purpose of this assignment is not to state the obvious but rather to illuminate subtle differences or unexpected similarities between the two (or more) chosen solutions. The essay must be a single MS Word document that follows professional standards. Review the rubric below and be sure to ask if you have any questions. This is a major assignment and will be marked appropriately. Details - The essay should compare/contrast at least 2 different solutions in detail and mention other ETL solutions to a lesser degree. - The purpose of this assignment is to gauge your overall familiarity with ETL solutions available on the market today. Possible elements you might address - Which is better, custom written code or a packaged solution? - Should performance be considered? - Are there any privacy or security considerations? - Are there hardware / cost considerations? - How long does it typically take to implement? - What kind of knowledge (user training) does the client need to maintain it in-house, should outsourcing be considered? - On-site vs. Cloud solutions. Does cost considerations outweigh security concerns? -Assume the reader already knows what ETL means and has an understanding of the required concepts -Be direct, clear, and professional. Do not add fluff or unnecessary details. - Use proper notation to cite all your sources. While direct comparisons of products can be found on the net, make sure you demonstrate your knowledge and understanding. - Be sure any sources you use are credible and are as current as possible.
Answered 6 days AfterFeb 02, 2022

Answer To: Microsoft Word - Assignment Description ETL Assignment Description Write an essay comparing and...

Abhijeet answered on Feb 09 2022
106 Votes
Essay on Comparison of ETL Tools
Student Name
Student ID
Department
Date
    
Contents
1.    Introduction    1
2.    Scalability    2
3    Cloud Vs On Prem     3
4    Privacy and Security    4
5    Performance     4
6    Cost and Hardware Considerations     5
7 Conclusion     6
8 References…………………………………………………………………………………………………………………….7
1. Introduction
ETL is a process that extracts da
ta from a source then performs some transformations and finally loads data into data warehouses (in most cases). Mostly all data-related applications are relying largely on ETL systems. ETL works as a backbone for Analytical tools, BI tools, Report generation systems, etc. We need ETL tools because simple relational databases cannot answer complex business questions. ETL enables a company to analyse its data and make the best use of it to take further strategic business decisions. ETL helps companies to compare source and target systems. One of the important features of ETL is that it can read data from multiple sources like relational databases, flat files, JSON, Kafka streams etc, and load output into flat files, data warehouses. etc.
ETL also performs data migration activities for moving data into different storage systems. ETL provides us with 2 ways of extraction of data i.e., Full and partial. Full is when we fully load a table or a full database with full data as it is. Partial extraction is a process in which data in a table is loaded on a conditional basis like loading for a particular duration or some top or last records. Using ETL we can also do data reconciliation between two databases or two files etc and we can generate reports and even send it over more by scheduling it. The extraction part mainly involves data validation, data type checks, removing unwanted and duplicate data.
The transformation process includes data type updates, spelling checks. It also involves checking some data rules like age cannot be greater than certain figures etc. The transformation includes filtering process of columns or data which are not required to load into a destination. Transformation stores data into staging tables. Data can be merged in this stage. Like merging of two different CSV based on some columns. Transformation can also transpose rows and columns.
Loading data into the target system is the last process of ETL. Since there is a huge amount of data, it is always required that the loading process should be very much optimized. Most of the organization loads data during the night time and it has to finish before business hours start. Before loading, there should be a checkpoint from where we can recover the system in case of loading failure. There can be multiple types of loading, like incremental load in which data is appended to the existing system or a table or it can also be a full refresh, in which data is fresh data is loaded. After loading the data there are certain verification that needs to be done like comparison with history tables, testing of the BI tools
Below is a diagram demonstrating a ETL process
Below are certain major areas covered that are most important while comparing ETL tools.
2. Scalability
Scalability enables high performance and higher availability of the system and it is a very crucial element of any ETL tool.
Informatica PowerCenter provides two kinds of scalability those are Vertical and Horizontal. We can increase the number of CPUs to the existing hosts to handle the more load this increases the performance and throughput of the application. Horizontal Scalability enables us to increase the number of hosts and add more hardware capabilities. We can increase the number of instances at source and destinations.
IBM DataStage is highly...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here