Infosys’ blog on industry solutions, trends, business process transformation and global implementation in Oracle.

« Big Data Processing Architecture | Main | Avoid Validation failures with DRM Validation checks »

ETL vs. ELT

In today's world, the process of innovative data warehousing is going through a rapid transformation like big data, sensor data, social media data, endless behavioral data about website/mobile app users etc. This gives rise to various new methodologies, for example, ELT is heard progressively in today's analytical environment. We have used ETL in the past but what happens when the "T" and "L" are switched since they both solve the same need.


etl1.png

What is ETL?

ETL is an acronym for Extract, Transform and Load. It is a process which involves extracting of data from the outside source or transactional systems like different RDBMS source, then transforming it to fit the model of the data warehouse or operational needs like doing calculations, concatenations, etc. and then loading it into the target database or Data Warehousing system.


etl2.png


The process of ETL works like a pipeline and the data should flow steadily through it. Contrasting the physical pipelines, ETL tools are proficient in handling more data but the more they try to grip they burst (giving memory or disk space error). In another word, the ETL tool holds the data before the output is transcribed so if the input the data is already transformed, the tool can handle the flow of data all the way through the process.

 

Advantages of ETL:

  • Less complexity and time involved in development since the process involves the development of the output.

  • Many tools are present in the market which implement ETL and hence it provides flexibility in choosing the tool required.

  • The ETL process is mature and has well defined best practices and process.

  • Since ETL is used form a decade it ensures availability of ETL experts in abundance.

 

Disadvantages of ETL:

  • Since It loads only the data types defined at the time of design and if there is a need to add a new data type it adds time and cost. Hence it is less flexible.

  • The hardware, maintenance and licensing cost of ETL tools are high.

  •  ETL tools are mostly limited to processing relational data.


What is ELT?

ELT is a different or approach to data movement. Once the data is extracted, instead of altering the data it is loaded into the staging table or the database and then permits the target system to do the alteration/transformation.


etl3.JPG



ELT is a cost-effective process since it takes less time in loading the data in comparison to ETL model.

 

Advantages of ELT:

  •  Since the ELT process involves loading and transforming data into smaller chunks it makes the project management easier.

  • It uses the same hardware for processing and storage minimizing the additional hardware cost.

  • It can process both semi-structured and unstructured data.

  • It is flexible since it processes and stores data and hence it can be easily merged into warehouse structure.

 

Disadvantages of ELT:

  • Since ELT has not been widely used it has less process maturity. But many industries are implementing ELT and its popularity is increasing.

  • Due to limited usage, the implementation tools are limited.

  • And the limited adoption has impacted the availability of experts on ELT. However, it is changing and more people working on this are increasing.

 

Difference between ELT and ETL?

ELT should be used when it has big volumes of data, for example, Hadoop Cluster, Cloud Installation or Data Appliance, or when the source and target database are the same.

Whereas ETL is designed to be a pipeline approach. So when the data runs from the source system to the target, it is transformed by the ETL tool while in ELT the changes are carried out by the target database.

Thus, in spite of ELT implementation is more intricate compared to the pipeline approach of ETL process, it is more preferred. Designing of ELT systems consumes more time but the payoff is worth it!!

 

Why we can shift from ETL to ELT?

ETL helped to cope with the restriction of the old-style data center infrastructures which with the cloud are no longer a barrier today. For the organizations having large data sets of even only a few terabytes, load time can take hours, depending on the complexity of the transformation rules.

ELT is the future of data warehousing. Businesses of any size can take advantage of on the current technologies by analyzing larger pools of data quickly and it requires less maintenance.

 

Conclusion

Using ELT has the following benefits:

  •  It has the capability to work with big data.

  • It is more efficient when the data sets used in "schema on read" analytics.

  • It can control the DBMS for processing and storage as well.

  • Since it doesn't require an ETL tool we can eliminate the ETL server.

These are one of the few benefits that ELT is offering now. Down the road, the scope of the technology will potentially grow as built-in data integration tools for Hadoop and NoSQL solutions continue to evolve.


Authors:

Sathya Bhama & Smriti

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Please key in the two words you see in the box to validate your identity as an authentic user and reduce spam.

Subscribe to this blog's feed

Follow us on

Blogger Profiles