kumargauraw

About Kumar Gauraw

Kumar Gauraw is senior IT professional with over 16 years of experience while serving in various capacities at Fortune 500 companies in the field of Data Integration, Data Modeling, Data Migration and architecting end-to-end Datawarehousing solutions. Kumar has strong experience in a wide spectrum of tools and technologies related to BI and Data Integration.

When Complete Compare On Corrupt Data Model Crashes Erwin

By | January 17th, 2016|Data Modeling|

According to the CA help documentation on Erwin, Cut, Copy and Paste functions are fully supported. You could simply select model objects you want to copy and paste them into a new or existing model diagram without any problem. You can read more about this functionality here: https://supportcontent.ca.com/cadocs/0/CA%20ERwin%20Data%20Modeler%20r9%205-ENU/Bookshelf_Files/HTML/ERwin%20Help/index.htm?toc.htm?Cut_Copy_and_Paste_Functions.html However, the development community seems to be [...]

Actually, Spark Adds Power To Hadoop In Real-Time Processing

By | January 5th, 2016|Big Data|

Since Apache Spark came to existence in 2014, it received massive recognition and developer community just loved it, all for good reasons. Apache Spark is a fast, in-memory data processing engine with elegant development APIs to allow developers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets. However, [...]

How To Load Star Schema Dimensions And Facts In Parallel

By | December 19th, 2015|Data Integration Concepts|

One of the bottlenecks often encountered in ETL development for start schema is the sequence of loading dimension tables before being able to load any fact table. Obviously, it happens since fact tables need those surrogate keys from dimension tables in any dimensional data model. I always wondered if there could be a better way [...]

Using Oracle’s Partition Exchange Load for Very Large Target Tables

By | September 15th, 2015|Oracle|

The situation is very common in Data Integration where a good performing ETL load starts to slow down. The job that been running and loading data just fine, suddenly becomes too slow and becomes a bottleneck. There are many technical explanations for this to happen. Just to name a few, stale statistics, outdated hardware, software, [...]

The Relationship Between Data Quality and Master Data Management

By | May 27th, 2015|Miscellaneous|

Master Data Management (MDM) is the process of creating and managing quality data in such a way that an organization can have a single master copy of its master data; such as customer, product, items data etc. Usually, non-transactional data are referred to as master within the organization and such data can include customers, suppliers, [...]

Dimensional Modeling for Support Ticket Processing

By | May 19th, 2015|Data Modeling|

Dimensional modeling becomes challenging, sometimes in uncommon situations, or that are not straightforward. Dimensional modeling with multiple stages of processing can sometimes be confusing and difficult. Such is the nature of a ticket-based support system where a single ticket can go through multiple stages and multiple people. So, how do you do dimensional modeling for [...]

Benefits and Limitations of Using Virtual Columns in Oracle Database

By | May 9th, 2015|Oracle|

As the name suggests, Oracle introduced the concept of defining columns on a table that will actually not store values. Instead, it is uses an expression based on other columns in the same table. A very interesting concept. Although these virtual columns appear as normal columns of the table when queried, they are actually not [...]

CA ERwin-Using Naming Standards, Model Templates

By | May 1st, 2015|Data Modeling|

Data Modeling (Relational or Dimensional,) is a job that needs more thinking than really doing. It has more to do with gaining a better understanding of business processes and data that follows through those processes than knowing a bunch of technical terms or tools. However, having a good technical handle on tools like CA ERwin [...]

Hive Query Performance Optimization On Hadoop For Big Data

By | March 31st, 2015|Big Data|

If you have been around big data for any length and worked on Hadoop, you have seen plenty of Pig and Hive. If you are even learning Hadoop and Big Data, Pig and Hive must seem to be the things you should know in order to have some control on big data. Well, that’s the [...]

How Master Data Management And Big Data Relate

By | March 30th, 2015|Big Data|

Master data is basically a shared master copy of data such as customer, product, employee, suppliers and location data used by several applications within an enterprise. Master Data Management (MDM) is important to organizations today because it allows an enterprise to have a single version of the truth. Without a clearly defined master data, any [...]