Wednesday, 22 July 2015

Types of Facts

A fact table is the one which consists of the measurements, metrics or facts of business process. These measurable facts are used to know the business value and to forecast the future business. The different types of facts are explained in detail below.

Additive:

Additive facts are facts that can be summed up through all of the dimensions in the fact table. A sales fact is a good example for additive fact.

Semi-Additive:

Semi-additive facts are facts that can be summed up for some of the dimensions in the fact table, but not the others.
Eg: Daily balances fact can be summed up through the customers dimension but not through the time dimension.

Non-Additive:

Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table.
Eg: Facts which have percentages, ratios calculated.

Factless Fact Table:

In the real world, it is possible to have a fact table that contains no measures or facts. These tables are called "Factless Fact tables".

Eg: A fact table which has only product key and date key is a factless fact. There are no measures in this table. But still you can get the number products sold over a period of time.

A fact tables that contain aggregated facts are often called summary tables

A Case of Factless fact


 


Challenge


When we  group by Doctor to get the total amount and Total patient count which are under  the doctor the total amount is correct because there is a entry in fact table for this. Consider the case of total patient under doctor that is incorrect because not all patients have a bill amount. For this we need to design a factless fact table  

Factless  facts  are  of  two  types --  EVENT TRACKING  and COVERAGE  a event tacking only says who took part in the event while a coverage allows you to answer who did not take part  

Answer is  Simple -- Create a Factless fact table which has entry for each doctor and patient . Sum up the values for Provider you will get number of patients . Then sum up the billing fact details based on provider . Then join both facts based on provider keys and then join them with dimension.

The reason for joining with dimensions later is size of the tables for joining would increase and it would require additional hash joins. Hash joins are CPU intensive and will slow down your system also consider your dimension is 1GB with 50000 doctors and 200 million records for fact you are repeating the doctor name for each of 200 million records and doctor name is 100 KB field then you are adding that much size so your temp size the amount of memory for PGA everything will increase this will cause a slowdown.  

No comments:

Post a Comment