The best way to start implementing your dimensions and
identifying your fact tables is a question usually raised by the business user.
An example of a question would be, "What if I would like to have all
customer sales by date and region?" There your dimension is the by clause
(Sales date and region in this case). Your fact table then would be the
customer sales amount.
We have two design approaches to creating our dimensions.
The star and the snow flake
Let us start with the simpler one, the start schema. There
are certain rules that apply.
Each dimension is represented by a single dimension table.
Each dimensional table is related to or linked to a fact
table (sometimes, in the case of date dimension it may be a good idea to take
the attribute from the fact table itself).
You can see in the picture below two examples for a star
schema where the fact table would be the employee table and the two dimensions
as linked are region as well as the entity.
Figure 2: Sample of creating dimensions with star
Please note that as you start adding more dimensions tables
to the picture, you are going to arrive at a star shape with the fact table in
the middle. That is why the name is start schema. Another interesting fact is
that the relation is merely between a master and a detail table (Primary Key
(Being a dimension) and a reference key in the fact table).
Snow Flake Schema
Figure 3: Snow Flake representation
I will begin by showing a sample snow flake representation
in picture 3 and then explain using example the analogy.
You can see from the picture that at each branch we can have
multiple branches. The same case applies to the dimension table as you will see
in the analogy presented afterwards.
The other type of dimension creation is the snow flake
schema. The snow starts like a star schema. However, the dimensions are
represented by a more than one dimensional table. Sometimes, the dimension
table is not even related directly to the fact table. In the picture you can
see that the employee table is connected to the entity table directly through
the entity ID. You can also see that the employee table can be linked directly
to the region using the employee region. However, we can still treat the region
as a super dimension of the employee by using the entity as an intermediate
dimension. In the case of filtering employees by entity type, there is no
direct connection as you can see. The only way to go about it is to use the
snow flake schema to filter the data through the use of the entity table. The
picture shows only a branch of a snow flake schema.
Figure 4: A snow flake schema with one branch