Data warehouse Design Using Oracle

Unit Objective:

This unit will cover various schema design of the data warehousing modelling. It will describe schema table such as fact table and dimension table.  The practical part explains details of creating dimension table using oracle statement level, hierarchy and attributes. Displaying dimension, validation of dimension and altering dimension.

1.1 Fact table


Figure 1-2 Fact Table
fig1-2_3cQieKc.jpgFigure 1-2 Fact Table

Figure 1-2 Fact Table


1.2 Dimension Table

Dimension is a collection of reference information about a measurable in the fact table. 

The primary key column of the dimension table has uniquely identifies each dimension record or row.

The dimension tables are organized has descriptive attributes.  For example, a customer dimension’s attributes could include first and last name, birth date, gender, Qualification, Address etc.,

1.3  Star Schema

Star Schema a Multidimensional data representation of relational database schema

A star schema contains two types of tables – Fact table and dimension table.

A Fact table in the middle connected to a set of dimension tables

In a star schema, a dimension table will not have any parent table

Each dimension table have primary key that corresponds exactly to one of the components of the composite key in the fact table

steps of designing star schema

1.  Identify the business process. i.e Fact table For example - sales, reservation 

2. Identify the measures or fact data i.e in the sales fact table  net profit, sales_income, No_of _unit_sold.

3. Identify the Dimension table related to fact table. For example product, Time, Location,Branch

4. List of attributes or column in each dimension table

5.  Find the lowest level of aggregation or summary analysis in the fact table. Ex. Total sales income.

star schema
fig1-3_star_schema.jpgstar schema

star schema


1.3.1 Snowflake Schema

Snowflake Schema is a refinement of star schema where some dimensional hierarchy is normalized into third normal form and forms a set of smaller dimension tables.

 Snowflake schema keeps same fact table structure as star schema.

In the dimension, it has multiple levels with multiple hierarchies. From each hierarchy of levels any one level can be attached to Fact Table.

 Mostly lowest level hierarchy is attached to Fact Table. These hierarchies help to drill down the data from topmost hierarchies to the lowermost hierarchies.

the tables which describe the dimensions will already exist and will typically be normalized.

Snow Flakes
fig1-22snowflakes.jpgSnow Flakes

Snow Flakes


1.3.2. Galaxy schema or Fact cancellation schema

  •   Multiple fact tables share dimension tables.
  •  This schema is viewed as collection of stars hence called galaxy schema or fact constellation.
  •  Sophisticated application requires such schema.
  • For Data warehouse the fact constellation schema is commonly used since it can model multiple, interrelated subjects.


galaxy
galaxy_ss72lBp.jpggalaxy

galaxy


1.4 CREATE DIMENSION statement

Before creating dimension, first create  product table as a base table and then create dimension table product_dim.

conn  hr/hr

Create table product( product_id number(5) primary key,prod_category varchar2(15),product_subcategory varchar2(15),prod_category_desc varchar2(20),Prod_subcate_desc varchar2(20),prod_price number(8,3),prod_status varchar2(5),brand_name varchar2(15),supplier_name varchar2(20));


The SQL CREATE DIMENSION statement supports the specification of levels, hierarchies  and attributes. 

SYNTAX:

CREATE DIMENSION <DIMENSION_NAME>

  LEVEL <LEVEL NAME> IS<TABLE_NAME.COLUMN NAME>

HIERARCHY <HIERARCHY NAME>

(<CHILD LEVEL_NAME> CHILD OF <PARENT LEVEL_NAME>)

ATTRIBUTE <LEVEL NAME> DETERMINES (<COLUMN NAME>)

1.5  LEVEL

LEVEL is a first part of a Dimension declaration, which involves the specification of levels.

All Dimension names are unique in the database, dimension is a database object like table, index, view etc.,  If product_dim is already exist then for adding or modifying level use ALTER dimension otherwise drop dimension product_dim again re-create CREATE DIMENSION product_dim

 Most dimensions involve multiple levels, in the example there are three Levels namely pid, prod_sub_cate, prod_cate

CREATE DIMENSION product_dim

LEVEL pid IS (product.product_id)

LEVEL prod_sub_Cate IS (product.product_subcategory) 

LEVEL prod_cate IS (product.prod_category)

create Level

fig1-5_create_level.jpg
create Level

create Level

Suppose HR user don't privilege have to create dimension, then grant create dimension privilege to Hr user then create dimension.

1.5.1 Identical Level name

All the level name in the dimension  to be unique not identical. If both the level has the same  level name. So, it gives error message.

Identical Level name
fig1-8_identical_level_name.jpgIdentical Level name

Identical Level name


1.5.2.  Don't repeat the same column as level in different dimension

Using same column not allow to create as level even though the dimension is different and level name is different. Therefore only one time the column name of the table is allowed to create dimension. Duplicating the column for level and dimension is not allowed.

same column name don't allow to create level
fig1-9_same_level_dont_repeat_diff_dim.jpgsame column name don't allow to create level

same column name don't allow to create level


1.6 Hierarchies

Hierarchies are used for data aggregation in data warehousing, it is  logical structures that use ordered levels of organizing data.

Example, time dimension, a hierarchy might aggregate data from the month level to the quarter level to the year level.

Using hierarchies to drill down into data in the data warehousing to  view different levels of granularity.

For example, city is a child of state (because you can aggregate city-level data up to state).

A hierarchical relationship is a functional dependency from one level of a hierarchy to the next level in the hierarchy, the CHILD OF relationship denotes that each child's level value is associated with one and only one parent level value

If the columns of a parent level and child level are in different relations, then the connection between them also requires a 1:n join relationship

Column of each hierarchy level are non-null that hierarchical integrity should be maintained.

Hierarchy level cannot be joined to itself either directly or indirectly.

A single dimension definition can contain multiple hierarchies.

Non-hierarchical dimension are referred as “Flat Dimension”, there is only a single level in a dimension

Here, we already created product_dim, so ALTER dimension add hierarchy or otherwise drop dimension product_dim and re-create product_dim again.

ALTER DIMENSION product_dim Add

HIERARCHY prod_rollup

( pid CHILD OF

prod_sub_cate CHILD OF

prod_cate)

Hierarchy
fig1-6_hierarchy.jpgHierarchy

Hierarchy


1.6.1. Hierarchy Using Join key

  • Join key is used in snowflakes schema  for joining parent dimension and child dimension. 
  • Hierarchy level cannot be joined to itself either directly or indirectly.

step 1: drop the brand_name from product table because it is normalized into another dimension called brand table

alter table product drop column brand_name;

step 2: Normalize the product table and create new table brand with three column

create table brand(brand_id number(5) primary key, brand_name varchar2(15),brand_desc varchar2(20));

Step 3: Alter the table product and create foreign key relationship between product and brand table. 

alter table product 

add 

brand_id number(5) references brand(brand_id);

step 4: create level for brand table primary key

step 5: Alter dimension add the level brand id

ALTER DIMENSION PROCUT_DIM 

ADD

LEVEL BID IS BRAND.BRAND_ID;

step 6: create hierarchy brand_rollup using join key with product brand_id and brand.brand_id

ALTER DIMENSION PROCUT_DIM 

ADD

HIERARCHY BRAND_ROLLUP

(PID CHILD OF BID

JOIN KEY PRODUCT.BRAND_ID REFERENCES BID);


1.7 Attributes

  •  The ATTRIBUTE clause defines functional dependency relationships involving dimension levels and non source columns in dimension tables.
  •  The attribute_clause specify the columns that are uniquely determined by a hierarchy level. There must be a 1:1 attribute relationship between hierarchy levels and their dependent dimension attributes

 The dependent_columns need not have been specified in the level_clause. For example prod_status is not specified in the level name.

ALTER DIMENSION PRODUCT_DIM ADD

ATTRIBUTE PID DETERMINES PROD_STATUS;

                                           OR 

ATTRIBUTE PID DETERMINES PRODUCT.PROD_STATUS;

 The columns in level must all come from the same table as the dependent_columns.

ALTER DIMENSION PRODUCT_DIM ADD

ATTRIBUTE PID DETERMINES BRAND.BRAND_DESC;

because level pid from product table but attributes brand_desc  from brand table.

Attribute level table differ from column table
fig1-12_hierarchy_5le8veR.jpgAttribute level table differ from column table

Attribute level table differ from column table


1.7.1 Extended_attribute_clause

The only difference is that this clause assigns a name to the attribute that is different from the level name.

If the extended_attribute_clause is used to create multiple columns determined by a hierarchy level.

ALTER DIMENSION PRODUCT_DIM 

ADD

ATTRIBUTE SUPP LEVEL PID DETERMINES SUPPLIER_NAME;

In the above example SUPP is attribute name.

1.8 ALTER DIMENSION

If dimension is already exist then using alter dimension statement it is easy to include hierarchy or level or attributes into the dimension. In the same manner removing or modifying the level, hierarchy or attribute also possible using alter dimension...drop  statement.

1.8.1 Alter Dimension using ADD

Add level

ALTER DIMENSION PRODUCT_DIM

ADD 

LEVEL PPRICE IS PRODUCT.PROD_PRICE;

Add Hierarchy

ALTER DIMENSION PRODUCT_DIM

ADD

HIERARCHY PRICE_ROLLUP

(PID CHILD OF PROD_CATE CHILD OF PPRICE);

Add Attribute

ALTER DIMENSION PRODUCT_DIM

ADD

ATTRIBUTE psubcate level pid  DETERMINES

(product.product_subcategory, Prod_subcate_desc);

1.8.2 Alter Dimension using DROP

CASCADE and RESTRICT is used for restricting DROP IN ALTER DIMENSION.

 

cascade and restrict using drop dimension
fig1-11_alter_dim_using_drop_cascade_and_restrict.jpgcascade and restrict using drop dimension

cascade and restrict using drop dimension

you can drop one attribute column without dropping them all.

The following statement illustrates how you can drop a single column without dropping all columns:

ALTER DIMENSION product_dim

DROP ATTRIBUTE psubcate LEVEL pid  COLUMN Prod_subcate_desc;

1.9.  DROP Dimension

A dimension is removed using the DROP DIMENSION statement. For example:

 DROP DIMENSION <Dimension name>

DROP DIMENSION product_dim;

Drop dimension
fig1-7_drop_dim_rqTZPKR.jpgDrop dimension

Drop dimension


1.10 Display Dimension

As mentioned earlier in the database schema, the dimension is one of the objects like table, index. This statement gives how many user objects created by the user, here in where condition given object type is dimension. So it will display how many dimension created by user.   Like "select * from tab" show how many tables are existed in the current user same manner it will show how many dimensions are created by the user.

SQL> SET LINESIZE 400;

SQL> select object_type,object_name FROM user_objects where object_type='DIMENSION';

OBJECT_TYPE         OBJECT_NAME

------------------- ---------------------------------------------------------------------

DIMENSION           PRODUCT_DIM

1.11 Dimension status

“All_dimensions” gives the number of dimension created by the user and also tells the status of each dimension table whether it is, ERROR, NEEDS_COMPILE, VALID etc.

dimension status
fig1-14.jpgdimension status

dimension status


1.12 Retrieve Dimension

To view the definition of a dimension, use the DESCRIBE_DIMENSION procedure in the DBMS_DIMENSION package.

SET SERVEROUTPUT ON FORMAT WRAPPED; --to improve the display of info 

EXECUTE DBMS_DIMENSION.DESCRIBE_DIMENSION ('PRODUCT_DIM');

 it will display how many level, hierarchy and attributes are available within the dimension. 

Retrieve dimension
fig1-13_describe_dimension.jpgRetrieve dimension

Retrieve dimension


1.13 Validate Dimension

For verifying the validity of the relationship specified in the dimensions  use DBMS_DIMENSION.VALIDATE_DIMENSION procedure periodically.

This procedure is easy to use and has only four parameters: 

* dimension:  who created the dimension or owner of the dimension object 

* incremental: set to TRUE to check only the new rows for tables of this dimension. 

* check_nulls: set to TRUE to verify that all columns that are not in the levels containing a SKIP WHEN NULL clauses are not null. 

* statement_id: user-supplied unique identifiers to identify the result of each run of the procedure


In this example dimension='hr.product_dim', incremental = FALSE, check_nulls=TRUE, statement_id='error catch'

EXECUTE DBMS_DIMENSION.VALIDATE_DIMENSION('hr.product_dim',FALSE, TRUE,'error catch');

1.13.1  Dimension_Exceptions

Before running the VALIDATE_DIMENSION procedure, need to create a local table, DIMENSION_EXCEPTIONS.

 

Dimension Exception
fig1-15_dimension_exception.jpgDimension Exception

Dimension Exception

If the VALIDATE_DIMENSION procedure encounters any errors, the errors are stored in DIMENSION_EXCEPTIONS table.  Querying this table will identify the exceptions that were found. 

For validating Dimension

Step 1: check whether dimension_exception table is exit or not,  then it will be created only once. 

creating dimension exception
fig1-16_dimension_exception.jpgcreating dimension exception

creating dimension exception

Step 2: suppose it is not exist, then create the dimension_exceptions table.

create table dimension_exceptions(STATEMENT_ID VARCHAR2(25),OWNER VARCHAR2(25),TABLE_NAME VARCHAR2(20),DIMENSION_NAME   VARCHAR2(25),RELATIONSHIP VARCHAR2(25),BAD_ROWID VARCHAR2(35));

Step 3: Execute the validate_dimension procedure

EXECUTE DBMS_DIMENSION.VALIDATE_DIMENSION('hr.product_dim',FALSE, TRUE,'error catch');

Step 4: check there is any error in the dimension exception stored

display dimension exception
fig1-17_display_dimension_exceptions.jpgdisplay dimension exception

display dimension exception

At present, there is no error occurred in the dimension_exceptions table. 

Step 5:  check with errors

Try to insert two records in the base table store.

SQL> Insert into product(PRODUCT_ID,PROD_CATEGORY) values (11,'Electronics');

1 row created.

SQL> insert into product (PRODUCT_ID,PROD_CATEGORY) values(11,'Cosmetics');

insert into product (PRODUCT_ID,PROD_CATEGORY) values(11,'Cosmetics')

*

ERROR at line 1:

ORA-00001: unique constraint (HR.SYS_C0011199) violated

Select * from dimension_exceptions;

Error Dimension Exception
fig1-18_error_dimension_exception.jpgError Dimension Exception

Error Dimension Exception


1.13.2 Revalidate Dimension status

Suppose after creating the dimension for product table called “product_dim”, alter the structure of the  base table any column data type or size  of the column. 

Describe product table
fig1-19_desc_product.jpgDescribe product table

Describe product table

Modify the prod_category varchar2(15)  into prod_category varchar2(20);

select dimension_name, compile_state from all_dimensions;

Alter table product modify (prod_category varchar2(20));

compile state
fig1-20_compile_state.jpgcompile state

compile state


To revalidate the dimension “product_dim”, use the COMPILE option; now check the status of dimension again, the status has changed as “VALID”

Alter dimension Product_dim compile;

select dimension_name, compile_state from all_dimensions;

product compile state
fig1-21_valid_state_of_product_dim.jpgproduct compile state

product compile state


Unit summary

At the end of the unit, student can learn how to design schema and how to create, modify and display the dimension. They also understand status of dimension.

Practical Exercise

Using hr schema,

Create dimension patientinfo_dim using patient table. The patient table has following columns patient id, patient name, DOB, gender, Address, disease, test, medicine attributes.

The patient table normalized into doctor table which has following attributes doctor id, doctor name, yrs_of_experience and specialization.

Design patientinfo_dim has hierarchy name as "medicine_rollup" which has following level patient_id, patient_name, disease, test and medicine.

create attributes "patient_info" which has columns are DOB, gender.

 Do the following :

1) Include attribute as "Address" in patientinfo_dim

2) Display hierarchy  and level of patientinfo_dim

3) Display how many dimensions are created by user hr.

4) Remove the level "test" from patientinfo_dim

5) Display the patient-dim status

6)  Include "patient_dr_rollup" hierarchy in patientinfo_dim dimension for joining the patient and doctor  table 

7) Remove the gender column from patient_info attribute

8) Change the disease column size as varchar2(40) in base table product.

9) Check the productinfo_dim validation status and  write appropriate statement to execute  "compile_state" of patientinfo_dim as valid.

10) insert the below given values in the Patient table, patiend_id column is primary key column

patient_id                   Patient_name                 Gender

P11                              Sara                                  F

P11                               Wafa                                F

While inserting check and store dimension  Exception information. 

Question

1. Explain design steps of Star schema

2. Differentiae star and snowflake schema

3.  what type of column data available in fact table

4.  How level is related in hierarchy using create dimension statement.

Video Tutorial Link

YouTube Tutorial 2: Create Dimension Snow Flakes Schema

YouTube Tutorial 3: Create Dimension Statement Syntax

YouTube Tutorial 4: Create Level

 YouTube Tutorial 5- Identical Level Name

YouTube Tutorial 6: Create Hierarchy

YouTube Tutorial 7: Hierarchy using Join Key

YouTube Tutorial 8 : Create Attribute

YouTube Tutorial 9: Alter Dimension using ADD

YouTube Tutorial 10: Alter Dimension Using Drop

YouTube Tutorial 11: Drop Dimension

Lecturer Notes

Lecturer Notes 1: Create dimension Star Schema

Lecturer Notes 2: Create Dimension Snowflakes Schema

Lecturer Notes 3: Create Dimension Statement Syntax

Lecturer Notes 4: Create Dimension Level

Lecturer Notes- OER Unit 1 Schema Design

References

http://www.vertabelo.com/blog/technical-articles/data-warehouse-modeling-star-schema-vs-snowflake-schema

https://learndatamodeling.com/blog/designing-star-schema/

https://docs.oracle.com/cd/B28359_01/server.111/b28313/dimen.htm

Return to top