what is a junk dimension ?

Answer Posted / sanju

A junk dimension is a convenient grouping of typically
low-cardinality flags and indicators. By creating an
abstract dimension, these flags and indicators are removed
from the fact table while placing them into a useful
dimensional framework.[2] A Junk Dimension is a dimension
table consisting of attributes that do not belong in the
fact table or in any of the existing dimension tables. The
nature of these attributes is usually text or various flags,
e.g. non-generic comments or just simple yes/no or
true/false indicators. These kinds of attributes are
typically remaining when all the obvious dimensions in the
business process have been identified and thus the designer
is faced with the challenge of where to put these attributes
that do not belong in the other dimensions.

One solution is to create a new dimension for each of the
remaining attributes, but due to their nature, it could be
necessary to create a vast number of new dimensions
resulting in a fact table with a very large number of
foreign keys. The designer could also decide to leave the
remaining attributes in the fact table but this could make
the row length of the table unnecessarily large if, for
example, the attributes is a long text string.

The solution to this challenge is to identify all the
attributes and then put them into one or several Junk
Dimensions. One Junk Dimension can hold several true/false
or yes/no indicators that have no correlation with each
other, so it would be convenient to convert the indicators
into a more describing attribute. An example would be an
indicator about whether a package had arrived, instead of
indicating this as “yes” or “no”, it would be converted into
“arrived” or “pending” in the junk dimension. The designer
can choose to build the dimension table so it ends up
holding all the indicators occurring with every other
indicator so that all combinations are covered. This sets up
a fixed size for the table itself which would be 2^x rows,
where x is the number of indicators. This solution is
appropriate in situations where the designer would expect to
encounter a lot of different combinations and where the
possible combinations are limited to an acceptable level. In
a situation where the number of indicators are large, thus
creating a very big table or where the designer only expect
to encounter a few of the possible combinations, it would be
more appropriate to build each row in the junk dimension as
new combinations are encountered. To limit the size of the
tables, multiple junk dimensions might be appropriate in
other situations depending on the correlation between
various indicators.

Junk dimensions are also appropriate for placing attributes
like non-generic comments from the fact table. Such
attributes might consist of data from an optional comment
field when a customer places an order and as a result will
probably be blank in many cases. Therefore the junk
dimension should contain a single row representing the
blanks as a surrogate key that will be used in the fact
table for every row returned with a blank comment field[3]

Is This Answer Correct ?    0 Yes 1 No



Post New Answer       View All Answers


Please Help Members By Posting Answers For Below Questions

Tell me can we override a native sql query within informatica? Where do we do it? How do we do it?

632


Implementation methodology

1260


What is repository manager?

536


How can you generate reports in informatica?

581


What is the difference between power center and power mart? What is the procedure for creating independent data marts from informatica 7.1?

605






How to update or delete the rows in a target, which do not have key fields?

651


What is parallel processing in informatica?

1030


Can we create multiple integration service on single repository?

699


Explain lookup transformation is active in informatica

662


what is informatica metadata?

602


What is a surrogate key?

563


CANNOT USE PARAMETER FILE! Hi all, I am trying to use parameter file for my workflow. This could help me to filter records where CITY = 'Portland' Following is what I have done: **in Designer - create new parameter : $$PARA_FIL, Parameter, String, IsExprVar=TRUE, Initial value = [empty] - Source Qualifier/ Properties/Source Filter: CUSTOMERS.CITY='$$PARA_FIL' **Create Parameter file: C:\Informatica\PowerCenter8.6.0 \server\infa_shared\BWParam\DynamicParamTest.txt $$PARA_FIL='Portland' **Configure workflow to use the parameter file: Edit Workflow/Properties/Parameter Filename: C:\Informatica\PowerCenter8.6.0 \server\infa_shared\BWParam\DynamicParamTest.txt I also configured directory of parameter file for session task. However, I just got this in the session log: [SQ_CUSTOMERS] SQL Query [SELECT CUSTOMERS.CUSTOMER_ID FROM CUSTOMERS WHERE CUSTOMERS.CITY='$$PARA_FIL'] No record has been loaded to target. It seems that the parameter file has not been read. I cannot understand the reason why. Could any of you kindly suggest me anything? Thanks

7797


Differentiate between reusable transformation and mapplet.

574


One of the optimizing technique to improve the session performance is push down optimization,by using push down optimization we push as much as transformation logic to source/target database,but this degrades the d/b performance,how to overcome this?

1749


What is domain and gateway node?

694