Write a short note on data warehousing & three-Tier
Warehouse architecture in detail.



Write a short note on data warehousing & three-Tier Warehouse architecture in detail...

Answer / kashish

If you want to get information on all the techniques of
designing, maintaining, building and retrieving data, Data
warehousing is the ideal method. A data warehouse is
premeditated and generated for supporting the decision
making process within an organization. When the production
databases are copied in the warehouse, it becomes easier to
answer all the queries without hampering the consistency of
the production system.

A data warehouse is actually a set of new concepts and
important tools evolved into a technology. With the help of
data warehousing, it becomes easy for an organization to
counter all the problems faced during providing key
information to concerned people.

Over the last two decades, a number of experiences and
technologies incorporated together to evolve the new field
of Data warehousing. You can say it as a well organized and
resourceful method of managing & reporting data non uniform
and scattered sourced throughout an organization.

Because of hundreds of gigabytes of transactions, it is
necessary for a data warehouse to be vast. Therefore, "data
marts" are often designed for individual department or a
product line. A data warehouse system is an influential and
necessary platform for merging data from old and new
applications. You can transfer rules to a warehouse without
making much efforts. The prime features of a data warehouse
is that it records, collects, filters and provides basic
data to different systems at higher levels.

Here are some of the benefits of a data warehouse:

o With data warehousing, you can provide a common data
model for different interest areas regardless of data's
source. In this way, it becomes easier to report and
analyze information.

o Many inconsistencies are identified and resolved before
loading of information in data warehousing. This makes the
reporting and analyzing process simpler.

o The best part of data warehousing is that the information
is under the control of users, so that in case the system
gets purged over time, information can be easily and safely
stored for longer time period.

o Because of being different from operational systems, a
data warehouse helps in retrieving data without slowing
down the operational system.

o Data warehousing enhances the value of operational
business applications and customer relationship management
systems.

o Data warehousing also leads to proper functioning of
support system applications like trend reports, exception
reports and the actual performance analyzing reports.

THREE-TIER WAREHOUSE ARCHITECTURE IN DETAIL:-
4.1 Architecture Choices
Selection of an architecture will determine, or be
determined by, where the data
warehouses and/or data marts themselves will reside and
where the control
resides. For example, the data can reside in a central
location that is managed
centrally. Or, the data can reside in distributed local
and/or remote locations
that are either managed centrally or independently.
The architecture choices we consider in this book are
global, independent,
interconnected, or some combination of all three. The
implementation choices to
be considered are top down, bottom up, or a combination of
both. It should be
understood that the architecture choices and the
implementation choices can
also be used in combinations. For example, a data warehouse
architecture
could be physically distributed, managed centrally, and
implemented from the
bottom up starting with data marts that service a
particular workgroup,
department, or line of business.
4.1.1 Global Warehouse Architecture
A global data warehouse is considered one that will support
all, or a large part,
of the corporation that has the requirement for a more
fully integrated data
warehouse with a high degree of data access and usage
across departments or
lines-of-business. That is, it is designed and constructed
based on the needs of
the enterprise as a whole. It could be considered to be a
common repository for
Ó Copyright IBM Corp. 1998 15
decision support data that is available across the entire
organization, or a large
subset thereof.
A common misconception is that a global data warehouse is
centralized. The
term global is used here to reflect the scope of data
access and usage, not the
physical structure. The global data warehouse can be
physically centralized or
physically distributed throughout the organization. A
physically centralized
global warehouse is to be used by the entire organization
that resides in a
single location and is managed by the Information Systems
(IS) department. A
distributed global warehouse is also to be used by the
entire organization, but it
distributes the data across multiple physical locations
within the organization
and is managed by the IS department.
When we say that the IS department manages the data
warehouse, we do not
necessarily mean that it controls the data warehouse. For
example, the
distributed locations could be controlled by a particular
department or line of
business. That is, they decide what data goes into the data
warehouse, when it
is updated, which other departments or lines of business
can access it, which
individuals in those departments can access it, and so
forth. However, to
manage the implementation of these choices requires support
in a more global
context, and that support would typically be provided by
IS. For example, IS
would typically manage network connections. Figure 5 shows
the two ways that
a global warehouse can be implemented. In the top part of
the figure, you see
that the data warehouse is distributed across three
physical locations. In the
bottom part of the figure, the data warehouse resides in a
single, centralized
location.
Figure 5. Global Warehouse Architecture. The two primary
architecture approaches.
Data for the data warehouse is typically extracted from
operational systems and
possibly from data sources external to the organization
with batch processes
during off-peak operational hours. It is then filtered to
eliminate any unwanted
data items and transformed to meet the data quality and
usability requirements.
It is then loaded into the appropriate data warehouse
databases for access by
end users.
16 Data Modeling Techniques for Data Warehousing
A global warehouse architecture enables end users to have
more of an
enterprisewide or corporatewide view of the data. It should
be certain that this
is a requirement, however, because this type of environment
can be very time
consuming and costly to implement.
4.1.2 Independent Data Mart Architecture
An independent data mart architecture implies stand-alone
data marts that are
controlled by a particular workgroup, department, or line
of business and are
built solely to meet their needs. There may, in fact, not
even be any connectivity
with data marts in other workgroups, departments, or lines
of business. For
example, data for these data marts may be generated
internally. The data may
be extracted from operational systems but would then
require the support of IS.
IS would not control the implementation but would simply
help manage the
environment. Data could also be extracted from sources of
data external to the
organization. In this case IS could be involved unless the
appropriate skills were
available within the workgroup, department, or line of
business. The top part of
Figure 6 depicts the independent data mart structure.
Although the figure
depicts the data coming from operational or external data
sources, it could also
come from a global data warehouse if one exists.
The independent data mart architecture requires some
technical skills to
implement, but the resources and personnel could be owned
and managed by
the workgroup, department, or line of business. These types
of implementation
typically have minimal impact on IS resources and can
result in a very fast
implementation. However, the minimal integration and lack
of a more global
view of the data can be a constraint. That is, the data in
any particular data
mart will be accessible only to those in the workgroup,
department, or line of
business that owns the data mart. Be sure that this is a
known and accepted
situation.
Figure 6. Data Mart Architectures. They can be independent
or interconnected.
Chapter 4. Data Warehousing Architecture and Implementation
Choices 17
4.1.3 Interconnected Data Mart Architecture
An interconnected data mart architecture is basically a
distributed
implementation. Although separate data marts are
implemented in a particular
workgroup, department, or line of business, they can be
integrated, or
interconnected, to provide a more enterprisewide or
corporatewide view of the
data. In fact, at the highest level of integration, they
can become the global data
warehouse. Therefore, end users in one department can
access and use the
data on a data mart in another department. This
architecture is depicted in the
bottom of Figure 6 on page 17. Although the figure depicts
the data coming
from operational or external data sources, it could also
come from a global data
warehouse if one exists.
This architecture brings with it many other functions and
capabilities that can be
selected. Be aware, however, that these additional choices
can bring with them
additional integration requirements and complexity as
compared to the
independent data mart architecture. For example, you will
now need to consider
who controls and manages the environment. You will need to
consider the need
for another tier in the architecture to contain, for
example, data common to
multiple data marts. Or, you may need to elect a data
sharing schema across
the data marts. Either of these choices adds a degree of
complexity to the
architecture. But, on the positive side, there can be
significant benefit to the
more global view of the data.
Interconnected data marts can be independently controlled
by a workgroup,
department, or line of business. They decide what source
data to load into the
data mart, when to update it, who can access it, and where
it resides. They may
also elect to provide the tools and skills necessary to
implement the data mart
themselves. In this case, minimal resources would be
required from IS. IS
could, for example, provide help in cross-department
security, backup and
recovery, and the network connectivity aspects of the
implementation. In
contrast, interconnected data marts could be controlled and
managed by IS.
Each workgroup, department, or line of business would have
its own data mart,
but the tools, skills, and resources necessary to implement
the data marts would
be provided by IS.

Is This Answer Correct ?    6 Yes 2 No

Post New Answer

More Data Warehouse General Interview Questions

What is degenerate dimension table?

0 Answers  


What is Data purging?

8 Answers   TCS,


Define data warehouse?

0 Answers  


Can any one explain the hierarchies level data warehousing.

0 Answers  


What are the advantages and disadvantages of reporting directly against the database?

0 Answers  






What is a Decision Tree Algorithm?

0 Answers  


Explain how are the dimension tables designed?

0 Answers  


What is freehand sql?

0 Answers  


What is the difference between metadata and data dictionary?

0 Answers  


Explain the process of extracting data from source systems,storing in ods and how data modeling is done?

0 Answers  


What are the various transformation available?

0 Answers  


Explain what are the various reporting tools in the market?

0 Answers  


Categories