General Datawarehousing

  Home  Data Warehouse  General Datawarehousing


“General Datawarehousing guideline for job interview preparation. Explore list of General Datawarehousing frequently asked questions(FAQs) asked in number of General Datawarehousing interviews. Post your comments as your suggestions, questions and answers on any General Datawarehousing Interview Question or answer. Ask General Datawarehousing Question, your question will be answered by our fellow friends.”



40 General Datawarehousing Questions And Answers

2⟩ Explain ssl?

The Secure Sockets Layer (SSL) is a commonly-used protocol for managing the security of a message transmission on the Internet. SSL has recently been succeeded by Transport Layer Security (TLS), which is based on SSL. SSL uses a program layer located between the Internet's Hypertext Transfer Protocol (HTTP) and Transport Control Protocol (TCP) layers. SSL is included as part of both the Microsoft and Netscape browsers and most Web server products. Developed by Netscape, SSL also gained the support of Microsoft and other Internet client/server developers as well and became the de facto standard until evolving into Transport Layer Security. The "sockets" part of the term refers to the sockets method of passing data back and forth between a client and a server program in a network or between program layers in the same computer. SSL uses the public-and-private key encryption system from RSA, which also includes the use of a digital certificate.

TLS and SSL are an integral part of most Web browsers (clients) and Web servers. If a Web site is on a server that supports SSL, SSL can be enabled and specific Web pages can be identified as requiring SSL access. Any Web server can be enabled by using Netscape's SSLRef program library which can be downloaded for noncommercial use or licensed for commercial use.

TLS and SSL are not interoperable. However, a message sent with TLS can be handled by a client that handles SSL but not TLS.

 138 views

3⟩ Explain the Difference between OLTP and OLAP?

Main Differences between OLTP and OLAP are:-

1. User and System Orientation

OLTP: customer-oriented, used for data analysis and querying by clerks, clients and IT professionals.

OLAP: market-oriented, used for data analysis by knowledge workers( managers, executives, analysis).

2. Data Contents

OLTP: manages current data, very detail-oriented.

OLAP: manages large amounts of historical data, provides facilities for summarization and aggregation, stores information at different levels of granularity to support decision making process.

3. Database Design

OLTP: adopts an entity relationship(ER) model and an application-oriented database design.

OLAP: adopts star, snowflake or fact constellation model and a subject-oriented database design.

4. View

OLTP: focuses on the current data within an enterprise or department.

OLAP: spans multiple versions of a database schema due to the evolutionary process of an organization; integrates information from many organizational locations and data stores

 143 views

4⟩ Explain What is What are Semi-additive and factless facts and in which scenario will you use such kinds of fact tables?

Snapshot facts are semi-additive, while we maintain aggregated facts we go for semi-additive.

EX: Average daily balance

A fact table without numeric fact columns is called factless fact table.

Ex: Promotion Facts

While maintain the promotion values of the transaction (ex: product samples) because this table doesn’t contain any measures.

 138 views

5⟩ Explain me what is VLDB?

VLDB stands for Very Large DataBase.

It is an environment or storage space managed by a relational database management system (RDBMS) consisting of vast quantities of information.

 138 views

6⟩ Explain What is real time data-warehousing?

Real-time data warehousing is a combination of two things: 1) real-time activity and 2) data warehousing. Real-time activity is activity that is happening right now. The activity could be anything such as the sale of widgets. Once the activity is complete, there is data about it.

Data warehousing captures business activity data. Real-time data warehousing captures business activity data as it occurs. As soon as the business activity is complete and there is data about it, the completed activity data flows into the data warehouse and becomes available instantly. In other words, real-time data warehousing is a framework for deriving information from data as the data becomes available.

 143 views

8⟩ What is SCD1 , SCD2 , SCD3?

SCD Stands for Slowly changing dimensions.

SCD1: only maintained updated values.

Ex: a customer address modified we update existing record with new address.

SCD2: maintaining historical information and current information by using

A) Effective Date

B) Versions

C) Flags

or combination of these

SCD3: by adding new columns to target table we maintain historical information and current information.

 169 views

9⟩ What is a lookup table?

A lookUp table is the one which is used when updating a warehouse. When the lookup is placed on the target table (fact table / warehouse) based upon the primary key of the target, it just updates the table by allowing only new records or updated records based on the lookup condition.

 140 views

10⟩ Explain me What are Data Marts?

Data Marts are designed to help manager make strategic decisions about their business.

Data Marts are subset of the corporate-wide data that is of value to a specific group of users.

There are two types of Data Marts:

1.Independent data marts ? sources from data captured form OLTP system, external providers or from data generated locally within a particular department or geographic area.

2.Dependent data mart ? sources directly form enterprise data warehouses.

 135 views

11⟩ What is a level of Granularity of a fact table?

Level of granularity means level of detail that you put into the fact table in a data warehouse. For example: Based on design you can decide to put the sales data in each transaction. Now, level of granularity would mean what detail are you willing to put for each transactional fact. Product sales with respect to each minute or you want to aggregate it upto minute and put that data.

 140 views

12⟩ Explain What are the Different methods of loading Dimension tables?

Conventional Load:

Before loading the data, all the Table constraints will be checked against the data.

Direct load:(Faster Loading)

All the Constraints will be disabled. Data will be loaded directly.Later the data will be checked against the table constraints and the bad data won't be indexed.

 156 views

13⟩ Explain piconet?

The original Piconet was a USB-style expansion port on RM Nimbus computers.

These days, a piconet is an ad-hoc computer network linking a user group of devices using Bluetooth technology protocols to allow one master device to interconnect with up to seven active slave devices (because a three-bit MAC address is used). Up to 255 further slave devices can be inactive, or parked, which the master device can bring into active status at any time.

A piconet typically has a range of about 10 m and a transfer rate between about 400 and 700 kbit/s, depending on whether synchronous or asynchronous connection is used.

All Parked Slaves have 8 bit parked member address (PMA) and all the active slaves have 3 bit active member address (AMA). The AMA is used by the master to send packets to a specific slave and to identify that the slave has sent a response packet.

 141 views

14⟩ What is ODS?

1. ODS means Operational Data Store.

Submitted by Francis C. ( xxchen74 @ hotmail . com )

2. A collection of operation or bases data that is extracted from operation databases and standardized, cleansed, consolidated, transformed, and loaded into an enterprise data architecture. An ODS is used to support data mining of operational data, or as the store for base data that is summarized for a data warehouse. The ODS may also be used to audit the data warehouse to assure summarized and derived data is calculated properly. The ODS may further become the enterprise shared operational database, allowing operational systems that are being reengineered to use the ODS as there operation databases.

 129 views

15⟩ What is Normalization, First Normal Form, Second Normal Form, Third Normal Form?

1.Normalization is process for assigning attributes to entities?Reducesdata redundancies?Helps eliminate data anomalies?Produces controlledredundancies to link tables

2.Normalization is the analysis offunctional dependency between attributes / data items of userviews􀁺It reduces a complex user view to a set of small andstable subgroups of fields / relations

1NF:Repeating groups must beeliminated, Dependencies can be identified, All key attributesdefined,No repeating groups in table

2NF: The Table is already in1NF,Includes no partial dependencies?No attribute dependent on a portionof primary key, Still possible to exhibit transitivedependency,Attributes may be functionally dependent on non-keyattributes

3NF: The Table is already in 2NF, Contains no transitivedependencies

 146 views

16⟩ What is Snow Flake Schema?

Snowflake Schema, each dimension has a primary dimension table, to which one or more additional dimensions can join. The primary dimension table is the only table that can join to the fact table.

 146 views

17⟩ What are vaious ETL tools in the Market?

Various ETL tools used in market are:

1. Informatica

2. Data Stage

3. MS-SQL DTS(Integrated Services 2005)

4. Abinitio

5. SQL Loader

6. Sunopsis

7. Oracle Warehouse Bulider

8. Data Junction

 135 views

19⟩ What is pre-emptive and non-pre-emptive?

Premptive means taken as a measure against something possible, anticipated, or feared; preventive; deterrent: a preemptive tactic against a ruthless business rival.

Non Pre-emptive is the exact opposite to Pre-emptive.No such preventive measures has been taken.

 131 views

20⟩ Explain Why should you put your data warehouse on a different system than your OLTP system?

A OLTP system is basically " data oriented " (ER model) and not " Subject oriented "(Dimensional Model) .That is why we design a separate system that will have a subject oriented OLAP system...

Moreover if a complex querry is fired on a OLTP system will cause a heavy overhead on the OLTP server that will affect the daytoday business directly.

 159 views