Extract, Transform, Load (ETL) Questions And Answers

1⟩ Explain What is the metadata extension?

Informatica allows end users and partners to extend the metadata stored in the repository by associating information with individual objects in the repository. For example, when you create a mapping, you can store your contact information with the mapping. You associate information with repository metadata using metadata extensions.

Informatica Client applications can contain the following types of metadata extensions:

Vendor-defined. Third-party application vendors create vendor-defined metadata extensions. You can view and change the values of vendor-defined metadata extensions, but you cannot create, delete, or redefine them.

User-defined. You create user-defined metadata extensions using PowerCenter/PowerMart. You can create, edit, delete, and view user-defined metadata extensions. You can also change the values of user-defined extensions.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

252 views

2⟩ Explain What is ETL process ?How many steps ETL contains explain with example?

ETL is extraction , transforming , loading process , you will extract data from the source and apply the business role on it then you will load it in the target

the steps are :

1-define the source(create the odbc and the connection to the source DB)

2-define the target (create the odbc and the connection to the target DB)

3-create the mapping ( you will apply the business role here by adding transformations , and define how the data flow will go from the source to the target )

4-create the session (its a set of instruction that run the mapping , )

5-create the work flow (instruction that run the session)

Is this answer helpful? 0 Yes | 0 No

Answer This Question

262 views

3⟩ Explain What is Informatica Metadata and where is it stored?

Informatica Metadata is data about data which stores in Informatica repositories.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

235 views

4⟩ How to call shell scripts from informatica?

Specify the Full path of the Shell script the "Post session properties

of session/workflow".

Is this answer helpful? 0 Yes | 0 No

Answer This Question

241 views

5⟩ Explain What is partitioning? What are the types of partitioning?

If you use PowerCenter, you can increase the number of partitions in a pipeline to improve session performance. Increasing the number of partitions allows the Informatica Server to create multiple connections to sources and process partitions of source data concurrently.

When you create a session, the Workflow Manager validates each pipeline in the mapping for partitioning. You can specify multiple partitions in a pipeline if the Informatica Server can maintain data consistency when it processes the partitioned data.

When you configure the partitioning information for a pipeline, you must specify a partition type at each partition point in the pipeline.

The partition type determines how the Informatica Server redistributes data across partition points.

The Workflow Manager allows you to specify the following partition types:

Round-robin partitioning. The Informatica Server distributes data evenly among all partitions. Use round-robin partitioning where you want each partition to process approximately the same number of rows.

For more information, see Round-Robin Partitioning.

Hash partitioning. The Informatica Server applies a hash function to a partition key to group data among partitions. If you select hash auto-keys, the Informatica Server uses all grouped or sorted ports as the partition key. If you select hash user keys, you specify a number of ports to form the partition key. Use hash partitioning where you want to ensure that the Informatica Server processes groups of rows

with the same partition key in the same partition. For more

information, see Hash Partitioning.

Key range partitioning. You specify one or more ports to form a compound partition key. The Informatica Server passes data to each partition depending on the ranges you specify for each port. Use key range partitioning where the sources or targets in the pipeline are partitioned by key range. For more information, see Key Range Partitioning.

Pass-through partitioning. The Informatica Server passes all rows at one partition point to the next partition point without redistributing them. Choose pass-through partitioning where you want to create an additional pipeline stage to improve performance, but do not want to change the distribution of data across partitions.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

231 views

6⟩ Explain Do we need an ETL tool? When do we go for the tools in the market?

ETL Tool:

It is used to Extract(E) data from multiple source systems(like RDBMS,Flat files,Mainframes,SAP,XML etc) transform(T) them based on Business requirements and Load(L) in target locations.(like tables,files etc).

Need of ETL Tool:

An ETL tool is typically required when data scattered accross different systems.(like RDBMS,Flat files,Mainframes,SAP,XML etc).

Is this answer helpful? 0 Yes | 0 No

Answer This Question

245 views

7⟩ How to use procedural logic inside Infromatica? If yes how, if now how can we use external procedural logic in informatica?

We can use advanced external transformation. for more detail you can refer the manual of informatica transformation guide in that advance external transformation. You can use c++ language on unix and c++, vb vc++ on windows server.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

265 views

8⟩ Explain a mapping, session, worklet, workflow, mapplet?

A mapping represents dataflow from sources to targets.

A mapplet creates or configures a set of transformations.

A workflow is a set of instruction sthat tell the Informatica server how to execute the tasks.

A worklet is an object that represents a set of tasks.

A session is a set of instructions that describe how and when to move data from sources to targets.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

248 views

9⟩ Explain What is a staging area? Do we need it? What is the purpose of a staging area?

Data staging is actually a collection of processes used to prepare source system data for loading a data warehouse. Staging includes the following steps:

Source data extraction, Data transformation (restructuring),

Data transformation (data cleansing, value transformations),

Surrogate key assignments

Is this answer helpful? 0 Yes | 0 No

Answer This Question

276 views

10⟩ Explain What is Full load & Incremental or Refresh load?

Full Load: completely erasing the contents of one or more tables and reloading with fresh data.

Incremental Load: applying ongoing changes to one or more tables based on a predefined schedule.

first time what we are loading the data is called initial load or full load.

secondtime or modified data waht ewe are loading is called as incremental load or delta load.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

238 views

11⟩ Tell me can we override a native sql query within Informatica? Where do we do it? How do we do it?

Yes,we can override a native sql query in source qualifier and lookup transformation.

In lookup transformation we can find "Sql override" in lookup properties.by using this option we can do this.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

281 views

12⟩ Explain What are snapshots? What are materialized views & where do we use them? What is a materialized view log?

Snapshots are read-only copies of a master table located on a remote node which is periodically refreshed to reflect changes made to the master table. Snapshots are mirror or replicas of tables.

Views are built using the columns from one or more tables. The Single Table View can be updated but the view with multi table cannot be updated.

A View can be updated/deleted/inserted if it has only one base table if the view is based on columns from one or more tables then insert, update and delete is not possible.

Materialized view

A pre-computed table comprising aggregated or joined data from fact and possibly dimension tables. Also known as a summary or aggregate table.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

289 views

13⟩ Explain Is there any way to read the MS Excel Datas directly into Informatica?Like IS there any Possibilities to take excel file as target?

we cant directly import the xml file in informatica.

we have to define the microsoft excel odbc driver on our system. and define the name in exce sheet by defining ranges.

then in inforematica open the folder using sources ->import from database->select excel odbc driver->connect->select the excel sheet name .

Is this answer helpful? 0 Yes | 0 No

Answer This Question

281 views

14⟩ Explain What are the various methods of getting incremental records or delta records from the source systems?

One foolproof method is to maintain a field called 'Last Extraction Date' and then impose a condition in the code saying 'current_extraction_date > last_extraction_date'.

First Method: If there is a column in the source which identifies the record inserted date. Then it will be easy to put a filter condition in the source qualifier.

Second Method: If there is no record in the source to identify the record inserted date. Then we need to do a target lookup based on the primary key and determine the new record and then insert.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

234 views

15⟩ Explain the various test procedures used to check whether the data is loaded in the backend, performance of the mapping, and quality of the data loaded in INFORMATICA?

he best procedure to take a help of debugger where we monitor each and every process of mappings and how data is loading based on conditions breaks.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

240 views

16⟩ Explain the different Lookup methods used in Informatica?

1. Connected lookup

2. Unconnected lookup

Connected lookup will receive input from the pipeline and sends output to the pipeline and can return any number of values.it does not contain retun port.

Unconnected lookup can return only one column. it containn return port.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

259 views

17⟩ Explain When do we Analyze the tables? How do we do it?

The ANALYZE statement allows you to validate and compute statistics for an index, table, or cluster. These statistics are used by the cost-based optimizer when it calculates the most efficient plan for retrieval. In addition to its role in statement optimization, ANALYZE also helps in validating object structures and in managing space in your system. You can choose the following operations: COMPUTER, ESTIMATE, and DELETE. Early version of Oracle7 produced unpredicatable results when the ESTIMATE operation was used. It is best to compute

your statistics.

EX:

select OWNER,

sum(decode(nvl(NUM_ROWS,9999), 9999,0,1)) analyzed,

sum(decode(nvl(NUM_ROWS,9999), 9999,1,0)) not_analyzed,

count(TABLE_NAME) total

from dba_tables

where OWNER not in ('SYS', 'SYSTEM')

group by OWNER

Is this answer helpful? 0 Yes | 0 No

Answer This Question

246 views

18⟩ Suppose we have some 10,000 odd records in source system and when load them into target.How do we ensure that all 10,000 records that are loaded to target doesnt contain any garbage values?

You can verify the session log and ensure the source fetched and target loaded details

Is this answer helpful? 0 Yes | 0 No

Answer This Question

265 views

19⟩ Explain What are the different versions of Informatica?

Here are some popular versions of Informatica.

Informatica Powercenter 4.1, Informatica Powercenter 5.1, Powercenter Informatica 6.1.2, Informatica Powercenter 7.1.2, Informatica Powercenter 8.1, Informatica Powercenter 8.5, Informatica Powercenter 8.6.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

259 views

20⟩ How can we determine what records to extract?

When addressing a table some dimension key must reflect the need for a record to get extracted. Mostly it will be from time dimension (e.g. date >= 1st of current mth) or a transaction flag (e.g. Order Invoiced Stat). Foolproof would be adding an archive flag to record which gets reset when record changes.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

244 views

Extract, transform, load (ETL)

Home Data Warehouse Extract, transform, load (ETL)

37 Extract, Transform, Load (ETL) Questions And Answers

1⟩ Explain What is the metadata extension?

2⟩ Explain What is ETL process ?How many steps ETL contains explain with example?

3⟩ Explain What is Informatica Metadata and where is it stored?

4⟩ How to call shell scripts from informatica?

5⟩ Explain What is partitioning? What are the types of partitioning?

6⟩ Explain Do we need an ETL tool? When do we go for the tools in the market?

7⟩ How to use procedural logic inside Infromatica? If yes how, if now how can we use external procedural logic in informatica?

8⟩ Explain a mapping, session, worklet, workflow, mapplet?

9⟩ Explain What is a staging area? Do we need it? What is the purpose of a staging area?

10⟩ Explain What is Full load & Incremental or Refresh load?

11⟩ Tell me can we override a native sql query within Informatica? Where do we do it? How do we do it?

12⟩ Explain What are snapshots? What are materialized views & where do we use them? What is a materialized view log?

13⟩ Explain Is there any way to read the MS Excel Datas directly into Informatica?Like IS there any Possibilities to take excel file as target?

14⟩ Explain What are the various methods of getting incremental records or delta records from the source systems?

15⟩ Explain the various test procedures used to check whether the data is loaded in the backend, performance of the mapping, and quality of the data loaded in INFORMATICA?

16⟩ Explain the different Lookup methods used in Informatica?

17⟩ Explain When do we Analyze the tables? How do we do it?

18⟩ Suppose we have some 10,000 odd records in source system and when load them into target.How do we ensure that all 10,000 records that are loaded to target doesnt contain any garbage values?

19⟩ Explain What are the different versions of Informatica?

20⟩ How can we determine what records to extract?

Quick Links:

Extract, transform, load (ETL)

Home Data Warehouse Extract, transform, load (ETL)

37 Extract, Transform, Load (ETL) Questions And Answers

1⟩ Explain What is the metadata extension?

2⟩ Explain What is ETL process ?How many steps ETL contains explain with example?

3⟩ Explain What is Informatica Metadata and where is it stored?

4⟩ How to call shell scripts from informatica?

5⟩ Explain What is partitioning? What are the types of partitioning?

6⟩ Explain Do we need an ETL tool? When do we go for the tools in the market?

7⟩ How to use procedural logic inside Infromatica? If yes how, if now how can we use external procedural logic in informatica?

8⟩ Explain a mapping, session, worklet, workflow, mapplet?

9⟩ Explain What is a staging area? Do we need it? What is the purpose of a staging area?

10⟩ Explain What is Full load & Incremental or Refresh load?

11⟩ Tell me can we override a native sql query within Informatica? Where do we do it? How do we do it?

12⟩ Explain What are snapshots? What are materialized views & where do we use them? What is a materialized view log?

13⟩ Explain Is there any way to read the MS Excel Datas directly into Informatica?Like IS there any Possibilities to take excel file as target?

14⟩ Explain What are the various methods of getting incremental records or delta records from the source systems?

15⟩ Explain the various test procedures used to check whether the data is loaded in the backend, performance of the mapping, and quality of the data loaded in INFORMATICA?

16⟩ Explain the different Lookup methods used in Informatica?

17⟩ Explain When do we Analyze the tables? How do we do it?

18⟩ Suppose we have some 10,000 odd records in source system and when load them into target.How do we ensure that all 10,000 records that are loaded to target doesnt contain any garbage values?

19⟩ Explain What are the different versions of Informatica?

20⟩ How can we determine what records to extract?

BE THE FIRST TO KNOW

Quick Links: