Warehouse DataStage Questions And Answers

1⟩ Explain What is merge and how it can be done plz explain with simple example taking 2 tables?

Merge is used to join two tables.It takes the Key columns sort them in Ascending or descending order.Let us consider two table i.e Emp,Dept.If we want to join these two tables we are having DeptNo as a common Key so we can give that column name as key and sort Deptno in ascending order and can join those two tables

Is this answer helpful? 0 Yes | 0 No

Answer This Question

254 views

2⟩ Explain How to run the job in command prompt in unix?

Using dsjob command,

-options

dsjob -run -jobstatus projectname jobname

Dsjob-run-jobstatus-PARAM<Parameter 1>=value PARAM<Paramer 2>value.

<project name><jobname>. . If jobstatus=0 successful run,status=1and2 warnigs but success otherwise abort.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

183 views

3⟩ Explain If data is partitioned in your job on key 1 and then you aggregate on key 2, what issues could arise?

Data will partitioned on both the keys ! hardly it will take more for execution .

Is this answer helpful? 0 Yes | 0 No

Answer This Question

239 views

4⟩ Explain What are the enhancements made in datastage 7.5 compare with 7.0?

Many new stages were introduced compared to datastage version 7.0. In server jobs we have stored procedure stage, command stage and generate report option was there in file tab. In job sequence many stages like startloop activity, end loop activity,terminate loop activity and user variables activities were introduced. In parallel jobs surrogate key stage, stored procedure stage were introduced. For all other specifications,

Is this answer helpful? 0 Yes | 0 No

Answer This Question

206 views

5⟩ What is job control? how can it used explain with steps?

JCL defines Job Control Language it is ued to run more number of jobs at a time with or without using loops. steps:click on edit in the menu bar and select 'job properties' and enter the parameters asparamete prompt typeSTEP_ID STEP_ID stringSource SRC stringDSN DSN stringUsername unm stringPassword pwd stringafter editing the above steps then set JCL button and select the jobs from the listbox and run the job

Job control can be acquired using job sequence in datastage 8.0.1.with or without loops.from the menu select new->sequence job and get the corresponding stages in the palette.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

190 views

6⟩ What is iconv and oconv functions?

Iconv( )-----converts string to internal storage formatOconv( )----converts an expression to an output format

Is this answer helpful? 0 Yes | 0 No

Answer This Question

225 views

7⟩ How to kill the job in data stage?

by killing the respective processing i.d

first u have to go for manager click on job from menu bar then click on clean up resources there u will find pid with that go to administrator click on command type ds.tools there u can give the pid to kill the job

Is this answer helpful? 0 Yes | 0 No

Answer This Question

179 views

8⟩ Explain It is possible to run parallel jobs in server jobs?

No, It is not possible to run Parallel jobs in server jobs. But Server jobs can be executed in Parallel jobs

Is this answer helpful? 0 Yes | 0 No

Answer This Question

187 views

9⟩ Explain What are other Performance tunings you have done in your last project to increase the performance of slowly running jobs?

Minimise the usage of Transformer (Instead of this use Copy, modify, Filter, Row Generator)

Use SQL Code while extracting the data

Handle the nulls

Minimise the warnings

Reduce the number of lookups in a job design

Use not more than 20stages in a job

Use IPC stage between two passive stages Reduces processing time

Drop indexes before data loading and recreate after loading data into tables

Gen'll we cannot avoid no of lookups if our requirements to do lookups compulsory.

There is no limit for no of stages like 20 or 30 but we can break the job into small jobs then we use dataset Stages to store the data.

IPC Stage that is provided in Server Jobs not in Parallel Jobs

Check the write cache of Hash file. If the same hash file is used for Look up and as well as target, disable this Option.

If the hash file is used only for lookup then "enable Preload to memory". This will improve the performance. Also, check the order of execution of the routines.

Don't use more than 7 lookups in the same transformer; introduce new transformers if it exceeds 7 lookups.

Use Preload to memory option in the hash file output.

Use Write to cache in the hash file input.

Write into the error tables only after all the transformer stages.

Reduce the width of the input record - remove the columns that you would not use.

Cache the hash files you are reading from and writting into. Make sure your cache is big enough to hold the hash files.

Use ANALYZE.FILE or HASH.HELP to determine the optimal settings for your hash files.

This would also minimize overflow on the hash file.

If possible, break the input into multiple threads and run multiple instances of the job.

Staged the data coming from ODBC/OCI/DB2UDB stages or any database on the server using Hash/Sequential files for optimum performance also for data recovery in case job aborts.

Tuned the OCI stage for 'Array Size' and 'Rows per Transaction' numerical values for faster inserts, updates and selects.

Tuned the 'Project Tunables' in Administrator for better performance.

Used sorted data for Aggregator.

Sorted the data as much as possible in DB and reduced the use of DS-Sort for better performance of jobs

Removed the data not used from the source as early as possible in the job.

Worked with DB-admin to create appropriate Indexes on tables for better performance of DS queries

Converted some of the complex joins/business in DS to Stored Procedures on DS for faster execution of the jobs.

If an input file has an excessive number of rows and can be split-up then use standard logic to run jobs in parallel.

Before writing a routine or a transform, make sure that there is not the functionality required in one of the standard routines supplied in the sdk or ds utilities categories.

Constraints are generally CPU intensive and take a significant amount of time to process. This may be the case if the constraint calls routines or external macros but if it is inline code then the overhead will be minimal.

Try to have the constraints in the 'Selection' criteria of the jobs itself. This will eliminate the unnecessary records even getting in before joins are made.

Tuning should occur on a job-by-job basis.

Use the power of DBMS.

Try not to use a sort stage when you can use an ORDER BY clause in the database.

Using a constraint to filter a record set is much slower than performing a SELECT ? WHERE?.

Make every attempt to use the bulk loader for your particular database. Bulk loaders are generally faster than using ODBC or OLE.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

201 views

10⟩ Explain What are Sequencers?

A sequencer allows you to synchronize the control flow of multiple activities in a job sequence. It can have multiple input triggers as well as multiple output triggers.The sequencer operates in two modes:ALL mode. In this mode all of the inputs to the sequencer must be TRUE for any of the sequencer outputs to fire.ANY mode. In this mode, output triggers can be fired if any of the sequencer inputs are TRUE

Is this answer helpful? 0 Yes | 0 No

Answer This Question

216 views

11⟩ How to create Containers?

There are Two types of containers

1.Local Container

2.Shared Container

Local container is available for that particular Job only.

Where as Shared Containers can be used any where in the project.

Local container:

Step1:Select the stages required

Step2:Edit>ConstructContainer>Local

SharedContainer:

Step1:Select the stages required

Step2:Edit>ConstructContainer>Shared

Shared containers are stored in the SharedContainers branch of the Tree Structure

Is this answer helpful? 0 Yes | 0 No

Answer This Question

255 views

12⟩ Explain What is the difference between LOOK UP File Stage and LookUP stage?

LookUP stage is used to deal on reference data set with source data .

where as LOOK UP File Stage is used to create the reference data set for the look up stage for to perform the look up operation with the source data.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

218 views

13⟩ How to fix the error "OCI has fetched truncated data" in DataStage?

Can we use Change capture stage to get the truncated data's.Members please confirm

Dear Friend,Not only truncated data,it captures duplicates,edit,insert,unwanted data that means every changes in the before and after data

Is this answer helpful? 0 Yes | 0 No

Answer This Question

229 views

14⟩ Explain the difference between drs and odbc stage?

To answer your question the DRS stage should be faster then the ODBC stage as it uses native database connectivity. You will need to install and configure the required database clients on your DataStage server for it to work.

Dynamic Relational Stage was leveraged for Peoplesoft to have a job to run on any of the supported databases. It supports ODBC connections too. Read more of that in the plug-in documentation.

ODBC uses the ODBC driver for a particular database, DRS is a stage that tries to make it seamless for switching from one database to another. It uses the native connectivities for the chosen target

Is this answer helpful? 0 Yes | 0 No

Answer This Question

236 views

15⟩ Explain how to improve the job performance?

by using partition technics we can improve the performance

like

hash,modules,range,random etc

Is this answer helpful? 0 Yes | 0 No

Answer This Question

200 views

16⟩ What are the command line functions that import and export the DS jobs?

A. dsimport.exe- imports the DataStage components.

B. dsexport.exe- exports the DataStage components.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

200 views

17⟩ Explain What happens out put of hash file is connected to transformer .. What error it through?

If Hash file output is connected to transformer stage the hash file will consider as the Lookup file if there is no primary link to the same Transformer stage, if there is no primary link then this will treat as primary link itself. you can do SCD in server job by using Lookup functionality. This will not return any error code.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

221 views

18⟩ Explain What is the difference between Symetrically parallel processing,Massively parallel processing?

Symmetric Multiprocessing (SMP) - Some Hardware resources may be shared by processor. Processor communicate via shared memory and have single operating system.

Cluster or Massively Parallel Processing (MPP) - Known as shared nothing in which each processor have exclusive access to hardware resources. CLuster systems can be physically dispoersed.The processor have their own operatins system and communicate via high speed network

Is this answer helpful? 0 Yes | 0 No

Answer This Question

268 views

19⟩ How to populate source files?

There are many ways to populate one is writting SQL statment in oracle is one way

Is this answer helpful? 0 Yes | 0 No

Answer This Question

226 views

20⟩ Explain How is datastage 4.0 functionally different from the enterprise edition now?? what are the exact changes?

There are lot of Changes in DS EE. CDC Stage, Procedure Stage, Etc

Is this answer helpful? 0 Yes | 0 No

Answer This Question

208 views

Warehouse DataStage

Home Data Warehouse Warehouse DataStage

37 Warehouse DataStage Questions And Answers

1⟩ Explain What is merge and how it can be done plz explain with simple example taking 2 tables?

2⟩ Explain How to run the job in command prompt in unix?

3⟩ Explain If data is partitioned in your job on key 1 and then you aggregate on key 2, what issues could arise?

4⟩ Explain What are the enhancements made in datastage 7.5 compare with 7.0?

5⟩ What is job control? how can it used explain with steps?

6⟩ What is iconv and oconv functions?

7⟩ How to kill the job in data stage?

8⟩ Explain It is possible to run parallel jobs in server jobs?

9⟩ Explain What are other Performance tunings you have done in your last project to increase the performance of slowly running jobs?

10⟩ Explain What are Sequencers?

11⟩ How to create Containers?

12⟩ Explain What is the difference between LOOK UP File Stage and LookUP stage?

13⟩ How to fix the error "OCI has fetched truncated data" in DataStage?

14⟩ Explain the difference between drs and odbc stage?

15⟩ Explain how to improve the job performance?

16⟩ What are the command line functions that import and export the DS jobs?

17⟩ Explain What happens out put of hash file is connected to transformer .. What error it through?

18⟩ Explain What is the difference between Symetrically parallel processing,Massively parallel processing?

19⟩ How to populate source files?

20⟩ Explain How is datastage 4.0 functionally different from the enterprise edition now?? what are the exact changes?

Quick Links:

Warehouse DataStage

Home Data Warehouse Warehouse DataStage

37 Warehouse DataStage Questions And Answers

1⟩ Explain What is merge and how it can be done plz explain with simple example taking 2 tables?

2⟩ Explain How to run the job in command prompt in unix?

3⟩ Explain If data is partitioned in your job on key 1 and then you aggregate on key 2, what issues could arise?

4⟩ Explain What are the enhancements made in datastage 7.5 compare with 7.0?

5⟩ What is job control? how can it used explain with steps?

6⟩ What is iconv and oconv functions?

7⟩ How to kill the job in data stage?

8⟩ Explain It is possible to run parallel jobs in server jobs?

9⟩ Explain What are other Performance tunings you have done in your last project to increase the performance of slowly running jobs?

10⟩ Explain What are Sequencers?

11⟩ How to create Containers?

12⟩ Explain What is the difference between LOOK UP File Stage and LookUP stage?

13⟩ How to fix the error "OCI has fetched truncated data" in DataStage?

14⟩ Explain the difference between drs and odbc stage?

15⟩ Explain how to improve the job performance?

16⟩ What are the command line functions that import and export the DS jobs?

17⟩ Explain What happens out put of hash file is connected to transformer .. What error it through?

18⟩ Explain What is the difference between Symetrically parallel processing,Massively parallel processing?

19⟩ How to populate source files?

20⟩ Explain How is datastage 4.0 functionally different from the enterprise edition now?? what are the exact changes?

BE THE FIRST TO KNOW

Quick Links: