Datastage Interview Questions with answers
Question.1 What is Meta Data Repository?
Answer: Meta Data is a data about the data. It also contains
Query statistics
ETL statistics
Business subject area
Source Information
Target Information
Source to Target mapping Information.
Question.2 What is Data Stage Engine?
Answer: It is a JAVA engine running at the background.
Question.3 What is Dimensional Modeling?
Answer: Dimensional Modeling is a logical design technique that seeks to present the data in a standard framework that is, intuitive and allows for high performance access.
Question.4 What is Star Schema?
Answer: Star Schema is a de-normalized multi-dimensional model. It contains centralized fact tables surrounded by dimensions table. Dimension Table: It contains a primary key and description about the fact table. Fact Table: It contains foreign keys to the dimension tables, measures and aggregates.
Question.5 What is surrogate Key?
Answer: It is a 4-byte integer which replaces the transaction / business / OLTP key in the dimension table. We can store up to 2 billion record.
Question.6 Why we need surrogate key?
Answer: It is used for integrating the data may help better for primary key. Index maintenance, joins, table size, key updates, disconnected inserts and partitioning.
Question.7 What is Snowflake schema?
Answer: It is partially normalized dimensional model in which at two represents least one dimension or more hierarchy related tables.
Question.8 Explain Types of Fact Tables?
Answer:
Factless Fact: It contains only foreign keys to the dimension tables.
Additive Fact: Measures can be added across any dimensions.
Semi-Additive: Measures can be added across some dimensions. Eg, % age, discount
Non-Additive: Measures cannot be added across any dimensions. Eg, Average
Conformed Fact: The equation or the measures of the two fact tables are the same under the facts are measured across the dimensions with a same set of measures.
Question.9 Explain the Types of Dimension Tables?
Answer: Conformed Dimension: If a dimension table is connected to more than one fact table, the granularity that is defined in the dimension table is common across between the fact tables. Junk Dimension: The Dimension table, which contains only flags. Monster Dimension: If rapidly changes in Dimension are known as Monster Dimension. De-generative Dimension: It is line item-oriented fact table design.
Question.10 What are stage variables?
Answer: Stage variables are declaratives in Transformer Stage used to store values. Stage variables are active at the run time. (Because memory is allocated at the run time).
Question.11 What is sequencer?
Answer: It sets the sequence of execution of server jobs.
Question.12 What are Active and Passive stages?
Answer: Active Stage: Active stage model the flow of data and provide mechanisms for combining data streams, aggregating data and converting data from one data type to another. Eg, Transformer, aggregator, sort, Row Merger etc. Passive Stage: A Passive stage handles access to Database for the extraction or writing of data. Eg, IPC stage, File types, Universe, Unidata, DRS stage etc.
Question.13 What is ODS?
Answer: Operational Data Store is a staging area where data can be rolled back.
Question.14 What are Macros?
Answer: They are built from Data Stage functions and do not require arguments. A number of macros are provided in the JOBCONTROL.H file to facilitate getting information about the current job, and links and stages belonging to the current job. These can be used in expressions (for example for use in Transformer stages), job control routines, filenames and table names, and before/after subroutines.
These macros provide the functionality of using the DSGetProiectInfo, DSGetJobInfo, DSGetStazeInfo, and DSGetLinkInfo functions with the DSJ.ME token as the JobHandle and can be used in all active stages and before/after subroutines. The macros provide the functionality for all the possible InfoType arguments for the DSGet…Info functions. See the Function call help topics for more details.
The available macros are:
DSHostName
DSProjectName
DSJobStatus
DSJobName
DSJobStartDate
DSJobStartTime
DSJobStarfrimestamp
DSJobWaveNo
DSJobInvocations
DSJobInvocationId
DSStageName
DSStageLastErr
DSStageType DSStagelnRovehlum
DSStageVarList
DSLinkRowCount
DSLinkLastErr
Question.15 What is key MgtGetNextValue?
Answer: It is a Built-in transform Sequential numbers. Its input type is literal string & output type is string.
Question.16 What are stages?
Answer: The stages are either passive or active stages. Passive stages handle access to databases for extracting or writing data. Active stages model the flow of data and provide mechanisms for combining data streams, aggregating data, and converting data from one data type to another.
Question.17 What index is created on Data Warehouse?
Answer: Bitmap index is created in Data Warehouse.
Question.18 What is container?
Answer: A container is a group of stages and links. Containers enable you to simplify and modularize your server job designs by replacing complex areas of the diagram with a single container stage. You can also use shared containers as a way of incorporating server job functionality into parallel jobs.
Learn DataStage by Expert