Friday, November 23, 2012

Informatica Interview Questions and Answers

Informatica Interview Questions and Answers

Q. What type of repositories can be created using Informatica Repository Manager?
A. Informatica PowerCenter includeds following type of repositories :
•Standalone Repository : A repository that functions individually and this is unrelated to any other repositories.
•Global Repository : This is a centralized repository in a domain. This repository can contain shared objects across the repositories in a domain. The objects are shared through global shortcuts.
•Local Repository : Local repository is within a domain and it’s not a global repository. Local repository can connect to a global repository using global shortcuts and can use objects in it’s shared folders.
•Versioned Repository : This can either be local or global repository but it allows version control for the repository. A versioned repository can store multiple copies, or versions of an object. This features allows to efficiently develop, test and deploy metadata in the production environment.
Q. What is a code page?
A. A code page contains encoding to specify characters in a set of one or more languages. The code page is selected based on source of the data. For example if source contains Japanese text then the code page should be selected to support Japanese text.
When a code page is chosen, the program or application for which the code page is set, refers to a specific set of data that describes the characters the application recognizes. This influences the way that application stores, receives, and sends character data.
Q. Which all databases PowerCenter Server on Windows can connect to?
A. PowerCenter Server on Windows can connect to following databases:
•IBM DB2
•Informix
•Microsoft Access
•Microsoft Excel
•Microsoft SQL Server
•Oracle
•Sybase
•Teradata
Q. Which all databases PowerCenter Server on UNIX can connect to?
A. PowerCenter Server on UNIX can connect to following databases:
•IBM DB2
•Informix
•Oracle
•Sybase
•Teradata
Informatica Mapping Designer
Q. How to execute PL/SQL script from Informatica mapping?
A. Stored Procedure (SP) transformation can be used to execute PL/SQL Scripts. In SP Transformation PL/SQL procedure name can be specified. Whenever the session is executed, the session will call the pl/sql procedure.
Q. How can you define a transformation? What are different types of transformations available in Informatica?
A. A transformation is a repository object that generates, modifies, or passes data. The Designer provides a set of transformations that perform specific functions. For example, an Aggregator transformation performs calculations on groups of data. Below are the various transformations available in Informatica:
•Aggregator
•Application Source Qualifier
•Custom
•Expression
•External Procedure
•Filter
•Input
•Joiner
•Lookup
•Normalizer
•Output
•Rank
•Router
•Sequence Generator
•Sorter
•Source Qualifier
•Stored Procedure
•Transaction Control
•Union
•Update Strategy
•XML Generator
•XML Parser
•XML Source Qualifier
Q. What is a source qualifier? What is meant by Query Override?
A. Source Qualifier represents the rows that the PowerCenter Server reads from a relational or flat file source when it runs a session. When a relational or a flat file source definition is added to a mapping, it is connected to a Source Qualifier transformation.
PowerCenter Server generates a query for each Source Qualifier Transformation whenever it runs the session. The default query is SELET statement containing all the source columns. Source Qualifier has capability to override this default query by changing the default settings of the transformation properties. The list of selected ports or the order they appear in the default query should not be changed in overridden query.
Q. What is aggregator transformation?
A. The Aggregator transformation allows performing aggregate calculations, such as averages and sums. Unlike Expression Transformation, the Aggregator transformation can only be used to perform calculations on groups. The Expression transformation permits calculations on a row-by-row basis only.
Aggregator Transformation contains group by ports that indicate how to group the data. While grouping the data, the aggregator transformation outputs the last row of each group unless otherwise specified in the transformation properties.
Various group by functions available in Informatica are : AVG, COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM, VARIANCE.
Q. What is Incremental Aggregation?
A. Whenever a session is created for a mapping Aggregate Transformation, the session option for Incremental Aggregation can be enabled. When PowerCenter performs incremental aggregation, it passes new source data through the mapping and uses historical cache data to perform new aggregation calculations incrementally.
Q. How Union Transformation is used?
A. The union transformation is a multiple input group transformation that can be used to merge data from various sources (or pipelines). This transformation works just like UNION ALL statement in SQL, that is used to combine result set of two SELECT statements.
Q. Can two flat files be joined with Joiner Transformation?
A. Yes, joiner transformation can be used to join data from two flat file sources.
Q. What is a look up transformation?
A. This transformation is used to lookup data in a flat file or a relational table, view or synonym. It compares lookup transformation ports (input ports) to the source column values based on the lookup condition. Later returned values can be passed to other transformations.
Q. Can a lookup be done on Flat Files?
A. Yes.
Q. What is the difference between a connected look up and unconnected look up?
A. Connected lookup takes input values directly from other transformations in the pipleline.
Unconnected lookup doesn’t take inputs directly from any other transformation, but it can be used in any transformation (like expression) and can be invoked as a function using :LKP expression. So, an unconnected lookup can be called multiple times in a mapping.
Q. What is a mapplet?
A. A mapplet is a reusable object that is created using mapplet designer. The mapplet contains set of transformations and it allows us to reuse that transformation logic in multiple mappings.
Q. What does reusable transformation mean?
A. Reusable transformations can be used multiple times in a mapping. The reusable transformation is stored as a metadata separate from any other mapping that uses the transformation. Whenever any changes to a reusable transformation are made, all the mappings where the transformation is used will be invalidated.
Q. What is update strategy and what are the options for update strategy?
A. Informatica processes the source data row-by-row. By default every row is marked to be inserted in the target table. If the row has to be updated/inserted based on some logic Update Strategy transformation is used. The condition can be specified in Update Strategy to mark the processed row for update or insert.
Following options are available for update strategy :
•DD_INSERT : If this is used the Update Strategy flags the row for insertion. Equivalent numeric value of DD_INSERT is 0.
•DD_UPDATE : If this is used the Update Strategy flags the row for update. Equivalent numeric value of DD_UPDATE is 1.
•DD_DELETE : If this is used the Update Strategy flags the row for deletion. Equivalent numeric value of DD_DELETE is 2.
•DD_REJECT : If this is used the Update Strategy flags the row for rejection. Equivalent numeric value of DD_REJECT is 3.
Feb122011

Server Performance Data Collection – Collectl

Server Performance Data Collection
1. Disk
2. CPU
3. Network … Etc
There are a number of times in which you find yourself needing performance data. These can include benchmarking, monitoring a system’s general heath or trying to determine what your system was doing at some time in the past. Sometimes you just want to know what the system is doing right now. Depending on what you’re doing, you often end up using different tools, each designed to for that specific situation.
Unlike most monitoring tools that either focus on a small set of statistics, format their output in only one way, run either interatively or as a daemon but not both, collectl tries to do it all. You can choose to monitor any of a broad set of subsystems which currently include buddyinfo, cpu, disk, inodes, infiniband, lustre, memory, network, nfs, processes, quadrics, slabs, sockets and tcp.
By default it shows cpu, network and disk stats in brief format. The key point of this format is all output appears on a single line making it much easier to spot spikes or other anomalies in the output:
[oracle@Mudhalvan] collectl
while writing to an NFS mounted filesystem, collectl displays interrupts, memory usage and nfs activity with timestamps. Keep in mind that you can mix and match any data and in the case of brief format you simply need to have a window wide enough to accommodate your output.
[oracle@Mudhalvan] collectl -sjmf -oT
You can also display the same information in verbose format, in which case you get a single line for each type of data at the expense of more screen real estate, as can be seen in this example of network data during NFS writes. Note how you can actually see the network traffic stall while waiting for the server to physically write the data.
[oracle@Mudhalvan] collectl -sn –verbose -oT
[oracle@Mudhalvan] collectl -sJ -oTm
Output can also be saved in a rolling set of logs for later playback or displayed interactively in a variety of formats. If all that isn’t enough there are additional mechanisms for supplying data to external tools by generating output as s-expressions, a format of choice for some tools such as supermon or in another format called list-expressions. This output can be written to a file or sent over a socket. You can even create files in space-separated format for plotting with external packages like gnuplot or colplot, part of the collectl utilities project, which provides a web-based interface to gnuplot.
Collectl runs on all linux distros (it’s included as part of Fedora) and only requires perl. If the perl Time::Hires module is installed, you will be able to use fractional intervals and display timestamps in msecs. If the Compress::Zlib module is installed the recorded data will be compressed and therefore use on average 90% less storage when recording to a file. Also note that the above links are not for RPMs. If you’d rather work with RPMs there are far too many versions out there to link to and so I’m sorry to say you’re on your own.
Did you know there was an inconsistency in the way Linux reported disk metrics that wasn’t even noticed/fixed until the 2.6-14 kernel was released? Collectl did. Or how about the fact that network stats may not accurately reported by most network monitoring tools at one second intervals? See this page for a description of the problem and how you can get more accurate stats by simply running collectl at a sub-second interval.
You can have the updated version here
Feb112011

Data Guard Background processes

Data Guard Background processes
DMON – The Data Guard Broker process.
SNP – The snapshot process.
MRP – Managed recovery process – For Data Guard, the background process that applies archived redo log to the standby database.
ORBn – performs the actual rebalance data extent movements in an Automatic Storage Management instance. There can be many of these at a time, called ORB0, ORB1, and so forth.
OSMB – is present in a database instance using an Automatic Storage Management disk group. It communicates with the Automatic Storage Management instance.
RFS – Remote File Server process – In Data Guard, the remote file server process on the standby database receives archived redo logs from the primary database.
QMN – Queue Monitor Process (QMNn) – Used to manage Oracle Streams Advanced Queuing.
Feb042011

How to set Data Protection Mode of a Primary Database

How to set Data Protection Mode of a Primary Database
Step 1 Select a data protection mode that meets your availability, performance and data protection requirements.
Step 2 Verify that redo transport is configured to at least one standby database The value of the LOG_ARCHIVE_DEST_n database initialization parameter that corresponds to the standby database must include the redo transport attributes listed for the data protection mode that you are moving to.
Maximum Protection, Availability – AFFIRM,SYNC, DB_UNIQUE_NAME
Maximum Performance – NOAFFIRM, ASYNC, DB_UNIQUE_NAME
Step 3 Verify that the DB_UNIQUE_NAME database initialization parameter has been set to a unique name on the primary and standby database.
Step 4 Verify that the LOG_ARCHIVE_CONFIG database initialization parameter has been defined on the primary and standby database and that its value includes a DG_CONFIG list that includes the DB_UNIQUE_NAME of the primary and standby database.
SQL> ALTER SYSTEM SET LOG_ARCHIVE_CONFIG=’DG_CONFIG=(MMPRY,MMSTB)’;
Step 5 Shut down the primary database and restart it in mounted mode if the protection mode is being set to Maximum Protection or being changed from Maximum Performance to Maximum Availability. If the primary database is an Oracle Real Applications Cluster, shut down all of the instances and then start and mount a single instance.
For example:
SQL> SHUTDOWN IMMEDIATE;
SQL> STARTUP MOUNT;
Step 6 Set the data protection mode.
Execute the following SQL statement on the primary database:
SQL> ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE {AVAILABILITY | PERFORMANCE | PROTECTION};
Step 7 Open the primary database.
SQL> ALTER DATABASE OPEN;
Step 8 Confirm that the primary database is operating in the new protection mode.
SQL> SELECT PROTECTION_MODE FROM V$DATABASE;

Dec302010

What are Oracle Clusterware processes for 10g on Unix and Linux

Cluster Synchronization Services (ocssd) — Manages cluster node membership and runs as the oracle user; failure of this process results in cluster restart.
Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application process, and so on) based on the resource’s configuration information that is stored in the OCR. This includes start, stop, monitor and failover operations. This process runs as the root user
Event manager daemon (evmd) —A background process that publishes events that crs creates.
Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O fencing. OPROCD performs its check, stops running, and if the wake up is beyond the expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on Linux platforms.
RACG (racgmain, racgimon) —Extends clusterware to support Oracle-specific requirements and complex resources. Runs server callout scripts when FAN events occur.

Dec252010

DBA Interview Questions – Part 2

1) Primary database scn is 22 and standby SCN is 24 how you will resolve this issue.
2) Archive log got deleted before shipped to standby location how you will resolve this issue.
3) 3 users running same select but one of them response time is 1hr and for other few minutes how u will diagnose this.
4) What is advantage of ASM over os storage and raw storage.
5) What is remastering in RAC
6) How you can see database version at OS level.
7) A job is running since long it was working fine earlier how you will check that
8) How you will suggest joins to developer
9) 3 databases sizes 10gb, 100gb, 1TB how which database will take more time.
10) You lost control file database is using RMAN backup with nocatalog. If you loose whole db as well same time how you recover database.
11) what is sync, and async in standby how it works.
12) Shut immediate is taking too much time how you diagnose what is the issue.
13) How you take backup of OCR.
14) What will happen if voting disk is down.
15) What will happen if virtual IP is down in RAC.
VIP – Virtual IP address in RAC
VIP is mainly used for fast connection in failover.
Until 9i RAC faileover we used physical IP address of another server. When the connection request come from a client to server, then failure of first server listener then RAC redirect the connection request to second available server using physical IP address. Hence it is physical IP address rediretion to second physical IP address is possible only after we get timeout error from First Physical IP address. So connection should wait a while for getting TCP connection timeout.
From RAC 10g we can use the VIP to save connection timeout wait, Because ONS (Oracle Notification Service) maintains communication between each nodes and listeners. Once ONS found any listener down or node down, it will notify another nodes and listeners. While new connection is trying to establish connection to failure node or listener, virtual IP of failure node automatically divert to surviving node. This Process will not wait for TCP/IP timeout event. So new connection will be faster even one listener/node failed.
Nov252010

RMAN Interview Questions

RMAN Interview Questions
1) if one archive deleted before applying on standby and u do not have backup how u resolve this issue?. Database size is in TB so recreating Standby is not good idea.
2) if block change tacking is enabled on production db will it be enabled on cloned db automatically?
3) You have assigned 3-4 channels in rman backup how u will identified whether all are utilized or not?
4) how u do performance tuning for rman backups?
5) can we register manual cold backup in rman?
6) can we recreate oracle inventory if we do not have any backup?
7) what is diff in imagecopy & normal backup in terms of performance
8) 10K blocks in ur db got corrupted how u will resolve this

Jul012010

Explain the relationship among database, tablespace, and datafile.

The database is logically divided into one or more tablespaces.
For each tablespace we explicitly create one or more datafiles.
Jun282010

How to recover if we delete a data file at OS level no backup for that?

bring the database to the mount state
sql>alter database create datafile ;
sql>recover datafile ;
sql>alter database open;

Jun222010

How do you see how many instances are running?

In Linux, Unix the command: ps -ef|grep pmon
In Windows: services.msc

1 comment:

  1. Informatica interview questions are crucial for assessing candidates' VPN Movies Lag skills in data integration and ETL (Extract, Transform, Load).

    ReplyDelete