TECH SOLUTIONS: June 2012

Friday, June 15, 2012

1Z0-233 : Oracle Apps DBA Certification Questions

Its time to crack 11i certification (check our previous post here) and rush towards 12i Certification.
I believe basic knowledge (which script does what, which table is useful for what purpose and How to use the ad* utilities….) is enough with some hands on to clear Oracle Apps DBA Certification Track.

so let’s start with some basic things

Patching:
X. Utility used to apply application patch is ??
“adpatch”

X.Various options available with adpatch are
option=nocheckfile, noautoconfig, nocompiledb, nocompilejsp, nocopyportion, nodatabaseportion, nogenerateportion, hotpatch, integrity, prereq, noprereq, nomaintainmrc, validate, nojcopy, noforcecopy, forcecopy, nolink, nogenform,nogenrep

X. For database patch utility used to apply patch is ??
“opatch”

X..log and .lgi files for adpatch will be under $APPL_TOP/admin/SID/log directory.

X. What are .lgi files for ?

X. While applying patch put the system in maintenance mode by running
$adadmin and then select option 5 (change maintenance mode) after that select options enable/disable maintenance mode. (You can also change system to maintenence mode using $AD_TOP/patch/115/sql/adsetmmd.sql)

X.By default where adadmin log file will go??
$APPL_TOP/admin/$SID/log/abc.log

Workers:
X. How to check what adpatch is doing ??
“adctrl” is the utility to do check status of ad worker, including adpatch.

X. What are various stages of adpatch worker
First of all workers will be in “waiting” stage after that jobs will be “assigned” then “running“, If error doesn’t come it will move to “complete” (waiting -> assigned-> running -> complete)

If error comes it will go like waiting -> assigned -> running -> failed -> fixed ->restarted -> completed

X. What is Default number of workers:
2 times the number of CPU on the database server.

-
Deferred jobs

Have you ever seen worker status as deferred, while using adctrl ??
Once you start applying patch , manager assigns jobs and unique ID to each worker. Manager will also insert one row into FND_INSTALL_PROCESSES table for each worker with information about who is doing what ???

1) CONTROL_CODE 2) STATUS, these two columns are important for us from thefnd_install_processes. Manager will monitor the workers by these two columns about status of the assigned job.Once worker is done with first job, manager will assign it the second job to do with the status update as <assigned>.

Once all the worker’s status is completed for all the jobs, manager will tell the workers to shutdown and will drop fnd_install_process table.

X. What happens if worker is working on some particular job say updating some table but that table is locked ??? so what the worker will do in such situation…

here comes AD_DEFERRED_JOBS table which will be created at the time ofFND_INSTALL_PROCESSES and ad_deferred_jobs too will be dropped with FND_INSTALL_PROCESSES. First time when worker checks and find that table is locked, that job fails and manager will automatically defers the job, and it will assigns a new job to the worker.
The job which was waiting in ad_deferred_jobs table will be assigned again till time it will not become failed or completed.

In such situation above, worker will wait till either the lock is released or timed out is reached and We (DBA) need not to take any action till the job status becomes fails…. once it shows its failed we can fix it and restart it.

X. Why APPS schema present in the 11i application ??
Well it reduces traffic because all the product schemas will grant full privileges to Apps schema. so it will have full access to the complete Oracle application.

All the products like AP,AR,FIN has the code for respective products and APPS will have access to all these code objectives too. Apps will have all the synonyms for base tables and sequences as well.

OBJECT_NAME
——————————————————————————–
OBJECT_TYPE STATUS OWNER
—————— ——- ——————————
BEN_PL_PCP_S
SYNONYM VALID APPS

BEN_PL_PCP_TYP_S
SYNONYM VALID APPS

BOM_CTO_ORDER_DEMAND_S1
SYNONYM VALID APPS

File System
X. What all directories will be there under any “product” top. (To know more about various top’s in 11i click here)

cd $AP_TOP (Example of product top AP) (Below are various directory under each product top – ad, ap, gl..)
admin bin forms help html lib log mds media mesg out patch reports sql xml

$ Here directory “forms” contains all .fmx (compiled version of forms)

X.Where are .fmb (Forms) stored
They will be under $AU_TOP/forms/US

Note* .fmb and .fmx are different

X. When you run f60gen to compile the forms : .fmx will be put to $PRODUCT_TOP/forms/US/ of respective product of form.

X.What is stored in “mesg” directory under each product top ??
“mesg” contains language specific message files and error message for the product.

X. What is stored in “bin” directory under each product top
“bin” directory contains executable files like in $AP_TOP/bin you will see APPBCF APTZGF apxamex.ctl apxboav.ctl apxdiner.ctl apxgecmc.ctl apxusbv.ctl APXXTR

X. How to compile apps schema and when to compile??
You can use “adadmin” utility to compile apps schema (other methods like utlrp.sql exists). Usually you compile apps after application of the patches, maintanance patch, upgrade, runtime error due to AD_DDL packages or scenarios where there are invalid objects in apps schema.

X. How many database connections are allowed during fresh installation of oracle application.
100.

-
ADADMIN :- day to day use

AD Administration Main Menu

1. Generate Applications Files menu

2. Maintain Applications Files menu —Snapshot details

3. Compile/Reload Applications Database Entities menu ————-COMPILE APPS schema(invalid objects or runtime error)

4. Maintain Applications Database Entities menu

5. Change Maintenance Mode —–While patching

6. Exit AD Administration

Maintain Snapshot Information

1. List snapshots

2. Update current view snapshot

3. Create named snapshot

4. Export snapshot to file

5. Import snapshot from file

6. Delete named snapshot(s)

Hence question on maintenance snapshot can be like “What all maintain snapshot can do”
a. You can create copy of existing snapshot.
b. You can create a list of current view and named snapshot stored in your system.
c. You can update current view snapshot with any changes to the snapshot since the last update
d. You can export an existing snapshot to a file for storage or to another system.
e. You can delete the current view snapshot

And answer is We can export/import snapshot:Delete named snapshots (You can’t delete current view snapshot):Update current view snapshot:Create named snapshot and list all the present snapshots too .

DATAPUMP 2

Directory:

In Datapump syntax we found Directory, What does this directory means ? Does it talk about the OS level directory ???

well you can say that here directory means object.
we need to create database level directory as well as os level directory

How and Why?? ==>> Datapump is server based and notclient based. Server based means, datapump creates all its dump files on the server not on client machine.

BG processes of the oracle they are responsible for all the I/O operations for the dump files.

While doing export/import earlier (in old export/import) We were able to give dump file location where ever we find space but in 10g we have specific location which is pointed by this Directory Object.

Now you cant let any user be able to specify an absolute file path on the server. so over writing the server file if you have power to write dump file anywhere will be stopped.
Datapump maps to a specific OS directory with this Directory object.

Example

SQL> desc dba_directories;
Name                                                Null?             Type
—————————————– ——– —————————-
OWNER                                     NOT NULL VARCHAR2(30)
DIRECTORY_NAME               NOT NULL VARCHAR2(30)
DIRECTORY_PATH                                     VARCHAR2(4000)

SQL> select * from dba_directories where DIRECTORY_NAME like ‘%DATA_PUMP%’;

OWNER DIRECTORY_NAME DIRECTORY_PAT
—————————— ————————————————————————
SYS DATA_PUMP_DIR /u01/oracle/product/TEST/10.2.0/rdbms/log/

===>> Want to create the directory right now and then do datapump export

SQL> create directory test_dir1 as ‘/u01/oracle/product/TEST/10.2.0/export’;

Directory created.

Then where the security is if all the users who has db privilege can create the directory????===> To create the directory we need to have DBA role privilege OR “CREATE ANY DIRECTORY” privilege.

Now if you want to grant user priya privileges on the newly created directory test_dir1,

SQL> GRANT READ,WRITE ON DIRECTORY test_dir1 to priya;
Grant succeeded.

To do the datapump export you should have ” Write” privilege on all the files (means write privilege on Directory) and for import; need to have Read access to that export dump files.

Is only read privilege for import is sufficient ??? Nope as you need to write log filetoo so you shuld have write privilege as well on the Directory for import

Once the Directory is created, all the DataPump export and import jobs can use theDIRECTORY parameter to specify the name of the directory object
(DIRECTORY=test_dir1), thus DIRECTORY is pointing to the OS level directories and files.

$expdp priya/passwd dumpfile=test_dir_export.dmp

IF you have already created the default directory with the name DATA_PUMP_DIR, then no need to specify DIRECTORY parameter in export/import commands.

Oracle will automatically look for the directory which is specified by the value DATA_PUMP_DIR.

SQL> CREATE DIRECTORY TEST_DIR2 AS ‘/U01/oracle/product/10.2.0/export’;
Directory created.

$export DATA_PUMP_DIR=TEST_DIR2
$expdp priya/passwd tables=emp dumpfile=emp.dmp.
$expdp priya/passwd tables=emp logfile=test_dir2:priya.log

test_dir2is directory and DataPump file name is priya.log , (:) is separation for both the value (Data Pump file contains dump file,logfile or sqlfile).

Some practicle examples:

1) Import of tables from scott’s account to priya’s account :

If you do the same using the old import job:

$ imp username/password FILE=scott.dmp FROMUSER=scott TOUSER=priya TABLES=(*)

Data Pump Import:
$ impdp username/password DIRECTORY=test_dir1 DUMPFILE=scott.dmp
TABLES=scott.emp REMAP_SCHEMA=scott:priya

Here fromuser/touser is replaced with remap_schema, so EMP table of scott ownership will be imported to the priya schema.

2) Export of an entire database to a dump file with all GRANTS, INDEXES,
and data

$ exp username/password FULL=y FILE=dba.dmp GRANTS=y INDEXES=y ROWS=y $ expdp username/password FULL=y INCLUDE=GRANT INCLUDE= INDEX
DIRECTORY=test_dir1 DUMPFILE=dba.dmp CONTENT=ALL

The INCLUDE parameter allows you to specify which object , you want to keep in the export job. The EXCLUDE parameter allows you to specify which object , you want to keep out of the export job.
You cannot mix the two parameters in one job. Both parameters work with Data Pump Import as well.

REMAP • REMAP_TABLESPACE – This allows you to easily import a table into a different
tablespace from which it was originally exported. The databases have to be 10.1
or later.
Example:> impdp username/password REMAP_TABLESPACE=tbs_1:tbs_6
DIRECTORY=dpumpdir1 DUMPFILE=employees.dmp

• REMAP_DATAFILES– This is a very useful feature when you move databases between platforms that have different file naming conventions. This parameter changes the source datafile name to the target datafile name in all SQL statements where the source datafile is referenced. Because the REMAP_DATAFILE value uses quotation marks, it’s best to specify the parameter within a parameter file.

DATAPUMP

Datapump is 10g new features useful to do export/import which is called as unload/load data too. There is one drawback with compare to normal export/import is…

The DataPump export/import jobs need more startup time then the old export and import utility, so for small jobs better to use normal export/import utility.
2) DataPump export data only to the disk not to a tape.

Now you will have the question that why to use datapump export/import.

1) Oracle claims that Datapump enables you to decrease total export time and import is 15 to 30 mins faster than the normal import method.

2) You can estimate the job times before hand means how much time it will take to do the export or import,

3) Ycan directly load one database from remote instance.

4) and you can select the objects to get exported or imported ( so fine-grained object selection)

5) so all in all its a superset of old export/import utility.

The commands to do use Datapum is
$expdp username/passwd (various parameters) $expdp help=y

Data Pump export utility will unload the data into operating system files known asdump files. It writes to these files in a proprietary format, which only the Data Pump import utility can understand while loading the data in the dump files into the same or other database.

==>You can take DataPump export dump files from one operating system andimport them into a database running on a different type of platform. (i.e. from Linux to Solaris)

so when you export using Datapump you can not import it using the normal old import utility.

Mechanics of the DataPump:

To perform complete datapump job it uses several processes which are new background process to our database 10g, Master process,worker process,shadow processes.

Master Process

MCP) That is Master control process recognized at DMnn. from the backend <instance>_DMnn_<pid>.

MCP
1) creates jobs and controls them
2)creates and manages the worker processes
3)Monitor the jobs and log the progress
4) Maintain the job state and restart information in the Master table
5)manages the necessary files, including dump file set.

so here comes the question What is master table??

MCP uses a special table called Master table(MT) to log the location of the various database objects in the export dump file. MCP maintain the information like job state and restart information in the master table,

Job state?? restart?? –Yes so here you can temporarily stop your job(export or import running) and then again after doing some alteration you can restart your job we will see in detail how to do the same.

Where this Master table resides?? Oracle creates the master table in the schema of the user who is running the Datapump job at the begining of every export job.

so Master table will have info like what all parameters of the export job, status of all worker process, location of the objects which are inside the dumpfile set.

here dumpfile is same as old export/import dump file and log file too we can have which will log complete operation. one more file is used in Datapump is SQLFILE.
SQLFILE ????— Yup , DataPump import uses a special parameter called SQLFILE which will write all the DDL statements it execute during the import job to a file.

here we will have the job name as well as master table name both will be same.
so what could be the job name?? we can use the JOBNAME parameter to provide your own job name(JOBNAME=priyaexp). Well this parameter is –optional however if you don’t specify the jobname parameter DataPump generates a unique sysem name like <user>_<operation>_<mode>_%N.

suppose username is SYSTEM and you are doing full database export then the name will be SYSTEM_EXPORT_FULL_01.

Its just a simple change right?

Putting a database in archivelog mode? Set the destination, the archive log format, shutdown immediate, startup mount, alter database archivelog, alter database open, alter system archive log current. Done.

Except, alter system archive log current failed:

ORA-19504: failed to create file "/u02/archive/ORCL/ORCL_1_1_686533687.dbf"
ORA-27040: file create error, unable to create file
Linux Error: 13: Permission denied

Huh, thats ok. This is an Oracle EBS environment and I created the archive destination directory with the applmgr user, not oracle. No problem, su - root, chown oracle:dba . , done.

alter system archive log current again, success!

Can't remember what query I issued next (its been a hazzy night) but I was greeted with the following error:

Errors in file /u01/oracle/ORCL/db/tech_st/10.2.0/admin/ORCL_myserver/bdump/orcl_smon_10547.trc:
ORA-00376: file 5 cannot be read at this time
ORA-01110: data file 5: '/u01/oracle/ORCL/db/apps_st/data/system05.dbf'

WTF? Just some background about this environment. I only cloned it the day before and I didn't have any backups. I was putting the database in archivelog mode so I could take backups.

This error typically means the datafile is offline, tho i'm not sure why. So I bring it online:

alter database datafile 5 online;
recover datafile 5;

But:

ORA-00283: recovery session canceled due to errors
ORA-00264: no recovery required

Hrmm.. ok, so I try to open the database:

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel

WTF? Check the alert log:

Errors in file  /u01/oracle/ORCL/db/tech_st/10.2.0/admin/ORCL_rc1657/bdump/uat2_lgwr_11557.trc:
ORA-00600: internal error code, arguments: [3712], [1], [1], [1388],  
[3796296244], [1388], [3796296241], []

Lovely. At this point I have 3 options:

1. Work with Oracle Support.
2. Rebuild the environment since I don't have any backups. Developers lose 1 day.
3. Back it up, research the error and see if I can find a solution.

I worked with Oracle support up until 6:30am, at which point I believe my analyst was finished for the day and went home. This is a UAT environment, so the SR is a P2. I wasn't expecting 24x7 support, so this was just bad timing.

I searched metalink and didn't find any solid hits. So I tried google. Not many hits, but I received one exact match, in chinese. Google graciously offered to translate the page for me and about the only useful text on the page was:

"Who came across the internal bug? 我用了alter database clear logfile group 后再重启动数据库就发生这个错误了。 I used the alter database clear logfile group restart the database after this error has occurred. "

At this point I had an hour left before testers started and only 2 options remaining. I figure I was going to have to rebuild the database anyways, so i'd try clearing the log files. I cleared the inactive groups without issue. The only group remaining was the current and if the clear logfile group command failed i'd have to proceed with the rebuild.

From the alert log:

alter database clear unarchived logfile group 1
Tue May 12 08:14:34 2009
Thread 1 advanced to log sequence 6 (preparation to clear logfile)
WARNING! CLEARING REDO LOG WHICH HAS NOT BEEN ARCHIVED. BACKUPS TAKEN
    BEFORE 05/12/2009 08:14:34 (CHANGE 1) CANNOT BE USED FOR RECOVERY.
Clearing online log 1 of thread 1 sequence number 5
Tue May 12 08:17:34 2009
Completed: alter database clear unarchived logfile group 1

Success!!!

Alter database open; Success!!!

Thats my 6 hr simple change (most of that time was working with Oracle Support, taking a backup [only 5MB/s IO??!] of the environment before I played around with it, etc.) Now I need some sleep. You know your tired when you almost use dish washing liquid for cream in your coffee. Luckily an attentive coworker prevented disaster! Normally an all nighter doesn't leave me this drained but the previous night I only had 3 hrs sleep.

Friday, March 20, 2009

Monitoring your RMAN backup scripts

There is an easy way to monitor your RMAN backup scripts in a unix environment. I've seen quite a few DBA's log the rman output to a file and then grep it for errors. Ex.

rman target / CMDFILE /home/oracle/scripts/rman.sql LOG $logfile
errors=`cat $logfile|grep -i error|wc -l`
if test $errors -gt 0
then
   mailx -s "BACKUP FAILED" email1@myserver.com, email2@myserver.com <<\! 
`cat $logfile`
!
fi

Another method would be to examine the return code from the rman client. Rman returns a value of 0 to the shell if the backup was successful. Otherwise it returns a nonzero value and from what I have seen this is always 1. (Have you seen any other values?)

So another way of writing the above script would be

rman target /  CMDFILE /home/oracle/scripts/rman.sql LOG $logfile
status=$?

if [ $status -gt 0 ] ; then
 mailx -s "BACKUP FAILED: TESTDB" email1@myserver.com <<\!
 `cat $logfile`
\! 
else
 mailx -s "SuccessfulBackup: TESTDB" email1@myserver.com <<\!
`cat $logfile`
\!
fi

NOTE: Ignore the backlashes before the exclamation points above. I'm not sure how to format it properly within blogger. I'll see if I can fix it later.

As you can see above, after rman executes I set a status variable. $? is automatically set to the return code from the rman client. After that I check the value and there was an error I send an email saying the backup failed and put the contents of the log file into the body of the message. If the backup was successful, I still send a copy of the logfile.

The reason I do this is because each day I will expect to see either a successful or failed email. I've had issues in the past with cron not executing scheduled jobs (for numerous reasons). So if I don't receive an email I'll verify the backup manually.

Friday, January 16, 2009

Recovery Scenario – Losing a Redo Log file.

I’m in the process of studying for the OCP exam and I personally find it much easier to remember something if I write it down. What better way to do this than creating a blog post? Unfortunately its some pretty basic stuff so it might not garner much interest.
My test environment has the following redo log configuration:

1  select group#, sequence#, bytes/1024/1024 as "Size (MB)", members, status

2* from v$log

L> /

  GROUP#  SEQUENCE#  Size (MB)    MEMBERS STATUS

-------- ---------- ---------- ---------- ----------------

       1          5         50          2 INACTIVE

       2          6         50          2 CURRENT

       3          4         50          2 INACTIVE

SQL> l

  1  select group#, member

  2  from v$logfile

  3* order by 1

SQL> /

    GROUP# MEMBER

---------- ----------------------------------------

         1 /home/oracle/oradata/orcl/redo01.log

         1 /home/oracle/oradata/orcl/redo01a.log

         2 /home/oracle/oradata/orcl/redo02a.log

         2 /home/oracle/oradata/orcl/redo02.log

         3 /home/oracle/oradata/orcl/redo03a.log

         3 /home/oracle/oradata/orcl/redo03.log

As you can see, there are 3 redo log groups with 2 members each. In order to simulate a failure I am going to remove one of the members.

SQL> !rm /home/oracle/oradata/orcl/redo01a.log

SQL> alter system switch logfile;

System altered.

SQL> /

System altered.

Oracle will not detect a problem until it tries to use the missing redo log file so I had to issue two logfile switches before redo01a.log was accessed. At that point the following errors show up in the alert.log.

Errors in file /home/oracle/product/10.2.0/db_1/admin/orcl/bdump/orcl_lgwr_2648.trc:

ORA-00313: open failed for members of log group 1 of thread 1

ORA-00312: online log 1 thread 1: '/home/oracle/oradata/orcl/redo01a.log'

ORA-27037: unable to obtain file status

Linux Error: 2: No such file or directory

Additional information: 3

Thu Jan 15 15:40:44 2009

Errors in file /home/oracle/product/10.2.0/db_1/admin/orcl/bdump/orcl_lgwr_2648.trc:

ORA-00321: log 1 of thread 1, cannot update log file header

ORA-00312: online log 1 thread 1: '/home/oracle/oradata/orcl/redo01a.log'

Thu Jan 15 15:40:45 2009

Errors in file /home/oracle/product/10.2.0/db_1/admin/orcl/bdump/orcl_lgwr_2648.trc:

ORA-00313: open failed for members of log group 1 of thread 1

As well, if we take another look at v$logfile, the missing redo log will have a status of INVALID:

SQL>

    GROUP# MEMBER                                   STATUS

---------- ---------------------------------------- -------

         3 /home/oracle/oradata/orcl/redo03.log

         2 /home/oracle/oradata/orcl/redo02.log

         1 /home/oracle/oradata/orcl/redo01.log

         1 /home/oracle/oradata/orcl/redo01a.log    INVALID

         2 /home/oracle/oradata/orcl/redo02a.log

         3 /home/oracle/oradata/orcl/redo03a.log

In this particular case, the loss of redo01a.log file is not critical. Since the group is multiplexed (more than 1 member) Oracle can operate without impacting users. Each time Oracle tries to use the missing redo log file the errors will be displayed again in alert log.

There are two methods in order to recover form this scenario:

Issue the CLEAR LOGFILE command.

Drop the missing member and add a new one.

Which one should you choose? That depends on the circumstances surrounding why the redo log is missing. The CLEAR LOGFILE command reinitializes the logfile group, similiar to dropping and adding each member. If the failure is due to a corrupted disk, then this command will fail as it recreates the group in place. If this is the case, you will need to use the second method of dropping and adding a new member so you can specify a new location.

Lets try issuing the CLEAR LOGFILE command:

SQL> alter database clear logfile group 1;

Database altered.

Thats it, everything is back to normal.

Lets try the other method of dropping and adding a new member:

SQL>  !rm  /home/oracle/oradata/orcl/redo01a.log

SQL> alter database drop logfile member '/home/oracle/oradata/orcl/redo01a.log';

Database altered.

SQL> alter database add logfile member '/home/oracle/oradata/orcl/redo01a.log' to group 1;

Database altered.

So, what happens if you lose all members of a redo log group? The steps you have to take are dependent on the groups status and whether or not the database is in archivelog mode.

INACTIVE:

If the affected redo log group has a status of INACTIVE it is no longer required for crash recovery. If you are in NOARCHIVELOG mode issue either CLEAR LOGFILE or recreate the group manually.

If you are in ARCHIVELOG mode and the group has been archived, issue either of the steps above.

If you are in ARCHIVELOG mode and the group hasn’t been archived then issue CLEAR UNARCHIVED LOGFILE. If you don’t specify the UNARCHIVED keyword you will receive an error. After you execute this command, you will have a gap in your archivelogs, so perform a complete backup (including the control file) ASAP.

ACTIVE:

If the redo log group has a status of ACTIVE, it is still required for crash recovery. Issue the command ALTER SYSTEM CHECKPOINT, If successful then follow the steps above for INACTIVE.

If the checkpoint fails, then you need to perform recovery. If you are in NOARCHIVELOG mode then you need to perform a complete recovery of the last cold backup.

If the checkpoint fails and you are in ARCHIVELOG mode then you can perform an INCOMPLETE recovery up to the previous log file.

CURRENT:

The current redo log group is the one which Oracle is currently writing to. If you lose this group or files become corrupted, LGWR may terminate. For example:

Corrupted files:

Fri Jan 30 14:14:55 2009

Errors in file /apps/oravis/db/tech_st/10.2.0/admin/test/bdump/test_lgwr_25599.trc:

ORA-00316: log 1 of thread 1, type 8944 in header is not log file

ORA-00312: online log 1 thread 1: '/oradata/dbf2/testdata/test/redo01.log'

Fri Jan 30 14:14:55 2009

LGWR: terminating instance due to error 316

Instance terminated by LGWR, pid = 25599

Missing files:

Fri Jan 30 15:04:34 2009

Errors in file /apps/oravis/db/tech_st/10.2.0/admin/test/bdump/test_lgwr_1434.trc:

ORA-00320: cannot read file header from log 1 of thread 1

ORA-00312: online log 1 thread 1: '/oradata/dbf2/testdata/test/redo01.log'

ORA-27091: unable to queue I/O

ORA-27072: File I/O error

SVR4 Error: 11: Resource temporarily unavailable

Additional information: 4

Additional information: 1

LGWR: terminating instance due to error 320

Instance terminated by LGWR, pid = 1434

If LGWR terminates you have to recover the database:
- If you are in ARCHIVELOG mode, perform an incomplete recovery up to the previous log file.
- If you are in NOARCHIVELOG, perform a complete recovery of the last cold backup.

If the database is still online, you can try and clear the logfile group. If the command is not successful then you will need to recover.

Thursday, May 15, 2008

Refreshing an 11i Database using Rman

In order to facilitate troubleshooting we maintain a test environment which is a nightly copy of our 11i production environment. Since this environment is usually used to test data fixes it has to be as up to date as possible. To perform the database refresh we use rman's duplicate feature.

The goal of this article isn't just to provide the entire set of scripts and send you on your way. I think its safe to say that most EBS environments aren't identical, so its not like you could take them and execute with no issues. Instead i'll highlight the steps we follow and some of the key scripts.

NOTE: This doesn't include any pre-setup steps such as, if this is the first time duplicating the database make sure you have the parameters db_file_name_convert and log_file_name_convert specified in your test environments init file.

Step 1: Shutdown the test environment. If you are using 10g then remove any tempfiles. In 10g, rman now includes tempfile information and if they exist you will encounter errors. Check this previous post. Startup the database in nomount mode.

Step 2: Build a Rman Script. There are a couple of ways to recover to a point in time and we have decided to use SCN numbers. Since this process needs to be automated, we query productions rman catalog and determine the proper SCN to use and build an rman script. Here it is:

set feedback off
set echo off
set serverout on
spool $HOME/scripts/prod_to_vis.sql
declare
vmax_fuzzy number;
vmax_ckp number;
scn number;
db_name varchar2(3) := 'VIS';
log_file_dest1 varchar2(30) := '/dbf/visdata/';
begin
select max(absolute_fuzzy_change#)+1,
     max(checkpoint_change#)+1
     into vmax_fuzzy, vmax_ckp
from rc_backup_datafile;
if vmax_fuzzy > vmax_ckp then
scn := vmax_fuzzy;
else
scn := vmax_ckp;
end if;
dbms_output.put_line('run {');
dbms_output.put_line('set until scn '||to_char(scn)||';');
dbms_output.put_line('allocate auxiliary channel ch1 type disk;');
dbms_output.put_line('allocate auxiliary channel ch2 type disk;');
dbms_output.put_line('duplicate target database to '||db_name);
dbms_output.put_line('logfile group 1 ('||chr(39)||log_file_dest1||'log01a.dbf'||chr(39)||',');
dbms_output.put_line(chr(39)||log_file_dest1||'log01b.dbf'||chr(39)||') size 10m,');
dbms_output.put_line('group 2 ('||chr(39)||log_file_dest1||'log02a.dbf'||chr(39)||',');
dbms_output.put_line(chr(39)||log_file_dest1||'log02b.dbf'||chr(39)||') size 10m;}');
dbms_output.put_line('exit;');
end;
/
spool off;

This script produces a spool file called, prod_to_vis.sql:

run {                                                                        
set until scn 14085390202;                                                   
allocate auxiliary channel ch1 type disk;                                    
allocate auxiliary channel ch2 type disk;                                    
duplicate target database to VIS                                             
logfile group 1 ('/dbf/visdata/log01a.dbf',                          
'/dbf/visdata/log01b.dbf') size 10m,                                 
group 2 ('/oradata/dbf/visdata/log02a.dbf',                                  
'/dbf/visdata/log02b.dbf') size 10m;}                                
exit;

Note: Our production nightly backups are on disk which are NFS mounted to our test server.

Step 3: Execute the rman script. Launch rman, connect to the target, catalog, auxiliary and execute the script above:

ie.
rman> connect target sys/syspasswd@PROD catalog rmancat/catpasswd@REPO auxiliary /

You may want to put some error checking around rman to alert you if it fails. We have a wrapper script which supplies the connection information and calls the rman script above. Our refresh is critical so if it fails we need to be paged.
```
rman @$SCRIPTS/prod_to_vis.sql
if [ $? != 0 ]
then
     echo Failed
     echo "RMAN Dupcliate Failed!"|mailx -s "Test refresh failed" pageremail@mycompany.com
     exit 1
fi
```
Step 4: If production is in archivelog mode but test isn't, then mount the database and alter database noarchivelog;
Step 5: If you are using a hotbackup for cloning then you need to execute adupdlib.sql. This updates libraries with correct OS paths. (Appendix B of Note:230672.1)
Step 6: Change passwords. For database accounts such as sys, system and other non-applications accounts change the passwords using alter user. For applications accounts such as apps/applsys, modules, sysadmin, etc use FNDCPASS to change their passwords.

ie. To change the apps password:

FNDCPASS apps/<production appspassword=> 0 Y system/<system_passwd> SYSTEM applsys <new apps passwd>
Step 7: Run autoconfig.
Step 8: Drop any database links that aren't required in the test environment, or repoint them to the proper test environments.
Step 9: Follow Section 3: Finishing Tasks of Note:230672.1
- Update any profile options which have still reference the production instance.
  
  Example:
  
  UPDATE FND_PROFILE_OPTION_VALUES SET
  profile_option_value = REPLACE(profile_option_value,'PROD','TEST')
  WHERE profile_option_value like '%PROD%
  
  Specifically check the FND_PROFILE_OPTION_VALUES,ICX_PARAMETERS, WF_NOTIFICATION_ATTRIBUTES andWF_RESOURCES tables and look for production hostnames and ports. We also update the forms title bar with the date the environment was refreshed:
  
  UPDATE apps.FND_PROFILE_OPTION_VALUES SET
  profile_option_value = 'TEST:'||' Refreshed from '||'Production: '||SYSDATE
  WHERE profile_option_id = 125
  ;
- Cancel Concurrent requests. We don't need concurrent requests which are scheduled in production to keep running in test. We use the following update to cancel them. Also, we change the number of processes for the standard manager.
  
  update fnd_concurrent_requests
  set phase_code='C',
  status_code='D'
  where phase_code = 'P'
  and concurrent_program_id not in (
  select concurrent_program_id
  from fnd_concurrent_programs_tl
  where user_concurrent_program_name like '%Synchronize%tables%'
  or user_concurrent_program_name like '%Workflow%Back%'
  or user_concurrent_program_name like '%Sync%responsibility%role%'
  or user_concurrent_program_name like '%Workflow%Directory%')
  and (status_code = 'I' OR status_code = 'Q');
  
  update FND_CONCURRENT_QUEUE_SIZE
  set min_processes = 4
  where concurrent_queue_id = 0;
Step 10: Perform any custom/environment specific steps. We have some custom modules which required some modifications as part of cloning.
Step 11: Startup all of the application processes. ($S_TOP/adstrtal.sh)

NOTE: If you have an application tier you may have to run autoconfig before starting up the services.

Hopefully this article was of some use even tho it was pretty vague at times. If you have any questions feel free to ask. Any corrections or better methods don't hesitate to leave a comment either.

Thursday, January 17, 2008

Control File Autobackups

One of the many features included in rman is the ability to automatically backup your control file. What is a control file? The control file contains information about your database including its physical structure. If you are using rman without a recovery catalog then your backup information is also stored in the control file. For an overview of control files take a look atOracles Documentation.

It is very important for (some) recovery purposes to ensure you have a recent copy of your control file, especially if it contains your recovery catalog. There is one feature of control file autobackups which alot of dba's seem to be unfamiliar with, which is why I thought i'd write up a brief article on it. If you have this feature enabled and you make a structural change to the database, the control file is automatically backed up. ie, If you add a datafile, rename a file, etc. Information which could affect your ability to recover. This backup will be placed on disk even if your regular backup goes to tape.

A quick test. First, ensure you have control file autobackups enabled and select a location on disk. By default the control file backup will be placed in your $ORACLE_HOME/dbs directory unless you specify a location.

oravis@myserver=> rman target / nocatalog

Recovery Manager: Release 10.2.0.2.0 - Production on Wed Jan 16 14:48:49 2008

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

connected to target database: VIS (DBID=767668735)
using target database control file instead of recovery catalog

RMAN> CONFIGURE CONTROLFILE AUTOBACKUP ON;

new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP ON;
new RMAN configuration parameters are successfully stored

RMAN> CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/oradata/bck/vis_backup/%F';

new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/oradata/bck/vis_backup/%F';y

A quick test to make sure the backup works and is on disk:

oravis@myserver=> rman target / nocatalog

Recovery Manager: Release 10.2.0.2.0 - Production on Wed Jan 16 15:25:24 2008

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

connected to target database: VIS (DBID=767668735)
using target database control file instead of recovery catalog

RMAN> backup current controlfile;

Starting backup at 16-JAN-08
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=414 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current control file in backupset
channel ORA_DISK_1: starting piece 1 at 16-JAN-08
channel ORA_DISK_1: finished piece 1 at 16-JAN-08
piece handle=/mnt1/app/visdb/1020_64bit/dbs/01j6adrd_1_1 tag=TAG20080116T152532 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
Finished backup at 16-JAN-08

Starting Control File Autobackup at 16-JAN-08
piece handle=/oradata/bck/vis_backup/c-767668735-20080116-00 comment=NONE
Finished Control File Autobackup at 16-JAN-08

RMAN> exit

Recovery Manager complete.
oravis@myserver=> ls -alrt /oradata/bck/vis_backup
total 42036
drwxrwxr-x  28 oravis   dba         1024 Jan 16 14:47 ..
-rw-rw----   1 oravis   dba      19398656 Jan 16 15:25 c-767668735-20080116-00
drwxrwxr-x   2 oravis   dba          512 Jan 16 15:25 .

Next we need to make a structural change to the database to see if the control file is automatically backed up. For this example i'll add a small datafile to the system tablespace.

SQL> alter tablespace SYSTEM add datafile '/oradata/dbf/visdata/system09.dbf' size 25M;

Tablespace altered.

A directory listing of the backup location should show a second control file backup:

oravis@myserver=> ls -alrt /oradata/bck/vis_backup
total 79972
drwxrwxr-x  28 oravis   dba         1024 Jan 16 14:47 ..
-rw-rw----   1 oravis   dba      19398656 Jan 16 15:25 c-767668735-20080116-00
-rw-rw----   1 oravis   dba      19398656 Jan 16 15:26 c-767668735-20080116-01
drwxrwxr-x   2 oravis   dba          512 Jan 16 15:26 .

I promise my next post will be a bit more advanced and related to EBS.

Friday, August 31, 2007

Rman and Windows

Thought I would point out a nice little utility i've used on windows 2003 environments. The program is called linkd and it is part of the Windows Server 2003 Resource Kit. Linkd is a utility which allows you to create Junction points on a windows server. Junction point is just the term used for directory symbolic links.

Why is this handy? Recently I was asked to restore a windows production database on a test server. As you may know with rman, it is very picky on paths. For example, if you place your backup in a directory called d:\backup\nightly and would like to restore this on another server, then you have to place the backup in the same path.

Unfortunately there wasn't enough free space on D: for me to place the backup. However, there was more than enough on the H: drive. I used the linkd utility to create a symbolic link from D:\backup\nightly to H:\backup\nightly (can be any path) and ran the restore.

On unix symbolic links are common knowledge, not the same for windows... In this case I took over from a couple of dba's who were trying to get this working for over a day (24hrs straight). They opened a case with Oracle Support and before I took it over they were trying to extract the datafiles manually from the backup. Yikes, painful. So hopefully this tool will become more well known.

Related links:

How to create and manipulate NTFS junction points - MS Support Site
Junction V1.05 - MS Technet
Windows 2003 Resource Kit download page

Monday, July 16, 2007

ORA-00376: file 6 cannot be read at this time

It seems each night I try to hit the sack a little early (1am) I get paged shortly after. Last night was no exception with a page coming in from one of our application support teams. Users in Australia were having issues logging into one of our customer support applications.

All of the environments I support are unix based but our team does support some Windows servers. So while i'm on call i'm responsible for those as well.

The first thing I do is check the alert log and I notice the following errors repeating over and over:

Errors in file d:\orant9i\admin\orcl\udump\orcl_j000_2096.trc:
ORA-12012: error on auto execute of job 22
ORA-00376: file 6 cannot be read at this time
ORA-01110: data file 6: 'D:\ORCL\MYDATA01.DBF'
ORA-06512: at "SCHEMA.STOREDPROCEDURE", line 3
ORA-06512: at line 1

I seemed to recall some of our other windows based database servers having issues in the past with backup software locking the files. So the next thing I do is called the Intel Support group in Australia and ask them if any backups were executed today. Indeed this past weekends full backups failed and they initiated some manual backups today.

The support person confirmed that the OS backup had completed, so I restarted the database. The database opened successfully and I checked the alert logs to verify there were no errors. As it had in the past occurrences, these steps resolved the issue.

Since I couldn't get back to sleep I kept an eye on the alert log and surfed the web on my home PC. About 20 minutes later the error appeared again followed by an email from a manager in Australia.

I was kind of surprised by this so I checked v$datafile. The status of 3 files was "RECOVER". My first thought was, why do I always get the fun issues when i'm on pager?!

This was a relatively small database and before I attempt any recovery I always take a cold backup. (If time allows.) Once I restarted the database I executed the following commands for each datafile that was marked as needing recovery:

SQL> recover datafile 'c:\my\datafile.dbf';
SQL> alter database datafile 'c:\my\datafile.dbf' online;

Media recovery was successful and the problem did not reoccur. So just a note for those windows DBA's out there. If you have automated backup software ensure that your datafile directories are excluded. While on unix the worst case would be an inconsistent backup, on windows, file locking behaves differently and could cause an outage like the one I experienced.

Well, i'm off to bed shortly and hopefully will get a good nights sleep!

Wednesday, July 11, 2007

Resetlogs and your Rman catalog

This is one of those, I knew it at one point but its been so long ago items. Also, the manner in which it was discovered had us scratching our heads for a minute. I try not to blog about work specific items, you never know who is watching out there and i'm sure we have all heard of bloggers who have been fired. Playing it safe, lets just say, someone made a boo boo.....

So just a reminder that when a database is opened with resetlogs your creating a new incarnation of the database. If your using an rman catalog, you will have to login to your catalog and execute register database. Otherwise you will get an error:

Recovery Manager: Release 9.2.0.8.0 - Production

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

connected to target database: MYDB (DBID=2508956331)
connected to recovery catalog database

RMAN> 2> 3> 4> 5> 6>
Starting backup at 11-JUL-07
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 07/11/2007 03:30:00
RMAN-06004: ORACLE error from recovery catalog database: RMAN-20003: target database incarnation not found in recovery catalog

Thursday, June 28, 2007

10.2.0.2 Rman duplicate issue

Each night we use rman's duplicate database feature to refresh a test environment to aid in troubleshooting production issues. This past weekend we upgraded our production database from 9i to 10gR2. We also applied alot of patches and the simplest way to update our test environment was to reclone it.

Last night we used our refresh scripts (rman duplicate) to rebuild the test database. Early this morning users reported the following error:

Since this problem wasn't reproducible in production we knew it had to be something with the refresh. The first thing we did was to look for differences between production and test databases. Some objects were invalid in test but recompiling them didn't resolve the issue. Checking through the log files for the refresh I noticed that autoconfig didn't complete successfully.

Uploading Context file /apps/appsora/oraappl/admin/ora_myserver.xml
Context File upload for /apps/appsora/oraappl/admin/ora_myserver.xml failed.
Check log file created by FNDCPUCF program.
. . .
[AutoConfig Error Report]
The following report lists errors AutoConfig encountered during each
phase of its execution. Errors are grouped by directory and phase.
The report format is:

[PROFILE PHASE]
AutoConfig could not successfully execute the following scripts:
Directory: /apps/appsora/oracomn/admin/install/ora_myserver
afcpctx.sh INSTE8_PRF 1

AutoConfig is exiting with status 1

afcpctx.sh uploads you context file to the database by calling FNDCPUCF. I assuming FNDCPUCF stands for FND Concurrent Process Upload Context File but I can't seem to find much information on it. Since it fires off a concurrent request the next step was to check the log files generated for errors. The following error was in one of them:

Uploading context file...
Oracle error -1187: ORA-01187: cannot read from file 702 because it failed verification tests
ORA-01110: data file 702: '/oradata/dbf/oradata/temp02.dbf' has been detected in FND_GSM_UTIL.APPEND_CTX_FRAGMENT.

I haven't encountered that error before so but a quick search on Metalink turned up the following note:

Subject: RMAN After 10.2.0.2 Duplicate Database Gets Ora-1187 And Ora-1110 on TempFiles
Doc ID: Note:374934.1

In 9i tempfile information is removed from the database during the duplication process and you need to re-add them manually. In 10g, tempfiles are now included. However, if the files already exist on the OS then the tempfiles are not re-created and you will see errors in the alert log. So, in order to resolve this issue we dropped the tempfiles and re-added them (as per the metalink note).

To drop tempfiles:

alter database tempfile '/u01/oracle/mydb/temp01.dbf' drop;

To add a new one (assuming your temporary tablespace is named temp):

alter tempspace temp add tempfile '/u01/oracle/mydb/temp01.dbf' size 500M;

Once this was done, autoconfig completed without error and we restarted the application.