Friday, January 11, 2013

Introduction to Recovery

To restore a physical backup of a datafile or control file is to reconstruct it and make it available to the Oracle database server. To recover a restored datafile is to update it by applying archived redo logs and online redo logs, that is, records of changes made to the database after the backup was taken. If you use RMAN, then you can also recover datafiles with incremental backups, which are backups of a datafile that contain only blocks that changed after a previous incremental backup.
After the necessary files are restored, media recovery must be initiated by the user. Media recovery involves various operations to restore, roll forward, and roll back a backup of database files.
Media recovery applies archived redo logs and online redo logs to recover the datafiles. Whenever a change is made to a datafile, the change is first recorded in the online redo logs. Media recovery selectively applies the changes recorded in the online and archived redo logs to the restored datafile to roll it forward.
To correct problems caused by logical data corruptions or user errors, you can use Oracle Flashback. Oracle Flashback Database and Oracle Flashback Table let you quickly recover to a previous time.
Figure 15-2 illustrates the basic principle of backing up, restoring, and performing media recovery on a database.
Figure 15-2 Media Recovery
Description of Figure 15-2 follows
Description of "Figure 15-2 Media Recovery"
Unlike media recovery, Oracle performs crash recovery and instance recovery automatically after an instance failure. Crash and instance recovery recover a database to its transaction-consistent state just before instance failure. By definition, crash recovery is the recovery of a database in a single-instance configuration or an Oracle Real Application Clusters configuration in which all instances have crashed. In contrast, instance recovery is the recovery of one failed instance by a live instance in an Oracle Real Application Clusters configuration.
This section contains the following topics:

Overview of Media Recovery

The type of recovery that takes a backup and applies redo is called media recovery. Media recovery updates a backup to either to the current or to a specified prior time. Typically, the term "media recovery" refers to recovery of datafiles. Block media recovery is a more specialized operation that you use when just a few blocks in one or more files have been corrupted. In any case, you always use a restored backup to perform the recovery.
This section contains the following topics:

Complete Recovery

Complete recovery involves using redo data or incremental backups combined with a backup of a database, tablespace, or datafile to update it to the most current point in time. It is called complete because Oracle applies all of the redo changes contained in the archived and online logs to the backup. Typically, you perform complete media recovery after a media failure damages datafiles or the control file.
You can perform complete recovery on a database, tablespace, or datafile. If you are performing complete recovery on the whole database, then you must:
  • Mount the database
  • Ensure that all datafiles you want to recover are online
  • Restore a backup of the whole database or the files you want to recover
  • Apply online or archived redo logs, or a combination of the two
If you are performing complete recovery on a tablespace or datafile, then you must:
  • Take the tablespace or datafile to be recovered offline if the database is open
  • Restore a backup of the datafiles you want to recover
  • Apply online or archived redo logs, or a combination of the two

Incomplete Recovery

Incomplete recovery, or point-in-time recovery, uses a backup to produce a noncurrent version of the database. In other words, you do not apply all of the redo records generated after the most recent backup. You usually perform incomplete recovery of the whole database in the following situations:
  • Media failure destroys some or all of the online redo logs.
  • A user error causes data loss, for example, a user inadvertently drops a table.
  • You cannot perform complete recovery because an archived redo log is missing.
  • You lose your current control file and must use a backup control file to open the database.
To perform incomplete media recovery, you must restore all datafiles from backups created prior to the time to which you want to recover and then open the database with the RESETLOGS option when recovery completes. The RESETLOGS operation creates a new incarnation of the database—in other words, a database with a new stream of log sequence numbers starting with log sequence 1.
Before using the OPEN RESETLOGS command to open the database in read/write mode after an incomplete recovery, it is a good idea to first open the database in read-only mode, and inspect the data to make sure that the database was recovered to the correct point. If the recovery was done to the wrong point, then it is easier to re-run the recovery if no OPEN RESETLOGS has been done. If you open the database read-only and discover that not enough recovery was done, then just run the recovery again to the desired time. If you discover that too much recovery was done, then you must restore the database again and re-run the recovery.
Note:
Flashback Database is another way to perform incomplete recovery.
Tablespace Point-in-Time Recovery
The tablespace point-in-time recovery (TSPITR) feature lets you recover one or more tablespaces to a point in time that is different from the rest of the database. TSPITR is most useful when you want to:
  • Recover from an erroneous drop or truncate table operation
  • Recover a table that has become logically corrupted
  • Recover from an incorrect batch job or other DML statement that has affected only a subset of the database
  • Recover one independent schema to a point different from the rest of a physical database (in cases where there are multiple independent schemas in separate tablespaces of one physical database)
  • Recover a tablespace on a very large database (VLDB) rather than restore the whole database from a backup and perform a complete database roll-forward
TSPITR has the following limitations:
  • You cannot use it on the SYSTEM tablespace, an UNDO tablespace, or any tablespace that contains rollback segments.
  • Tablespaces that contain interdependent data must be recovered together. For example, if two tables are in separate tablespaces and have a foreign key relationship, then both tablespaces must be recovered at the same time; you cannot recover just one of them. Oracle can enforce this limitation when it detects data relationships that have been explicitly declared with database constraints. There could be other data relationships that are not declared with database constraints. Oracle cannot detect these, and the DBA must be careful to always restore a consistent set of tablespaces.
Incomplete Media Recovery Options
Because you are not completely recovering the database to the most current time, you must tell Oracle when to terminate recovery. You can perform the following types of media recovery.
Type of Recovery Function
Time-based recovery Recovers the data up to a specified point in time.
Cancel-based recovery Recovers until you issue the CANCEL statement (not available when using Recovery Manager).
Change-based recovery Recovers until the specified SCN.
Log sequence recovery Recovers until the specified log sequence number (only available when using Recovery Manager).

Datafile Media Recovery

Datafile media recovery is used to recover from a lost or damaged current datafile or control file. It is also used to recover changes that were lost when a tablespace went offline without the OFFLINE NORMAL option. Both datafile media recovery and instance recovery must repair database integrity. However, these types of recovery differ with respect to their additional features. Media recovery has the following characteristics:
  • Applies changes to restored backups of damaged datafiles.
  • Can use archived logs as well as online logs.
  • Requires explicit invocation by a user.
  • Does not detect media failure (that is, the need to restore a backup) automatically. After a backup has been restored, however, detection of the need to recover it through media recovery is automatic.
  • Has a recovery time governed solely by user policy (for example, frequency of backups, parallel recovery parameters, number of database transactions since the last backup) rather than by Oracle internal mechanisms.
The database cannot be opened if any of the online datafiles needs media recovery, nor can a datafile that needs media recovery be brought online until media recovery is complete. The following scenarios necessitate media recovery:
  • You restore a backup of a datafile.
  • You restore a backup control file (even if all datafiles are current).
  • A datafile is taken offline (either by you or automatically by Oracle) without the OFFLINE NORMAL option.
Unless the database is not open by any instance, datafile media recovery can only operate on offline datafiles. You can initiate datafile media recovery before opening a database even when crash recovery would have sufficed. If so, crash recovery still runs automatically at database open.
Note that when a file requires media recovery, you must perform media recovery even if all necessary changes are contained in the online logs. In other words, you must still run recovery even though the archived logs are not needed. Media recovery could find nothing to do — and signal the "no recovery required" error — if invoked for files that do not need recovery.

Block Media Recovery

Block media recovery is a technique for restoring and recovering individual data blocks while all database files remain online and available. If corruption is limited to only a few blocks among a subset of database files, then block media recovery might be preferable to datafile recovery.
The interface to block media recovery is provided by RMAN. If you do not already use RMAN as your principal backup and recovery solution, then you can still perform block media recovery by cataloging into the RMAN repository the necessary user-managed datafile and archived redo log backups.
See Also:
Oracle Database Backup and Recovery Reference for information on how to catalog user-managed datafile and archived log backups and to perform block media recovery

Overview of RMAN and User-Managed Restore and Recovery

You have a choice between two basic methods for recovering physical files. You can:
  • Use the RMAN utility to restore and recover the database
  • Restore backups by means of operating system utilities, and then recover by running the SQL*Plus RECOVER command
Whichever method you choose, you can recover a database, tablespace, or datafile. Before performing media recovery, you need to determine which datafiles to recover. Often you can use the fixed view V$RECOVER_FILE. This view lists all files that require recovery and explains the error that necessitates recovery.
See Also:
Oracle Database Backup and Recovery Reference for more about using V$ views in a recovery scenario

RMAN Restore and Recovery

The basic RMAN recovery commands are RESTORE and RECOVER. Use RESTORE to restore datafiles from backup sets or from image copies on disk, either to their current location or to a new location. You can also restore backup sets containing archived redo logs, but this is usually unnecessary, because RMAN automatically restores the archived logs that are needed for recovery and deletes them after the recovery is finished. Use the RMAN RECOVER command to perform media recovery and apply archived logs or incremental backups.
RMAN automates the procedure for recovering and restoring your backups and copies.
See Also:
Oracle Database Backup and Recovery Reference for details about how to restore and recover using RMAN

User-Managed Restore and Recovery

If you do not use RMAN, then you can restore backups with operating system utilities and then run the SQL*Plus RECOVER command to recover the database. You should follow these basic steps:
  1. After identifying which files are damaged, place the database in the appropriate state for restore and recovery. For example, if some but not all datafiles are damaged, then take the affected tablespaces offline while the database is open.
  2. Restore the files with an operating system utility. If you do not have a backup, it is sometimes possible to perform recovery if you have the necessary redo logs dating from the time when the datafiles were first created and the control file contains the name of the damaged file.
    If you cannot restore a datafile to its original location, then relocate the restored datafile and change the location in the control file.
  3. Restore any necessary archived redo log files.
  4. Use the SQL*Plus RECOVER command to recover the datafile backups.
See Also:
Oracle Database Backup and Recovery Advanced User's Guide for details about how to restore and recover with operating system utilities and SQL*Plus

Recovery Using Oracle Flashback Technology

To correct problems caused by logical data corruptions or user errors, you can use Oracle Flashback. Flashback Database and Flashback Table let you quickly recover to a previous time.
This section contains the following topics:
See Also:
"Overview of Oracle Flashback Features" for an overview of all Oracle Flashback features

Overview of Oracle Flashback Database

Oracle Flashback Database lets you quickly recover an Oracle database to a previous time to correct problems caused by logical data corruptions or user errors.
If an Oracle managed disk area, called a flash recovery area is configured, and if you have enabled the Flashback functionality, then you can use the RMAN and SQL FLASHBACK DATABASE commands to return the database to a prior time. Flashback Database is not true media recovery, because it does not involve restoring physical files. However, Flashback is preferable to using the RESTORE and RECOVER commands in some cases, because it is faster and easier, and does not require restoring the whole database.
To Flashback a database, Oracle uses past block images to back out changes to the database. During normal database operation, Oracle occasionally logs these block images in Flashback logs. Flashback logs are written sequentially, and they are not archived. Oracle automatically creates, deletes, and resizes Flashback logs in the flash recovery area. You only need to be aware of Flashback logs for monitoring performance and deciding how much disk space to allocate to the flash recovery area for Flashback logs.
The amount of time it takes to Flashback a database is proportional to how far back you need to revert the database, rather than the time it would take to restore and recover the whole database, which could be much longer. The before images in the Flashback logs are only used to restore the database to a point in the past, and forward recovery is used to bring the database to a consistent state at some time in the past. Oracle returns datafiles to the previous point-in-time, but not auxiliary files, such as initialization parameter files.
See Also:

Overview of Oracle Flashback Table

Oracle Flashback Table lets you recover tables to a specified point in time with a single statement. You can restore table data along with associated indexes, triggers, and constraints, while the database is online, undoing changes to only the specified tables. Flashback Table does not address physical corruption; for example, bad disks or data segment and index inconsistencies.
Flashback Table works like a self-service repair tool. Suppose a user accidentally deletes some important rows from a table and wants to recover the deleted rows. You can restore the table to the time before the deletion and see the missing rows in the table with the FLASHBACK TABLE statement.
You can revert the table and its contents to a certain wall clock time or user-specified system change number (SCN). Use Flashback Table with Oracle Flashback Version query and Flashback Transaction Query to find a time to which the table should be restored back to.
See Also:
"Overview of Oracle Flashback Query" for information about Oracle Flashback Query
"Overview of Oracle Flashback Database" for information about reverting an entire database to an earlier point in time
For Flashback Table to succeed, the system must retain enough undo information to satisfy the specified SCN or timestamp, and the integrity constraints specified on the tables cannot be violated. Also, row movement must be enabled.
The point of time in the past that you use Flashback Table to go to is controlled by the undo retention of the system. Oracle Database 10g automatically tunes a parameter called the undo retention period. The undo retention period indicates the amount of time that must pass before old undo information—that is, undo information for committed transactions—can be overwritten. The database collects usage statistics and tunes the undo retention period based on these statistics and on undo tablespace size.
Note:
Oracle strongly recommends that you run your database in automatic undo management mode. In addition, set the undo retention to an interval large enough to include the oldest data you anticipate needing.
See Also:

Other Types of Oracle Recovery

This section contains the following topics:

Overview of Redo Application

Database buffers in the buffer cache in the SGA are written to disk only when necessary, using a least-recently-used (LRU) algorithm. Because of the way that the database writer process uses this algorithm to write database buffers to datafiles, datafiles could contain some data blocks modified by uncommitted transactions and some data blocks missing changes from committed transactions.
Two potential problems can result if an instance failure occurs:
  • Data blocks modified by a transaction might not be written to the datafiles at commit time and might only appear in the redo log. Therefore, the redo log contains changes that must be reapplied to the database during recovery.
  • After the roll forward phase, the datafiles could contain changes that had not been committed at the time of the failure. These uncommitted changes must be rolled back to ensure transactional consistency. These changes were either saved to the datafiles before the failure, or introduced during the roll forward phase.
To solve this dilemma, two separate steps are generally used by Oracle for a successful recovery of a system failure: rolling forward with the redo log (cache recovery) and rolling back with the rollback or undo segments (transaction recovery).
Overview of Cache Recovery
The online redo log is a set of operating system files that record all changes made to any database block, including data, index, and rollback segments, whether the changes are committed or uncommitted. All changes to Oracle blocks are recorded in the online log.
The first step of recovery from an instance or disk failure is called cache recovery or rolling forward, and involves reapplying all of the changes recorded in the redo log to the datafiles. Because rollback data is also recorded in the redo log, rolling forward also regenerates the corresponding rollback segments
Rolling forward proceeds through as many redo log files as necessary to bring the database forward in time. Rolling forward usually includes online redo log files (instance recovery or media recovery) and could include archived redo log files (media recovery only).
After rolling forward, the data blocks contain all committed changes. They could also contain uncommitted changes that were either saved to the datafiles before the failure, or were recorded in the redo log and introduced during cache recovery.
Overview of Transaction Recovery
You can run Oracle in either manual undo management mode or automatic undo management mode. In manual mode, you must create and manage rollback segments to record the before-image of changes to the database. In automatic undo management mode, you create one or more undo tablespaces. These undo tablespaces contain undo segments similar to traditional rollback segments. The main difference is that Oracle manages the undo for you.
Undo blocks (whether in rollback segments or automatic undo tablespaces) record database actions that should be undone during certain database operations. In database recovery, the undo blocks roll back the effects of uncommitted transactions previously applied by the rolling forward phase.
After the roll forward, any changes that were not committed must be undone. Oracle applies undo blocks to roll back uncommitted changes in data blocks that were either written before the failure or introduced by redo application during cache recovery. This process is called rolling back or transaction recovery.
Figure 15-3 illustrates rolling forward and rolling back, the two steps necessary to recover from any type of system failure.
Figure 15-3 Basic Recovery Steps: Rolling Forward and Rolling Back
Description of Figure 15-3 follows
Description of "Figure 15-3 Basic Recovery Steps: Rolling Forward and Rolling Back"
Oracle can roll back multiple transactions simultaneously as needed. All transactions systemwide that were active at the time of failure are marked as terminated. Instead of waiting for SMON to roll back terminated transactions, new transactions can recover blocking transactions themselves to get the row locks they need.

Overview of Instance and Crash Recovery

Crash recovery is used to recover from a failure either when a single-instance database fails or all instances of an Oracle Real Application Clusters database fail. Instance recovery refers to the case where a surviving instance recovers a failed instance in an Oracle Real Application Clusters database.
The goal of crash and instance recovery is to restore the data block changes located in the cache of the terminated instance and to close the redo thread that was left open. Instance and crash recovery use only online redo log files and current online datafiles. Oracle recovers the redo threads of the terminated instances together.
Crash and instance recovery involve two distinct operations: rolling forward the current, online datafiles by applying both committed and uncommitted transactions contained in online redo records, and then rolling back changes made in uncommitted transactions to their original state.
Crash and instance recovery have the following shared characteristics:
  • Redo the changes using the current online datafiles (as left on disk after the failure or SHUTDOWN ABORT)
  • Use only the online redo logs and never require the use of the archived logs
  • Have a recovery time governed by the number of terminated instances, amount of redo generated in each terminated redo thread since the last checkpoint, and by user-configurable factors such as the number and size of redo log files, checkpoint frequency, and the parallel recovery setting
Oracle performs this recovery automatically on two occasions:
  • At the first database open after the failure of a single-instance database or all instances of an Oracle Real Applications Cluster database (crash recovery).
  • When some but not all instances of an Oracle Real Application Clusters configuration fail (instance recovery). The recovery is performed automatically by a surviving instance in the configuration.
The important point is that in both crash and instance recovery, Oracle applies the redo automatically: no user intervention is required to supply redo logs. However, you can set parameters in the database server that can tune the duration of instance and crash recovery performance. Also, you can tune the rolling forward and rolling back phases of instance recovery separately.
See Also:
Oracle Database Backup and Recovery Advanced User's Guide for a discussion of instance recovery mechanics and instructions for tuning instance and crash recovery

Deciding Which Recovery Technique to Use

This section contains the following topics:

When to Use Media Recovery

Use media recovery when one or more datafiles has been physically damaged. This can happen due to hardware errors or user errors, such as accidentally deleting a file. Complete media recovery is used with individual datafiles, tablespaces, or the entire database.
Use incomplete media recovery when the database has been logically damaged. This can happen due to application error or user error, such as accidentally deleting a table or tablespace. Incomplete media recovery is used only with the whole database, not with individual datafiles or tablespaces. (If you do not want to do incomplete media recovery of the entire database, you can do tablespace point-in-time recovery with individual tablespaces.)
Use block media recovery when a small number of blocks in one or more files have been physically damaged. This usually happens due to hardware errors, such as a bad disk controller, or operating system I/O errors. Block media recovery is used with individual data blocks, and the remainder of the database remains online and available during the recovery.

When to Use Oracle Flashback

Flashback Table is a push button solution to restore the contents of a table to a given point in time. An application on top of Flashback Query can achieve this, but with less efficiency.
Flashback Database applies to the entire database. It requires configuration and resources, but it provides a fast alternative to performing incomplete database recovery.
Flashback Table uses information in the undo tablespace to restore the table. This provides significant benefits over media recovery in terms of ease of use, availability, and faster restoration.
Flashback Database and Flashback Table differ in granularity, performance, and restrictions. For a primary database, consider using Flashback Database rather than Flashback Table in the following situations:
  • There is a logical data corruption, particularly undo corruption.
  • A user error affected the whole database.
  • A user error affected a table or a small set of tables, but the impact of reverting this set of tables is not clear because of the logical relationships between tables.
  • A user error affected a table or a small set of tables, but using Flashback Table would fail because of its DDL restrictions.
  • Flashback Database works through all DDL operations, whereas Flashback Table does not. Also, because Flashback Database moves the entire database back in time, constraints are not an issue, whereas they are with Flashback Table. Flashback Table cannot be used on a standby database.

When to Use CREATE TABLE AS SELECT Recovery

To do an out of place restore of the data, perform a CTAS (CREATE TABLE AS SELECTAS OF …) using the Flashback Query SQL "AS OF …" clause. For example, to create a copy of the table as of a specific time:
CREATE TABLE old_emp AS SELECT * 
FROM employees AS OF TIMESTAMP '2002-02-05  14:15:00'

For out of place creation of the table, you only get data back. Constraints, indexes, and so on are not restored. This could take significantly more time and space than Flashback Table. However, Flashback Table only restores rows in blocks that were modified after the specified time, making it more efficient.

When to Use Import/Export Utilities Recovery

In contrast to physical backups, logical backups are exports of schema objects, like tables and stored procedures, into a binary file. Oracle utilities are used to move Oracle schema objects in and out of Oracle. Export, or Data Pump Export, writes data from an Oracle database to binary operating system files. Import, or Data Pump Import, reads export files and restores the corresponding data into an existing database.
Although import and export are designed for moving Oracle data, you can also use them as a supplemental method of protecting data in an Oracle database. You should not use Oracle import and export utilities as the sole method of backing up your data.
Oracle import and export utilities work similarly to CTAS, but they restore constraints, indexes, and so on. They effectively re-create the whole table if an export was performed earlier corresponding to the Flashback time. Flashback Table is more performance efficient than import/export utilities, because it restores only the subset of rows that got modified.

When to Use Tablespace Point-in-Time Recovery

Use tablespace point-in-time recovery when one or more tablespaces have been logically damaged, and you do not want to do incomplete media recovery of the entire database. Tablespace point-in-time recovery is used with individual tablespaces.

Flash Recovery Area

The flash recovery area is an Oracle-managed directory, file system, or Automatic Storage Management disk group that provides a centralized disk location for backup and recovery files. Oracle creates archived logs in the flash recovery area. RMAN can store its backups in the flash recovery area, and it uses it when restoring files during media recovery. The flash recovery area also acts as a disk cache for tape.
Oracle recovery components interact with the flash recovery area ensuring that the database is completely recoverable using files in flash recovery area. All files necessary to recover the database following a media failure are part of flash recovery area.
Following is a list of recovery-related files in flash recovery area:
  • Current control file
  • Online logs
  • Archived logs
  • Flashback logs
  • Control file autobackups
  • Control file copies
  • Datafile copies
  • Backup pieces

Flash Recovery Area Disk Limit

Oracle lets you define a disk limit, which is the amount of space that Oracle can use in the flash recovery area. A disk limit lets you use the remaining disk space for other purposes and not to dedicate a complete disk for the flash recovery area. It does not include any overhead that is not known to Oracle. For example, the flash recovery area disk limit does not include the extra size of a file system that is compressed, mirrored, or some other redundancy mechanism.
Oracle and RMAN create files in the flash recovery area until the space used reaches the flash recovery area disk limit. Then, Oracle deletes the minimum set of existing files from the flash recovery area that are obsolete, redundant copies, or backed up to tertiary storage. Oracle warns the user when available disk space is less than 15%, but it continues to fill the disk to 100% of the flash recovery area disk limit.
The bigger the flash recovery area, the more useful it becomes. The recommended disk limit is the sum of the database size, the size of incremental backups, and the size of all archive logs that have not been copied to tape.
If the flash recovery area is big enough to keep a copy of the tablespaces, then those tablespaces do not need to access tertiary storage. The minimum size of the flash recovery area should be at least large enough to contain archive logs that have not been copied to tape. For example, if an ASM disk group of size 100 GB is used with normal redundancy for the flash recovery area, then the flash recovery area disk limit must be set to 50 GB.
See Also:

No comments:

Post a Comment