Corrupted ControlFile Recovery

This is my real world corrupted controlfile recovery experience.

Database controlfiles are among the most critical files in oracle.
The default install of an oracle database automatically provides 3 copies of these files.
Documentations and best practice papers are littered with constant reminders about mirroring or creation of copies in different physical disks for redundancy.

Failing to heed those reminders will lead you to lost database and probably your job 😦

However if you are reading this post you may just be lucky enough to have survived 1 or 2 copies of this file which you want to use in recovery.  Right, you just needed 1 working copy to recover.

Here is a fragment of  alert log after the database crash caused by a corrupted controlfile:

Wed Feb 22 18:27:25 2012
Errors in file /oracle/KPSO/db/tech_st/11.1.0/admin/KPSO_saas/diag/rdbms/kpso/KPSO/trace/KPSO_ckpt_8024.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/oracle/KPSO/db/apps_st/data/cntrl01.dbf'
ORA-27072: File I/O error
Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 3
Additional information: -1
Errors in file /oracle/KPSO/db/tech_st/11.1.0/admin/KPSO_saas/diag/rdbms/kpso/KPSO/trace/KPSO_ckpt_8024.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/oracle/KPSO/db/apps_st/data/cntrl01.dbf'
ORA-27072: File I/O error
Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 3
Additional information: -1
CKPT (ospid: 8024): terminating the instance due to error 221
Instance terminated by CKPT, pid = 8024

Solution was simple:

1) Once the database is shutdown, it is necessary to copy the control files via the OS to the desired (mirrored) locations.
2) Modify the init.ora parameter CONTROL_FILES and indicate where the new mirrored (multiplexed) control-file can can be accessed.

So instead of overwriting the bad copy with the good one we created a copy in the new location, since it could be that the cause of the corruption was bad physical disk.

This is an EBS R12 DB so instead of using the automated startup script, startup the database manually, step by step so we can see error messages along the way, if any:

SQL> conn / as sysdba
Connected to an idle instance.
SQL> startup mount pfile=/oracle/KPSO/db/tech_st/11.1.0/dbs/initKPSO.ora
ORACLE instance started.

Total System Global Area 2137886720 bytes
Fixed Size                  2161400 bytes
Variable Size             872416520 bytes
Database Buffers         1241513984 bytes
Redo Buffers               21794816 bytes
Database mounted.
SQL> alter database open
  2  ;

Database altered.

All is good 🙂 we’re back in business..

 

Advertisements

About this entry