标签归档:ORA-600 4194

重建控制文件丢失数据文件导致悲剧

在Oracle职业生涯中,恢复过生产环境数据库也有几百个.对于Oracle恢复我还是相当的自信,今天因为自己的一时过于自信,对于环境错了错误的判断,简单问题复杂化,差点变成悲剧
开发出来了Oracle Recovery Tools恢复MISSING00000文件故障工具,能够一键解决类似问题,实现快速恢复
数据库最初故障

Thu Sep 25 09:27:26 2014
MMON started with pid=15, OS id=1968 
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
starting up 1 shared server(s) ...
ORACLE_BASE from environment = F:\oracle
Thu Sep 25 09:27:26 2014
ALTER DATABASE   MOUNT
Thu Sep 25 09:27:26 2014
MMNL started with pid=16, OS id=5976 
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_4624.trc:
ORA-00202: ????: ''F:\ORACLE\ORADATA\ORCL\CONTROL01.CTL
ORA-27070: ????/????
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 23) 数据错误(循环冗余检查)。
Thu Sep 25 09:28:31 2014
ORA-204 signalled during: ALTER DATABASE   MOUNT...

因为硬件或者系统层面问题,导致控制文件无法正常访问

重建控制文件

Fri Sep 26 12:28:44 2014
Successful mount of redo thread 1, with mount id 1387065723
Completed: CREATE CONTROLFILE REUSE DATABASE "orcl" RESETLOGS ARCHIVELOG
    MAXLOGFILES 5
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 2
    MAXLOGHISTORY 226
LOGFILE
  GROUP 1 'F:\oracle\oradata\orcl\REDO01.LOG'  SIZE 50M,  --redo log ????
  GROUP 2 'F:\oracle\oradata\orcl\REDO02.LOG'  SIZE 50M,  --redo log ????
  GROUP 3 'F:\oracle\oradata\orcl\REDO03.LOG'  SIZE 50M  --redo log ????
-- STANDBY LOGFILE
DATAFILE
  'F:\oracle\oradata\orcl\SYSAUX01.DBF',  --sysaux???????
  'F:\oracle\oradata\orcl\SYSTEM01.DBF',
  'F:\oracle\oradata\orcl\USERS01.DBF', --user????????
  'F:\oracle\oradata\orcl\UNDOTBS01.DBF' --undo???????
CHARACTER SET ZHS16GBK
Fri Sep 26 12:29:55 2014
alter database open resetlogs
ORA-1194 signalled during: alter database open resetlogs...

埋下了雷,创建控制文件中未全部列举出来所有数据文件

进行不完全恢复,尝试resetlogs库发现redo异常

Fri Sep 26 14:13:24 2014
ALTER DATABASE   MOUNT
Fri Sep 26 14:13:24 2014
MMNL started with pid=16, OS id=9024 
Successful mount of redo thread 1, with mount id 1387037444
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Fri Sep 26 14:14:08 2014
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
Fri Sep 26 14:15:16 2014
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc:
ORA-00333: 重做日志读取块 2049 计数 6143 出错
ORA-00312: 联机日志 1 线程 1: 'F:\ORACLE\ORADATA\ORCL\REDO01.LOG'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 23) 数据错误(循环冗余检查)。
Fri Sep 26 14:16:24 2014
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc:
ORA-00333: 重做日志读取块 1 计数 8191 出错
ORA-00312: 联机日志 1 线程 1: 'F:\ORACLE\ORADATA\ORCL\REDO01.LOG'
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 23) 数据错误(循环冗余检查)。
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc:
ORA-00333: 重做日志读取块 1 计数 8191 出错
ARCH: All Archive destinations made inactive due to error 333

使用隐含参数尝试拉库,报ORA-600[2662]

Fri Sep 26 14:16:45 2014
SMON: enabling cache recovery
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc  (incident=57761):
ORA-00600: 内部错误代码, 参数: [2662], [0], [38221304], [0], [38352371], [4194545], [], [], [], [], [], []
Incident details in: f:\oracle\diag\rdbms\orcl\orcl\incident\incdir_57761\orcl_ora_3720_i57761.trc
Fri Sep 26 14:16:45 2014
ARC3 started with pid=23, OS id=9692 
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码, 参数: [2662], [0], [38221304], [0], [38352371], [4194545], [], [], [], [], [], []
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码, 参数: [2662], [0], [38221304], [0], [38352371], [4194545], [], [], [], [], [], []
Error 704 happened during db open, shutting down database
USER (ospid: 3720): terminating the instance due to error 704
Instance terminated by USER, pid = 3720
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (3720) as a result of ORA-1092

数据库在未使用所有数据文件的情况下,进行了resetlogs操作,悲剧的本质已经注定,我的失误是没有评估好现状,还继续在错误的道路上越走越远.

我开始接手该库现况

Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Fri Sep 26 14:18:55 2014
alter database open
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_8968.trc:
ORA-01113: 文件 1 需要介质恢复
ORA-01110: 数据文件 1: 'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'
ORA-1113 signalled during: alter database open...
Fri Sep 26 14:19:31 2014
alter database open 
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_8968.trc:
ORA-01113: 文件 1 需要介质恢复
ORA-01110: 数据文件 1: 'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'
ORA-1113 signalled during: alter database open ...
Fri Sep 26 14:22:26 2014
ALTER DATABASE RECOVER  database  
Media Recovery Start
 started logmerger process
Fri Sep 26 14:22:26 2014
Media Recovery failed with error 16433
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER  database  ...
Fri Sep 26 14:24:25 2014
ALTER DATABASE RECOVER  datafile 'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'  
Media Recovery Start
Media Recovery failed with error 16433
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'  ...
Fri Sep 26 14:28:47 2014
alter database open read write
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_8968.trc:
ORA-01113: 文件 1 需要介质恢复
ORA-01110: 数据文件 1: 'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'
ORA-1113 signalled during: alter database open read write...
Fri Sep 26 14:31:48 2014
ALTER DATABASE RECOVER  datafile 'F:\oracle\oradata\orcl\SYSTEM01.DBF'  
Media Recovery Start
Media Recovery failed with error 16433
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 'F:\oracle\oradata\orcl\SYSTEM01.DBF'  ...

提示ORA-01110: 数据文件 1需要恢复,尝试recover操作

尝试recover操作

连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> recover database ;
ORA-00283: recovery session canceled due to errors
ORA-16433: The database must be opened in read/write mode.


SQL> alter database backup controlfile to trace as 'd:\ctl.txt';
alter database backup controlfile to trace as 'd:\ctl.txt'
*
第 1 行出现错误:
ORA-16433: 必须以读/写模式打开数据库。


SQL> recover database using backup controlfile;
ORA-00283: recovery session canceled due to errors
ORA-16433: The database must be opened in read/write mode.

重建控制文件

SQL> shutdown immediate;
ORA-01109: 数据库未打开


已经卸载数据库。
ORACLE 例程已经关闭。
SQL> STARTUP NOMOUNT
ORACLE 例程已经启动。

Total System Global Area  970895360 bytes
Fixed Size                  1375452 bytes
Variable Size             603980580 bytes
Database Buffers          360710144 bytes
Redo Buffers                4829184 bytes
SQL> CREATE CONTROLFILE REUSE DATABASE orcl NORESETLOGS FORCE LOGGING ARCHIVELOG

  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 2921
  7  LOGFILE
  8    GROUP 1 'F:\ORACLE\ORADATA\ORCL\REDO01.LOG'  SIZE 50M,
  9    GROUP 2 'F:\ORACLE\ORADATA\ORCL\REDO02.LOG'  SIZE 50M,
 10    GROUP 3 'F:\ORACLE\ORADATA\ORCL\REDO03.LOG'  SIZE 50M
 11  DATAFILE
 12    'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF',
 13    'F:\ORACLE\ORADATA\ORCL\SYSAUX01.DBF',
 14    'F:\ORACLE\ORADATA\ORCL\UNDOTBS01.DBF',
 15    'F:\ORACLE\ORADATA\ORCL\USERS01.DBF'
 16  CHARACTER SET ZHS16GBK
 17  ;

控制文件已创建。

这一步严重发错,在恢复前未认真看alert日志,太依赖v$datafile查询出来结果,导致重建控制文件丢失数据文件,埋下大雷。根据前面alert日志报错ORA-600 2662,决定一并处理该问题,然后进行恢复

SQL> shutdown immediate;
ORA-01109: ??????


已经卸载数据库。
ORACLE 例程已经关闭。
SQL> startup pfile='d:\pfile.txt'  mount;
ORACLE 例程已经启动。

Total System Global Area  970895360 bytes
Fixed Size                  1375452 bytes
Variable Size             603980580 bytes
Database Buffers          360710144 bytes
Redo Buffers                4829184 bytes
数据库装载完毕。
SQL> recover database;
完成介质恢复。
SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [4194], [], [

数据库报ORA-600 4194,直接修改undo_management=manual,然后尝试启动数据库

SQL> conn / as sysdba
已连接到空闲例程。
SQL> startup pfile='d:\pfile.txt'
ORACLE 例程已经启动。

Total System Global Area  970895360 bytes
Fixed Size                  1375452 bytes
Variable Size             603980580 bytes
Database Buffers          360710144 bytes
Redo Buffers                4829184 bytes
数据库装载完毕。
数据库已经打开。
SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------

F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF
F:\ORACLE\ORADATA\ORCL\SYSAUX01.DBF
F:\ORACLE\ORADATA\ORCL\UNDOTBS01.DBF
F:\ORACLE\ORADATA\ORCL\USERS01.DBF
F:\ORACLE\PRODUCT\11.2.0\DBHOME_1\DATABASE\MISSING00005
F:\ORACLE\PRODUCT\11.2.0\DBHOME_1\DATABASE\MISSING00006

已选择6行。

SQL> alter database rename file 'F:\ORACLE\PRODUCT\11.2.0\DBHOME_1\DATABASE\MISSING00005' 
2   to 'F:\oracle\oradata\SOURCE_DATA1.DBF';

数据库已更改。

SQL> alter database rename file 'F:\ORACLE\PRODUCT\11.2.0\DBHOME_1\DATABASE\MISSING00006' 
2   to 'F:\oracle\oradata\SOURCE_idx1.DBF';

数据库已更改。

SQL> shutdown immediate;
数据库已经关闭。
已经卸载数据库。
ORACLE 例程已经关闭。
SQL> startup mount pfile='d:\pfile.txt'
ORACLE 例程已经启动。

Total System Global Area  970895360 bytes
Fixed Size                  1375452 bytes
Variable Size             603980580 bytes
Database Buffers          360710144 bytes
Redo Buffers                4829184 bytes
数据库装载完毕。
SQL> alter datafile 5 online;
alter datafile 5 online
      *
第 1 行出现错误:
ORA-00940: 无效的 ALTER 命令


SQL> alter database datafile 5 online;

数据库已更改。

SQL> alter database datafile 6 online;

数据库已更改。

SQL> recover database until cancel;
ORA-00283: recovery session canceled due to errors
ORA-19909: datafile 5 belongs to an orphan incarnation
ORA-01110: data file 5: 'F:\ORACLE\ORADATA\SOURCE_DATA1.DBF'


SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-01139: RESETLOGS 选项仅在不完全数据库恢复后有效


SQL> alter database datafile 6 offline;

数据库已更改。

SQL> alter database datafile 5 offline;

数据库已更改。

SQL> recover database until cancel;
完成介质恢复。
SQL> alter database datafile 6 online;

数据库已更改。

SQL> alter database datafile 5 online;

数据库已更改。

SQL> alter database open resetlogs;

数据库已更改。

还好结合一些隐含参数侥幸恢复成功,差点到了要使用bbed的程度,如果遇到极端情况无法处理可以参考:Oracle Recovery Tools恢复MISSING00000文件故障

这次的恢复告诉我:Oracle数据库恢复千万比大意,需要认真分析alert日志和咨询客户做了那些操作,不然可能导致万劫不复之禁地

发表在 Oracle备份恢复 | 标签为 , , , , | 评论关闭

又一起存储故障导致ORA-00333 ORA-00312恢复

数据库启动报ORA-00333 ORA-00312错误,无法正常open数据库

Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_arc0_4724.trc:
ORA-00333: redo log read error block 63489 count 2048
ORA-00312: online log 2 thread 1: 'F:\ORADATA\SZCG\REDO02.LOG'
ORA-27091: unable to queue I/O
ORA-27070: async read/write failed
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。

Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_arc0_4724.trc:
ORA-00333: redo log read error block 63489 count 2048

Thu Aug 07 10:42:03  2014
ARC0: All Archive destinations made inactive due to error 333
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_1856.trc:
ORA-00449: 后台进程 'LGWR' 因错误 340 异常终止
ORA-00340: 处理联机日志  (用于线程 ) 时出现 I/O 错误

Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_6548.trc:
ORA-00449: 后台进程 'LGWR' 因错误 340 异常终止
ORA-00340: 处理联机日志  (用于线程 ) 时出现 I/O 错误

Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_8104.trc:
ORA-00449: 后台进程 'LGWR' 因错误 340 异常终止
ORA-00340: 处理联机日志  (用于线程 ) 时出现 I/O 错误

Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_lgwr_884.trc:
ORA-00340: IO error processing online log 3 of thread 1
ORA-00345: redo log write error block 65238 count 13
ORA-00312: online log 3 thread 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: async read/write failed
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。

Thu Aug 07 10:42:03  2014
LGWR: terminating instance due to error 340
Thu Aug 07 10:42:05  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_8104.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00449: background process 'LGWR' unexpectedly terminated with error 340
ORA-00340: IO error processing online log  of thread 

Thu Aug 07 10:42:05  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_1856.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00449: background process 'LGWR' unexpectedly terminated with error 340
ORA-00340: IO error processing online log  of thread 

Thu Aug 07 10:42:05  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_6548.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00449: background process 'LGWR' unexpectedly terminated with error 340
ORA-00340: IO error processing online log  of thread 


Thu Aug 07 17:40:05  2014
ALTER DATABASE OPEN
Thu Aug 07 17:40:05  2014
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Thu Aug 07 17:40:06  2014
Started redo scan
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。

Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27091: 无法将 I/O 排队
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。

Thu Aug 07 17:40:06  2014
Aborting crash recovery due to error 333
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错

ORA-333 signalled during: ALTER DATABASE OPEN...

进一步检查发现在7月6日系统就已经报io异常

Sun Jul 06 10:05:23  2014
ARC0: All Archive destinations made inactive due to error 333
Sun Jul 06 10:06:07  2014
KCF: write/open error block=0xd03 online=1
     file=3 F:\ORADATA\SZCG\SYSAUX01.DBF
     error=27070 txt: 'OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。'
Automatic datafile offline due to write error on
file 3: F:\ORADATA\SZCG\SYSAUX01.DBF
Sun Jul 06 10:06:23  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_arc1_2676.trc:
ORA-00333: redo log read error block 63489 count 2048
ORA-00312: online log 2 thread 1: 'F:\ORADATA\SZCG\REDO02.LOG'
ORA-27091: unable to queue I/O
ORA-27070: async read/write failed
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。

Thu Aug 07 10:36:54  2014
ARC1: All Archive destinations made inactive due to error 333
Thu Aug 07 10:37:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_m000_5832.trc:
ORA-01135: file 3 accessed for DML/query is offline
ORA-01110: data file 3: 'F:\ORADATA\SZCG\SYSAUX01.DBF'

检查硬件发现raid一块盘完全损坏,另外一块盘也处于告警状态,保护现场拷贝文件过程中发现redo02,redo03,sysaux无法拷贝,使用rman检查发现
sysaux-block


因为redo完全损坏,使用工具跳过坏块,拷贝相关有坏块文件到其他目录,重命名相关文件尝试启动数据库,依然报ORA-00333 ORA-00312

Started redo scan
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。

Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27091: 无法将 I/O 排队
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。

Thu Aug 07 17:40:06  2014
Aborting crash recovery due to error 333
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错

ORA-333 signalled during: ALTER DATABASE OPEN...

设置隐含参数_allow_resetlogs_corruption,尝试强制拉库

Started redo scan
Fri Aug 08 12:13:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_3892.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。

Fri Aug 08 12:13:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_3892.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27091: 无法将 I/O 排队
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。

Fri Aug 08 12:13:25  2014
Aborting crash recovery due to error 333
Fri Aug 08 12:13:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_3892.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错

ORA-333 signalled during: ALTER DATABASE OPEN...
Fri Aug 08 12:13:45  2014
ALTER DATABASE RECOVER  database until cancel  
Fri Aug 08 12:13:45  2014
Media Recovery Start
 parallel recovery started with 15 processes
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Fri Aug 08 12:13:55  2014
ALTER DATABASE RECOVER    CANCEL  
Fri Aug 08 12:13:59  2014
ORA-1547 signalled during: ALTER DATABASE RECOVER    CANCEL  ...
Fri Aug 08 12:13:59  2014
ALTER DATABASE RECOVER CANCEL 
ORA-1112 signalled during: ALTER DATABASE RECOVER CANCEL ...
Fri Aug 08 12:14:12  2014
alter database open resetlogs
Fri Aug 08 12:14:13  2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Fri Aug 08 12:54:11  2014
alter tablespace sysaux offline
Fri Aug 08 12:54:11  2014
ORA-1109 signalled during: alter tablespace sysaux offline...
Fri Aug 08 13:05:30  2014
alter database open
Fri Aug 08 13:05:30  2014
ORA-1589 signalled during: alter database open...

在offline过程中,数据库检查到sysaux数据文件为offline状态,当表空间只有一个数据文件,而且该数据文件为offline,数据库将会尝试offline sysaux表空间,但是发现该表空间文件非正常scn,无法offline 表空间,导致resetlogs操作失败。这里是操作失误应该先online相关数据文件,然后再进行resetlogs操作

Sat Aug 09 11:56:03  2014
alter database datafile 3 online
Sat Aug 09 11:56:04  2014
Completed: alter database datafile 3 online
Sat Aug 09 11:56:08  2014
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
Sat Aug 09 11:56:18  2014
ARCH: Encountered disk I/O error 19502
Sat Aug 09 11:56:18  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512) 
ORA-27072: 文件 I/O 错误
OSD-04008: WriteFile() 失败, 无法写入文件
O/S-Error: (OS 1) 函数不正确。
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512) 

Sat Aug 09 11:56:18  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512) 
ORA-27072: 文件 I/O 错误
OSD-04008: WriteFile() 失败, 无法写入文件
O/S-Error: (OS 1) 函数不正确。
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512) 

ARCH: I/O error 19502 archiving log 3 to 'F:\ARCHIVE\ARC01745_0814618167.001'
Sat Aug 09 11:56:18  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-00265: 要求实例恢复, 无法设置 ARCHIVELOG 模式

Archive all online redo logfiles failed:265
RESETLOGS after incomplete recovery UNTIL CHANGE 77983856
Resetting resetlogs activation ID 3562192628 (0xd452bef4)
Online log F:\ORADATA\SZCG\REDO01.LOG: Thread 1 Group 1 was previously cleared
Online log F:\ORADATA\SZCG\REDO02.LOG: Thread 1 Group 2 was previously cleared
Online log D:\REDO04.LOG: Thread 1 Group 4 was previously cleared
Sat Aug 09 11:56:22  2014
Setting recovery target incarnation to 3
Sat Aug 09 11:56:23  2014
Assigning activation ID 3602586269 (0xd6bb1a9d)
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=33, OS id=5900
Sat Aug 09 11:56:23  2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=34, OS id=5776
Sat Aug 09 11:56:24  2014
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: F:\ORADATA\SZCG\REDO01.LOG
Successful open of redo thread 1
Sat Aug 09 11:56:24  2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Aug 09 11:56:24  2014
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
Sat Aug 09 11:56:24  2014
ARC0: Becoming the heartbeat ARCH
Sat Aug 09 11:56:24  2014
SMON: enabling cache recovery
Sat Aug 09 11:56:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [77983864], [0], [77992379], [8388617], [], []

Sat Aug 09 11:56:26  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [77983864], [0], [77992379], [8388617], [], []

Sat Aug 09 11:56:26  2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 4516
ORA-1092 signalled during: alter database open resetlogs...

ORA-600 2662这个错误很熟悉,直接推SCN,数据库open,但是报ORA-600 4194

Sat Aug 09 12:01:28  2014
SMON: enabling cache recovery
Dictionary check complete
Sat Aug 09 12:01:32  2014
SMON: enabling tx recovery
Sat Aug 09 12:01:32  2014
Database Characterset is ZHS16GBK
Opening with internal Resource Manager plan 
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=34, OS id=6116
Sat Aug 09 12:01:34  2014
LOGSTDBY: Validating controlfile with logical metadata
Sat Aug 09 12:01:34  2014
LOGSTDBY: Validation complete
Sat Aug 09 12:01:34  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_smon_920.trc:
ORA-00600: internal error code, arguments: [4194], [21], [53], [], [], [], [], []

Sat Aug 09 12:01:36  2014
Doing block recovery for file 2 block 319
Resuming block recovery (PMON) for file 2 block 319
Block recovery from logseq 2, block 56 to scn 1073742003
Sat Aug 09 12:01:36  2014
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: F:\ORADATA\SZCG\REDO02.LOG
Block recovery stopped at EOT rba 2.79.16
Block recovery completed at rba 2.79.16, scn 0.1073742002
Doing block recovery for file 2 block 153
Resuming block recovery (PMON) for file 2 block 153
Block recovery from logseq 2, block 56 to scn 1073741986
Sat Aug 09 12:01:36  2014
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: F:\ORADATA\SZCG\REDO02.LOG
Block recovery completed at rba 2.66.16, scn 0.1073741988
Sat Aug 09 12:01:36  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_smon_920.trc:
ORA-01595: error freeing extent (4) of rollback segment (10))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [21], [53], [], [], [], [], []

Sat Aug 09 12:01:36  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5272.trc:
ORA-00600: internal error code, arguments: [4194], [21], [53], [], [], [], [], []

Sat Aug 09 12:01:36  2014
Completed: alter database open

尝试重建undo表空间并切换undo_tabspace到新undo表空间解决,因为数据库在恢复过程中使用了隐含参数强制拉库,不能保证数据一致性,强烈建议逻辑方式重建数据库
在本次故障中,所幸的是只有redo和sysaux文件损坏,如果是业务数据文件或者system数据文件损坏,恢复的后果可能更加麻烦,丢失数据可能更加多。再次说明:数据库备份非常重要,数据的安全性不能完全寄希望于硬件之上

发表在 Oracle备份恢复 | 标签为 , , , , , , , , | 评论关闭

ORACLE 8.1.7 数据库ORA-600 4194故障恢复

一个817数据库报ORA-600 4194 无法正常启动

Fri Jul 25 10:49:47 2014
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE   MOUNT
Fri Jul 25 10:49:58 2014
ALTER DATABASE RECOVER  database  
Fri Jul 25 10:49:58 2014
Media Recovery Start
Media Recovery Log 
Recovery of Online Redo Log: Thread 1 Group 2 Seq 3320 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Media Recovery Complete
Completed: ALTER DATABASE RECOVER  database  
Fri Jul 25 10:50:09 2014
alter database open

Beginning crash recovery of 1 threads
Fri Jul 25 10:50:09 2014
Thread recovery: start rolling forward thread 1
Recovery of Online Redo Log: Thread 1 Group 2 Seq 3320 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Fri Jul 25 10:50:09 2014
Thread recovery: finish rolling forward thread 1
Thread recovery: 0 data blocks read, 0 data blocks written, 3 redo blocks read
Crash recovery completed successfully
Fri Jul 25 10:50:09 2014
Thread 1 advanced to log sequence 3321
Thread 1 opened at log sequence 3321
  Current log# 3 seq# 3321 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1.
Fri Jul 25 10:50:09 2014
SMON: enabling cache recovery
Fri Jul 25 10:50:10 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA03216.TRC:
ORA-00600: ??????????: [4194], [12], [37], [], [], [], [], []

Fri Jul 25 10:50:10 2014
Recovery of Online Redo Log: Thread 1 Group 3 Seq 3321 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Fri Jul 25 10:50:10 2014
SMON: disabling cache recovery
Fri Jul 25 10:50:10 2014
ORA-600 signalled during: alter database open

ORA-600 4194这个错误在数据库异常恢复中非常常见,因为库不是很重要,因此就是直接屏蔽掉故障回滚段,然后强制拉库,该库的恢复过程中,也直接使用隐含参数屏蔽回滚段
_corrupted_rollback_segments= RBS0, RBS1, RBS2, RBS3, RBS4, RBS5, RBS6, RBS_HDSYS,数据库依然无法open,进一步分析trace文件

Fri Jul 25 11:26:07 2014
ORACLE V8.1.7.0.0 - Production vsnsta=0
vsnsql=e vsnxtr=3
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Oracle8i Release 8.1.7.0.0 - Production
JServer Release 8.1.7.0.0 - Production
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Instance name: orcl

Redo thread mounted by this instance: 1

Oracle process number: 14

Windows thread id: 3648, image: ORACLE.EXE


*** SESSION ID:(11.1) 2014-07-25 11:26:07.843
*** 2014-07-25 11:26:07.843
ksedmp: internal or fatal error
ORA-00600: ??????????: [4194], [12], [37], [], [], [], [], []
Current SQL statement for this session:
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,
scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12 where us#=:1
----- Call Stack Trace -----

这里很明显看出来,数据库是在open过程中,update undo$表遭遇到ORA-600 4194,因为该过程需要使用系统回滚段,但是由于其所对应的undo和redo信息不一致,所以无法正常启动数据库.继续读trace文件

  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      space2: 0      #extents: 5      #blocks: 49    
                  last map  0x00000000  #maps: 0      offset: 4128  
      Highwater::  0x00400006  ext#: 0      blk#: 3      ext size: 9     
  #blocks in seg. hdr's freelists: 0     
  #blocks below: 0     
  mapblk  0x00000000  offset: 0     
                   Unlocked
     Map Header:: next  0x00000000  #extents: 5    obj#: 0      flag: 0x40000000
  Extent Map
  -----------------------------------------------------------------
   0x00400003  length: 9     
   0x0040000c  length: 10    
   0x0040008f  length: 10    
   0x00400099  length: 10    
   0x004000a3  length: 10    
  
  TRN CTL:: seq: 0x003c chd: 0x004e ctl: 0x0050 inc: 0x00000000 nfb: 0x0000
            mgc: 0x8002 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x00400006.003c.25 scn: 0x0000.009a4009
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x00000000.003c.24 ext: 0x0  spc: 0x196   
    uba: 0x00000000.001f.14 ext: 0x1  spc: 0x16f6  
    uba: 0x00000000.0018.02 ext: 0x4  spc: 0x1f1a  
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0     
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0     
  TRN TBL::

通过这里可以看出来,数据库在启动的时候,使用system undo的block为为0×00400006,使用bbed清除掉该uba记录,让数据库启动的时候重新分配system undo block给数据库执行update undo$使用,数据库open成功

BBED> m /x 0x00000000
 File: D:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF (0)
 Block: 2                Offsets: 4188 to 4192           Dba:0x00000000
------------------------------------------------------------------------
 00000000 3c002400 00009601 00000000 1f001400 0100f616 00000000 18000200

BBED> m /x 0x0000
 File: D:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF (0)
 Block: 2                Offsets: 4028 to 4032           Dba:0x00000000
------------------------------------------------------------------------
 00000000 00000000 3c005000 02800100 68000000 feffff7f 06004000 3c002400
Sat Jul 26 12:09:21 2014
Thread recovery: start rolling forward thread 1
Recovery of Online Redo Log: Thread 1 Group 2 Seq 3326 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Sat Jul 26 12:09:21 2014
Thread recovery: finish rolling forward thread 1
Thread recovery: 0 data blocks read, 0 data blocks written, 3 redo blocks read
Crash recovery completed successfully
Sat Jul 26 12:09:22 2014
Thread 1 advanced to log sequence 3327
Thread 1 opened at log sequence 3327
  Current log# 3 seq# 3327 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1.
Sat Jul 26 12:09:22 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Jul 26 12:09:39 2014
Completed: alter database open
发表在 非常规恢复 | 标签为 | 评论关闭