月归档:六月 2014

数据库恢复历史再次刷新到Oracle 7.3.2版本—redo异常恢复

有网友在QQ上找我,说Oracle 7.3的数据库,因为redo异常咨询我是否可以恢复
qq咨询


检查数据库得到以下信息

SVRMGR> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle7 Workgroup Server Release 7.3.2.2.1 - Production Release
PL/SQL Release 2.3.2.2.0 - Production
CORE Version 3.5.2.0.0 - Production
TNS for 32-bit Windows: Version 2.3.2.1.0 - Production
NLSRTL Version 3.2.2.0.0 - Production
已选择 5 行

数据文件信息
qq
qq2


redo信息
qq3


跳过redo进行恢复,在resetlogs过程中报rbs表空间坏块,然后通过dul工具获得回滚段名称,然后使用隐含参数屏蔽掉

License high water mark = 2
Starting up ORACLE RDBMS Version: 7.3.2.2.1.
System parameters with non-default values:
  processes                = 800
  shared_pool_size         = 540000000
  control_files            = D:\ORANT\DATABASE\ctl1orcl.ora, D:\ORANT\DATABASE\ctl2orcl.ora
  compatible               = 7.3.0.0.0
  log_buffer               = 327680
  log_checkpoint_interval  = 1000000
  db_files                 = 40
  db_file_simultaneous_writes= 1280
  max_rollback_segments    = 12800
  _offline_rollback_segments= RB13, RB14, RB15, RB16, RB20
  _corrupted_rollback_segments= RB13, RB14, RB15, RB16, RB20
  sequence_cache_entries   = 100
  sequence_cache_hash_buckets= 100
  remote_login_passwordfile= SHARED
  mts_servers              = 0
  mts_max_servers          = 0
  mts_max_dispatchers      = 0
  audit_trail              = NONE
  sort_area_retained_size  = 65536
  sort_direct_writes       = AUTO
  db_name                  = oracle
  open_cursors             = 800
  text_enable              = TRUE
  snapshot_refresh_processes= 1
  background_dump_dest     = %RDBMS73%\trace
  user_dump_dest           = %RDBMS73%\trace

Mon Jun 16 16:46:57 2014

PMON started
Mon Jun 16 16:46:57 2014

DBWR started
Mon Jun 16 16:46:57 2014

LGWR started
Mon Jun 16 16:46:57 2014

RECO started
Mon Jun 16 16:46:57 2014

SNP0 started
Mon Jun 16 16:46:57 2014
alter database  mount exclusive
Mon Jun 16 16:46:58 2014
Successful mount of redo thread 1.
Mon Jun 16 16:46:58 2014
Completed: alter database  mount exclusive
Mon Jun 16 16:48:15 2014
alter database open
Mon Jun 16 16:48:16 2014
Beginning crash recovery of 1 threads
Crash recovery completed successfully
Mon Jun 16 16:48:17 2014
Thread 1 advanced to log sequence 9
  Current log# 1 seq# 9 mem# 0: D:\ORANT\DATABASE\LOG2ORCL.ORA
Thread 1 opened at log sequence 9
  Current log# 1 seq# 9 mem# 0: D:\ORANT\DATABASE\LOG2ORCL.ORA
Successful open of redo thread 1.
Mon Jun 16 16:48:18 2014
SMON: enabling cache recovery
Mon Jun 16 16:48:19 2014
Completed: alter database open
Mon Jun 16 16:48:20 2014
SMON: enabling tx recovery
SMON: about to recover undo segment 14
SMON: mark undo segment 14 as needs recovery
SMON: about to recover undo segment 15
SMON: mark undo segment 15 as needs recovery
SMON: about to recover undo segment 16
SMON: mark undo segment 16 as needs recovery
SMON: about to recover undo segment 17
SMON: mark undo segment 17 as needs recovery
SMON: about to recover undo segment 18
SMON: mark undo segment 18 as needs recovery
Mon Jun 16 16:48:20 2014
Errors in file D:\ORANT\RDBMS73\trace\orclSMON.TRC:
ORA-00600: internal error code, arguments: [4306], [21], [2], [], [], [], [], []

数据库在启动过程中出现ORA-00600[4306],导致smon异常。该错误是因为在数据库open过程中smon会清理临时段从而出现该错误,通过设置event跳过,数据库算整体打开,不过在恢复过程中还遇到了

Mon Jun 16 17:53:10 2014
Errors in file D:\ORANT\RDBMS73\trace\orclDBWR.TRC:
ORA-00600: internal error code, arguments: [3600], [3], [14], [], [], [], [], []


Mon Jun 16 18:05:12 2014
Errors in file D:\ORANT\RDBMS73\trace\ORA06880.TRC:
ORA-01578: ORACLE数据块有错(文件号12, 块号46644)
ORA-01110: 文件'12'没有联机
ORA-00600: 内部错误码, 变元: [4194], [18], [5], [], [], []
ORA-00600: 内部错误码, 变元: [4194], [18], [5], [], [], []

ORA-00600[3600]是因为在offline 回滚段所在表空间锁出现的问题
ORA-00600[4194]是因为回滚段所在的表空间数据文件出现坏块所导致

发表在 非常规恢复 | 标签为 , , , , | 评论关闭

Oracle安全警示录:加错裸设备导致redo异常

最近一个朋友数据库异常了,咨询我,通过分析日志发现对方人员根本不懂aix中的裸设备和Oracle数据库然后就直接使用OEM创建新表空间,导致了数据库crash而且不能正常启动

Thread 1 advanced to log sequence 4395
  Current log# 1 seq# 4395 mem# 0: /dev/rorcl_redo01
Thu Jun 12 19:28:38 2014
/* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo04' SIZE 2000M EXTENT MANAGEMENT 
LOCAL SEGMENT SPACE MANAGEMENT  AUTO 
ORA-1119 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo04' 
SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO ...
Thu Jun 12 19:36:23 2014
/* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo03' SIZE 2000M EXTENT MANAGEMENT 
LOCAL SEGMENT SPACE MANAGEMENT  AUTO 
Thu Jun 12 19:43:56 2014
ORA-604 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo03' 
SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO ...
Thu Jun 12 19:48:11 2014
/* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo03' SIZE 2000M EXTENT 
MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO 
Thu Jun 12 19:48:11 2014
ORA-1537 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo03'
 SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO ...
Thu Jun 12 19:48:20 2014
/* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo04' SIZE 2000M EXTENT 
MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO 
ORA-1537 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo04' 
SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO ...
Fri Jun 13 00:50:37 2014
Trace dumping is performing id=[cdmp_20140613005032]
Fri Jun 13 00:50:40 2014
Reconfiguration started (old inc 4, new inc 6)
List of nodes:
 0
 Global Resource Directory frozen
 * dead instance detected - domain 0 invalid = TRUE 
…………
Fri Jun 13 00:50:40 2014
Beginning instance recovery of 1 threads
Reconfiguration complete
Fri Jun 13 00:50:41 2014
 parallel recovery started with 7 processes
Fri Jun 13 00:50:43 2014
Started redo scan
Fri Jun 13 00:50:43 2014
Errors in file /oracle/admin/orcl/bdump/orcl1_smon_213438.trc:
ORA-00316: log 3 of thread 2, type 0 in header is not log file
ORA-00312: online log 3 thread 2: '/dev/rorcl_redo03'
Fri Jun 13 00:50:43 2014
Errors in file /oracle/admin/orcl/bdump/orcl1_smon_213438.trc:
ORA-00316: log 3 of thread 2, type 0 in header is not log file
ORA-00312: online log 3 thread 2: '/dev/rorcl_redo03'
SMON: terminating instance due to error 316
Fri Jun 13 00:50:43 2014
Errors in file /oracle/admin/orcl/bdump/orcl1_lgwr_335980.trc:
ORA-00316: log  of thread , type  in header is not log file
Instance terminated by SMON, pid = 213438

从这里可以看出来,在使用OEM创建表空间的过程中犯了两个错误
1. 未分清楚aix的块设备和字符设备的命名方式
2. 对于2节点正在使用的current redo作为不适用设备当作未使用设备来创建新表空间
由于创建表空间的使用了错误的文件和错误的设备,导致2节点的当前redo(/dev/rorcl_redo03)被损坏(因为先读redo header,所以数据库中优先反馈出来的是ORA-00316: log of thread , type in header is not log file).从而导致数据库2节点先crash,然后节点1进行实例恢复,但是由于2节点的current redo已经损坏,导致实例恢复无法完成,从而两个节点都crash.因为是rac的一个节点的当前redo损坏,数据库无法正常.
如果有备份该数据库可以使用备份还原进行恢复,如果没有备份只能使用强制拉库的方法抢救数据.希望不要发生一个大的数据丢失悲剧
介绍这个案例希望给大家以警示:对数据库的裸设备操作请谨慎,不清楚切不可乱操作,否则后果严重

发表在 Oracle | 标签为 , , , | 评论关闭

记录一次ORA-00600[kdxlin:psno out of range]/ORA-00600[3020]/ORA-00600[4000]/ORA-00600[4193]的数据库恢复

尝试recover database,遭遇ORA-00600[kdxlin:psno out of range]/ORA-00600[3020]/ORA-00354错误

Media Recovery Log 
Recovery of Online Redo Log: Thread 1 Group 1 Seq 5645 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\REDO01.LOG
Mon Jun 09 15:36:10 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p001_9604.trc:
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], []

Mon Jun 09 15:36:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p002_9592.trc:
ORA-00600: internal error code, arguments: [3020], [3], [23337], [12606249], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 23337)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\SYSAUX01.DBF'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'

Mon Jun 09 15:36:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p001_9604.trc:
ORA-10562: Error occurred while applying redo to data block (file# 3, block# 20142)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\SYSAUX01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 47841
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], []

Mon Jun 09 15:36:13 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p002_9592.trc:
ORA-00600: internal error code, arguments: [3020], [3], [23337], [12606249], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 23337)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\SYSAUX01.DBF'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'

Errors with log 
Mon Jun 09 15:36:14 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p000_9600.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 2357 change 25400286 time 06/06/2014 04:00:41
ORA-00334: archived log: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\REDO02.LOG'

Mon Jun 09 15:36:14 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p000_9600.trc:
ORA-00600: internal error code, arguments: [kddummy_blkchk], [1], [1490], [6401], [], [], [], []

Mon Jun 09 15:36:16 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p000_9600.trc:
ORA-10562: Error occurred while applying redo to data block (file# 1, block# 1490)
ORA-10564: tablespace SYSTEM
ORA-01110: data file 1: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\SYSTEM01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 203
ORA-00600: internal error code, arguments: [kddummy_blkchk], [1], [1490], [6401], [], [], [], []

Media Recovery failed with error 12801
ORA-283 signalled during: ALTER DATABASE RECOVER  database  ...

因为数据库允许少量丢失数据,且redo文件发生损坏,直接使用隐含参数屏蔽redo前滚,尝试强制拉库,报ORA-00704,ORA-00600[4000]错误

Mon Jun 09 15:57:51 2014
SMON: enabling cache recovery
Mon Jun 09 15:57:51 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\udump\gtgs_ora_8664.trc:
ORA-00600: 内部错误代码, 参数: [4000], [1], [], [], [], [], [], []

Mon Jun 09 15:57:52 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\udump\gtgs_ora_8664.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码, 参数: [4000], [1], [], [], [], [], [], []

Mon Jun 09 15:57:52 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Mon Jun 09 15:57:52 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_pmon_9760.trc:
ORA-00704: bootstrap process failure

Mon Jun 09 15:57:52 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_reco_5244.trc:
ORA-00704: bootstrap process failure

Mon Jun 09 15:57:52 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_smon_7096.trc:
ORA-00704: bootstrap process failure

Mon Jun 09 15:57:53 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_ckpt_7924.trc:
ORA-00704: bootstrap process failure

Mon Jun 09 15:57:53 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_lgwr_708.trc:
ORA-00704: bootstrap process failure

Mon Jun 09 15:57:53 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_dbw0_7400.trc:
ORA-00704: bootstrap process failure

Mon Jun 09 15:57:53 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_mman_9836.trc:
ORA-00704: bootstrap process failure

Instance terminated by USER, pid = 8664
ORA-1092 signalled during: alter database open resetlogs...

对数据库启动过程做10046,然后使用bbed修改scn绕过该错误,然后继续尝试打开数据库,报ORA-00604/ORA-00607/ORA-00600[4193]错误

Mon Jun 09 16:01:09 2014
SMON: enabling cache recovery
Mon Jun 09 16:01:10 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\udump\gtgs_ora_7548.trc:
ORA-00600: 内部错误代码, 参数: [4193], [57], [51], [], [], [], [], []

Mon Jun 09 16:01:10 2014
Doing block recovery for file 1 block 397
Block recovery range from rba 2.3.0 to scn 0.1073741830
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\REDO02.LOG
Block recovery stopped at EOT rba 2.5.16
Block recovery completed at rba 2.5.16, scn 0.1073741830
Doing block recovery for file 1 block 9
Block recovery range from rba 2.3.0 to scn 0.1073741829
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\REDO02.LOG
Block recovery completed at rba 2.5.16, scn 0.1073741830
Mon Jun 09 16:01:11 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\udump\gtgs_ora_7548.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-00607: 当更改数据块时出现内部错误
ORA-00600: 内部错误代码, 参数: [4193], [57], [51], [], [], [], [], []

Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Mon Jun 09 16:01:11 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_reco_9176.trc:
ORA-00604: error occurred at recursive SQL level 

Mon Jun 09 16:01:11 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_smon_7932.trc:
ORA-00604: error occurred at recursive SQL level 

Mon Jun 09 16:01:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_ckpt_7428.trc:
ORA-00604: error occurred at recursive SQL level 

Mon Jun 09 16:01:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_lgwr_6936.trc:
ORA-00604: error occurred at recursive SQL level 

Mon Jun 09 16:01:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_dbw0_404.trc:
ORA-00604: error occurred at recursive SQL level 

Mon Jun 09 16:01:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_mman_7968.trc:
ORA-00604: error occurred at recursive SQL level 

Instance terminated by USER, pid = 7548
ORA-1092 signalled during: ALTER DATABASE OPEN...

该错误的原因是因为数据库在启动的过程中,会事先利用上次数据库运行过程中system undo segment header指向的block,而该block异常,所以出现该错误,使用bbed/dul之类的工具清除掉undo seg header 指向block指针,然后数据库启动会重新分配一个block,从而实现数据库正常启动.

发表在 非常规恢复 | 标签为 , , , , , , , , | 评论关闭