标签归档:ORA-600 17182

ORA-600 17182导致oracle异常

正常运行的数据库突然爆ORA-00600 17182,然后直接crash,以前遇到过类似的case:分享一例由于主库逻辑坏块导致dataguard容灾失效,这又是一例数据库正常crash之后无法启动成功的case

Tue May 22 08:32:12 2018
Archived Log entry 84344 added for thread 1 sequence 90196 ID 0x103430df dest 1:
Tue May 22 09:05:42 2018
Thread 1 cannot allocate new log, sequence 90198
Private strand flush not complete
  Current log# 4 seq# 90197 mem# 0: +DATA/xifenfei/onlinelog/group_4.279.887464919
Thread 1 advanced to log sequence 90198 (LGWR switch)
  Current log# 2 seq# 90198 mem# 0: +DATA/xifenfei/onlinelog/group_2.263.887465041
Tue May 22 09:05:46 2018
Archived Log entry 84345 added for thread 1 sequence 90197 ID 0x103430df dest 1:
Tue May 22 09:07:42 2018
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_16297.trc  (incident=592822):
ORA-00600: 内部错误代码, 参数: [17182], [0x7FE274EADBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_592822/xifenfei_ora_16297_i592822.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 22 09:07:45 2018
Dumping diagnostic data in directory=[cdmp_20180522090745], requested by (instance=1, osid=16297), summary=[incident=592822].
Tue May 22 09:07:46 2018
Sweep [inc][592822]: completed
Sweep [inc2][592822]: completed
Tue May 22 09:08:29 2018
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_16297.trc  (incident=592824):
ORA-07445: 出现异常错误: 核心转储 [kghrcdepth()+168] [SIGSEGV] [ADDR:0x7FE2766ADD04] [PC:0x2C2B886] [Invalid permissions for mapped object] []
ORA-00600: 内部错误代码, 参数: [kghrcdepth:ds], [0x7FE274EADBE8], [], [], [], [], [], [], [], [], [], []
ORA-00600: 内部错误代码, 参数: [17182], [0x7FE274EADBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_592824/xifenfei_ora_16297_i592824.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 22 09:08:31 2018
Dumping diagnostic data in directory=[cdmp_20180522090831], requested by (instance=1, osid=16297), summary=[incident=592823].
Tue May 22 09:08:44 2018
Block recovery from logseq 90198, block 37639 to scn 161030804187
Recovery of Online Redo Log: Thread 1 Group 2 Seq 90198 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_2.263.887465041
Block recovery completed at rba 90198.97219.16, scn 37.2117014236
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_pmon_7690.trc  (incident=592118):
ORA-00600: internal error code, arguments: [17182], [0x7F96BDA2AA70], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_592118/xifenfei_pmon_7690_i592118.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x97E6B5C, kghpmfal()+216] [flags: 0x0, count: 1]
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_pmon_7690.trc  (incident=592119):
ORA-07445: exception encountered: core dump [kghpmfal()+216] [SIGSEGV] [ADDR:0x0] [PC:0x97E6B5C] [SI_KERNEL(general_protection)] []
ORA-00600: internal error code, arguments: [17182], [0x7F96BDA2AA70], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_592119/xifenfei_pmon_7690_i592119.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 22 09:08:45 2018
Dumping diagnostic data in directory=[cdmp_20180522090845], requested by (instance=1, osid=7690 (PMON)), summary=[incident=592118].
Tue May 22 09:08:47 2018
Sweep [inc][592824]: completed
Sweep [inc][592823]: completed
Sweep [inc][592119]: completed
Sweep [inc][592118]: completed
Sweep [inc2][592824]: completed
Sweep [inc2][592119]: completed
Sweep [inc2][592118]: completed
Tue May 22 09:08:47 2018
ARC2 (ospid: 7834): terminating the instance due to error 472
Instance terminated by ARC2, pid = 7834

无法正常open

Completed: ALTER DATABASE   MOUNT
Tue May 22 09:26:44 2018
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Started redo scan
Completed redo scan
 read 12232 KB redo, 4787 data blocks need recovery
Started redo application at
 Thread 1: logseq 90199, block 233846
Recovery of Online Redo Log: Thread 1 Group 3 Seq 90199 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_3.262.887465049
Completed redo application of 10.34MB
Completed crash recovery at
 Thread 1: logseq 90199, block 258311, scn 161030851622
 4787 data blocks read, 4787 data blocks written, 12232 redo k-bytes read
LGWR: STARTING ARCH PROCESSES
Tue May 22 09:26:45 2018
ARC0 started with pid=48, OS id=18632 
Tue May 22 09:26:46 2018
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Thread 1 advanced to log sequence 90200 (thread open)
Tue May 22 09:26:46 2018
ARC1 started with pid=49, OS id=18636 
Tue May 22 09:26:46 2018
ARC2 started with pid=50, OS id=18640 
Tue May 22 09:26:46 2018
ARC3 started with pid=51, OS id=18644 
ARC1: Archival started
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Thread 1 opened at log sequence 90200
  Current log# 5 seq# 90200 mem# 0: +DATA/xifenfei/onlinelog/group_5.280.887465135
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue May 22 09:26:46 2018
SMON: enabling cache recovery
[18512] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:2704839736 end:2704839986 diff:250 (2 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
Archived Log entry 84347 added for thread 1 sequence 90199 ID 0x103430df dest 1:
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Tue May 22 09:26:47 2018
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p019_18664.trc  (incident=624628):
ORA-00600: internal error code, arguments: [17182], [0x7F7A4A50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_624628/xifenfei_p019_18664_i624628.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Starting background process QMNC
Tue May 22 09:26:48 2018
QMNC started with pid=71, OS id=18737 
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x97ECF6C, kghalo()+570] [flags: 0x0, count: 1]
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p019_18664.trc  (incident=624629):
ORA-00600: internal error code, arguments: [17182], [0x7F7A4A50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_624629/xifenfei_p019_18664_i624629.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 22 09:26:49 2018
Tue May 22 09:26:50 2018
Starting background process EMNC
Tue May 22 09:26:50 2018
EMNC started with pid=76, OS id=18814 
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x97ECF6C, kghalo()+570] [flags: 0x0, count: 2]
Completed: ALTER DATABASE OPEN
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p019_18664.trc  (incident=624630):
ORA-07445: exception encountered: core dump [kghalo()+570] [SIGSEGV] [ADDR:0x0] [PC:0x97ECF6C] [SI_KERNEL(general_protection)] []
ORA-07445: exception encountered: core dump [kghalo()+570] [SIGSEGV] [ADDR:0x0] [PC:0x97ECF6C] [SI_KERNEL(general_protection)] []
ORA-00600: internal error code, arguments: [17182], [0x7F7A4A50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_624630/xifenfei_p019_18664_i624630.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 22 09:27:01 2018
Block recovery from logseq 90200, block 59 to scn 161030851961
Recovery of Online Redo Log: Thread 1 Group 5 Seq 90200 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_5.280.887465135
Block recovery stopped at EOT rba 90200.935.16
Block recovery completed at rba 90200.935.16, scn 37.2117062009
Starting background process CJQ0
Tue May 22 09:27:01 2018
SMON: slave died unexpectedly, downgrading to serial recovery
Tue May 22 09:27:01 2018
CJQ0 started with pid=56, OS id=18922 
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x9823AA3, kghalo()+567] [flags: 0x0, count: 1]
Errors in file /u01/app/diag/rdbms/posdg/posdg/trace/posdg_p019_11656.trc  (incident=1136658):
ORA-07445: exception encountered: core dump [kghalo()+567] [SIGSEGV] [ADDR:0x0] [PC:0x9823AA3] [SI_KERNEL(general_protection)] []
ORA-00600: internal error code, arguments: [17182], [0x7F813F61DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/posdg/posdg/incident/incdir_1136658/posdg_p019_11656_i1136658.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/diag/rdbms/posdg/posdg/trace/posdg_ora_10925.trc:
ORA-00600: internal error code, arguments: [2252], [49410], [2147581953], [3726], [1009467392], [], [], [], [], [], [], []
Errors in file /u01/app/diag/rdbms/posdg/posdg/trace/posdg_ora_10925.trc:
ORA-00600: internal error code, arguments: [2252], [49410], [2147581953], [3726], [1009467392], [], [], [], [], [], [], []
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_18434.trc  (incident=624292):
ORA-00600: internal error code, arguments: [17182], [0x7F7488BDD7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_624292/xifenfei_smon_18434_i624292.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x97E64D7, kghalf()+537] [flags: 0x0, count: 1]
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_18434.trc  (incident=624293):
ORA-07445: exception encountered: core dump [kghalf()+537] [SIGSEGV] [ADDR:0x0] [PC:0x97E64D7] [SI_KERNEL(general_protection)] []
ORA-00600: internal error code, arguments: [17182], [0x7F7488BDD7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_624293/xifenfei_smon_18434_i624293.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 22 09:27:03 2018
Dumping diagnostic data in directory=[cdmp_20180522092703], requested by (instance=1, osid=18434 (SMON)), summary=[incident=624292].
PMON (ospid: 18383): terminating the instance due to error 474
Tue May 22 09:27:05 2018
opiodr aborting process unknown ospid (18839) as a result of ORA-1092
System state dump requested by (instance=1, osid=18383 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_diag_18402_20180522092705.trc
Tue May 22 09:27:05 2018
ORA-1092 : opitsk aborting process
Instance terminated by PMON, pid = 18383

通过对于启动过程的观察,比较明显,由于数据库无法正常回滚,导致smon进程异常,从而使得数据库无法启动成功.恢复方法比较简单,就是对异常事务进行提交或者跳过即可

发表在 ORA-xxxxx, Oracle备份恢复 | 标签为 , , , , , | 评论关闭

分享一例由于主库逻辑坏块导致dataguard容灾失效

有朋友数据库配置了物理dataguard,但是由于触发了某种Oracle bug/或者其他原因导致block出现大量逻辑坏块,结果是主备库(主库启动后就crash,备库failover后也启动后就crash)都无法正常工作,请求给予技术支持。
数据库配置了物理dataguard,但是主库在进行了某些操作之后,主库直接crash.重启主库发现,数据库启动之后,稍后数据库继续CRASH

Wed Jul 08 16:32:22 2015
Thread 1 advanced to log sequence 401531 (LGWR switch)
  Current log# 6 seq# 401531 mem# 0: /opt/oracle/database/fast_recovery_area/xifenfei/onlinelog/o1_mf_6_b9p7qhrm_.log
Archived Log entry 605194 added for thread 1 sequence 401530 ID 0xfbe74bc9 dest 1:
ARC3: Standby redo logfile selected for thread 1 sequence 401530 for destination LOG_ARCHIVE_DEST_2
******************************************************************
LGWR: Setting 'active' archival for destination LOG_ARCHIVE_DEST_2
******************************************************************
Wed Jul 08 16:33:02 2015
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p032_413611.trc  (incident=201910):
ORA-00600: internal error code, arguments: [17114], [0x7FB03BB6DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7FB03BB6DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/incident/incdir_201910/xifenfei_p032_413611_i201910.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p032_413611.trc  (incident=201911):
ORA-00600: internal error code, arguments: [kghfrempty:ds], [0x7FB03BB6DBE8], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17114], [0x7FB03BB6DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7FB03BB6DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/incident/incdir_201911/xifenfei_p032_413611_i201911.trc
Wed Jul 08 16:34:23 2015
SMON: slave died unexpectedly, downgrading to serial recovery
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_413207.trc  (incident=201547):
ORA-00600: internal error code, arguments: [17182], [0x7F24D680D7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/incident/incdir_201547/xifenfei_smon_413207_i201547.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Block recovery from logseq 401530, block 119 to scn 73226510040
Recovery of Online Redo Log: Thread 1 Group 5 Seq 401530 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/xifenfei/onlinelog/o1_mf_5_b9p7qhl0_.log
Recovery of Online Redo Log: Thread 1 Group 6 Seq 401531 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/xifenfei/onlinelog/o1_mf_6_b9p7qhrm_.log
Block recovery completed at rba 401531.1882.16, scn 17.212066009
ORACLE Instance xifenfei (pid = 16) - Error 600 encountered while recovering transaction (13, 14) on object 135520.
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_413207.trc:
ORA-00600: internal error code, arguments: [17182], [0x7F24D680D7A0], [], [], [], [], [], [], [], [], [], []
Wed Jul 08 16:34:24 2015
Dumping diagnostic data in directory=[cdmp_20150708163424], requested by (instance=1, osid=413207 (SMON)), summary=[incident=201547].
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_413207.trc  (incident=201548):
ORA-00600: internal error code, arguments: [KSMFPG2], [0x7F24D680D000], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7F24D680D7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/incident/incdir_201548/xifenfei_smon_413207_i201548.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fatal internal error happened while SMON was doing active transaction recovery.
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_413207.trc:
ORA-00600: internal error code, arguments: [KSMFPG2], [0x7F24D680D000], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7F24D680D7A0], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 413207): terminating the instance due to error 474
System state dump requested by (instance=1, osid=413207 (SMON)), summary=[abnormal instance termination].
System State dumped to trace file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_diag_413167_20150708163426.trc
Dumping diagnostic data in directory=[cdmp_20150708163426], requested by (instance=1, osid=413207 (SMON)), summary=[abnormal instance termination].
Instance terminated by SMON, pid = 413207

由于主库不能正常open,备库直接failover方式激活

Wed Jul 08 17:56:41 2015
alter database recover managed standby database finish
Terminal Recovery: request posted (xffdb)
Wed Jul 08 17:56:45 2015
Begin: Standby Redo Logfile archival
End: Standby Redo Logfile archival
Terminal Recovery timestamp is '07/08/2015 17:56:45'
Terminal Recovery: applying standby redo logs.
Terminal Recovery: thread 1 seq# 401533 redo required
Terminal Recovery: 
Recovery of Online Redo Log: Thread 1 Group 7 Seq 401533 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_7_bb7x31lk_.log
Identified End-Of-Redo (failover) for thread 1 sequence 401533 at SCN 0xffff.ffffffff
Incomplete Recovery applied until change 73226530800 time 07/08/2015 16:56:40
Terminal Recovery: successful completion
Wed Jul 08 17:56:45 2015
ARCH: Archival stopped, error occurred. Will continue retrying
Forcing ARSCN to IRSCN for TR 17:212086768
ORACLE Instance xffdb - Archival Error
Attempt to set limbo arscn 17:212086768 irscn 17:212086768 ORA-16014: log 7 sequence# 401533 not archived, no available destinations
ORA-00312: online log 7 thread 1: '/opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_7_bb7x31lk_.log'

Resetting standby activation ID 4226239433 (0xfbe74bc9)
Wed Jul 08 17:56:45 2015
MRP0: Media Recovery Complete (xffdb)
MRP0: Background Media Recovery process shutdown (xffdb)
Terminal Recovery: completion detected (xffdb)
Completed: alter database recover managed standby database finish
Wed Jul 08 17:56:58 2015
alter database commit to switchover to primary
ALTER DATABASE SWITCHOVER TO PRIMARY (xffdb)
Maximum wait for role transition is 15 minutes.
Backup controlfile written to trace file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_ora_485094.trc
Standby terminal recovery start SCN: 73226530482
RESETLOGS after incomplete recovery UNTIL CHANGE 73226530800
Online logfile pre-clearing operation disabled by switchover
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_1_bb7x30gw_.log: Thread 1 Group 1 was previously cleared
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_2_bb7x30js_.log: Thread 1 Group 2 was previously cleared
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_3_bb7x310q_.log: Thread 1 Group 3 was previously cleared
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_4_bb7x312r_.log: Thread 1 Group 4 was previously cleared
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_5_bb7x317f_.log: Thread 1 Group 5 was previously cleared
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_6_bb7x31cz_.log: Thread 1 Group 6 was previously cleared
Standby became primary SCN: 73226530481
Wed Jul 08 17:56:58 2015
Setting recovery target incarnation to 3
Switchover: Complete - Database mounted as primary
Completed: alter database commit to switchover to primary

数据库激活成功后,重启激活之后数据库发现和主库出现类似情况

Wed Jul 08 17:57:25 2015
ALTER DATABASE   MOUNT
Successful mount of redo thread 1, with mount id 4243462021
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Wed Jul 08 17:57:30 2015
ALTER DATABASE OPEN
Assigning activation ID 4243462021 (0xfcee1785)
LGWR: STARTING ARCH PROCESSES
Wed Jul 08 17:57:30 2015
ARC0 started with pid=23, OS id=485230 
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Thread 1 advanced to log sequence 2 (thread open)
Wed Jul 08 17:57:31 2015
ARC1 started with pid=24, OS id=485236 
Wed Jul 08 17:57:31 2015
ARC2 started with pid=25, OS id=485240 
Wed Jul 08 17:57:31 2015
ARC3 started with pid=26, OS id=485244 
ARC1: Archival started
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Thread 1 opened at log sequence 2
  Current log# 2 seq# 2 mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_2_bb7x30js_.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
Wed Jul 08 17:57:31 2015
NSA2 started with pid=27, OS id=485248 
[485226] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:3453451748 end:3453452018 diff:270 (2 seconds)
Dictionary check beginning
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Wed Jul 08 17:57:32 2015
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_p032_485380.trc  (incident=64481):
ORA-00600: internal error code, arguments: [17182], [0x7FE96B50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64481/xffdb_p032_485380_i64481.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Starting background process QMNC
Wed Jul 08 17:57:32 2015
QMNC started with pid=92, OS id=485512 
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Dumping diagnostic data in directory=[cdmp_20150708175733], requested by (instance=1, osid=485380 (P032)), summary=[incident=64481].
Thread 1 advanced to log sequence 3 (LGWR switch)
  Current log# 3 seq# 3 mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_3_bb7x310q_.log
ARC3: STARTING ARCH PROCESSES
Wed Jul 08 17:57:34 2015
ARC4 started with pid=93, OS id=485516 
Wed Jul 08 17:57:35 2015
db_recovery_file_dest_size of 204800 MB is 0.41% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
ARC4: Archival started
ARC3: STARTING ARCH PROCESSES COMPLETE
krsk_srl_archive_int: Enabling archival of deferred physical standby SRLs
Archived Log entry 273963 added for thread 1 sequence 2 ID 0xfcee1785 dest 1:
Archived Log entry 273964 added for thread 1 sequence 401533 ID 0xfbe74bc9 dest 1:
Completed: ALTER DATABASE OPEN
Wed Jul 08 17:57:36 2015
Starting background process CJQ0
Wed Jul 08 17:57:36 2015
CJQ0 started with pid=95, OS id=485554 
Shutting down archive processes
ARCH shutting down
ARC4: Archival stopped
Wed Jul 08 17:57:41 2015
Thread 1 advanced to log sequence 4 (LGWR switch)
  Current log# 4 seq# 4 mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_4_bb7x312r_.log
Wed Jul 08 17:57:41 2015
Archived Log entry 273965 added for thread 1 sequence 3 ID 0xfcee1785 dest 1:
Wed Jul 08 17:58:30 2015
Sweep [inc][64481]: completed
Sweep [inc2][64481]: completed
Wed Jul 08 17:58:31 2015
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_p032_485380.trc  (incident=64482):
ORA-00600: internal error code, arguments: [17114], [0x7FE96B50DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7FE96B50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64482/xffdb_p032_485380_i64482.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jul 08 17:58:32 2015
Dumping diagnostic data in directory=[cdmp_20150708175832], requested by (instance=1, osid=485380 (P032)), summary=[incident=64482].
Wed Jul 08 17:58:36 2015
Thread 1 advanced to log sequence 5 (LGWR switch)
  Current log# 5 seq# 5 mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_5_bb7x317f_.log
Wed Jul 08 17:59:02 2015
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_p032_485380.trc  (incident=64483):
ORA-00600: internal error code, arguments: [17114], [0x7FE96B50DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17114], [0x7FE96B50DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7FE96B50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64483/xffdb_p032_485380_i64483.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jul 08 17:59:04 2015
Dumping diagnostic data in directory=[cdmp_20150708175904], requested by (instance=1, osid=485380 (P032)), summary=[incident=64483].
Wed Jul 08 17:59:29 2015
Sweep [inc][64483]: completed
Sweep [inc][64482]: completed
Sweep [inc2][64483]: completed
Sweep [inc2][64482]: completed
Wed Jul 08 17:59:30 2015
Block recovery from logseq 2, block 104 to scn 73226531646
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_2_bb7x30js_.log
Recovery of Online Redo Log: Thread 1 Group 3 Seq 3 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_3_bb7x310q_.log
Recovery of Online Redo Log: Thread 1 Group 4 Seq 4 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_4_bb7x312r_.log
Recovery of Online Redo Log: Thread 1 Group 5 Seq 5 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_5_bb7x317f_.log
Block recovery stopped at EOT rba 5.765.16
Block recovery completed at rba 5.765.16, scn 17.212087614
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_p032_485380.trc  (incident=64484):
ORA-00600: internal error code, arguments: [KSMFPG2], [0x7FE96B50D000], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17114], [0x7FE96B50DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17114], [0x7FE96B50DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7FE96B50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64484/xffdb_p032_485380_i64484.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Dumping diagnostic data in directory=[cdmp_20150708175934], requested by (instance=1, osid=485380 (P032)), summary=[incident=64487].
Wed Jul 08 17:59:36 2015
SMON: slave died unexpectedly, downgrading to serial recovery
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_smon_485190.trc  (incident=64129):
ORA-00600: internal error code, arguments: [17182], [0x7F5AED10D7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64129/xffdb_smon_485190_i64129.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Block recovery from logseq 2, block 104 to scn 73226531646
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_2_bb7x30js_.log
Recovery of Online Redo Log: Thread 1 Group 3 Seq 3 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_3_bb7x310q_.log
Recovery of Online Redo Log: Thread 1 Group 4 Seq 4 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_4_bb7x312r_.log
Recovery of Online Redo Log: Thread 1 Group 5 Seq 5 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_5_bb7x317f_.log
Block recovery completed at rba 5.765.16, scn 17.212087615
ORACLE Instance xffdb (pid = 16) - Error 600 encountered while recovering transaction (13, 14) on object 135520.
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_smon_485190.trc:
ORA-00600: internal error code, arguments: [17182], [0x7F5AED10D7A0], [], [], [], [], [], [], [], [], [], []
Dumping diagnostic data in directory=[cdmp_20150708175937], requested by (instance=1, osid=485190 (SMON)), summary=[incident=64129].
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_smon_485190.trc  (incident=64130):
ORA-00600: internal error code, arguments: [KSMFPG2], [0x7F5AED10D000], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7F5AED10D7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64130/xffdb_smon_485190_i64130.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fatal internal error happened while SMON was doing active transaction recovery.
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_smon_485190.trc:
ORA-00600: internal error code, arguments: [KSMFPG2], [0x7F5AED10D000], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7F5AED10D7A0], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 485190): terminating the instance due to error 474
System state dump requested by (instance=1, osid=485190 (SMON)), summary=[abnormal instance termination].
System State dumped to trace file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_diag_485150_20150708175939.trc
Dumping diagnostic data in directory=[cdmp_20150708175939], requested by (instance=1, osid=485190 (SMON)), summary=[abnormal instance termination].
Instance terminated by SMON, pid = 485190

通过进一步分析确定是67号文件有异常,使用dbv检查该文件发现

DBVERIFY: Release 11.2.0.4.0 - Production on Thu Jul 9 11:08:27 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.


DBVERIFY - Verification starting : FILE = /opt/oracle/database/oradata/xffdb/dk_adv_ts_main_25.dbf
Block Checking: DBA = 283035270, Block Type = KTB-managed data block
data header at 0x7f87813ea064
kdbchk: the amount of space used is not equal to block size
        used=7383 fsc=0 avsp=873 dtl=8088
Page 2016902 failed with check code 6110
Block Checking: DBA = 283035271, Block Type = KTB-managed data block
data header at 0x7f87813ec064
kdbchk: the amount of space used is not equal to block size
        used=8676 fsc=0 avsp=832 dtl=8088
Page 2016903 failed with check code 6110
Block Checking: DBA = 283035272, Block Type = KTB-managed data block
data header at 0x7f87813ee064
kdbchk: avsp(1306) > tosp(894)
Page 2016904 failed with check code 6128
Block Checking: DBA = 283035273, Block Type = KTB-managed data block
data header at 0x7f87813f0064
kdbchk: the amount of space used is not equal to block size
        used=7506 fsc=0 avsp=815 dtl=8088
Page 2016905 failed with check code 6110
Block Checking: DBA = 283035274, Block Type = KTB-managed data block
data header at 0x7f87813f2064
kdbchk: the amount of space used is not equal to block size
        used=7892 fsc=0 avsp=884 dtl=8088
Page 2016906 failed with check code 6110
…………
Block Checking: DBA = 283035368, Block Type = KTB-managed data block
data header at 0x7f87814ae064
kdbchk: the amount of space used is not equal to block size
        used=7934 fsc=0 avsp=837 dtl=8088
Page 2017000 failed with check code 6110
Block Checking: DBA = 283035369, Block Type = KTB-managed data block
data header at 0x7f87814b0064
kdbchk: the amount of space used is not equal to block size
        used=7683 fsc=0 avsp=883 dtl=8088
Page 2017001 failed with check code 6110
Block Checking: DBA = 283035370, Block Type = KTB-managed data block
data header at 0x7f87814b2064
kdbchk: the amount of space used is not equal to block size
        used=8556 fsc=0 avsp=841 dtl=8088
Page 2017002 failed with check code 6110
Block Checking: DBA = 283035371, Block Type = KTB-managed data block
data header at 0x7f87814b4064
kdbchk: the amount of space used is not equal to block size
        used=7460 fsc=0 avsp=822 dtl=8088
Page 2017003 failed with check code 6110

DBVERIFY - Verification complete

Total Pages Examined         : 4063232
Total Pages Processed (Data) : 2724435
Total Pages Failing   (Data) : 102
Total Pages Processed (Index): 1064839
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 273957
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 212065913 (17.212065913)

把datafile 67给offline之后,数据库open正常,也不再crash。这里比较明显,导致主库和被动都出现异常,主库直接crash,然后重启后,一会儿就crash;备库激活后,重启后一会儿也就crash;两个库现象相同。最后通过dbv定位到是由于某个文件出现大量逻辑坏块,导致数据库open之后,进行回滚之时crash.对于此类故障,可以通过屏蔽事务回滚,并且通过alert日志和trace文件定位到异常对象,可以删除异常对象可以暂时解决该问题。而导致数据库出现类似问题。
通过这个案例,可以的出来,由于oracle某种bug或者其他原因,导致block 逻辑损坏,而且这个损坏会传输到备库,导致备库也异常,oracle的备份,容灾不能全部依赖物理dataguard容灾。因此在条件允许情况下,建议增加物理备份和逻辑容灾(类似OGG)

发表在 Data Guard | 标签为 , , , , , , | 3 条评论

记录一次由于坏块和不恰当恢复引起各种ORA-600案例

朋友让我帮忙处理一个不能open的库,打开alert日志一看,傻眼了,里面是各种ORA-600的错误应有尽有,被折腾的够惨
故障后重启,无法启动主要表现在block坏块,引起的各种ORA-600等错误

Mon Mar 02 16:09:27 2015
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
 parallel recovery started with 23 processes
Started redo scan
Completed redo scan
 read 962 KB redo, 256 data blocks need recovery
Started redo application at
 Thread 1: logseq 726, block 37343
Recovery of Online Redo Log: Thread 1 Group 3 Seq 726 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/oa/redo03.log
Mon Mar 02 16:09:27 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 1673 OF FILE 3
Completed redo application of 0.27MB
Mon Mar 02 16:09:27 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 3104 OF FILE 3
Mon Mar 02 16:09:27 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 3613 OF FILE 3
Mon Mar 02 16:09:28 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 272 OF FILE 3
Mon Mar 02 16:09:28 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 2512 OF FILE 3
Hex dump of (file 2, block 92889) in trace file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_dbw2_4158.trc
Corrupt block relative dba: 0x00816ad9 (file 2, block 92889)
Bad header found during preparing block for write
Data in bad block:
 type: 0 format: 0 rdba: 0x6ad90000
 last change scn: 0x0000.00c6a052 seq: 0x1 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0x5d7e
 consistency value in tail: 0xa0520001
 check value in block header: 0x0
 block checksum disabled
Mon Mar 02 16:09:28 2015
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p007_4196.trc  (incident=3833):
ORA-00600: internal error code, arguments: [4502], [1], [], [], [], [], [], [], [], [], [], []
Mon Mar 02 16:09:28 2015
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p013_4208.trc  (incident=3881):
ORA-00600: internal error code, arguments: [2037], [4259067], [4244307968], [159], [243], [0], [2162032704], [100728832], [], [], [], []
Slave exiting with ORA-1172 exception
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p009_4200.trc:
ORA-01172: recovery of thread 1 stuck at block 3613 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p001_4184.trc:
ORA-01172: recovery of thread 1 stuck at block 2512 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p021_4224.trc:
ORA-10388: parallel query server interrupt (failure)
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p021_4224.trc:
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_dbw2_4158.trc  (incident=3697):
ORA-00600: internal error code, arguments: [kcbzpbuf_1], [4], [1], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/oa/oa/incident/incdir_3697/oa_dbw2_4158_i3697.trc
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0xD2DDB7, kcbs_shrink_pool()+705] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_mman_4152.trc  (incident=3673):
ORA-07445: exception encountered: core dump [kcbs_shrink_pool()+705] [SIGSEGV] [ADDR:0x0] [PC:0xD2DDB7] [SI_KERNEL(general_protection)] []
Incident details in: /u01/app/oracle/diag/rdbms/oa/oa/incident/incdir_3673/oa_mman_4152_i3673.trc
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_dbw2_4158.trc:
Mon Mar 02 16:09:34 2015
Instance terminated by DBW2, pid = 4158

第二次重启后增加新错误ORA-00600[17182]

Mon Mar 02 16:39:50 2015
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p002_4321.trc  (incident=4993):
ORA-00600: internal error code, arguments: [17182], [0x7F548C2BDBA8], [], [], [], [], [], [], [], [], [], []

进行了一些恢复处理后,日志中报错
主要体现在进行了不完全恢复,而且应该是对redo进行了重命名或者redo头损坏锁引起的一系列提示

Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
 read 962 KB redo, 256 data blocks need recovery
Started redo application at
 Thread 1: logseq 726, block 37343
Recovery of Online Redo Log: Thread 1 Group 3 Seq 726 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/oa/redo03.log
RECOVERY OF THREAD 1 STUCK AT BLOCK 1673 OF FILE 3
Aborting crash recovery due to error 1172
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6644.trc:
ORA-01172: recovery of thread 1 stuck at block 1673 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6644.trc:
ORA-01172: recovery of thread 1 stuck at block 1673 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
ORA-1172 signalled during: alter  database open...
Tue Mar 03 11:17:59 2015
Sweep [inc][17178]: completed
Sweep [inc][17177]: completed
Sweep [inc2][17178]: completed
Tue Mar 03 11:18:00 2015
ALTER DATABASE RECOVER  database until cancel  
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 24 slaves
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Tue Mar 03 11:18:06 2015
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_pr00_6701.trc:
ORA-00266: name of archived log file needed
ORA-266 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER CANCEL 
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_pr00_6701.trc:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/app/oracle/oradata/oa/system01.dbf'
Slave exiting with ORA-1547 exception
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_pr00_6701.trc:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/app/oracle/oradata/oa/system01.dbf'
ORA-10879 signalled during: ALTER DATABASE RECOVER CANCEL ...
Tue Mar 03 11:18:06 2015
Checker run found 4 new persistent data failures
Tue Mar 03 11:18:13 2015
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 12986989
Resetting resetlogs activation ID 3278679642 (0xc36cae5a)
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6644.trc:
ORA-00367: checksum error in log file header
ORA-00322: log 1 of thread 1 is not current copy
ORA-00312: online log 1 thread 1: '/u01/app/oracle/oradata/oa/redo01.log'
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6644.trc:

再一步折腾,增加了_allow_resetlogs_corruption= TRUE之后数据库报ORA-600[2662]

Tue Mar 03 11:19:26 2015
SMON: enabling cache recovery
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6864.trc  (incident=18195):
ORA-00600: internal error code, arguments: [2662], [0], [13007002], [0], [13016626], [4194545], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/oa/oa/incident/incdir_18195/oa_ora_6864_i18195.trc
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6864.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [2662], [0], [13007002], [0], [13016626], [4194545], [], [], [], [], [], []
Error 704 happened during db open, shutting down database
USER (ospid: 6864): terminating the instance due to error 704
Instance terminated by USER, pid = 6864
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (6864) as a result of ORA-1092
Tue Mar 03 11:19:29 2015
ORA-1092 : opitsk aborting process

进一步折腾,可以看出来undo已经被其offline,无法正常访问,导致系统报ORA-704和ORA-00376

Wed Mar 04 21:10:58 2015
SMON: enabling cache recovery
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17074.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/app/oracle/oradata/oa/undotbs01.dbf'
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17074.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/app/oracle/oradata/oa/undotbs01.dbf'
Error 704 happened during db open, shutting down database
USER (ospid: 17074): terminating the instance due to error 704
Instance terminated by USER, pid = 17074
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (17074) as a result of ORA-1092
Wed Mar 04 21:11:00 2015
ORA-1092 : opitsk aborting process

通过Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)检测结果见附件(xifenfei_db_recover_20150304),这里可以知道undo 不知道怎么折腾的数据文件scn较大而且还offline,
通过一些列方法(bbed,隐含参数等)调整数据库scn,强制启动数据库,报如下错误

Wed Mar 04 22:50:23 2015
SMON: enabling cache recovery
ORA-01555 caused by SQL statement below (SQL ID: 3nkd3g3ju5ph1, SCN: 0x0000.4000003e):
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17807.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10_3550978943$" too small
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17807.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10_3550978943$" too small
Error 704 happened during db open, shutting down database
USER (ospid: 17807): terminating the instance due to error 704
Instance terminated by USER, pid = 17807
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (17807) as a result of ORA-1092

根据经验,该错误怀疑是文件头scn不够大,块延迟清理导致,进一步增加scn尝试,最后依旧是ORA-00704/ORA-00604/ORA-01555错误

Wed Mar 04 22:50:23 2015
SMON: enabling cache recovery
ORA-01555 caused by SQL statement below (SQL ID: 3nkd3g3ju5ph1, SCN: 0x0000.4000003e):
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17807.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10_3550978943$" too small
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17807.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10_3550978943$" too small
Error 704 happened during db open, shutting down database
USER (ospid: 17807): terminating the instance due to error 704
Instance terminated by USER, pid = 17807
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (17807) as a result of ORA-1092

根据经验,在scn上做手脚估计难以解决给问题,对其启动过程做10046和errorstack分析发现

PARSING IN CURSOR #3 len=202 dep=2 uid=0 oct=3 lid=0 tim=1425481940448439 hv=3819099649 ad='64ff91af8' sqlid='3nkd3g3ju5ph1'
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
END OF STMT
PARSE #3:c=1000,e=334,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=4,plh=0,tim=1425481940448439
BINDS #3:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f5b3253a6f0  bln=22  avl=01  flg=05
  value=0
 Bind#1
  oacdty=01 mxl=32(06) mxlc=00 mal=00 scl=00 pre=00
  oacflg=18 fl2=0001 frm=01 csi=852 siz=32 off=0
  kxsbbbfp=7f5b3253a6b8  bln=32  avl=06  flg=05
  value="PROPS$"
 Bind#2
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f5b3253a688  bln=24  avl=02  flg=05
  value=1
EXEC #3:c=0,e=640,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=4,plh=2853959010,tim=1425481940449147
WAIT #3: nam='db file sequential read' ela= 5 file#=1 block#=345 blocks=1 obj#=37 tim=1425481940449186
WAIT #3: nam='db file sequential read' ela= 4 file#=1 block#=44528 blocks=1 obj#=37 tim=1425481940449221
WAIT #3: nam='db file sequential read' ela= 3 file#=1 block#=5505 blocks=1 obj#=37 tim=1425481940449247

*** 2015-03-04 23:12:20.450
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=3, mask=0x0)
----- Error Stack Dump -----
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10_3550978943$" too small
----- Current SQL Statement for this session (sql_id=g64r07v2jn8nq) -----
SELECT NULL FROM PROPS$ WHERE NAME='BOOTSTRAP_UPGRADE_ERROR'

这里可以发现是数据库在启动的过程中需要执行SELECT NULL FROM PROPS$ WHERE NAME=’BOOTSTRAP_UPGRADE_ERROR’语句,而该语句递归调用了select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null 语句。既然这样通过一些方法避免数据库启动之时查询SELECT NULL FROM PROPS$ WHERE NAME=’BOOTSTRAP_UPGRADE_ERROR’语句,果然数据库启动成功。

知识点补充
ORA-600 [4502] [a]

Arg [a] ITL entry with a lock count
Meaning: During ITL cleanout we clear all row locks but the ITL entry
	 still thinks there is an uncleared lock. Ie: ITL has a locked
	 row but there are no locked rows in the block 

大体意思是数据库发现undo 的itl已经被清除,但是block中的itl依然存在,从而出现ORA-600[4502],引起该问题除bug外主要原因是坏块

ORA-600 [2037] [a] [b] {c} [d] [e] [f] [g]

Arg [a] Relative Data Block Address (RDBA) that the redo vector is for
Arg [b] The Block format  
Arg {c} RDBA in the block itself
Arg [d] The block type
Arg [e] The sequence number
Arg [f] Flags, if set  
Arg [g] The return value from the block head/tail checker.
DESCRIPTION:
  During recovery we are examining a block to ensure that it is not
  corrupt prior to applying any change vectors.
  The block has failed this check and this exception is raised

大体意思是在恢复过程中,正在检查的块,以确保它在应用任何变化向量之前不损坏。如果检查失败排除该异常ORA-600[2037],引起该问题除bug外主要原因是坏块

ORA-600 [kcbzpbuf_1],[a],[b]

Arg [a] Corruption reason
Arg [b] Calculate checksum flag

Corruption reason:
#define KCBH_GOOD    0                                     /* block is valid */
#define KCBH_ZERO    1             /* block header was entirely zero on disk */
#define KCBH_BROKEN  2      /* corruption could be from a partial disk write */
#define KCBH_CHKVAL  3               /* The check value for the block failed */
#define KCBH_CORRUPT 4     /* this is the wrong block or is not a data block */
#define KCBH_ZERONG  5               /* all zero block and it is not allowed */

Calculate checksum flag:
The possible values are 1 (Generate Checksum - db_block_checksum is enabled - default value) 
                        0 (do not generate checksum - db_block_checksum=false)

kcbzpbuf_1是该错误的源码函数

ORA-600 [17182] [a] [b] {c} [d] [e]

DESCRIPTION:        
  Oracle has detected that the magic number in a memory chunk header has been overwritten.
  This is a heap (in memory) corruption and there is no underlying data corruption.
  The error may occur in the one of the process specific heaps 
  (the Call heap, PGA heap, or session heap) or in the shared heap (SGA).

ORACLE 发现在内存中重要的块头被重新,但是没有基础数据损坏,大部分和数据块或者内存损坏有关系.

ORA-600 [4552] [a] [b] {c} [d] [e]

DESCRIPTION:
  This assertion is raised because we are trying to unlock the rows in a 
  block, but receive an incorrect block type.
  The second argument is the block type received.

ORACLE尝试对某行进行解锁但是接收到了不正确的数据块类型,Arg [b]是接收到的数据块类型

ORA-600 [2662] [a] [b] {c} [d] [e]

DESCRIPTION:

  A data block SCN is ahead of the current SCN.

  The ORA-600 [2662] occurs when an SCN is compared to the dependent SCN 
  stored in a UGA variable.

  If the SCN is less than the dependent SCN then we signal the ORA-600 [2662]
  internal error.

ARGUMENTS:
  Arg [a]  Current SCN WRAP
  Arg [b]  Current SCN BASE
  Arg {c}  dependent SCN WRAP
  Arg [d]  dependent SCN BASE 
  Arg [e]  Where present this is the DBA where the dependent SCN came from.

主要的含义就是oracle文件头scn比某个block dependent scn小从而出现该问题

发表在 Oracle备份恢复 | 标签为 , , , , , , , , | 评论关闭