标签归档:ORA-600 KCLCHKBLK_4

ora-600 2662和ora-600 kclchkblk_4恢复

这两天连续处理两个case,一个是12.1.0.2版本数据库屏蔽一致性,强制open之后,报ORA-600 2662故障
20210429220218


这个错误本身是一个非常常见的错误,直接推scn即可解决,但是问题是12.1.0.2版本,oracle不允许以前常规的操作方法,就连oradebug都报错oradebug poke ORA-32521/ORA-32519故障解决,而且可以是rac环境,bbed修改文件头也相当麻烦,最后我们使用patch方法轻松解决

另外一例是11.2.0.4版本,强制open库报ora-600 kclchkblk_4

Wed Apr 28 21:25:38 2021
SMON: enabling cache recovery
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
Errors in file /u01/app/oracle/diag/rdbms/dc/dc1/trace/dc1_ora_27832.trc  (incident=564430):
ORA-00600: internal error code, arguments: [kclchkblk_4], [2959], [904341694], [2959], [904131717], []
Incident details in: /u01/app/oracle/diag/rdbms/dc/dc1/incident/incdir_564430/dc1_ora_27832_i564430.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/dc/dc1/trace/dc1_ora_27832.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [kclchkblk_4], [2959], [904341694], [2959], [904131717], []
Errors in file /u01/app/oracle/diag/rdbms/dc/dc1/trace/dc1_ora_27832.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [kclchkblk_4], [2959], [904341694], [2959], [904131717], []
Error 704 happened during db open, shutting down database
USER (ospid: 27832): terminating the instance due to error 704
Instance terminated by USER, pid = 27832
ORA-1092 signalled during: alter database open resetlogs...

这个比较简单,参考redo异常 ORA-600 kclchkblk_4 故障恢复.

发表在 Oracle | 标签为 , , , | 留下评论

某医院存储掉线导致Oracle数据库故障恢复

xx医院存储突然掉线,导致数据库异常,现场工程师折腾了一天,问题依旧没有解决,无奈之下找到我们,希望我们能够帮忙恢复数据库.
启动报ORA-00600[2131]错误

Fri Nov 06 14:50:59 2015
ALTER DATABASE   MOUNT
This instance was first to mount
Fri Nov 06 14:50:59 2015
ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=192.168.4.4)(PORT=1521))' SCOPE=MEMORY SID='xifenfei1';
NOTE: Loaded library: System 
SUCCESS: diskgroup DATA was mounted
NOTE: dependency between database xifenfei and diskgroup resource ora.DATA.dg is established
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13221.trc  (incident=191085):
ORA-00600: internal error code, arguments: [2131], [33], [32], [], [], [], [], [], [], [], [], []
Incident details in: /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_191085/xifenfei1_ora_13221_i191085.trc
Fri Nov 06 14:51:10 2015
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORA-600 signalled during: ALTER DATABASE   MOUNT...

出现该错误的原因是由于:We are attempting to write a controlfile checkpoint progress record, but find we do not have the progress record generating this exception.由于控制文件异常导致,出现此类情况,我们一般使用单个控制文件一次尝试,如果都不可以考虑重建控制文件

由于坏块(逻辑/物理)导致数据库实例恢复无法进行

Beginning crash recovery of 2 threads
Started redo scan
kcrfr_rnenq: use log nab 393216
kcrfr_rnenq: use log nab 2
Completed redo scan
 read 4427 KB redo, 500 data blocks need recovery
Started redo application at
 Thread 1: logseq 5731, block 391398
 Thread 2: logseq 4252, block 520815
Recovery of Online Redo Log: Thread 1 Group 2 Seq 5731 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_2.266.835331047
Recovery of Online Redo Log: Thread 2 Group 8 Seq 4252 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_8.331.835330421
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc  (incident=197486):
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], [], [], [], [], []
Incident details in:/home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_197486/xifenfei1_ora_13770_i197486.trc
Fri Nov 06 15:03:09 2015
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc  (incident=197487):
ORA-01578: ORACLE data block corrupted (file # 2, block # 65207)
ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753'
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 81045
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], [], [], [], [], []
Incident details in:/home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_197487/xifenfei1_ora_13770_i197487.trc
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc:
ORA-01578: ORACLE data block corrupted (file # 2, block # 65207)
ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753'
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 81045
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], [], [], [], [], []
Recovery of Online Redo Log: Thread 2 Group 3 Seq 4253 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_3.332.835330505
Hex dump of (file 14, block 62536) in trace file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc
Reading datafile '+DATA/xifenfei/datafile/ht01.dbf' for corruption at rdba: 0x0380f448 (file 14, block 62536)
Reread (file 14, block 62536) found same corrupt data (logically corrupt)
RECOVERY OF THREAD 1 STUCK AT BLOCK 62536 OF FILE 14
Fri Nov 06 15:03:13 2015
Abort recovery for domain 0
Aborting crash recovery due to error 1172
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc:
ORA-01172: recovery of thread 1 stuck at block 62536 of file 14
ORA-01151: use media recovery to recover block, restore backup if needed
Abort recovery for domain 0
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc:
ORA-01172: recovery of thread 1 stuck at block 62536 of file 14
ORA-01151: use media recovery to recover block, restore backup if needed
ORA-1172 signalled during: ALTER DATABASE OPEN...

查看资料发现和Bug 14301592 – Several errors by corrupt blocks shifted by 2 bytes in buffer cache during recovery caused by INDEX redo apply,可以通过ALLOW 1 CORRUPTION临时解决

使用ALLOW 1 CORRUPTION进行恢复,出现ORA-07445[kdxlin]错误

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
+DATA/xifenfei/onlinelog/group_3.332.835330505     
ORA-00279: change 700860458 generated at 11/05/2015 21:20:15 needed for thread
1
ORA-00289: suggestion : +ARCHIVE/xifenfei/xifenfei_1_5731_835324843.arc
ORA-00280: change 700860458 for thread 1 is in sequence #5731


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
+DATA/xifenfei/onlinelog/group_2.266.835331047
ORA-00283: recovery session canceled due to errors
ORA-10562: Error occurred while applying redo to data block (file# 2, block#
70104)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 82289
ORA-00607: Internal error occurred while making a change to a data block
ORA-00602: internal programming exception
ORA-07445: exception encountered: core dump [kdxlin()+4088] [SIGSEGV]
[ADDR:0xC] [PC:0x95FB572] [Address not mapped to object] []


ORA-01112: media recovery not started

ORA-07445[kdxlin()+4088]未找到类似说明,到了这一步,无法简单的恢复成功,只能通过设置隐含参数跳过实例恢复,尝试resetlog库

通过设置_allow_resetlogs_corruption参数继续恢复

SQL> startup pfile='/tmp/pfile.ora' mount;
ORACLE instance started.

Total System Global Area 7315603456 bytes
Fixed Size                  2267384 bytes
Variable Size            2566915848 bytes
Database Buffers         4731174912 bytes
Redo Buffers               15245312 bytes
Database mounted.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927],
[0], [700860464], [], [], [], [], [], [], []
Process ID: 13563
Session ID: 157 Serial number: 3

alert日志报错

Fri Nov 06 19:26:39 2015
SMON: enabling cache recovery
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13563.trc  (incident=319140):
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927], [0], [700860464], [], [], [], [], [], [], []
Incident details in:/home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_319140/xifenfei1_ora_13563_i319140.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Redo thread 2 internally disabled at seq 1 (CKPT)
ARC1: Archiving disabled thread 2 sequence 1
Archived Log entry 9956 added for thread 2 sequence 1 ID 0x0 dest 1:
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13563.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927], [0], [700860464], [], [], [], [], [], [], []
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13563.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927], [0], [700860464], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 13563): terminating the instance due to error 600
Fri Nov 06 19:26:42 2015
Instance terminated by USER, pid = 13563
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (13563) as a result of ORA-1092
Fri Nov 06 19:26:42 2015
ORA-1092 : opitsk aborting process

这里是比较熟悉的ora-600[kclchkblk_4]错误,和ora-600[2662]错误类似,需要调整scn,由于数据库版本为11.2.0.4,无法使用常规方法调整scn,在修改控制文件,oradebug,bbed方法可供选择

使用oradebug方法处理
因为是asm环境,其他方法处理起来都相对麻烦

[oracle@wisetop1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Fri Nov 6 19:30:59 2015

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup pfile='/tmp/pfile.ora' mount;
ORACLE instance started.

Total System Global Area 7315603456 bytes
Fixed Size                  2267384 bytes
Variable Size            2566915848 bytes
Database Buffers         4731174912 bytes
Redo Buffers               15245312 bytes
Database mounted.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug poke 0x06001AE70 4 0x2FAF0800
BEFORE: [06001AE70, 06001AE74) = 00000000
AFTER:  [06001AE70, 06001AE74) = 2FAF0800
SQL> alter database open;

Database altered.

至此数据库open成功,后续就是处理一些坏块的工作,并建议客户逻辑重建库.

发表在 非常规恢复 | 标签为 , , , | 2 条评论

redo异常 ORA-600 kclchkblk_4 故障恢复

朋友和我说,他们数据库由于存储控制器异常,导致数据库无法正常启动.
数据库recover database 提示需要已经覆盖的redo

-bash-3.2$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Nov 3 17:23:35 2014

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, Real Application Clusters, Data Mining and Real Application Testing options

SQL> recover database;
ORA-00279: change 1639063379 generated at 11/01/2014 12:06:33 needed for thread
2
ORA-00289: suggestion :
/export/home/oracle/product/10.2.0/db_1/dbs/arch2_29919_790965041.dbf
ORA-00280: change 1639063379 for thread 2 is in sequence #29919


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.

SQL> select thread#,group#,sequence# from v$log;

   THREAD#     GROUP#  SEQUENCE#
---------- ---------- ----------
         1          1      30261
         1          2      30262
         1          3      30258
         1          4      30259
         1          5      30260
         2          6      29966
         2          7      29967
         2          8      29968
         2          9      29969
         2         10      29970

10 rows selected.

SQL> select group#,member from v$logfile;

    GROUP# MEMBER
---------- --------------------------------------------------
         1 /dev/md/oradg/rdsk/d115
         2 /dev/md/oradg/rdsk/d116
         3 /dev/md/oradg/rdsk/d117
         4 /dev/md/oradg/rdsk/d118
         5 /dev/md/oradg/rdsk/d119
         6 /dev/md/oradg/rdsk/d120
         7 /dev/md/oradg/rdsk/d121
         8 /dev/md/oradg/rdsk/d122
         9 /dev/md/oradg/rdsk/d123
        10 /dev/md/oradg/rdsk/d124

10 rows selected.

SQL> recover database;
ORA-00279: change 1639063379 generated at 11/01/2014 12:06:33 needed for thread 2
ORA-00289: suggestion : /export/home/oracle/product/10.2.0/db_1/dbs/arch2_29919_790965041.dbf
ORA-00280: change 1639063379 for thread 2 is in sequence #29919


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.

数据库没归档,redo覆盖,无法继续恢复,使用_allow_resetlogs_corruption屏蔽前滚,继续恢复

SQL> startup mount pfile='/tmp/pfile.txt';
ORACLE instance started.

Total System Global Area 2097152000 bytes
Fixed Size                  2053120 bytes
Variable Size            1090522112 bytes
Database Buffers          855638016 bytes
Redo Buffers              148938752 bytes
Database mounted.
SQL> recover database until cancel;          
ORA-00279: change 1639063379 generated at 11/01/2014 12:58:20 needed for thread
1
ORA-00289: suggestion :
/export/home/oracle/product/10.2.0/db_1/dbs/arch1_30262_790965041.dbf
ORA-00280: change 1639063379 for thread 1 is in sequence #30262


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/dev/md/oradg/rdsk/d105'


ORA-01112: media recovery not started


SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced

查看alert日志

Mon Nov 03 17:37:11 CST 2014
Completed crash recovery at
 Thread 2: logseq 1, block 3, scn 1639083385
 0 data blocks read, 0 data blocks written, 1 redo blocks read
Picked broadcast on commit scheme to generate SCNs
Mon Nov 03 17:37:12 CST 2014
Thread 2 advanced to log sequence 2 (thread open)
Thread 2 opened at log sequence 2
  Current log# 7 seq# 2 mem# 0: /dev/md/oradg/rdsk/d121
Successful open of redo thread 2
Mon Nov 03 17:37:12 CST 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Nov 03 17:37:13 CST 2014
SMON: enabling cache recovery
Mon Nov 03 17:37:13 CST 2014
Errors in file /export/home/oracle/admin/iesdb/udump/iesdb2_ora_15700.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [1640589405], [0], [1639117006], [], [], []
Mon Nov 03 17:37:14 CST 2014
Errors in file /export/home/oracle/admin/iesdb/udump/iesdb2_ora_15700.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [1640589405], [0], [1639117006], [], [], []
Mon Nov 03 17:37:14 CST 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 15700
ORA-1092 signalled during: alter database open...

参考:In 10.1.0.2: ORA-600 [kclchkblk_4] and ORA-600 [2662] After Recovery of Database (Doc ID 275902.1),删除tempfile继续打开数据库

SQL> alter database tempfile '/dev/md/oradg/rdsk/d109' drop;

Database altered.

SQL> c/109/110
  1* alter database tempfile '/dev/md/oradg/rdsk/d110' drop
SQL> /

Database altered.

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced

数据库依然报ORA-00600kclchkblk_4]错误,因此解决该问题选择使用bbed修改文件头scn来完成,具体参考类似文章:使用bbed修复损坏datafile header

-bash-3.2$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Nov 3 17:41:17 2014

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

Connected to an idle instance.

SQL> startup pfile='/tmp/pfile.txt' mount; 
ORACLE instance started.

Total System Global Area 2097152000 bytes
Fixed Size                  2053120 bytes
Variable Size            1090522112 bytes
Database Buffers          855638016 bytes
Redo Buffers              148938752 bytes
Database mounted.
SQL> alter database open;

Database altered.

SQL> alter tablespace temp add tempfile '/dev/md/oradg/rdsk/d109' size 8388608000 autoextend off;

Tablespace altered.

SQL> alter tablespace temp add tempfile '/dev/md/oradg/rdsk/d110' size 8388608000 autoextend off;

Tablespace altered.

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 2097152000 bytes
Fixed Size                  2053120 bytes
Variable Size            1090522112 bytes
Database Buffers          855638016 bytes
Redo Buffers              148938752 bytes
Database mounted.
Database opened.

至此数据库恢复完成,建议逻辑方式重建数据库

发表在 Oracle备份恢复 | 标签为 , , | 评论关闭