ORA-01092 ORA-00604 ORA-08103故障处理

数据库启动报

SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00604: error occurred at recursive SQL level 1
ORA-08103: object no longer exists
进程 ID: 39348
会话 ID: 67 序列号: 29322

对应的alert日志

Mon Jul 15 10:59:46 2024
SMON: enabling cache recovery
Mon Jul 15 10:59:46 2024
Undo initialization finished serial:0 start:302658203 end:302658218 diff:15 ms (0.0 seconds)
Verifying minimum file header compatibility (11g) for tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
Mon Jul 15 10:59:46 2024
SMON: enabling tx recovery
Mon Jul 15 10:59:46 2024
Database Characterset is UTF8
Mon Jul 15 10:59:46 2024
Errors in file C:\APP\XFF\diag\rdbms\xff\xff\trace\xff_ora_46664.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-08103: 对象不再存在
Mon Jul 15 10:59:46 2024
Errors in file C:\APP\XFF\diag\rdbms\xff\xff\trace\xff_ora_46664.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-08103: 对象不再存在
Error 604 happened during db open, shutting down database
USER (ospid: 46664): terminating the instance due to error 604
Starting background process ARC2
Process ARC2 submission failed with error = 1092
Mon Jul 15 10:59:47 2024
Errors in file C:\APP\XFF\diag\rdbms\xff\xff\trace\xff_arc0_33164.trc:
ORA-00444: 后台进程 "ARC2" 启动失败
ORA-01092: ORACLE 实例终止。强制断开连接
Mon Jul 15 10:59:51 2024
Instance terminated by USER, pid = 46664
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (46664) as a result of ORA-1092

跟踪启动过程发现delete from histgrm$ where obj# = :1遭遇到ORA-08103错误

=====================
PARSING IN CURSOR #18135904 lid=0 tim=302295369306 hv=3667723989 ad='7ffda7f5b500' sqlid='2mp99nzd9u1qp'
delete from histgrm$ where obj# = :1
END OF STMT
PARSE #18135904:c=0,e=1191,p=0,cr=44,cu=0,mis=1,r=0,dep=1,og=4,plh=0,tim=302295369306
BINDS #16769312:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000001 frm=00 csi=00 siz=48 off=0
  kxsbbbfp=01144048  bln=22  avl=02  flg=05
  value=66
 Bind#1
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000001 frm=00 csi=00 siz=0 off=24
  kxsbbbfp=01144060  bln=22  avl=02  flg=01
  value=1
EXEC #16769312:c=0,e=85,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=3,plh=2239883476,tim=302295369571
FETCH #16769312:c=0,e=7,p=0,cr=4,cu=0,mis=0,r=1,dep=2,og=3,plh=2239883476,tim=302295369592
CLOSE #16769312:c=0,e=4,dep=2,type=3,tim=302295369610
BINDS #16769312:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000001 frm=00 csi=00 siz=48 off=0
  kxsbbbfp=01144048  bln=22  avl=02  flg=05
  value=66
 Bind#1
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000001 frm=00 csi=00 siz=0 off=24
  kxsbbbfp=01144060  bln=22  avl=02  flg=01
  value=2
EXEC #16769312:c=0,e=79,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=3,plh=2239883476,tim=302295369724
FETCH #16769312:c=0,e=6,p=0,cr=4,cu=0,mis=0,r=1,dep=2,og=3,plh=2239883476,tim=302295369740
CLOSE #16769312:c=0,e=4,dep=2,type=3,tim=302295369756
BINDS #18135904:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=1000001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=01145078  bln=22  avl=06  flg=05
  value=4294951147
WAIT #18135904: nam='db file sequential read' ela= 127 file#=1 block#=609 blocks=1 obj#=67 tim=302295370065
WAIT #18135904: nam='db file sequential read' ela= 188 file#=1 block#=243448 blocks=1 obj#=67 tim=302295370285
Dumping Short Stack
ksedsts()+314<-kcbzib()+17818<-kcbgtcr()+12688<-ktrgtc2()+802<-qeilbk1()+7661<-qeilsr()+185<-qerixtFetch()
…………
<-opidrv()+848<-sou2o()+94<-opimai_real()+281<-opimai()+170<-00007FFC2EAC7374<-00007FFC2FADCC91
kcbzib: dump suspect buffer, err2=8103
Encrypted block <0, 4437752> content will not be dumped. Dumping header only.
buffer tsn: 0 rdba: 0x0043b6f8 (1/243448)
scn: 0x0.0 seq: 0x01 flg: 0x05 tail: 0x00000001
frmt: 0x02 chkval: 0x11bb type: 0x00=unknown
Dump of buffer cache at level 8 for pdb=0 tsn=0 rdba=4437752
BH (0x7ffd55f95998) file#: 1 rdba: 0x0043b6f8 (1/243448) class: 1 ba: 0x7ffd5555e000
  set: 50 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,0
  dbwrid: 1 obj: 67 objn: 67 tsn: [0/0] afn: 1 hint: f
  hash: [0x7ffdaca1f3a0,0x7ffdaca1f3a0] lru: [0x7ffd55f95bc0,0x7ffda9328448]
  ckptq: [NULL] fileq: [NULL]
  objq: [0x7ffd9d61a3c0,0x7ffd9d61a3c0] objaq: [0x7ffd9d61a3b0,0x7ffd9d61a3b0]
  use: [0x7ffdaaf604f8,0x7ffdaaf604f8] wait: [NULL]
  st: READING md: EXCL tch: 0
  flags: only_sequential_access
  Using State Objects
    ----------------------------------------
    SO: 0x00007FFDAAF60470, type: 46, owner: 0x00007FFD9B3C5ED8, flag: INIT/-/-/0x00 if: 0x1 c: 0x1
     proc=0x00007FFDAB2984C0, name=buffer handle, file=kcb2.h LINE:3317, pg=0 conuid=0
    (buffer) (CR) PR: 0x00007FFDAB2984C0 FLG: 0x0 SEQ: 0x439
    class bit: 0x0
    scan scn: 0.0
     cr[0]:
     sh[0]:
    kcbbfbp: [BH: 0x00007FFD55F95998, LINK: 0x00007FFDAAF604F8]
    type: normal pin
    where: qeilwhnp: qeilbk, why: 0
EXEC #18135904:c=234375,e=235311,p=2,cr=9,cu=0,mis=1,r=0,dep=1,og=4,plh=2015116224,tim=302295604662
ERROR #18135904:err=8103 tim=302295604678
STAT #18135904 id=1 cnt=0 pid=0 pos=1 obj=0 op='DELETE  HISTGRM$ (cr=0 pr=0 pw=0 time=2 us)'
STAT #18135904 id=2 obj=67 op='INDEX RANGE SCAN I_H_OBJ#_COL# (cr=0 pr=0 pw=0 time=0 us cost=3 size=376 card=47)'
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-08103: 对象不再存在
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-08103: 对象不再存在

*** 2024-07-15 10:53:30.201
USER (ospid: 39348): terminating the instance due to error 604

因为数据库启动执行的delete from histgrm$操作不是必须的,因此在数据库启动过程中让该sql不执行,实现数据库open成功

C:\Users\XFF>sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on 星期一 7月 15 11:15:33 2024

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

已连接到空闲例程。

SQL> startup mount pfile='d:/pfile.txt'
ORACLE 例程已经启动。

Total System Global Area 6442450944 bytes
Fixed Size                  6205768 bytes
Variable Size            1493175992 bytes
Database Buffers         4932501504 bytes
Redo Buffers               10567680 bytes
数据库装载完毕。
SQL> recover database;
完成介质恢复。
SQL> alter database open;

数据库已更改。

然后再对histgrm$表对象进行处理,数据库恢复正常

发表在 Oracle备份恢复 | 标签为 , , | 留下评论

数据库启动报ORA-600 6711故障分析处理

几个月以前的一个数据库故障,今天拿出来在win上重新分析,数据库启动报ORA-600 6711错

C:\Users\XFF>SQLPLUS / AS SYSDBA

SQL*Plus: Release 12.1.0.2.0 Production on 星期日 7月 14 16:17:32 2024

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

已连接到空闲例程。

SQL> startup mount pfile='d:/pfile.txt'
ORACLE 例程已经启动。

Total System Global Area 6442450944 bytes
Fixed Size                  6205768 bytes
Variable Size            1493175992 bytes
Database Buffers         4932501504 bytes
Redo Buffers               10567680 bytes
数据库装载完毕。
SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [6711], [4436379], [1], [4436389],
[0], [], [], [], [], [], [], []
进程 ID: 44144
会话 ID: 67 序列号: 39084

根据经验该报错为:ORA-600 [6711] “Cluster Key Chain corruption”,也就是说很可能是cluster相关对象异常导致该问题.


对启动过程进行跟踪

PARSING IN CURSOR #17695456 len=189 dep=4  tim=233428646426 hv=186852205 ad='7ffda1eea168' sqlid='2tkw12w5k68vd'
select user#,password,datats#,tempts#,type#,defrole,resource$, ptime,
decode(defschclass,NULL,'DEFAULT_CONSUMER_GROUP',defschclass),
spare1,spare4,ext_username,spare2 from user$ where name=:1
END OF STMT
PARSE #17695456:c=0,e=168,p=0,cr=0,cu=0,mis=1,r=0,dep=4,og=4,plh=0,tim=233428646426
BINDS #17695456:
 Bind#0
  oacdty=01 mxl=32(03) mxlc=00 mal=00 scl=00 pre=00
  oacflg=18 fl2=0001 frm=01 csi=871 siz=32 off=0
  kxsbbbfp=010b2df0  bln=32  avl=03  flg=05
  value="SYS"
EXEC #17695456:c=0,e=418,p=0,cr=0,cu=0,mis=1,r=0,dep=4,og=4,plh=1457651150,tim=233428646901
WAIT #17695456: nam='db file sequential read' ela= 126 file#=1 block#=417 blocks=1 obj#=46 tim=233428647046
FETCH #17695456:c=0,e=153,p=1,cr=2,cu=0,mis=0,r=1,dep=4,og=4,plh=1457651150,tim=233428647069
STAT #17695456 id=1 cnt=1 pid=0 pos=1 obj=22 op='TABLE ACCESS BY INDEX ROWID USER$ 
  (cr=2 pr=1 pw=0 time=151 us cost=1 size=139 card=1)'
STAT #17695456 id=2 cnt=1 pid=1 pos=1 obj=46 op='INDEX UNIQUE SCAN I_USER1 (cr=1 pr=1 pw=0 time=149 us)'
CLOSE #17695456:c=0,e=2,dep=4,type=0,tim=233428647111
Incident 2601 created, dump file: C:\APP\XFF\diag\rdbms\ecp\ecp\incident\incdir_2601\ecp_ora_40516_i2601.trc
ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], []

FETCH #15289752:c=2062500,e=2544215,p=13,cr=65626,cu=28,mis=0,r=0,dep=3,og=3,plh=3312420081,tim=233431176536
=====================
PARSE ERROR #387363008:len=50 dep=1 uid=0 oct=3 lid=0 tim=233431176680 err=600
select cost from resource_cost$ where resource#=:1
ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], []
ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], []

这个操作触发了递归查询

PARSING IN CURSOR #387319440 len=151 dep=5 lid=0 tim=233428641503 hv=2507062328 ad='7ffd9ffa23a8' sqlid='7u49y06aqxg1s'
select /*+ rule */ bucket, endpoint, col#, epvalue, epvalue_raw, ep_repeat_count from histgrm$ 
where obj#=:1 and intcol#=:2 and row#=:3 order by bucket
END OF STMT
PARSE #387319440:c=0,e=11,p=0,cr=0,cu=0,mis=0,r=0,dep=5,og=3,plh=3312420081,tim=233428641503
BINDS #387319440:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000001 frm=00 csi=00 siz=72 off=0
  kxsbbbfp=00eb2be0  bln=22  avl=02  flg=05
  value=22
 Bind#1
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000001 frm=00 csi=00 siz=0 off=24
  kxsbbbfp=00eb2bf8  bln=22  avl=02  flg=01
  value=2
 Bind#2
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000001 frm=00 csi=00 siz=0 off=48
  kxsbbbfp=00eb2c10  bln=22  avl=01  flg=01
  value=0
EXEC #387319440:c=0,e=105,p=0,cr=0,cu=0,mis=0,r=0,dep=5,og=3,plh=3312420081,tim=233428641652
WAIT #387319440: nam='db file sequential read' ela= 124 file#=1 block#=45660 blocks=1 obj#=66 tim=233428641792
FETCH #387319440:c=0,e=173,p=1,cr=3,cu=0,mis=0,r=20,dep=5,og=3,plh=3312420081,tim=233428641834
STAT #387319440 id=1 cnt=20 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=3 pr=1 pw=0 time=169 us cost=0 size=0 card=0)'
STAT #387319440 id=2 cnt=20 pid=1 pos=1 obj=66 op='TABLE ACCESS CLUSTER HISTGRM$ (cr=3 pr=1 pw=0 time=148 us)'
STAT #387319440 id=3 cnt=1 pid=2 pos=1 obj=65 op='INDEX UNIQUE SCAN I_OBJ#_INTCOL# (cr=2 pr=0 pw=0 time=2 us)'
CLOSE #387319440:c=0,e=36,dep=5,type=3,tim=233428641886

查看对应的trace文件

[TOC00000]
Jump to table of contents
Dump continued from file: C:\APP\XFF\diag\rdbms\ecp\ecp\trace\ecp_ora_40516.trc
[TOC00001]
ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], []

[TOC00001-END]
[TOC00002]
========= Dump for incident 2601 (ORA 600 [6711]) ========
[TOC00003]
----- Beginning of Customized Incident Dump(s) -----
kdsDumpState: cdb: 0 dspdb: 0 type: 3
*** ENTER: kds state dump ***
            row 0x0043b1a5.28 continuation at: 0x0043b1a5.0 file# 1 block# 242085 slot 0 (dscnt: 0)
KDSTABN_GET: 1 ..... ntab: 2
curSlot: 0 ..... nrows: 40
Dumping kcb descriptor:
kcbds 0x0000000017100DF0 : tsn 0, rdba 0x0043b1a5, afn 1, objd 64, cls 1, tidflg 0x0 0x0 0x0
    dsflg 0x00100000, dsflg2 0x00004000, lobid 00000000:00000000, cnt 0, addr 0x00007FFD55D1C014 dx 0x0000000000000000
    env [0x0000000017178C7C]: (scn: 0x0000.54290647   xid: 0x0000.000.00000000  uba: 0x00000000.0000.00  
    statement num=0  parent xid:  0x0000.000.00000000  st-scn: 0x0000.00000000  
    hi-scn: 0x0000.00000000  ma-scn: 0x0000.00000000  flg: 0x00000660)
kcb_dw_scan_dumpctx: not in DW scan
kdsgrp1_dump database not fully open
*** EXIT: kds state dump ***
----- End of Customized Incident Dump(s) -----
[TOC00003-END]

通过对相关rdba进行dump分析,确认对象id为64和trace中报的信息匹配

DUL> rdba 0x0043b1a5

  rdba   : 0x0043b1a5=4436389
  rfile# : 1
  block# : 242085

DUL> dump datafile 1 block 242085 header
Block Header:
block type=0x06 (table/index/cluster segment data block)
block format=0xa2 (oracle 10)
block rdba=0x0043b1a5 (file#=1, block#=242085)
scn=0x0000.438d4a86, seq=1, tail=0x4a860601
block checksum value=0xd591=54673, flag=6
Data Block Header Dump:
 Object id on Block? Y
 seg/obj: 0x40=64  csc: 0x00.438d4a80  itc: 2  flg: -  typ: 1 (data)
     fsl: 0  fnx: 0x0 ver: 0x01

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0002.01f.00014b92  0x00c01897.6e20.07  C---    0  scn 0x0000.438c5fca
0x02   0x000a.01a.0011bb8e  0x00c0292c.0317.42  --U-   22  fsc 0x0000.438d4a86
Data Block Dump:
================
flag=0x0 --------
ntab=2
nrow=41
frre=23
fsbo=0x68
ffeo=0xb90
avsp=0x1ce1
tosp=0x1ce1

进一步分析该id为什么对象,使用dul unload obj$
c_obj#_intcol#


确认对对象为cluster C_OBJ#_INTCOL#,对应的表为HISTGRM$(统计信息中存储直方图信息表),明白这一些,处理起来就比较容易了,open数据库过程中绕过该对象访问,然后对该表进行处理即可

发表在 Oracle备份恢复 | 标签为 , , , | 留下评论

RMAN SBT_TAPE备份无法被DISK通道识别

经过测试确认rman SBT_TAPE通道备份的rman备份集无法被DISK通道(文件系统方式备份)方式恢复,关于rman的SBT_TAPE通道(带库方式备份)备份参考:模拟带库实现rman远程备份
发起SBT_TAPE通道备份备份

[oracle@xifenfei ~]$ rman target /

Recovery Manager: Release 11.2.0.4.0 - Production on Sun Jul 14 13:19:03 2024

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: XIFENFEI (DBID=1780931490)

RMAN> CONFIGURE DEFAULT DEVICE TYPE TO SBT_TAPE;

using target database control file instead of recovery catalog
old RMAN configuration parameters:
CONFIGURE DEFAULT DEVICE TYPE TO DISK;
new RMAN configuration parameters:
CONFIGURE DEFAULT DEVICE TYPE TO 'SBT_TAPE';
new RMAN configuration parameters are successfully stored

RMAN> backup  format 'ctl_%T_%U.rman' current controlfile;

Starting backup at 14-JUL-24
allocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: SID=147 device type=SBT_TAPE
channel ORA_SBT_TAPE_1: SBT/SSH2-SFTP
channel ORA_SBT_TAPE_1: starting full datafile backup set
channel ORA_SBT_TAPE_1: specifying datafile(s) in backup set
including current control file in backup set
channel ORA_SBT_TAPE_1: starting piece 1 at 14-JUL-24
channel ORA_SBT_TAPE_1: finished piece 1 at 14-JUL-24
piece handle=ctl_20240714_0r2vt3eh_1_1.rman tag=TAG20240714T131913 comment=API Version 2.0,MMS Version 1.0.9.0
channel ORA_SBT_TAPE_1: backup set complete, elapsed time: 00:00:01
Finished backup at 14-JUL-24

确认备份文件在文件系统中位置

[oracle@xifenfei ~]$ ls -l /tmp/rmanback/ctl_20240714_0r2vt3eh_1_1.rman
-rw-r--r-- 1 root root 9961472 Jul 14 13:19 /tmp/rmanback/ctl_20240714_0r2vt3eh_1_1.rman

尝试DISK通道加载备份集,报RMAN-07519错误

RMAN> catalog start with '/tmp/rmanback/ctl_20240714_0r2vt3eh_1_1.rman';

using target database control file instead of recovery catalog
searching for all files that match the pattern /tmp/rmanback/ctl_20240714_0r2vt3eh_1_1.rman

List of Files Unknown to the Database
=====================================
File Name: /tmp/rmanback/ctl_20240714_0r2vt3eh_1_1.rman

Do you really want to catalog the above files (enter YES or NO)? yes
cataloging files...
no files cataloged

List of Files Which Where Not Cataloged
=======================================
File Name: /tmp/rmanback/ctl_20240714_0r2vt3eh_1_1.rman
  RMAN-07519: Reason: Error while cataloging. See alert.log.

查看alert日志信息报ORA-27048: skgfifi: file header information is invalid 等错误

Sun Jul 14 13:28:22 2024
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_13260.trc:
ORA-19624: operation failed, retry possible
ORA-19870: error while restoring backup piece /tmp/rmanback/ctl_20240714_0r2vt3eh_1_1.rman
ORA-19505: failed to identify file "/tmp/rmanback/ctl_20240714_0r2vt3eh_1_1.rman"
ORA-27048: skgfifi: file header information is invalid
Additional information: 8
ORA-27048: skgfifi: file header information is invalid
Additional information: 8
ORA-27048: skgfifi: file header information is invalid
Additional information: 8

通过分析备份文件发现在带库方式备份和文件系统方式备份的文件头信息明显不一样
disk

tape


发表在 rman备份/恢复 | 标签为 , , | 留下评论