标签云
asm 恢复 asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 kfed MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 ORACLE恢复 Oracle 恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (100)
- 数据库 (1,598)
- DB2 (22)
- MySQL (70)
- Oracle (1,463)
- Data Guard (49)
- EXADATA (7)
- GoldenGate (21)
- ORA-xxxxx (158)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (13)
- ORACLE 21C (3)
- Oracle ASM (65)
- Oracle Bug (7)
- Oracle RAC (47)
- Oracle 安全 (6)
- Oracle 开发 (27)
- Oracle 监听 (27)
- Oracle备份恢复 (530)
- Oracle安装升级 (84)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (75)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (36)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (19)
-
最近发表
- PostgreSQL解析wal日志之—walminer
- Oracle 19c/21c最新patch信息-202404
- PostgreSQL恢复系列:pg_filedump批量处理
- PostgreSQL部分主要字典信息
- PostgreSQL恢复系列:pg_filedump恢复字典构造
- PostgreSQL 16 源码安装
- ORA-00742 ORA-00312 恢复
- 数据库open成功后报ORA-00353 ORA-00354错误引起的一系列问题(本质ntfs文件系统异常)
- ORA-600 ktsiseginfo1故障
- ORA-00600: internal error code, arguments: [16703], [1403], [4] 原因
- 最近遇到几起ORA-600 16703故障(tab$被清空),请引起重视
- ORA-600 2662快速恢复之Patch scn工具
- TNS-12518: TNS:listener could not hand off client connection
- ora.storage无法启动报ORA-12514故障处理
- 断电引起文件scn异常数据库恢复
- ORA-16188: LOG_ARCHIVE_CONFIG settings inconsistent with previously started instance
- .[hudsonL@cock.li].mkp勒索加密数据库完美恢复
- 模拟带库实现rman远程备份
- 又一例:ORA-600 kclchkblk_4和2662故障
- Oracle误删除数据文件恢复
月归档:八月 2016
MON_MODS$表ORA-600 13013报错处理
有朋友反馈数据库启动运行一点时间之后,然后就自动crash,让我们帮忙找原因,通过分析是由于smon进程触发ORA-600 13013导致数据库异常
alert日志报错信息
Thu Aug 4 18:39:44 2016 Database Characterset is ZHS16GBK replication_dependency_tracking turned off (no async multimaster replication found) Starting background process QMNC QMNC started with pid=33, OS id=22935 Thu Aug 4 18:39:44 2016 Completed: ALTER DATABASE OPEN Thu Aug 4 18:39:44 2016 db_recovery_file_dest_size of 2048 MB is 0.00% used. This is a user-specified limit on the amount of space that will be used by this database for recovery-related files, and does not reflect the amount of space available in the underlying filesystem or ASM diskgroup. Thu Aug 4 18:48:41 2016 Thread 1 advanced to log sequence 86746 Current log# 3 seq# 86746 mem# 0: /opt/ora10/oradata/ora10g/redo03.log Thu Aug 4 18:58:13 2016 Errors in file /opt/ora10/admin/ora10g/bdump/ora10g_smon_22449.trc: ORA-00600: internal error code, arguments: [13013], [5001], [482], [4198075], [40], [4198075], [17], [] Thu Aug 4 18:58:56 2016 Non-fatal internal error happenned while SMON was doing flushing of monitored table stats. SMON encountered 8 out of maximum 100 non-fatal internal errors. Thu Aug 4 18:59:06 2016 Errors in file /opt/ora10/admin/ora10g/bdump/ora10g_smon_22449.trc: ORA-00600: internal error code, arguments: [13013], [5001], [482], [4198075], [40], [4198075], [17], [] Thu Aug 4 18:59:08 2016 Errors in file /opt/ora10/admin/ora10g/bdump/ora10g_pmon_22413.trc: ORA-00474: SMON process terminated with error Thu Aug 4 18:59:08 2016 PMON: terminating instance due to error 474 Instance terminated by PMON, pid = 22413
通过trace文件大概可以发现是由于ORA-600 13013错误导致数据库crash,而且这里有类似”SMON was doing flushing of monitored table stats”错误提示,根据经验,很可能是smon把表的dml操作收集信息相关.
ORA-600 [13013] 含义
ORA-600 [13013] [a] [b] {c} [d] [e] [f] This format relates to Oracle Server 8.0.3 to 10.1 Arg [a] Passcount Arg [b] Data Object number Arg {c} Tablespace Relative DBA of block containing the row to be updated Arg [d] Row Slot number Arg [e] Relative DBA of block being updated (should be same as 1) Arg [f] Code
根据这个错误信息,以及How to resolve ORA-00600 [13013], [5001] [ID 816784.1]中的描述
ORA-600 13013 对应对象
SQL> select object_name from dba_objects where object_id=482 OBJECT_NAME -------------------------------------------------------------------------------- MON_MODS$
该对象正是和监控dml变化相关的表,smon会对其进行相关操作,以前写过一篇:MON_MODS$和MON_MODS_ALL$统计DML操作次数的文章
对于MON_MODS$表ORA-600 13013处理
SQL> analyze table mon_mods$ validate structure cascade; analyze table mon_mods$ validate structure cascade * ERROR at line 1: ORA-01499: table/index cross reference failure - see trace file SQL> select index_name from dba_indexes where table_name='MON_MODS$'; INDEX_NAME ------------------------------ I_MON_MODS$_OBJ SQL> ALTER INDEX I_MON_MODS$_OBJ REBUILD; Index altered. SQL> analyze table mon_mods$ validate structure cascade; analyze table mon_mods$ validate structure cascade * ERROR at line 1: ORA-01499: table/index cross reference failure - see trace file SQL> CREATE TABLE MON_MODS_BAK AS SELECT * FROM MON_MODS$; Table created. SQL> SELECT COUNT(*) FROM MON_MODS$; COUNT(*) ---------- 1247 SQL> C/MON_MODS$/MON_MODS_BAK; 1* SELECT COUNT(*) FROM MON_MODS_BAK SQL> / COUNT(*) ---------- 1247 SQL> TRUNCATE TABLE MON_MODS$; Table truncated. SQL> INSERT INTO MON_MODS$ SELECT * fROM MON_MODS_BAK; 1247 rows created. SQL> COMMIT; Commit complete. SQL> analyze table mon_mods$ validate structure cascade; Table analyzed.
自此关于MON_MODS$表相关的ORA-600 13013异常处理完全,当然也可以通过重建I_MON_MODS$_OBJ索引来解决,但是不能通过rebuild index解决.数据库也就不会因此而crash了.
发表在 Oracle备份恢复
评论关闭
ORA-600 4042 故障恢复
通过Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check) 检查结果
通过上图可以知道file 2未能正常恢复(需要看日志分析原因),file 3以前就被offline,需要历史归档(非归档状态,所以这个先放着,后续再处理)
分析file 2 不成功原因
Wed Aug 3 15:21:11 2016 ALTER DATABASE RECOVER datafile 2 Wed Aug 3 15:21:11 2016 Media Recovery Start parallel recovery started with 2 processes Wed Aug 3 15:21:11 2016 Recovery of Online Redo Log: Thread 1 Group 1 Seq 1916 Reading mem 0 Mem# 0 errs 0: /home/oracle/orabase/oradata/ORACLE/redo01.log Wed Aug 3 15:21:11 2016 Errors in file /u01/app/oracle/admin/oracle/bdump/oracle_p001_22017.trc: ORA-00600: internal error code, arguments: [3020], [2], [41], [8388649], [], [], [], [] ORA-10567: Redo is inconsistent with data block (file# 2, block# 41) ORA-10564: tablespace UNDOTBS1 ORA-01110: data file 2: '/home/oracle/orabase/oradata/ORACLE/undotbs01.dbf' ORA-10560: block type '0' Wed Aug 3 15:21:13 2016 Errors in file /u01/app/oracle/admin/oracle/bdump/oracle_p001_22017.trc: ORA-00600: internal error code, arguments: [3020], [2], [41], [8388649], [], [], [], [] ORA-10567: Redo is inconsistent with data block (file# 2, block# 41) ORA-10564: tablespace UNDOTBS1 ORA-01110: data file 2: '/home/oracle/orabase/oradata/ORACLE/undotbs01.dbf' ORA-10560: block type '0' Wed Aug 3 15:21:18 2016 Media Recovery failed with error 12801 ORA-283 signalled during: ALTER DATABASE RECOVER datafile 2 ...
通过日志可以知道由于ORA-600 3020导致file 2不能正常的恢复.
处理file 2
SQL> recover datafile 2 allow 1 corruption; Media recovery complete.
Thu Aug 4 01:58:35 2016 ALTER DATABASE RECOVER datafile 2 allow 1 corruption Media Recovery Start ALLOW CORRUPTION option must use serial recovery Thu Aug 4 01:58:35 2016 Recovery of Online Redo Log: Thread 1 Group 1 Seq 1916 Reading mem 0 Mem# 0 errs 0: /home/oracle/orabase/oradata/ORACLE/redo01.log Thu Aug 4 01:58:35 2016 Media Recovery Complete (oracle) Completed: ALTER DATABASE RECOVER datafile 2 allow 1 corruption
尝试open数据库
SQL> alter database open ; alter database open * ERROR at line 1: ORA-01092: ORACLE instance terminated. Disconnection forced
Thu Aug 4 01:59:20 2016 alter database open Thu Aug 4 01:59:21 2016 Beginning crash recovery of 1 threads parallel recovery started with 2 processes Thu Aug 4 01:59:21 2016 Started redo scan Thu Aug 4 01:59:21 2016 Completed redo scan 1619 redo blocks read, 0 data blocks need recovery Thu Aug 4 01:59:21 2016 Started redo application at Thread 1: logseq 1916, block 12724 Thu Aug 4 01:59:21 2016 Recovery of Online Redo Log: Thread 1 Group 1 Seq 1916 Reading mem 0 Mem# 0 errs 0: /home/oracle/orabase/oradata/ORACLE/redo01.log Thu Aug 4 01:59:21 2016 Completed redo application Thu Aug 4 01:59:21 2016 Completed crash recovery at Thread 1: logseq 1916, block 14343, scn 3303614971196 0 data blocks read, 0 data blocks written, 1619 redo blocks read Thu Aug 4 01:59:21 2016 LGWR: STARTING ARCH PROCESSES ARC0 started with pid=18, OS id=5542 Thu Aug 4 01:59:21 2016 ARC0: Archival started ARC1: Archival started LGWR: STARTING ARCH PROCESSES COMPLETE ARC1 started with pid=19, OS id=5544 Thu Aug 4 01:59:21 2016 Thread 1 advanced to log sequence 1917 Thread 1 opened at log sequence 1917 Current log# 2 seq# 1917 mem# 0: /home/oracle/orabase/oradata/ORACLE/redo02.log Successful open of redo thread 1 Thu Aug 4 01:59:21 2016 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Thu Aug 4 01:59:21 2016 ARC1: STARTING ARCH PROCESSES Thu Aug 4 01:59:21 2016 ARC0: Becoming the 'no FAL' ARCH ARC0: Becoming the 'no SRL' ARCH Thu Aug 4 01:59:21 2016 SMON: enabling cache recovery Thu Aug 4 01:59:21 2016 ARC2: Archival started ARC1: STARTING ARCH PROCESSES COMPLETE ARC1: Becoming the heartbeat ARCH ARC2 started with pid=20, OS id=5546 Thu Aug 4 01:59:21 2016 db_recovery_file_dest_size of 2048 MB is 1.05% used. This is a user-specified limit on the amount of space that will be used by this database for recovery-related files, and does not reflect the amount of space available in the underlying filesystem or ASM diskgroup. Thu Aug 4 01:59:22 2016 Errors in file /u01/app/oracle/admin/oracle/udump/oracle_ora_5505.trc: ORA-00600: internal error code, arguments: [4042], [0], [], [], [], [], [], [] Thu Aug 4 01:59:23 2016 Errors in file /u01/app/oracle/admin/oracle/udump/oracle_ora_5505.trc: ORA-00600: internal error code, arguments: [4042], [0], [], [], [], [], [], [] Thu Aug 4 01:59:23 2016 Error 600 happened during db open, shutting down database USER: terminating instance due to error 600 Instance terminated by USER, pid = 5505 ORA-1092 signalled during: alter database open ...
由于ORA-600 4042错误导致数据库无法正常open.
分析ORA-600 4042
PARSING IN CURSOR #4 len=142 dep=1 uid=0 oct=3 lid=0 tim=1435788503594313 hv=361892850 ad='a7ab2db8' select /*+ rule */ name,file#,block#,status$,user#,undosqn,xactsqn,scnbas,scnwrp, DECODE(inst#,0,NULL,inst#),ts#,spare1 from undo$ where us#=:1 END OF STMT PARSE #4:c=0,e=11,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,tim=1435788503594311 BINDS #4: kkscoacd Bind#0 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0 kxsbbbfp=2aae75802218 bln=22 avl=02 flg=05 value=3 EXEC #4:c=0,e=39,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,tim=1435788503594393 FETCH #4:c=0,e=8,p=0,cr=2,cu=0,mis=0,r=1,dep=1,og=3,tim=1435788503594412 STAT #4 id=1 cnt=1 pid=0 pos=1 obj=15 op='TABLE ACCESS BY INDEX ROWID UNDO$ (cr=2 pr=0 pw=0 time=8 us)' STAT #4 id=2 cnt=1 pid=1 pos=1 obj=34 op='INDEX UNIQUE SCAN I_UNDO1 (cr=1 pr=0 pw=0 time=3 us)' WAIT #1: nam='db file sequential read' ela= 10 file#=2 block#=41 blocks=1 obj#=-1 tim=1435788503594468 Dump of buffer cache at level 4 for tsn=1, rdba=8388649 BH (0x95ff3c58) file#: 2 rdba: 0x00800029 (2/41) class: 21 ba: 0x95ef0000 set: 3 blksize: 8192 bsi: 0 set-flg: 2 pwbcnt: 0 dbwrid: 0 obj: -1 objn: 0 tsn: 1 afn: 2 hash: [a8b77880,a8b77880] lru: [95ff3dd0,a8e70338] ckptq: [NULL] fileq: [NULL] objq: [a43da110,a43da110] use: [a8e6e658,a8e6e658] wait: [NULL] st: XCURRENT md: SHR tch: 0 flags: gotten_in_current_mode LRBA: [0x0.0.0] HSCN: [0xffff.ffffffff] HSUB: [65535] buffer tsn: 1 rdba: 0x00800029 (2/41) scn: 0x0000.00000000 seq: 0x01 flg: 0x01 tail: 0x00000001 frmt: 0x02 chkval: 0x0000 type: 0x00=unknown Hex dump of block: st=0, typ_found=0 Dump of memory from 0x0000000095EF0000 to 0x0000000095EF2000 095EF0000 0000A200 00800029 00000000 01010000 [....)...........] 095EF0010 00000000 00000000 00000000 00000000 [................] Repeat 509 times 095EF1FF0 00000000 00000000 00000000 00000001 [................] Dump of memory from 0x0000000095EF0014 to 0x0000000095EF1FFC 095EF0010 00000000 00000000 00000000 [............] 095EF0020 00000000 00000000 00000000 00000000 [................]
这里可以发现,file 2 block 41的type为unknown,注意观察ORA-600 3020的错误,我们发现当时报的坏块也正好是该block.基本上可以确定由于前面的allow 1 corruption操作导致了后面的ORA-600 4042的错误.官方关于ORA-600[4042]解释
通过修改undo$中的回滚段状态(参考:bbed修改undo$(回滚段)状态)
正常open数据库,修改file 3的scn并online数据文件
SQL> shutdown immediate; Database closed. Database dismounted. ORACLE instance shut down. SQL> startup mount ORACLE instance started. Total System Global Area 1224736768 bytes Fixed Size 2020384 bytes Variable Size 318770144 bytes Database Buffers 889192448 bytes Redo Buffers 14753792 bytes Database mounted. SQL> SELECT thread#, 2 a.sequence#, 3 a.group#, 4 TO_CHAR (first_change#, '9999999999999999') "SCN", 5 a.status, 6 MEMBER 7 FROM v$log a, v$logfile b 8 WHERE a.group# = B.GROUP# 9 ORDER BY a.sequence# DESC; THREAD# SEQUENCE# GROUP# SCN ---------- ---------- ---------- ---------------------------------- STATUS -------------------------------- MEMBER -------------------------------------------------------------------------------- 1 1919 1 3303615011212 CURRENT /home/oracle/orabase/oradata/ORACLE/redo01.log 1 1918 3 3303614991206 INACTIVE /home/oracle/orabase/oradata/ORACLE/redo03.log THREAD# SEQUENCE# GROUP# SCN ---------- ---------- ---------- ---------------------------------- STATUS -------------------------------- MEMBER -------------------------------------------------------------------------------- 1 1917 2 3303614971197 INACTIVE /home/oracle/orabase/oradata/ORACLE/redo02.log SQL> recover database using backup controlfile; ORA-00279: change 3303615011452 generated at 08/04/2016 02:06:52 needed for thread 1 ORA-00289: suggestion : /u01/app/oracle/flash_recovery_area/ORACLE/archivelog/2016_08_04/o1_mf_1_1919_%u _.arc ORA-00280: change 3303615011452 for thread 1 is in sequence #1919 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} /home/oracle/orabase/oradata/ORACLE/redo01.log Log applied. Media recovery complete. SQL> alter database datafile 3 online; Database altered. SQL> alter database open resetlogs; Database altered. SQL>
至此该数据库基本上恢复完成,强烈建议使用逻辑方式导出导入重建库.
ORA-600 4194/ORA-600 4193/ORA-600 4137故障解决
对于常见的undo异常错误,ORA-600 4193,ORA-600 4194,ORA-600 4137等错误的处理一般步骤.
适用版本
Oracle Database - Enterprise Edition - Version 9.2.0.1 to 11.2.0.4 [Release 9.2 to 11.2] Information in this document applies to any platform.
报错现象
The following error is occurring in the alert.log right before the database crashes. ORA-00600: internal error code, arguments: [4194], [#], [#], [], [], [], [], [] This error indicates that a mismatch has been detected between redo records and rollback (undo) records. ARGUMENTS: Arg [a] - Maximum Undo record number in Undo block Arg [b] - Undo record number from Redo block Since we are adding a new undo record to our undo block, we would expect that the new record number is equal to the maximum record number in the undo block plus one. Before Oracle can add a new undo record to the undo block it validates that this is correct. If this validation fails, then an ORA-600 [4194] will be triggered.
报错原因
This also can be cause by the following defect Bug 8240762 Abstract: Undo corruptions with ORA-600 [4193]/ORA-600 [4194] or ORA-600 [4137] after SHRINK Details: Undo corruption may be caused after a shrink and the same undo block may be used for two different transactions causing several internal errors like: ORA-600 [4193] / ORA-600 [4194] for new transactions ORA-600 [4137] for a transaction rollback
处理步骤
Best practice to create a new undo tablespace. This method includes segment check. Create pfile from spfile to edit >create pfile from spfile; 1. Shutdown the instance 2. set the following parameters in the pfile undo_management = manual event = '10513 trace name context forever, level 2' 3. >startup restrict pfile=<initsid.ora> 4. >select tablespace_name, status, segment_name from dba_rollback_segs where status != 'OFFLINE'; This is critical - we are looking for all undo segments to be offline - System will always be online. If any are 'PARTLY AVAILABLE' or 'NEEDS RECOVERY' - Please open an issue with Oracle Support or update the current SR. If all offline then continue to the next step 5. Create new undo tablespace - example >create undo tablespace <new undo tablespace> datafile <datafile> size 2000M; 6. Drop old undo tablespace >drop tablespace <old undo tablespace> including contents and datafiles; 7. >shutdown immediate; 8 >startup nomount; --> Using your Original spfile 9 modify the spfile with the new undo tablespace name Alter system set undo_tablespace = '<new tablespace created in step 5>' scope=spfile; 10. >shutdown immediate; 11. >startup; --> Using spfile The reason we create a new undo tablespace first is to use new undo segment numbers that are higher then the current segments being used. This way when a transaction goes to do block clean-out the reference to that undo segment does not exist and continues with the block clean-out.
参考:tep by step to resolve ORA-600 4194 4193 4197 on database crash (Doc ID 1428786.1)