标签归档:smon_scn_time

ORA-10562 故障恢复—allow 1 corruption

朋友数据库由于存储变动,导致数据库瞬间hang住,然后直接crash,之后无法正常启动,请求技术支持.
数据库报ORA-00600[2131]错误
不能mount,可以通过重建控制文件解决

Mon Nov 30 20:35:38 2015
alter database mount
Mon Nov 30 20:35:38 2015
NOTE: Loaded library: System 
Mon Nov 30 20:35:38 2015
SUCCESS: diskgroup DATADG was mounted
Mon Nov 30 20:35:38 2015
NOTE: dependency between database xifenfei and diskgroup resource ora.DATADG.dg is established
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_26450.trc  (incident=3032256):
ORA-00600: internal error code, arguments: [2131], [33], [32], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: alter database mount...

尝试recover数据库

Mon Nov 30 20:45:53 2015
ALTER DATABASE RECOVER  database  
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 80 slaves
Mon Nov 30 20:45:56 2015
Recovery of Online Redo Log: Thread 2 Group 11 Seq 617 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo011.log
Recovery of Online Redo Log: Thread 1 Group 4 Seq 5410 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo04.log
Recovery of Online Redo Log: Thread 1 Group 5 Seq 5411 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo05.log
Mon Nov 30 20:46:07 2015
Recovery of Online Redo Log: Thread 1 Group 6 Seq 5412 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo06.log
Mon Nov 30 20:46:13 2015
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0xC] [PC:0x95FB502, kdxlin()+4088] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_pr13_30480.trc  (incident=3032568):
ORA-07445: 出现异常错误: 核心转储 [kdxlin()+4088] [SIGSEGV] [ADDR:0xC] [PC:0x95FB502] [Address not mapped to object] []
Mon Nov 30 20:46:17 2015
Sweep [inc][3032568]: completed
Sweep [inc2][3032568]: completed
Mon Nov 30 20:46:31 2015
Slave exiting with ORA-10562 exception
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_pr13_30480.trc:
ORA-10562: Error occurred while applying redo to data block (file# 2, block# 165054)
ORA-10564: tablespace SYSAUX
ORA-01110: 数据文件 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 271
ORA-00607: 当更改数据块时出现内部错误
ORA-00602: 内部编程异常错误
ORA-07445: 出现异常错误: 核心转储 [kdxlin()+4088] [SIGSEGV] [ADDR:0xC] [PC:0x95FB502] [Address not mapped to object] []
Mon Nov 30 20:46:31 2015
Recovery Slave PR13 previously exited with exception 10562
Mon Nov 30 20:46:33 2015
Checker run found 28 new persistent data failures
Mon Nov 30 20:46:35 2015
Media Recovery failed with error 448
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_pr00_30400.trc:
ORA-00283: 恢复会话因错误而取消
ORA-00448: 后台进程正常结束
ORA-10562 signalled during: ALTER DATABASE RECOVER  database  ...

通过这里可以看到,由于在recover 操作之时,由于某种原因redo的数据无法apply到file 2 block 165054中,导致数据库recover database失败.

按照数据文件recover操作

SQL> recover datafile 1;
Media recovery complete.
SQL> recover datafile 3,4,5,6,7;
Media recovery complete.
SQL> recover datafile 8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28;              
Media recovery complete.
SQL> recover datafile 2;
ORA-00283: recovery session canceled due to errors
ORA-10562: Error occurred while applying redo to data block (file# 2, block#
165054)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 271
ORA-00607: Internal error occurred while making a change to a data block
ORA-00602: internal programming exception
ORA-07445: exception encountered: core dump [kdxlin()+4088] [SIGSEGV]
[ADDR:0xC] [PC:0x95FB502] [Address not mapped to object] []

错误提示和recover database一样,那我们只能让恢复跳过该block继续恢复,因为根据经验判断data object# 271不是系统核心对象,不会影响数据库的启动

跳过坏块继续恢复

SQL> recover  datafile 2 allow 1 corruption;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [2], [69793], [8458401], [],
[], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 2, block# 69793, file
offset is 571744256 bytes)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 272


SQL> recover  datafile 2 allow 1 corruption;
Media recovery complete.
SQL> alter database open;

Database altered.

出现了ORA-600[3020] 继续跳过坏块,然后数据库顺利open,别忘记加tempfile

处理异常对象

SQL> select object_name,object_type from dba_objects where data_object_id in(272,271);

OBJECT_NAME
--------------------------------------------------------------------------------
OBJECT_TYPE
-------------------
SMON_SCN_TIME_TIM_IDX
INDEX

SMON_SCN_TIME_SCN_IDX
INDEX

SQL> select index_name from dba_indexes where table_name='SMON_SCN_TIME';

INDEX_NAME
------------------------------
SMON_SCN_TIME_TIM_IDX
SMON_SCN_TIME_SCN_IDX

SQL> set pages 1000
SQL> set long 1000
SQL> Select dbms_metadata.get_ddl('TABLE','SMON_SCN_TIME','SYS') FROM DUAL ;

DBMS_METADATA.GET_DDL('TABLE','SMON_SCN_TIME','SYS')
--------------------------------------------------------------------------------

  CREATE TABLE "SYS"."SMON_SCN_TIME"
   (    "THREAD" NUMBER,
        "TIME_MP" NUMBER,
        "TIME_DP" DATE,
        "SCN_WRP" NUMBER,
        "SCN_BAS" NUMBER,
        "NUM_MAPPINGS" NUMBER,
        "TIM_SCN_MAP" RAW(1200),
        "SCN" NUMBER DEFAULT 0,
        "ORIG_THREAD" NUMBER DEFAULT 0           /* for downgrade */
   ) CLUSTER "SYS"."SMON_SCN_TO_TIME_AUX" ("THREAD")

SQL> analyze table smon_scn_time validate structure cascade online;
analyze table smon_scn_time validate structure cascade online
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 2, block # 165054)
ORA-01110: data file 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'

SQL> truncate CLUSTER "SYS"."SMON_SCN_TO_TIME_AUX";

Cluster truncated.

关于SMON_SCN_TIME部分处理,可以参考:关于SMON_SCN_TIME若干问题说明.至此数据库基本上恢复完成,而且运气非常好,恢复的非常完美,数据实现0丢失.

发表在 Oracle备份恢复 | 标签为 , , , , | 评论关闭

关于SMON_SCN_TIME若干问题说明

1.SMON_SCN_TIME表基础知识
0)作用
由smon收集scn和time映射关系,用于flashback/查询scn和time对应关系等操作

1)保留条数
官方文档给出说明instance number N * 12 times per hour * 24 hours * 5 days = 1440N rows,因为每次的时间间隔不是非常准确的5分钟,所以在具体的条数在实际生产环境中有一定的出入

2)采集和删除
smon进程没5分钟采集一次插入到SMON_SCN_TIME表中,同时将删除历史数据(超过5天前数据),采用下面语句
delete from smon_scn_time where thread=0 and time_mp = (select min(time_mp) from smon_scn_time where thread=0),如果有时候index出了问题,导致该语句执行很慢

3)当查询scn对应time,如果scn超过SMON_SCN_TIME表范围,将提示错误;或者查询time对应的scn,如果超过范围也同样报错。

2.当SMON_SCN_TIME表出现问题时,truncate操作语句

--找出CLUSTER
Select dbms_metadata.get_ddl('TABLE','SMON_SCN_TIME','SYS') FROM DUAL ;

9i
truncate CLUSTER "SYS"."SMON_SCN_TO_TIME";

10g
truncate  CLUSTER "SYS"."SMON_SCN_TO_TIME";

11g
truncate CLUSTER "SYS"."SMON_SCN_TO_TIME_AUX";

3.停止/开启smon进程收集scn信息

stop
alter system set events '12500 trace name context forever, level 10';

start
alter system set events '12500 trace name context off';

4.index异常处理

--找出index
Select index_name,index_type,owner from dba_indexes where table_name='SMON_SCN_TIME';
--对应index创建语句
Select dbms_metadata.get_ddl('INDEX','SMON_SCN_TIME_TIM_IDX','SYS') FROM DUAL;

--重建index
drop index smon_scn_time_scn_idx;
drop index smon_scn_time_tim_idx;
create unique index smon_scn_time_scn_idx on smon_scn_time(scn);
create unique index smon_scn_time_tim_idx on smon_scn_time(time_mp);
analyze table smon_scn_time validate structure cascade online;

--rebuild index
alter index sys.smon_scn_time_scn_idx rebuild online;
alter index sys.smon_scn_time_tim_idx rebuild online;
analyze table smon_scn_time validate structure cascade online;

补充知识点scn计算方法SCN=(SCN_WRP * 4294967296) + SCN_BAS

发表在 Oracle | 标签为 | 一条评论