标签归档:ORA-01207

分布式存储故障导致数据库无法启动故障处理

国内xx医院使用了国外医疗行业龙头的pacs系统,由于是一个历史库,存放在分布式存储中,由于存储同时多个节点故障,导致数据库多个文件异常,数据库无法启动,三方维护人员尝试通通过rman归档进行应用日志,结果发现日志有损坏报ORA-00354 ORA-00353,无法记录恢复,希望我们给予支持
ORA-00353

Mon Apr 29 13:28:40 2024
Media Recovery failed with error 354
Mon Apr 29 13:28:40 2024
Errors in file F:\XXXXXX_DB\ORACLE\ADMIN\diag\rdbms\xxx\msxxx1\trace\xxx_pr00_4568.trc:
ORA-00283: recovery session canceled due to errors
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 487424 change 8737273868 time 04/01/2024 01:38:25
ORA-00334: archived log: 'F:\XXXXXX_DB\ORADATA\XXX\LOGARC0000052184_0922116268.0001'
ORA-283 signalled during: alter database recover logfile 'F:\XXXXXX_DB\ORADATA\XXX\LOGARC0000052184_0922116268.0001'...

接手故障之后,通过尝试恢复发现除该错误之外,还有ORA-600 4552之类错误
ORA-600-4552


跳过这些异常文件恢复,最终确认异常文件有如下部分
20240511223413

这些文件由于日志无法正常应用(有日志损坏无法应用,有日志和数据文件block不匹配导致无法应用),这样的情况直接通过自研的Oracle Recovery Tools小工具直接修改文件头信息
orarecovery

然后尝试OPEN数据库结果报ORA-1207
ORA-1207

对于这个故障可以通过rectl或者using backup ctl方式处理,然后open数据库成功
20240510224845

由于该系统是历史库,不会有新业务写入,通过对异常表和索引进行处理之后,客户测试业务可以正常访问,完成本次恢复

发表在 Oracle备份恢复 | 标签为 , , , , | 留下评论

ORA-1200/ORA-1207数据库恢复

由于系统性能问题或者底层io问题,数据库alert日志报一下控制文件损坏错误然后crash掉

Mon Nov 13 08:06:44 2023
Thread 1 advanced to log sequence 12100 (LGWR switch)
  Current log# 1 seq# 12100 mem# 0: /u01/oracle/oradata/xifenfei/redo01.log
Mon Nov 13 09:23:59 2023
********************* ATTENTION: ******************** 
 The controlfile header block returned by the OS
 has a sequence number that is too old. 
 The controlfile might be corrupted.
 PLEASE DO NOT ATTEMPT TO START UP THE INSTANCE 
 without following the steps below.
 RE-STARTING THE INSTANCE CAN CAUSE SERIOUS DAMAGE 
 TO THE DATABASE, if the controlfile is truly corrupted.
 In order to re-start the instance safely, 
 please do the following:
 (1) Save all copies of the controlfile for later 
     analysis and contact your OS vendor and Oracle support.
 (2) Mount the instance and issue: 
     ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
 (3) Unmount the instance. 
 (4) Use the script in the trace file to
     RE-CREATE THE CONTROLFILE and open the database. 
*****************************************************
USER (ospid: 17064): terminating the instance
Mon Nov 13 09:24:00 2023
System state dump requested by (instance=1, osid=17064), summary=[abnormal instance termination].

重启数据库报ORA-01122 ORA-01110 ORA-01207错误

Mon Nov 13 10:11:21 2023
ALTER DATABASE OPEN
Errors in file /u01/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_25824.trc:
ORA-01122: database file 1 failed verification check
ORA-01110: data file 1: '/u01/oracle/oradata/xifenfei/system01.dbf'
ORA-01207: file is more recent than control file - old control file
ORA-1122 signalled during: ALTER DATABASE OPEN...

处理好上述错误之后遭遇ORA-01122 ORA-01110 ORA-01200,类似文章:
bbed处理ORA-01200故障
ORA-01122 ORA-01200故障处理

Mon Nov 13 10:51:48 2023
alter database open
Read of datafile '/u01/oracle/oradata/xifenfei/sysaux01.dbf' (fno 2) header failed with ORA-01200
Rereading datafile 2 header failed with ORA-01200
Errors in file /u01/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_24148.trc:
ORA-01122: database file 2 failed verification check
ORA-01110: data file 2: '/u01/oracle/oradata/xifenfei/sysaux01.dbf'
ORA-01200: actual file size of 2860800 is smaller than correct size of 2867200 blocks
ORA-1122 signalled during: alter database open...

解决上述错误之后,尝试open库报ORA-00314 ORA-00312之类错误

Mon Nov 13 18:00:43 2023
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Started redo scan
Completed redo scan
 read 61894 KB redo, 589 data blocks need recovery
Started redo application at
 Thread 1: logseq 12100, block 112760
Recovery of Online Redo Log: Thread 1 Group 1 Seq 12100 Reading mem 0
  Mem# 0: /u01/oracle/oradata/xifenfei/redo01.log
Completed redo application of 1.20MB
Mon Nov 13 18:00:44 2023
Hex dump of (file 2, block 39078) in trace file /u01/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p011_27469.trc
Reading datafile '/u01/oracle/oradata/xifenfei/sysaux01.dbf' for corruption at rdba: 0x008098a6 (file 2, block 39078)
Reread (file 2, block 39078) found same corrupt data (logically corrupt)
RECOVERY OF THREAD 1 STUCK AT BLOCK 39078 OF FILE 2
Mon Nov 13 18:00:44 2023
Exception [type: SIGSEGV, Address not mapped to object][ADDR:0xC][PC:0x95FB838, kdxlin()+4946][flags: 0x0, count: 1]
Mon Nov 13 18:00:44 2023
Exception [type: SIGSEGV, Address not mapped to object][ADDR:0xC][PC:0x95FB4DE, kdxlin()+4088][flags: 0x0, count: 1]
Errors in file /u01/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p011_27469.trc:
ORA-00314: log 2 of thread 1, expected sequence# 12093 doesn't match 12085
ORA-00312: online log 2 thread 1: '/u01/oracle/oradata/xifenfei/redo02.log'
Errors in file /u01/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p011_27469.trc:
ORA-00314: log 3 of thread 1, expected sequence# 12096 doesn't match 12080
ORA-00312: online log 3 thread 1: '/u01/oracle/oradata/xifenfei/redo03.log'
ORA-00314: log 2 of thread 1, expected sequence# 12093 doesn't match 12085
ORA-00312: online log 2 thread 1: '/u01/oracle/oradata/xifenfei/redo02.log'

后面继续处理遇到类似这样错误

ALTER DATABASE RECOVER    CANCEL  
Errors in file /u01/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_pr00_31110.trc:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/xifenfei/system01.dbf'
ORA-1547 signalled during: ALTER DATABASE RECOVER    CANCEL  ...
ALTER DATABASE RECOVER CANCEL 
ORA-1112 signalled during: ALTER DATABASE RECOVER CANCEL ...
Mon Nov 13 19:06:28 2023
alter database open resetlogs
ORA-1194 signalled during: alter database open resetlogs...

最后确认其他数据文件均可recover 成功,只有file 2 无法正常recover

SQL> recover datafile 1;
Media recovery complete.
SQL> recover datafile 2;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [2], [950840], [9339448],
[], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 2, block# 950840, file
offset is 3494313984 bytes)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '/u01/oracle/oradata/xifenfei/sysaux01.dbf'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
SQL> recover datafile 3;
Media recovery complete.
SQL> recover datafile 4;
Media recovery complete.
SQL> recover datafile 5;
Media recovery complete.
SQL> recover datafile 6;
Media recovery complete.
SQL> recover datafile 7;
Media recovery complete.

SQL> recover  datafile 2 allow 1 corruption;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [2], [2410240], [10798848],
[], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 2, block# 2410240, file
offset is 2564816896 bytes)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '/u01/oracle/oradata/xifenfei/sysaux01.dbf'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'

通过bbed修改文件头,直接open数据库成功,并协助客户顺利导出数据
参考类似文章:
使用bbed修复损坏datafile header
使用bbed让rac中的sysaux数据文件online

发表在 Oracle备份恢复 | 标签为 , | 评论关闭

ORACLE 8.0.5 ORA-01207故障恢复

朋友和我说,他的数据库ORACLE 8.0.5出现ORA-01207,进行了尝试恢复但是别未成功,让我协助其完成恢复
数据库版本

SVRMGR> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle8 Release 8.0.5.0.0 - Production
PL/SQL Release 8.0.5.0.0 - Production
CORE Version 4.0.5.0.0 - Production
TNS for 32-bit Windows: Version 8.0.5.0.0 - Production
NLSRTL Version 3.3.2.0.0 - Production
5 rows selected.

open数据库报ORA-01207错误

SVRMGR> alter database open;
alter database open
*
ORA-01122: database file 1 failed verification check
ORA-01110: data file 1: 'D:\ORANT\DATABASE\SYS1ORCL.ORA'
ORA-01207: file is more recent than controlfile - old controlfile

出现该错误的原因是因为控制文件里面的scn或者checkpoint_time>数据文件中的对应值,从而出现该错误,解决方法重建控制文件或者执行recover using backup controlfile 之类命令

重建控制文件,并open报ORA-600[4147]

SVRMGR> alter database backup controlfile to trace;
Statement processed.
SVRMGR> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SVRMGR> STARTUP NOMOUNT
ORACLE instance started.
Total System Global Area                         15077376 bytes
Fixed Size                                          49152 bytes
Variable Size                                    12906496 bytes
Database Buffers                                  2048000 bytes
Redo Buffers                                        73728 bytes
SVRMGR> CREATE CONTROLFILE REUSE DATABASE "ORCL" NORESETLOGS NOARCHIVELOG
     2>     MAXLOGFILES 32
     3>     MAXLOGMEMBERS 2
     4>     MAXDATAFILES 32
     5>     MAXINSTANCES 16
     6>     MAXLOGHISTORY 3260
     7> LOGFILE
     8>   GROUP 1 'D:\ORANT\DATABASE\LOG4ORCL.ORA'  SIZE 1M,
     9>   GROUP 2 'D:\ORANT\DATABASE\LOG3ORCL.ORA'  SIZE 1M,
    10>   GROUP 3 'D:\ORANT\DATABASE\LOG2ORCL.ORA'  SIZE 1M,
    11>   GROUP 4 'D:\ORANT\DATABASE\LOG1ORCL.ORA'  SIZE 1M
    12> DATAFILE
    13>   'D:\ORANT\DATABASE\SYS1ORCL.ORA',
    14>   'D:\ORANT\DATABASE\USR1ORCL.ORA',
    15>   'D:\ORANT\DATABASE\RBS1ORCL.ORA',
    16>   'D:\ORANT\DATABASE\TMP1ORCL.ORA'
    17> ;
Statement processed.
SVRMGR> recover database using backup controlfile;
ORA-00279: change 46960617 generated at 01/31/14 18:51:49 needed for thread 1
ORA-00289: suggestion : D:\ORANT\RDBMS80\ARC12900.1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
D:\ORANT\DATABASE\LOG3ORCL.ORA
Log applied.
Media recovery complete.
SVRMGR> alter database open;
alter database open
*
ORA-00600: internal error code, arguments: [4147], [16], [1], [], [], [], [], []

The ORA-600[4147] basically indicates some kind of corruption with the UNDO (rollback segment)block, most probably due to a lost write to the rollback segment.
ORA-600[4147]是因为回滚段坏块导致(具体是因为undoblock的scn无效),解决方法是用dul找出来回滚段,并屏蔽之

继续恢复报ORA-00600[3668]

SVRMGR> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SVRMGR> startup
ORACLE instance started.
Total System Global Area                         15077376 bytes
Fixed Size                                          49152 bytes
Variable Size                                    12906496 bytes
Database Buffers                                  2048000 bytes
Redo Buffers                                        73728 bytes
Database mounted.
ORA-00600: internal error code, arguments: [3668], [1], [2], [17232], [17232], [4], [], []

ORA-00600[3668]是因为在ORACLE 7.0到9.2的版本中The FIRST time an attempt has been made to start an instance after a CREATE CONTROLFILE command has been issued.
At least one data file needs MEDIA RECOVERY.在9.2.0.x及其以后版本报:ORA-1113: file needs media recovery.
通过重建控制文件,执行recover database,再open数据库恢复成功

发表在 Oracle备份恢复 | 标签为 , , , , , , | 2 条评论