标签归档:OSD-04016

硬件故障导致ORA-600 2662错误处理

前几天恢复了一个40多T的CASE:ORA-00600: internal error code, arguments: [16513], [1403] 恢复,又一个近30T的库由于硬件故障,通过其他人一系列恢复之后,无法正常open,让我们提供技术支持:
故障最初原因是由于存储异常

Fri Feb 19 09:03:49 2021
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_3460.trc:
ORA-01114: 将块写入文件 849 时出现 IO 错误 (块 # 3871748)
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1167) 设备没有连接。
ORA-01114: 将块写入文件 849 时出现 IO 错误 (块 # 3871748)
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1167) 设备没有连接。

通过其他人一系列处理后,数据库报ORA-600 2662错误

Sat Feb 20 08:19:35 2021
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Feb 20 08:19:35 2021
SMON: enabling cache recovery
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_5304.trc(incident=1960181):
ORA-00600:internal error code,arguments:[2662],[4],[2185364344], [4],[2185453722],[893388032],[],[],[],[],[],[]
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_5304.trc:
ORA-00600:internal error code,arguments:[2662],[4],[2185364344], [4],[2185453722],[893388032],[],[],[],[],[],[]
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_5304.trc:
ORA-00600:internal error code,arguments:[2662],[4],[2185364344], [4],[2185453722],[893388032],[],[],[],[],[],[]
Error 600 happened during db open, shutting down database
USER (ospid: 5304): terminating the instance due to error 600
Instance terminated by USER, pid = 5304
ORA-1092 signalled during: ALTER DATABASE OPEN...
opiodr aborting process unknown ospid (5304) as a result of ORA-1092
Sat Feb 20 08:19:42 2021
ORA-1092 : opitsk aborting process

通过对scn处理,数据库顺利绕过该错误,然后报ORA-600 4194错误

Doing block recovery for file 213 block 4688
No block recovery was needed
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_smon_7048.trc(incident=1984136):
ORA-00600: internal error code, arguments: [4194], [38.4.1381252], [0], [], [],[],[],[],[],[],[],[]
Sat Feb 20 10:50:45 2021
Doing block recovery for file 213 block 4688
No block recovery was needed
Fatal internal error happened while SMON was doing active transaction recovery.
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_smon_7048.trc:
ORA-00600: internal error code, arguments: [4194], [38.4.1381252], [0], [], [],[],[],[],[],[],[],[]
SMON (ospid: 7048): terminating the instance due to error 474
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_6652.trc(incident=1984185):
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
Sat Feb 20 10:50:52 2021
Instance terminated by SMON, pid = 7048

通过对异常事务进行处理,屏蔽smon进程进行回滚,数据库open成功,但是报ORA-600 4137错误

Sat Feb 20 10:53:46 2021
Sweep [inc][1992133]: completed
Stopping background process MMNL
Sat Feb 20 10:53:47 2021
Trace dumping is performing id=[cdmp_20210220105347]
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_smon_6576.trc(incident=1992134):
ORA-00600: internal error code, arguments: [4137], [23.13.3094188], [0], [0], [], [], [], [], [], [], [], []
ORACLE Instance xifenfei (pid = 14) - Error 600 encountered while recovering transaction (23, 13).
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_smon_6576.trc:
ORA-00600: internal error code, arguments: [4137], [23.13.3094188], [0], [0], [], [], [], [], [], [], [], []
Sat Feb 20 10:53:47 2021
Sweep [inc2][1992133]: completed
Sat Feb 20 10:53:47 2021
Sweep [inc][1992134]: completed
Stopping background process MMON
Trace dumping is performing id=[cdmp_20210220105348]
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_smon_6576.trc(incident=1992135):
ORA-00600: internal error code, arguments: [4137], [38.4.1381252], [0], [0], [], [], [], [], [], [], [], []
Starting background process MMON
Starting background process MMNL
Sat Feb 20 10:53:48 2021
MMON started with pid=16, OS id=6448 
ALTER SYSTEM enable restricted session;
Sat Feb 20 10:53:48 2021
MMNL started with pid=36, OS id=6840 
ORACLE Instance xifenfei (pid = 14) - Error 600 encountered while recovering transaction (38, 4).
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_smon_6576.trc:
ORA-00600: internal error code, arguments: [4137], [38.4.1381252], [0], [0], [], [], [], [], [], [], [], []
Sat Feb 20 10:53:49 2021
Sweep [inc][1992135]: completed
Trace dumping is performing id=[cdmp_20210220105349]
replication_dependency_tracking turned off (no async multimaster replication found)
Completed: alter database open

对异常回滚段进行处理,数据库后端启动正常,不再报明显ORA-错误.通过hcheck.sql检查字典正常

HCheck Version 07MAY18 on 20-FEB-2021 11:35:11
----------------------------------------------
Catalog Version 11.2.0.1.0 (1102000100)
db_name: JYJG

                                   Catalog       Fixed
Procedure Name                     Version    Vs Release    Timestamp
Result
------------------------------ ... ---------- -- ---------- --------------
------
.- LobNotInObj                 ... 1102000100 <=  *All Rel* 02/20 11:35:11 PASS
.- MissingOIDOnObjCol          ... 1102000100 <=  *All Rel* 02/20 11:35:11 PASS
.- SourceNotInObj              ... 1102000100 <=  *All Rel* 02/20 11:35:11 PASS
.- IndIndparMismatch           ... 1102000100 <= 1102000100 02/20 11:35:12 PASS
.- InvCorrAudit                ... 1102000100 <= 1102000100 02/20 11:35:12 PASS
.- OversizedFiles              ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- PoorDefaultStorage          ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- PoorStorage                 ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- PartSubPartMismatch         ... 1102000100 <= 1102000100 02/20 11:35:12 PASS
.- TabPartCountMismatch        ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- OrphanedTabComPart          ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- MissingSum$                 ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- MissingDir$                 ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- DuplicateDataobj            ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- ObjSynMissing               ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- ObjSeqMissing               ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedUndo                ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedIndex               ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedIndexPartition      ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedIndexSubPartition   ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedTable               ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedTablePartition      ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedTableSubPartition   ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- MissingPartCol              ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedSeg$                ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedIndPartObj#         ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- DuplicateBlockUse           ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- FetUet                      ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- Uet0Check                   ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- ExtentlessSeg               ... 1102000100 <= 1102000100 02/20 11:35:13 PASS
.- SeglessUET                  ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- BadInd$                     ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- BadTab$                     ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- BadIcolDepCnt               ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- ObjIndDobj                  ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- TrgAfterUpgrade             ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- ObjType0                    ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- BadOwner                    ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- StmtAuditOnCommit           ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- BadPublicObjects            ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- BadSegFreelist              ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- BadDepends                  ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- CheckDual                   ... 1102000100 <=  *All Rel* 02/20 11:35:14 PASS
.- ObjectNames                 ... 1102000100 <=  *All Rel* 02/20 11:35:14 PASS
.- BadCboHiLo                  ... 1102000100 <=  *All Rel* 02/20 11:35:14 PASS
.- ChkIotTs                    ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- NoSegmentIndex              ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- BadNextObject               ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- DroppedROTS                 ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- FilBlkZero                  ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- DbmsSchemaCopy              ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- OrphanedObjError            ... 1102000100 >  1102000000 02/20 11:35:15 PASS
.- ObjNotLob                   ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- MaxControlfSeq              ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- SegNotInDeferredStg         ... 1102000100 >  1102000000 02/20 11:35:18 PASS
.- SystemNotRfile1             ... 1102000100 >   902000000 02/20 11:35:18 PASS
.- DictOwnNonDefaultSYSTEM     ... 1102000100 <=  *All Rel* 02/20 11:35:19 PASS
.- OrphanTrigger               ... 1102000100 <=  *All Rel* 02/20 11:35:19 PASS
.- ObjNotTrigger               ... 1102000100 <=  *All Rel* 02/20 11:35:19 PASS
---------------------------------------
20-FEB-2021 11:35:19  Elapsed: 8 secs
---------------------------------------
Found 0 potential problem(s) and 0 warning(s)

PL/SQL procedure successfully completed.

Statement processed.

虽然字典正常,但是由于数据库屏蔽了一致性,建议客户在条件允许的情况下,进行逻辑迁移,排除风险隐患.

发表在 非常规恢复 | 标签为 , , , , , , | 评论关闭

硬件故障数据库异常恢复

硬件故障数据库crash
有客户由于硬件故障导致数据库异常ORA-00345 ORA-00312 ORA-27070 OSD-04016

Tue Feb 05 16:58:26 2019
Thread 1 advanced to log sequence 17139 (LGWR switch)
  Current log# 12 seq# 17139 mem# 0: S:\ORADATA\ORCL\REDO12A.LOG
  Current log# 12 seq# 17139 mem# 1: S:\ORADATA\ORCL\REDO12B.LOG
Tue Feb 05 19:47:24 2019
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_lgwr_2420.trc:
ORA-00345: redo log write error block 152097 count 8
ORA-00312: online log 12 thread 1: 'S:\ORADATA\ORCL\REDO12A.LOG'
ORA-27070: async read/write failed
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
ORA-00345: redo log write error block 152097 count 8
ORA-00312: online log 12 thread 1: 'S:\ORADATA\ORCL\REDO12B.LOG'
ORA-27070: async read/write failed
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
ORA-00345: redo log write error block 152105 count 1
ORA-00312: online log 12 thread 1: 'S:\ORADATA\ORCL\REDO12A.LOG'
ORA-27070: async read/write failed
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。

直接启动数据库报错
修复好硬件之后,直接启动数据库报ORA-00600 kcratr_scan_lastbwr错误

Fri Feb 08 20:58:15 2019
alter database mount exclusive
Successful mount of redo thread 1, with mount id 1527506791
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount exclusive
alter database open
Beginning crash recovery of 1 threads
Started redo scan
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3672.trc  (incident=41353):
ORA-00600: ??????, ??: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Incident details in: c:\oracle\diag\rdbms\orcl\orcl\incident\incdir_41353\orcl_ora_3672_i41353.trc
Aborting crash recovery due to error 600
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3672.trc:
ORA-00600: ??????, ??: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3672.trc:
ORA-00600: ??????, ??: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: alter database open...
Fri Feb 08 20:58:24 2019
Trace dumping is performing id=[cdmp_20190208205824]
Fri Feb 08 20:59:04 2019
alter database open 
Beginning crash recovery of 1 threads
Started redo scan
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_1696.trc  (incident=41354):
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Incident details in: c:\oracle\diag\rdbms\orcl\orcl\incident\incdir_41354\orcl_ora_1696_i41354.trc
Aborting crash recovery due to error 600
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_1696.trc:
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_1696.trc:
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: alter database open ...

recover database报错
执行recover database报错ORA-00600 6101,ORA-00600 kdourp_inorder2,ORA-00600 ktbsdp1,ORA-00600 3020

Fri Feb 08 21:09:20 2019
ALTER DATABASE RECOVER  database  
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 4 slaves
Fri Feb 08 21:09:21 2019
Recovery of Online Redo Log: Thread 1 Group 12 Seq 17139 Reading mem 0
  Mem# 0: S:\ORADATA\ORCL\REDO12A.LOG
  Mem# 1: S:\ORADATA\ORCL\REDO12B.LOG
Fri Feb 08 21:09:21 2019
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr02_3780.trc  (incident=49379):
ORA-00600: internal error code, arguments: [6101], [17], [21], [0], [], [], [], [], [], [], [], []
Incident details in: c:\oracle\diag\rdbms\orcl\orcl\incident\incdir_49379\orcl_pr02_3780_i49379.trc
Fri Feb 08 21:09:21 2019
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr01_2040.trc  (incident=49371):
ORA-00600: internal error code, arguments: [kdourp_inorder2], [34], [0], [0], [44], [], [], [], [], [], [], []
Incident details in: c:\oracle\diag\rdbms\orcl\orcl\incident\incdir_49371\orcl_pr01_2040_i49371.trc
Fri Feb 08 21:09:21 2019
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr03_1068.trc  (incident=49387):
ORA-00600: internal error code, arguments: [ktbsdp1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: c:\oracle\diag\rdbms\orcl\orcl\incident\incdir_49387\orcl_pr03_1068_i49387.trc
Fri Feb 08 21:09:24 2019
Trace dumping is performing id=[cdmp_20190208210924]
Slave exiting with ORA-10562 exception
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr03_1068.trc:
ORA-10562: Error occurred while applying redo to data block (file# 4, block# 1716972)
ORA-10564: tablespace USERS
ORA-01110: data file 4: 'S:\ORADATA\ORCL\USERS01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 204127
ORA-00600: internal error code, arguments: [ktbsdp1], [], [], [], [], [], [], [], [], [], [], []
Slave exiting with ORA-10562 exception
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr02_3780.trc:
ORA-10562: Error occurred while applying redo to data block (file# 4, block# 1738552)
ORA-10564: tablespace USERS
ORA-01110: data file 4: 'S:\ORADATA\ORCL\USERS01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 211606
ORA-00600: internal error code, arguments: [6101], [17], [21], [0], [], [], [], [], [], [], [], []
Slave exiting with ORA-10562 exception
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr01_2040.trc:
ORA-10562: Error occurred while applying redo to data block (file# 4, block# 1725898)
ORA-10564: tablespace USERS
ORA-01110: data file 4: 'S:\ORADATA\ORCL\USERS01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 73907
ORA-00600: internal error code, arguments: [kdourp_inorder2], [34], [0], [0], [44], [], [], [], [], [], [], []
Recovery Slave PR03 previously exited with exception 10562
Fri Feb 08 21:09:28 2019
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr04_2608.trc  (incident=49395):
ORA-00600: internal error code, arguments: [3020], [4], [1739291], [18516507], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 4, block# 1739291, file offset is 1363369984 bytes)
ORA-10564: tablespace USERS
ORA-01110: data file 4: 'S:\ORADATA\ORCL\USERS01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 211552
Incident details in: c:\oracle\diag\rdbms\orcl\orcl\incident\incdir_49395\orcl_pr04_2608_i49395.trc
Slave exiting with ORA-600 exception
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr04_2608.trc:
ORA-00600: internal error code, arguments: [3020], [4], [1739291], [18516507], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 4, block# 1739291, file offset is 1363369984 bytes)
ORA-10564: tablespace USERS
ORA-01110: data file 4: 'S:\ORADATA\ORCL\USERS01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 211552
Media Recovery failed with error 448
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_1548.trc:
ORA-00283: recovery session canceled due to errors
ORA-00448: normal completion of background process
Slave exiting with ORA-283 exception
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_1548.trc:
ORA-00283: recovery session canceled due to errors
ORA-00448: normal completion of background process
ORA-10562 signalled during: ALTER DATABASE RECOVER  database  ...

出现上述问题主要是由于硬件突然故障,数据写丢失导致相关问题.

处理思路

RMAN> recover datafile 1;

启动 recover 于 09-2月 -19
使用通道 ORA_DISK_1

正在开始介质的恢复
介质恢复完成, 用时: 00:00:01

完成 recover 于 09-2月 -19

RMAN> recover datafile 2;

启动 recover 于 09-2月 -19
使用通道 ORA_DISK_1

正在开始介质的恢复
介质恢复完成, 用时: 00:00:01

完成 recover 于 09-2月 -19

RMAN> recover datafile 3;

启动 recover 于 09-2月 -19
使用通道 ORA_DISK_1

正在开始介质的恢复
介质恢复完成, 用时: 00:00:02

完成 recover 于 09-2月 -19

RMAN> recover datafile 4;

启动 recover 于 09-2月 -19
使用通道 ORA_DISK_1

正在开始介质的恢复
无法恢复介质
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: recover 命令 (在 02/09/2019 21:48:19 上) 失败
ORA-00283: recovery session canceled due to errors
RMAN-11003: 在分析/执行 SQL 语句期间失败: alter database recover if needed
 datafile 4
ORA-00283: 恢复会话因错误而取消
ORA-10562: Error occurred while applying redo to data block (file# 4, block# 172
5913)
ORA-10564: tablespace USERS
ORA-01110: 数据文件 4: 'S:\ORADATA\ORCL\USERS01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 73907
ORA-00600: 内部错误代码, 参数: [kdourp_inorder2], [34], [43], [44], [44], [], []
, [], [], [], [], []

SQL> recover datafile 4;
ORA-00283: 恢复会话因错误而取消
ORA-10562: Error occurred while applying redo to data block (file# 4, block#
1725913)
ORA-10564: tablespace USERS
ORA-01110: 数据文件 4: 'S:\ORADATA\ORCL\USERS01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 73907
ORA-00600: 内部错误代码, 参数: [kdourp_inorder2], [34], [43], [44], [44], [],
[], [], [], [], [], []

--通过bbed修改异常文件,屏蔽文件恢复,直接open库
SQL> alter database open;

数据库已更改。

数据库open之后,逻辑方式导出数据,重建新库,导入数据.

发表在 非常规恢复 | 标签为 , , , , , , , , | 评论关闭

又一例asm格式化文件系统恢复

又一个客户把win rac中的asm disk给格式化为ntfs了(data磁盘组由三个500G的磁盘组成,被格式化掉前面两个还剩下一个),而且格式化之后,还进行了一系列恢复(比如修复磁盘头,又进行分区等一些磁盘操作),导致恢复难度增加,也增加了一些数据覆盖
asm alert日志报错

Thu Aug 23 11:20:14 2018
NOTE: ASM client orcl1:orcl disconnected unexpectedly.
NOTE: check client alert log.
NOTE: Process state recorded in trace file d:\app\administrator\diag\asm\+asm\+asm1\trace\+asm1_ora_2260.trc
Thu Aug 23 11:20:28 2018
Errors in file d:\app\administrator\diag\asm\+asm\+asm1\trace\+asm1_lgwr_3820.trc:
ORA-27070: async read/write failed
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 87) 参数错误。
WARNING: IO Failed. group:2 disk(number.incarnation):1.0xf0f0a1cb disk_path:\\.\ORCLDISKDATA1
	 AU:26 disk_offset(bytes):27566080 io_size:4096 operation:Write type:synchronous
	 result:I/O error process_id:3820
NOTE: unable to write any mirror side for diskgroup DATA
NOTE: cache initiating offline of disk 1 group DATA
NOTE: process 3268:3820 initiating offline of disk 1.4042301899 (DATA_0001) with mask 0x7e in group 2
WARNING: Disk DATA_0001 in mode 0x7f is now being taken offline
NOTE: initiating PST update: grp = 2, dsk = 1/0xf0f0a1cb, mode = 0x15
kfdp_updateDsk(): 22 
Thu Aug 23 11:20:28 2018
kfdp_updateDskBg(): 22 
ERROR: too many offline disks in PST (grp 2)
WARNING: Disk DATA_0001 in mode 0x7f offline aborted

数据库alert日志报错

WARNING: IO Failed. group:2 disk(number.incarnation):1.0xf0f0a1cb disk_path:\\.\ORCLDISKDATA1
	 AU:422 disk_offset(bytes):442515456 io_size:16384 operation:Read type:synchronous
	 result:I/O error process_id:11992
WARNING: failed to read mirror side 1 of virtual extent 5 logical extent 0 of file 260 in 
group [2.1859146063] from disk DATA_0001  allocation unit 422 reason error; if possible,will try another mirror side 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_ora_11992.trc:
ORA-15080: 与磁盘的同步 I/O 操作失败
WARNING: failed to write mirror side 1 of virtual extent 5 logical extent 0 of file 260 
in group 2 on disk 1 allocation unit 422 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_ora_11992.trc:
ORA-00202: 控制文件: ''+DATA/orcl/controlfile/current.260.944422981''
ORA-15081: 无法将 I/O 操作提交到磁盘
Thu Aug 23 11:20:13 2018
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-27070: 异步读取/写入失败
WARNING: IO Failed. group:2 disk(number.incarnation):1.0xf0f0a1cb disk_path:\\.\ORCLDISKDATA1
	 AU:841 disk_offset(bytes):882532352 io_size:131072 operation:Write type:asynchronous
	 result:I/O error process_id:3224
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-15080: 与磁盘的同步 I/O 操作失败
WARNING: failed to write mirror side 1 of virtual extent 240 logical extent 0 of file 259 in group 2 on disk 1 
allocation unit 841 KCF: read, write or open error, block=0x7853 online=1
        file=4 '+DATA/orcl/datafile/users.259.944422883'
        error=15081 txt: ''
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 87) 参数错误。
WARNING: IO Failed. group:2 disk(number.incarnation):1.0xf0f0a1cb disk_path:\\.\ORCLDISKDATA1
	 AU:422 disk_offset(bytes):442515456 io_size:16384 operation:Read type:synchronous
	 result:I/O error process_id:3224
WARNING: failed to read mirror side 1 of virtual extent 5 logical extent 0 of file 260 in group [2.1859146063] from 
disk DATA_0001  allocation unit 422 reason error; if possible,will try another mirror side 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-15080: 与磁盘的同步 I/O 操作失败
WARNING: failed to write mirror side 1 of virtual extent 5 logical extent 0 of file 260 in group 2 on disk 1 
allocation unit 422 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-00202: 控制文件: ''+DATA/orcl/controlfile/current.260.944422981''
ORA-15081: 无法将 I/O 操作提交到磁盘
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-00204: 读取控制文件时出错 (块 41, # 块 1)
ORA-00202: 控制文件: ''+DATA/orcl/controlfile/current.260.944422981''
ORA-15081: 无法将 I/O 操作提交到磁盘
DBW1 (ospid: 3224): terminating the instance due to error 204

由于客户进行了一系列恢复恢复操作导致查看磁盘都不全

D:\>asmtool -list
NTFS                             \Device\Harddisk0\Partition1              100M
NTFS                             \Device\Harddisk0\Partition2           102298M
NTFS                             \Device\Harddisk1\Partition1           102397M
NTFS                             \Device\Harddisk2\Partition1           204797M
---这里还有一个磁盘没有正常显示
ORCLDISKDATA10                   \Device\Harddisk4\Partition1           511997M--客户尝试修复的磁盘
ORCLDISKDATA2                    \Device\Harddisk5\Partition1           511997M
ORCLDISKRECOVERY0                \Device\Harddisk6\Partition1            51197M
ORCLDISKRECOVERY1                \Device\Harddisk7\Partition1            51197M
ORCLDISKRECOVERY2                \Device\Harddisk8\Partition1            51197M
ORCLDISKCRS0                     \Device\Harddisk9\Partition1            10237M
ORCLDISKCRS1                     \Device\Harddisk10\Partition1           10237M
ORCLDISKCRS2                     \Device\Harddisk11\Partition1           10237M
NTFS                             \Device\Harddisk12\Partition2         4194174M

通过主机层面激活卷,删除分区等一系列操作,然后通过kfed构造磁盘头,让这些磁盘在os层面可以正常显示

C:\Users\Administrator>asmtool -list
NTFS                             \Device\Harddisk0\Partition1              100M
NTFS                             \Device\Harddisk0\Partition2           102298M
NTFS                             \Device\Harddisk1\Partition1           102397M
NTFS                             \Device\Harddisk2\Partition1           204797M
------需要处理的磁盘------
ORCLDISKDATA0                    \Device\Harddisk3\Partition1           511997M
ORCLDISKDATA1                    \Device\Harddisk4\Partition1           511997M
ORCLDISKDATA2                    \Device\Harddisk5\Partition1           511997M
-----------------------
ORCLDISKRECOVERY0                \Device\Harddisk6\Partition1            51197M
ORCLDISKRECOVERY1                \Device\Harddisk7\Partition1            51197M
ORCLDISKRECOVERY2                \Device\Harddisk8\Partition1            51197M
ORCLDISKCRS0                     \Device\Harddisk9\Partition1            10237M
ORCLDISKCRS1                     \Device\Harddisk10\Partition1           10237M
ORCLDISKCRS2                     \Device\Harddisk11\Partition1           10237M
NTFS                             \Device\Harddisk12\Partition2         4194174M

由于asm磁盘组内部目录au被彻底损坏,导致无法通过asm直接拷贝出来数据,通过底层扫描,按照au恢复出来相关数据,由于格式化ntfs和后续的误操作导致部分数据au被覆盖.其余数据均恢复,抢救了绝大部分数据.
数据文件恢复参考:asm disk header 彻底损坏恢复
另外有一次win平台类似恢复经历:asm disk格式化为ntfs恢复
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:13429648788    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

发表在 Oracle ASM, 非常规恢复 | 标签为 , , , , | 评论关闭