标签归档:oracle 0kb

解决一次硬件恢复之后数据文件0kb的故障恢复case

客户一个比较久远系统,由于长期没有人维护,导致硬件故障,客户找人进行了硬件恢复之后,发现大量数据文件为0kb
0kb


客户这个系统是17年上线,19年进行了一次升级,提出要求,只要能够恢复到19年升级之后的系统状态即可(因为是制造业系统,大量配置信息在里,至于后续产生的数据,无所谓),基于目前的数据文件情况,肯定无法恢复出来(因为字典数据在system01.dbf中)
基于这种情况,我这边在客户恢复的整个目录文件中,再三查找,发现了一个类似rman备份的文件(是21年的),对其进行还原尝试
QQ20250615-134144

在还原过程中发现大量坏块,没有办法,最后只能采用一些方法强制rman还原出来备份中的部分文件

Corrupt block 653695 found during reading backup piece, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, corr_type=-2
Reread of blocknum=653695, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=653695, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=653695, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=653695, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=653695, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Continuing reading piece H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, no other copies available.
Fri Jun 06 14:23:26 2025
Cannot read block 1 from S:\DBFILES\BACKUP\ORA_DF1080446471_S8590_S1 - 
   restore failover to read from H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1
ORA-19505: 无法识别文件"S:\DBFILES\BACKUP\ORA_DF1080446471_S8590_S1"
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Full restore complete of datafile 2 to datafile copy H:\BAIDUNETDISK\BACKUP\BACKUP\2_SYSAUX01.DBF.Elapsed time: 0:00:04
  checkpoint is 16694678523790
Full restore complete of datafile 1 to datafile copy H:\BAIDUNETDISK\BACKUP\BACKUP\1_SYSTEM01.DBF.Elapsed time: 0:00:05
  checkpoint is 16694678523790
  Undo Optimization current scn is 16694646809619
Fri Jun 06 14:23:47 2025
Datafile rdba reconstruction error, expected block greater than 3305201, got 3304960 for datafile 4
Corrupt block 3746806 found during reading backup piece, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, corr_type=4
Datafile tail reconstruction error, expected tail of 0, got -1601108480 for datafile 4
………………
Corrupt block 4290319 found during reading backup piece, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, corr_type=-2
Reread of blocknum=4290319, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=4290319, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=4290319, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=4290319, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=4290319, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Continuing reading piece H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, no other copies available.
Fri Jun 06 16:01:21 2025
Hex dump of (file 4, block 1) in trace file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_15808.trc
Corrupt block relative dba: 0x01000001 (file 4, block 1)
Bad check value found during deleting datafile copy
Data in bad block:
 type: 0 format: 2 rdba: 0x01000001
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000001
 check value in block header: 0x0
 computed block checksum: 0xa601
Reread of blocknum=1, file=H:\BAIDUNETDISK\BACKUP\BACKUP\4_USERS01.DBF. found valid data
Switch of datafile 4 complete to datafile copy 
  checkpoint is 16126

很明显还原出来的system/sysaux文件可能还可以使用,但是users01.dbf肯定不行(从checkpoint is SCN)可以判断出来(users01.dbf是初始化出来的),基于这种情况,利用当前的system和sysaux打开数据库

Fri Jun 13 22:05:31 2025
Media Recovery failed with error 1610
Fri Jun 13 22:05:31 2025
Signalling error 1152 for datafile 1!
Signalling error 1152 for datafile 2!
Signalling error 1152 for datafile 3!
Signalling error 1152 for datafile 4!
Checker run found 5 new persistent data failures
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Fri Jun 13 22:05:49 2025
ALTER DATABASE RECOVER  database using backup controlfile  
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 20 slaves
Fri Jun 13 22:05:49 2025
Warning: Datafile 3 (H:\BAIDUNETDISK\BACKUP\BACKUP\3_UNDOTBS01.DBF) is 
offline during full database recovery and will not be recovered
ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile
ALTER DATABASE RECOVER    CANCEL  
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER    CANCEL  
Fri Jun 13 22:06:04 2025
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 16694678523790
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO01.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 2 (用于线程 1) 的成员
ORA-00312: 联机日志 2 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO02.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 3 (用于线程 1) 的成员
ORA-00312: 联机日志 3 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO03.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO01.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Clearing online redo logfile 1 H:\BAIDUNETDISK\BACKUP\BACKUP\REDO01.LOG
Clearing online log 1 of thread 1 sequence number 33772
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO01.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO01.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Clearing online redo logfile 1 complete
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 2 (用于线程 1) 的成员
ORA-00312: 联机日志 2 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO02.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Clearing online redo logfile 2 H:\BAIDUNETDISK\BACKUP\BACKUP\REDO02.LOG
Clearing online log 2 of thread 1 sequence number 33773
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 2 (用于线程 1) 的成员
ORA-00312: 联机日志 2 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO02.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 2 (用于线程 1) 的成员
ORA-00312: 联机日志 2 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO02.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Clearing online redo logfile 2 complete
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 3 (用于线程 1) 的成员
ORA-00312: 联机日志 3 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO03.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Clearing online redo logfile 3 H:\BAIDUNETDISK\BACKUP\BACKUP\REDO03.LOG
Clearing online log 3 of thread 1 sequence number 33771
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 3 (用于线程 1) 的成员
ORA-00312: 联机日志 3 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO03.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 3 (用于线程 1) 的成员
ORA-00312: 联机日志 3 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO03.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Clearing online redo logfile 3 complete
Resetting resetlogs activation ID 1596759182 (0x5f2c9c8e)
Online log H:\BAIDUNETDISK\BACKUP\BACKUP\REDO01.LOG: Thread 1 Group 1 was previously cleared
Online log H:\BAIDUNETDISK\BACKUP\BACKUP\REDO02.LOG: Thread 1 Group 2 was previously cleared
Online log H:\BAIDUNETDISK\BACKUP\BACKUP\REDO03.LOG: Thread 1 Group 3 was previously cleared
Fri Jun 13 22:06:05 2025
Setting recovery target incarnation to 2
Fri Jun 13 22:06:05 2025
Assigning activation ID 1908542329 (0x71c20b79)
LGWR: STARTING ARCH PROCESSES
Fri Jun 13 22:06:05 2025
ARC0 started with pid=21, OS id=3372 
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Fri Jun 13 22:06:06 2025
ARC1 started with pid=22, OS id=14764 
Fri Jun 13 22:06:06 2025
ARC2 started with pid=23, OS id=9156 
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: H:\BAIDUNETDISK\BACKUP\BACKUP\REDO01.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Jun 13 22:06:06 2025
ARC3 started with pid=24, OS id=24080 
ARC1: Archival started
ARC2: Archival started
ARC2: Becoming the 'no FAL' ARCH
ARC2: Becoming the 'no SRL' ARCH
ARC1: Becoming the heartbeat ARCH
Fri Jun 13 22:06:07 2025
SMON: enabling cache recovery
Undo initialization finished serial:0 start:160589734 end:160589750 diff:16 (0 seconds)
Dictionary check beginning
File #3 is offline, but is part of an online tablespace.
data file 3: 'H:\BAIDUNETDISK\BACKUP\BACKUP\3_UNDOTBS01.DBF'
File #4 is offline, but is part of an online tablespace.
data file 4: 'H:\BAIDUNETDISK\BACKUP\BACKUP\4_USERS01.DBF'
Fri Jun 13 22:06:07 2025
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_dbw0_8352.trc:
ORA-01157: ????/?????? 201 - ??? DBWR ????
ORA-01110: ???? 201: 'H:\BAIDUNETDISK\BACKUP\BACKUP\TEMP01.DBF'
ORA-27041: ??????
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_dbw0_8352.trc:
ORA-01186: ?? 201 ??????
ORA-01157: ????/?????? 201 - ??? DBWR ????
ORA-01110: ???? 201: 'H:\BAIDUNETDISK\BACKUP\BACKUP\TEMP01.DBF'
File 201 not verified due to error ORA-01157
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Re-creating tempfile H:\BAIDUNETDISK\BACKUP\BACKUP\TEMP01.DBF
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Fri Jun 13 22:06:07 2025
QMNC started with pid=25, OS id=20288 
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Completed: alter database open resetlogs

导出需要的业务用户字典信息,然后把客户那边提供的users01.dbf文件(users02.dbf是客户在21年之后增加的,原则上客户要的数据都在users01.dbf中)中的数据恢复到导出的字典中,完成本次数据恢复,客户远程验证业务,运行正常,客户需要的配置信息都在其中.

发表在 Oracle | 标签为 , , , , | 留下评论

Oracle 数据文件大小为0kb或者文件丢失恢复

接到一个朋友恢复请求,由于rose频繁切换导致文件系统部分数据文件变化为0kb和文件丢失.
故障现象
部分数据文件变化为0kb和文件丢失.
file_lost
file_size_0


这里比较明显,数据库的users03变为了0kb和users04丢失.数据库alert日志报错信息如下:

Completed: alter database mount exclusive
alter database open
Errors in file E:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_dbw0_12008.trc:
ORA-01157: ????/?????? 7 - ??? DBWR ????
ORA-01110: ???? 7: 'E:\APP\ADMINISTRATOR\ORADATA\ORCL\USERS03.DBF'
ORA-27047: ??????????
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 38) 已到文件结尾。
Errors in file E:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_dbw0_12008.trc:
ORA-01157: ????/?????? 8 - ??? DBWR ????
ORA-01110: ???? 8: 'E:\APP\ADMINISTRATOR\ORADATA\ORCL\USERS04.DBF'
ORA-27041: ??????
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件。
Errors in file E:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_12040.trc:
ORA-01157: ????/?????? 7 - ??? DBWR ????
ORA-01110: ???? 7: 'E:\APP\ADMINISTRATOR\ORADATA\ORCL\USERS03.DBF'
ORA-1157 signalled during: alter database open...
Fri May 04 09:35:10 2018
Checker run found 2 new persistent data failures

alert日志的报错也比较明显,users03是文件超过了大小(大小为0kb,读取之后肯定超过大小),users04提示无法打开文件(文件在文件系统层面已经丢失).现在问题比较明显由于文件系统故障导致文件大小为0和丢失

碎片扫描恢复
常规的方法肯定无法恢复,比较好的方法只能是底层碎片扫描重组,结合多种扫描工具,最后发现一个做底层恢复的朋友的工具效果不错,扫描结果如下
file_scan


通过工具分析坏块情况

C:\Users\Administrator>dbv FiLe=D:\0504\ORCL_TS.4_FILE.7_10.ora

DBVERIFY: Release 11.2.0.4.0 - Production on 星期六 5月 5 08:52:53 2018

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - 开始验证: FILE = D:\0504\ORCL_TS.4_FILE.7_10.ora

………………

页 382565 标记为损坏
Corrupt block relative dba: 0x01c5d665 (file 7, block 382565)
Completely zero block found during dbv:

页 382566 标记为损坏
Corrupt block relative dba: 0x01c5d666 (file 7, block 382566)
Completely zero block found during dbv:

页 382567 标记为损坏
Corrupt block relative dba: 0x01c5d667 (file 7, block 382567)
Completely zero block found during dbv:



DBVERIFY - 验证完成

检查的页总数: 1374720
处理的页总数 (数据): 27582
失败的页总数 (数据): 0
处理的页总数 (索引): 20114
失败的页总数 (索引): 0
处理的页总数 (其他): 1319752
处理的总页数 (段)  : 0
失败的总页数 (段)  : 0
空的页总数: 1
标记为损坏的总页数: 7271
流入的页总数: 0
加密的总页数        : 0
最高块 SCN            : 228271996 (0.228271996)


C:\Users\Administrator>dbv FiLe=D:\0504\ORCL_TS.4_FILE.8_8.ora

DBVERIFY: Release 11.2.0.4.0 - Production on 星期六 5月 5 08:52:53 2018

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - 开始验证: FILE = D:\0504\ORCL_TS.4_FILE.8_8.ora


DBVERIFY - 验证完成

检查的页总数: 1136896
处理的页总数 (数据): 36639
失败的页总数 (数据): 0
处理的页总数 (索引): 57038
失败的页总数 (索引): 0
处理的页总数 (其他): 1043218
处理的总页数 (段)  : 0
失败的总页数 (段)  : 0
空的页总数: 1
标记为损坏的总页数: 0
流入的页总数: 0
加密的总页数        : 0
最高块 SCN            : 228271997 (0.228271997)

C:\Users\Administrator>

scan_resulte


这里通过分析恢复的两个文件总的block数量2511618,其中连续损坏7271个block损坏,由于出现问题之后,数据库被offline这两个文件继续启动运行了几个小时,导致少量block被覆盖,恢复软件直接置空.后续的恢复比较顺利,正常open数据库,然后处理坏块对象(正好不是业务核心表的lob字段,所有部分丢失影响不是非常大).

温馨提醒:
1. 数据文件和备份不要放在同一个阵列上,更不能是同一个分区(卷)上
2. 出现此类问题之后,应当理解停止对该分区的任何写操作,方式丢失或者大小为0KB的文件被覆盖.
如果需要专业ORACLE数据库恢复技术支持,请联系我们
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

发表在 非常规恢复 | 标签为 , , , , , , , , , | 评论关闭