解决一次硬件恢复之后数据文件0kb的故障恢复case

客户一个比较久远系统,由于长期没有人维护,导致硬件故障,客户找人进行了硬件恢复之后,发现大量数据文件为0kb
0kb


客户这个系统是17年上线,19年进行了一次升级,提出要求,只要能够恢复到19年升级之后的系统状态即可(因为是制造业系统,大量配置信息在里,至于后续产生的数据,无所谓),基于目前的数据文件情况,肯定无法恢复出来(因为字典数据在system01.dbf中)
基于这种情况,我这边在客户恢复的整个目录文件中,再三查找,发现了一个类似rman备份的文件(是21年的),对其进行还原尝试
QQ20250615-134144

在还原过程中发现大量坏块,没有办法,最后只能采用一些方法强制rman还原出来备份中的部分文件

Corrupt block 653695 found during reading backup piece, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, corr_type=-2
Reread of blocknum=653695, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=653695, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=653695, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=653695, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=653695, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Continuing reading piece H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, no other copies available.
Fri Jun 06 14:23:26 2025
Cannot read block 1 from S:\DBFILES\BACKUP\ORA_DF1080446471_S8590_S1 - 
   restore failover to read from H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1
ORA-19505: 无法识别文件"S:\DBFILES\BACKUP\ORA_DF1080446471_S8590_S1"
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Full restore complete of datafile 2 to datafile copy H:\BAIDUNETDISK\BACKUP\BACKUP\2_SYSAUX01.DBF.Elapsed time: 0:00:04
  checkpoint is 16694678523790
Full restore complete of datafile 1 to datafile copy H:\BAIDUNETDISK\BACKUP\BACKUP\1_SYSTEM01.DBF.Elapsed time: 0:00:05
  checkpoint is 16694678523790
  Undo Optimization current scn is 16694646809619
Fri Jun 06 14:23:47 2025
Datafile rdba reconstruction error, expected block greater than 3305201, got 3304960 for datafile 4
Corrupt block 3746806 found during reading backup piece, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, corr_type=4
Datafile tail reconstruction error, expected tail of 0, got -1601108480 for datafile 4
………………
Corrupt block 4290319 found during reading backup piece, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, corr_type=-2
Reread of blocknum=4290319, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=4290319, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=4290319, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=4290319, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Reread of blocknum=4290319, file=H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, found same corrupt data
Continuing reading piece H:\BAIDUNETDISK\ORA_DF1080446471_S8590_S1, no other copies available.
Fri Jun 06 16:01:21 2025
Hex dump of (file 4, block 1) in trace file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_15808.trc
Corrupt block relative dba: 0x01000001 (file 4, block 1)
Bad check value found during deleting datafile copy
Data in bad block:
 type: 0 format: 2 rdba: 0x01000001
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000001
 check value in block header: 0x0
 computed block checksum: 0xa601
Reread of blocknum=1, file=H:\BAIDUNETDISK\BACKUP\BACKUP\4_USERS01.DBF. found valid data
Switch of datafile 4 complete to datafile copy 
  checkpoint is 16126

很明显还原出来的system/sysaux文件可能还可以使用,但是users01.dbf肯定不行(从checkpoint is SCN)可以判断出来(users01.dbf是初始化出来的),基于这种情况,利用当前的system和sysaux打开数据库

Fri Jun 13 22:05:31 2025
Media Recovery failed with error 1610
Fri Jun 13 22:05:31 2025
Signalling error 1152 for datafile 1!
Signalling error 1152 for datafile 2!
Signalling error 1152 for datafile 3!
Signalling error 1152 for datafile 4!
Checker run found 5 new persistent data failures
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Fri Jun 13 22:05:49 2025
ALTER DATABASE RECOVER  database using backup controlfile  
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 20 slaves
Fri Jun 13 22:05:49 2025
Warning: Datafile 3 (H:\BAIDUNETDISK\BACKUP\BACKUP\3_UNDOTBS01.DBF) is 
offline during full database recovery and will not be recovered
ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile
ALTER DATABASE RECOVER    CANCEL  
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER    CANCEL  
Fri Jun 13 22:06:04 2025
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 16694678523790
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO01.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 2 (用于线程 1) 的成员
ORA-00312: 联机日志 2 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO02.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 3 (用于线程 1) 的成员
ORA-00312: 联机日志 3 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO03.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO01.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Clearing online redo logfile 1 H:\BAIDUNETDISK\BACKUP\BACKUP\REDO01.LOG
Clearing online log 1 of thread 1 sequence number 33772
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO01.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO01.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Clearing online redo logfile 1 complete
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 2 (用于线程 1) 的成员
ORA-00312: 联机日志 2 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO02.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Clearing online redo logfile 2 H:\BAIDUNETDISK\BACKUP\BACKUP\REDO02.LOG
Clearing online log 2 of thread 1 sequence number 33773
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 2 (用于线程 1) 的成员
ORA-00312: 联机日志 2 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO02.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 2 (用于线程 1) 的成员
ORA-00312: 联机日志 2 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO02.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Clearing online redo logfile 2 complete
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 3 (用于线程 1) 的成员
ORA-00312: 联机日志 3 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO03.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Clearing online redo logfile 3 H:\BAIDUNETDISK\BACKUP\BACKUP\REDO03.LOG
Clearing online log 3 of thread 1 sequence number 33771
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 3 (用于线程 1) 的成员
ORA-00312: 联机日志 3 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO03.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_5812.trc:
ORA-00313: 无法打开日志组 3 (用于线程 1) 的成员
ORA-00312: 联机日志 3 线程 1: 'H:\BAIDUNETDISK\BACKUP\BACKUP\REDO03.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Clearing online redo logfile 3 complete
Resetting resetlogs activation ID 1596759182 (0x5f2c9c8e)
Online log H:\BAIDUNETDISK\BACKUP\BACKUP\REDO01.LOG: Thread 1 Group 1 was previously cleared
Online log H:\BAIDUNETDISK\BACKUP\BACKUP\REDO02.LOG: Thread 1 Group 2 was previously cleared
Online log H:\BAIDUNETDISK\BACKUP\BACKUP\REDO03.LOG: Thread 1 Group 3 was previously cleared
Fri Jun 13 22:06:05 2025
Setting recovery target incarnation to 2
Fri Jun 13 22:06:05 2025
Assigning activation ID 1908542329 (0x71c20b79)
LGWR: STARTING ARCH PROCESSES
Fri Jun 13 22:06:05 2025
ARC0 started with pid=21, OS id=3372 
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Fri Jun 13 22:06:06 2025
ARC1 started with pid=22, OS id=14764 
Fri Jun 13 22:06:06 2025
ARC2 started with pid=23, OS id=9156 
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: H:\BAIDUNETDISK\BACKUP\BACKUP\REDO01.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Jun 13 22:06:06 2025
ARC3 started with pid=24, OS id=24080 
ARC1: Archival started
ARC2: Archival started
ARC2: Becoming the 'no FAL' ARCH
ARC2: Becoming the 'no SRL' ARCH
ARC1: Becoming the heartbeat ARCH
Fri Jun 13 22:06:07 2025
SMON: enabling cache recovery
Undo initialization finished serial:0 start:160589734 end:160589750 diff:16 (0 seconds)
Dictionary check beginning
File #3 is offline, but is part of an online tablespace.
data file 3: 'H:\BAIDUNETDISK\BACKUP\BACKUP\3_UNDOTBS01.DBF'
File #4 is offline, but is part of an online tablespace.
data file 4: 'H:\BAIDUNETDISK\BACKUP\BACKUP\4_USERS01.DBF'
Fri Jun 13 22:06:07 2025
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_dbw0_8352.trc:
ORA-01157: ????/?????? 201 - ??? DBWR ????
ORA-01110: ???? 201: 'H:\BAIDUNETDISK\BACKUP\BACKUP\TEMP01.DBF'
ORA-27041: ??????
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_dbw0_8352.trc:
ORA-01186: ?? 201 ??????
ORA-01157: ????/?????? 201 - ??? DBWR ????
ORA-01110: ???? 201: 'H:\BAIDUNETDISK\BACKUP\BACKUP\TEMP01.DBF'
File 201 not verified due to error ORA-01157
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Re-creating tempfile H:\BAIDUNETDISK\BACKUP\BACKUP\TEMP01.DBF
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Fri Jun 13 22:06:07 2025
QMNC started with pid=25, OS id=20288 
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Completed: alter database open resetlogs

导出需要的业务用户字典信息,然后把客户那边提供的users01.dbf文件(users02.dbf是客户在21年之后增加的,原则上客户要的数据都在users01.dbf中)中的数据恢复到导出的字典中,完成本次数据恢复,客户远程验证业务,运行正常,客户需要的配置信息都在其中.

发表在 Oracle | 标签为 , , , , | 留下评论

Error in invoking target ‘libasmclntsh19.ohso libasmperl19.ohso client_sharedlib’问题处理

最近在redhat 8.x系列系统中安装Oracle 19c,编译的过程中出现类似:[FATAL] Error in invoking target ‘libasmclntsh19.ohso libasmperl19.ohso client_sharedlib’ of makefile ‘/u01/app/oracle/product/19c/db_1/rdbms/lib/ins_rdbms.mk’.错误

[oracle@xifenfei db_1]$ ./runInstaller -ignorePrereq -waitforcompletion -silent \
>   oracle.install.option=INSTALL_DB_SWONLY \
>   UNIX_GROUP_NAME=oinstall \
>   INVENTORY_LOCATION=${ORACLE_BASE}/oraInventory \
>   ORACLE_HOME=${ORACLE_HOME} \
>   ORACLE_BASE=${ORACLE_BASE} \
>   oracle.install.db.InstallEdition=EE \
>   oracle.install.db.OSDBA_GROUP=dba \
>   oracle.install.db.OSBACKUPDBA_GROUP=backupdba \
>   oracle.install.db.OSDGDBA_GROUP=dgdba \
>   oracle.install.db.OSKMDBA_GROUP=kmdba \
>   oracle.install.db.OSRACDBA_GROUP=dba \
>   SECURITY_UPDATES_VIA_MYORACLESUPPORT=false \
>   DECLINE_SECURITY_UPDATES=true
Launching Oracle Database Setup Wizard...

[WARNING] [INS-13014] Target environment does not meet some optional requirements.
   CAUSE: Some of the optional prerequisites are not met. See logs for details. 
/u01/app/oraInventory/logs/InstallActions2025-06-02_05-13-46PM/installActions2025-06-02_05-13-46PM.log
   ACTION: Identify the list of failed prerequisite checks from the log: 
/u01/app/oraInventory/logs/InstallActions2025-06-02_05-13-46PM/installActions2025-06-02_05-13-46PM.log. 
Then either from the log file or from installation manual 
find the appropriate configuration to meet the prerequisites and fix it manually.
The response file for this session can be found at:
 /u01/app/oracle/product/19c/db_1/install/response/db_2025-06-02_05-13-46PM.rsp

You can find the log of this install session at:
 /u01/app/oraInventory/logs/InstallActions2025-06-02_05-13-46PM/installActions2025-06-02_05-13-46PM.log
[FATAL] Error in invoking target 'libasmclntsh19.ohso libasmperl19.ohso client_sharedlib' of 
makefile '/u01/app/oracle/product/19c/db_1/rdbms/lib/ins_rdbms.mk'. See 
'/u01/app/oraInventory/logs/InstallActions2025-06-02_05-13-46PM/installActions2025-06-02_05-13-46PM.log' for details.

查看日志中具体信息

INFO:
/usr/bin/ld
INFO:
: cannot find -lclntsh

INFO:
make[2]: *** [/u01/app/oracle/product/19c/db_1/rdbms/lib/env_rdbms.mk:5232: dlopenlib] Error 1

INFO:
make[2]: Leaving directory '/u01/app/oracle/product/19c/db_1/rdbms/lib'

INFO:
make[1]: *** [/u01/app/oracle/product/19c/db_1/rdbms/lib/env_rdbms.mk:5210: 
 /u01/app/oracle/product/19c/db_1/lib/libasmperl19.so] Error 2

INFO:
make[1]: Leaving directory '/u01/app/oracle/product/19c/db_1/rdbms/lib'

INFO:
make: *** [/u01/app/oracle/product/19c/db_1/rdbms/lib/env_rdbms.mk:5247: libasmperl19.ohso] Error 2

INFO: End output from spawned process.
INFO: ----------------------------------
INFO: Exception thrown from action: make
Exception Name: MakefileException
Exception String: Error in invoking target 'libasmclntsh19.ohso libasmperl19.ohso client_sharedlib' of makefile 
'/u01/app/oracle/product/19c/db_1/rdbms/lib/ins_rdbms.mk'. See 
'/u01/app/oraInventory/logs/InstallActions2025-06-02_05-13-46PM/installActions2025-06-02_05-13-46PM.log' for details.
Exception Severity: 1
INFO:  [Jun 2, 2025 5:16:23 PM] Adding ExitStatus STOP_INSTALL to the exit status set
INFO:  [Jun 2, 2025 5:16:23 PM] Finding the most appropriate exit status for the current application
INFO:  [Jun 2, 2025 5:16:23 PM] inventory location is/u01/app/oraInventory
INFO:  [Jun 2, 2025 5:16:23 PM] Adding ExitStatus SUCCESS_WITH_WARNINGS to the exit status set
INFO:  [Jun 2, 2025 5:16:23 PM] Finding the most appropriate exit status for the current application
INFO:  [Jun 2, 2025 5:16:23 PM] Exit Status is -4
INFO:  [Jun 2, 2025 5:16:23 PM] Shutdown Oracle Database 19c Installer
INFO:  [Jun 2, 2025 5:16:23 PM] Unloading Setup Driver

提示缺少lclntsh,对应到数据中为libclntsh动态库文件,检查数据lib中相关文件

[oracle@xifenfei lib]$ ls -ltr libclntsh*
lrwxrwxrwx 1 oracle oinstall      12 Jun  2 16:06 libclntsh.so.11.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall      12 Jun  2 16:06 libclntsh.so.10.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall      12 Jun  2 16:06 libclntsh.so.12.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall      12 Jun  2 16:06 libclntsh.so.18.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall      17 Jun  2 16:06 libclntsh.so -> libclntsh.so.19.1
-rwxr-xr-x 1 oracle oinstall 8057080 Jun  2 16:11 libclntshcore.so.19.1
lrwxrwxrwx 1 oracle oinstall      21 Jun  2 16:11 libclntshcore.so -> libclntshcore.so.19.1

发现libclntsh.so.*.1都软连接到 libclntsh.so.19.1 而 libclntsh.so.19.1这个文件本身丢失,从安装介质中找到该文件并传输到lib中,修改权限

[oracle@xifenfei ~]$ cd $ORACLE_HOME/lib
[oracle@xifenfei lib]$ cp /tmp/libclntsh.so.19.1 ./
[oracle@xifenfei lib]$ chmod 777 libclntsh.so.19.1
[oracle@xifenfei lib]$ ls -ltr libclntsh*
lrwxrwxrwx 1 oracle oinstall       12 Jun  2 17:31 libclntsh.so.10.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall       12 Jun  2 17:31 libclntsh.so.11.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall       12 Jun  2 17:31 libclntsh.so.12.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall       12 Jun  2 17:31 libclntsh.so.18.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall       17 Jun  2 17:34 libclntsh.so -> libclntsh.so.19.1
lrwxrwxrwx 1 oracle oinstall       21 Jun  2 17:34 libclntshcore.so -> libclntshcore.so.19.1
-rwxr-xr-x 1 oracle oinstall 82573024 Jun  2 17:34 libclntsh.so.19.1
-rwxr-xr-x 1 oracle oinstall  8057080 Jun  2 17:34 libclntshcore.so.19.1

后续重新执行runInstaller相关命令安装正常,这个问题本质是由于libclntsh.so.19.1文件丢失导致,我查看了unzip解压日志,发现是解压出来了该文件的,具体什么原因丢失未知
233759


发表在 Oracle安装升级 | 标签为 , , , | 留下评论

ORA-01171: datafile N going offline due to error advancing checkpoint

最近接到一个客户有一个数据文件offline的恢复咨询,通过分析日志,当时是由于在启动的时候数据文件被占用导致后续数据库open之后,该文件被强制offline掉

Fri May 16 20:01:05 2025
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Fri May 16 20:01:05 2025
ALTER DATABASE OPEN
Fri May 16 20:01:06 2025
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=70, OS id=4628
Fri May 16 20:01:06 2025
ARC0: Archival started
ARC1 started with pid=74, OS id=4840
Fri May 16 20:01:06 2025
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
Fri May 16 20:01:06 2025
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_lgwr_4080.trc:
ORA-01110: data file 14: 'D:\ORADATA\XIFENFEI105_DAT_1.DBF'
ORA-01114: IO error writing block to file 14 (block # 1)
ORA-27041: unable to open file
OSD-04002: 无法打开文件
O/S-Error: (OS 32) 另一个程序正在使用此文件,进程无法访问。

Thread 1 opened at log sequence 172421
  Current log# 1 seq# 172421 mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO01.LOG
Fri May 16 20:01:06 2025
ARC1: STARTING ARCH PROCESSES
Fri May 16 20:01:06 2025
Successful open of redo thread 1
Fri May 16 20:01:06 2025
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Fri May 16 20:01:06 2025
ARC2: Archival started
ARC1: STARTING ARCH PROCESSES COMPLETE
ARC2 started with pid=78, OS id=4056
Fri May 16 20:01:06 2025
ARC1: Becoming the heartbeat ARCH
Fri May 16 20:01:06 2025
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri May 16 20:01:06 2025
SMON: enabling cache recovery
Fri May 16 20:01:07 2025
Successfully onlined Undo Tablespace 1.
Fri May 16 20:01:07 2025
SMON: enabling tx recovery
Fri May 16 20:01:08 2025
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=86, OS id=4492
Fri May 16 20:01:12 2025
db_recovery_file_dest_size of 51200 MB is 1.97% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Fri May 16 20:01:13 2025
Completed: ALTER DATABASE OPEN
Fri May 16 20:06:44 2025
Restarting dead background process MMON
MMON started with pid=98, OS id=4232
Fri May 16 20:07:06 2025
Shutting down archive processes
Fri May 16 20:07:11 2025
ARCH shutting down
ARC2: Archival stopped
Fri May 16 20:10:32 2025
Thread 1 advanced to log sequence 172422
  Current log# 2 seq# 172422 mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO02.LOG
Fri May 16 20:15:33 2025
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_ckpt_2496.trc:
ORA-01171: datafile 14 going offline due to error advancing checkpoint
ORA-01122: database file 14 failed verification check
ORA-01110: data file 14: 'D:\ORADATA\XIFENFEI105_DAT_1.DBF'
ORA-01208: data file is an old version - not accessing current version

Fri May 16 20:23:09 2025
Starting background process EMN0
EMN0 started with pid=82, OS id=2660

通过dbv检查报错文件,确认被offline文件本身正常
dbv


本身这个故障相对比较简单,只要归档存在直接recover datafile,然后online即可,但是由于备份软件定时工作,导致对应的归档被备份走

Fri May 16 21:55:10 2025
Control autobackup written to SBT_TAPE device
	comment 'API Version 2.0,MMS Version 10.0.0.116',
	media 'V_6746190_6959024'
	handle 'c-1300253653-20250516-00'
Fri May 16 21:56:03 2025
Thread 1 cannot allocate new log, sequence 172423
Private strand flush not complete
  Current log# 2 seq# 172422 mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO02.LOG

而且被异常的数据文件不是核心业务文件,导致客户没有及时发现,等到发现之时尝试recover datafile,提示缺少归档

Wed May 28 17:26:01 2025
alter database recover datafile list clear
Wed May 28 17:26:01 2025
Completed: alter database recover datafile list clear
Wed May 28 17:26:01 2025
alter database recover if needed
 datafile 14

Media Recovery Start
 parallel recovery started with 16 processes
ORA-279 signalled during: alter database recover if needed
 datafile 14
...
Wed May 28 17:26:11 2025
alter database recover cancel
Wed May 28 17:26:13 2025
Media Recovery Canceled
Completed: alter database recover cancel
Wed May 28 17:38:58 2025
ALTER DATABASE RECOVER  datafile 'D:\ORADATA\XIFENFEI105_DAT_1.DBF'  
Wed May 28 17:38:58 2025
Media Recovery Start
 parallel recovery started with 16 processes
ORA-279 signalled during: ALTER DATABASE RECOVER  datafile 'D:\ORADATA\XIFENFEI105_DAT_1.DBF'  ...
Wed May 28 18:26:37 2025
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Wed May 28 18:26:38 2025
Media Recovery Log D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2025_05_28\O1_MF_1_172421_%U_.ARC
Errors with log D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2025_05_28\O1_MF_1_172421_%U_.ARC
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Wed May 28 18:26:38 2025
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Wed May 28 18:26:38 2025
Media Recovery Log D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2025_05_28\O1_MF_1_172421_%U_.ARC
Errors with log D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2025_05_28\O1_MF_1_172421_%U_.ARC
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Wed May 28 18:26:38 2025
ALTER DATABASE RECOVER CANCEL 
Wed May 28 18:26:40 2025
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER CANCEL 

这个客户运气还不错,带库中的需要恢复的归档日志都还在,通过指定带库通道,直接recover datafile成功

RUN {
  ALLOCATE CHANNEL ch1 DEVICE TYPE 'sbt_tape' 
  PARMS="BLKSIZE=262144,ENV=(CV_mmsApiVsn=2,CV_channelPar=ch1)";
  ALLOCATE CHANNEL ch2 DEVICE TYPE 'sbt_tape' 
  PARMS="BLKSIZE=262144,ENV=(CV_mmsApiVsn=2,CV_channelPar=ch2)";
 recover datafile 14;
}

rec
ok


至此完美解决该问题,通过这个case,的出来的经验有:
1. 数据库重启之后,要检查数据库日志和查询数据库数据文件状态(主要防止一些不太常用的文件异常,不能及时发现)
2. 需要需要数据库的基本情况,比如备份,容灾,asm磁盘组冗余,存储冗余,网络冗余等情况,这样出现问题好排查解决

发表在 ORA-xxxxx | 标签为 , , | 留下评论