分类目录归档:非常规恢复

在生产环境错误执行dd命令破坏asm磁盘故障恢复

由于ssh登录错误,客户对生产环境进行了误操作把系统的一块磁盘dd到另外两个磁盘上,由于及时发现立马进行了终止操作,但是还是分别破坏了一点数据(一块盘破坏了2G多,另外一块盘破坏了1G多)
dd


通过分析udev的绑定关系确认被破坏的asm disk名称
udev

再通过asm alert日志确认破坏磁盘在asm disk中情况
asmdisk

通过上述信息基本上可以确认,asmdisk13被分别dd到了asmdisk11和asmdisk26中了部分11
26

基于这种情况,由于asm disk被破坏了1-2G多,直接修复然后正常mount磁盘组基本上没有希望,经过分析以及与客户沟通,确认他们改系统是4节点组成的集群,1/2节点上面跑2套库,3/4节点上跑2套库,数据整体放在data_dg磁盘组中,需要恢复的库是第二个顺序创建的1套库(4套库中只需恢复一套即可),由于破坏的数据本身不多,而且需要恢复的数据不是最初写入asm磁盘组,基于这样的情况,需要恢复的数据库机会比较大.

由于现在三个磁盘头信息一致(一个磁盘被dd到另外两个磁盘上),因此第一步先把损坏的两个磁盘头进行简单修复,为了便于amdu(找回ASM中数据文件)等之类数据可以识别到正确的磁盘头信息,然后进行后续的数据文件提取恢复.使用工具对数据文件进行了批量提取,提取数据完成之后,尝试recover和open库
open

虽然数据库正常打开了,不过很不幸,后台还是有一些坏块报错,通过dbv检查发现有文件有一部分坏块,类似dbv报错信息
1_1

通过分析该文件在磁盘组中各个磁盘的分布情况
map

确认该文件确实有部分block分布在被dd磁盘的破坏的范围内这个部分的数据丢失无法挽回,只能是定位到具体对象然后由业务想办法处理.相对以往的各种dd破坏的案例恢复而言,这个应该是效果比较好的一个,而且也是恢复比较容易的一个,没有使用到asm disk 基于asm au/oracle block 扫描的级别,而且system表空间没有任何损坏,数据库甚至直接open成功了,以往的一些dd案例列举:
asm磁盘dd破坏恢复
dd破坏asm磁盘头恢复
asm disk 磁盘部分被清空恢复

发表在 非常规恢复 | 标签为 , | 留下评论

expdp dmp 导出不完整导入ORA-39059 ORA-39246 故障抢救数据

客户一套nc系统,由于安装时候把库建在了比较小的分区上,运行一些时间之后,出现空间不足,现场技术人员对oracle不太熟悉,经过一系列操作(删除业务表空间,复制pdb,创建表空间等等操作),无法恢复数据库,准备使用备份的dmp进行还原,结果分析发现仅保留的最后一份dmp,是一份导出不完全的dmp文件,无法正常导入(以前处理过一个类似case:ORA-39773: parse of metadata stream failed故障处理,尝试导入报ORA-39246错:

C:\Users\XFF>impdp system/oracle@127.0.0.1/orapdb directory=expdp_dir dumpfile=xxxxx_2025-12-01_0230.dmp logfile=1.log

Import: Release 19.0.0.0.0 - Production on 星期三 12月 3 21:00:19 2025
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

连接到: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
ORA-39002: 操作无效
ORA-39059: 转储文件集不完整
ORA-39246: 无法在提供的转储文件中定位主表

分析当时当初的dmp日志,由于expdp的job表所在表空间不足导致expdp导出失败
dmp1


TABLE:"XIFENFEI"."EOM_MEASURE_POINT"
ORA-30032: 挂起的 (可恢复) 语句已超时
ORA-01691: Lob 段 XIFENFEI.SYS_LOB0000161267C00111$$ 无法通过 32 (在表空间 NNC_DATA01 中) 扩展
ORA-06512: 在 "SYS.DBMS_SYS_ERROR", line 105
ORA-06512: 在 "SYS.KUPW$WORKER", line 12620
ORA-06512: 在 "SYS.DBMS_SYS_ERROR", line 105
ORA-06512: 在 "SYS.KUPW$WORKER", line 11414
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0xda5dae50     33476  package body SYS.KUPW$WORKER.WRITE_ERROR_INFORMATION
0xda5dae50     12641  package body SYS.KUPW$WORKER.DETERMINE_FATAL_ERROR
0xda5dae50     11602  package body SYS.KUPW$WORKER.CREATE_OBJECT_ROWS
0xda5dae50     15268  package body SYS.KUPW$WORKER.FETCH_XML_OBJECTS
0xda5dae50      3907  package body SYS.KUPW$WORKER.UNLOAD_METADATA
0xda5dae50     13736  package body SYS.KUPW$WORKER.DISPATCH_WORK_ITEMS
0xda5dae50      2429  package body SYS.KUPW$WORKER.MAIN
0x6524a4f0         2  anonymous block
KUPW: Object row index into parse items is: 1
KUPW: Parse item count is: 19
KUPW: In function CHECK_FOR_REMAP_NETWORK
KUPW: Nothing to remap
KUPW: In procedure BUILD_OBJECT_STRINGS - non-base info
KUPW: In procedure BUILD_SUBNAME_LIST with TABLE:XIFENFEI.EOM_MEASURE_POINT
KUPW: In function NEXT_PO_NUMBER
KUPW: PO number assigned: 34198
FORALL
KUPW: In procedure DETERMINE_FATAL_ERROR with ORA-30032: 挂起的 (可恢复) 语句已超时
ORA-01691: Lob 段 XIFENFEI.SYS_LOB0000161267C00111$$ 无法通过 32 (在表空间 NNC_DATA01 中) 扩展
作业 "XIFENFEI"."SYS_EXPORT_SCHEMA_01" 因致命错误于 星期一 12月 1 06:33:21 2025 elapsed 0 04:03:18 停止

从导出日志看,在导出大量”0 KB 0 行”记录之后提示表空间不足,expdp的job表无法扩展导致导出挂起然后超时导出终止(这个导出操作没有完全完成),从而在导入的时候出现了ORA-39059: 转储文件集不完整 ORA-39246: 无法在提供的转储文件中定位主表 的错误.对于这种故障,分析导出日志,发现运气不错,所有有数据的表都导出完成,基于这个心中就有了第一层底气,所有表数据不会丢失(因为都导出到了这个dmp中),但是非表的字典数据不完整,要想业务完整跑起来,需要找到一个完整的业务字典信息.对于大量的备份dmp被删除,然后对应分区还写入了很多数据,只能尝试看运气,通过对磁盘文件镜像,然后进行反删除恢复,找出来一个11月26日的dmp的压缩文件是完整的
good-dmp


通过这个dmp导入业务字典信息,然后再利用expdp dmp解析工具(expdp dmp被加密破坏恢复)把所有表数据出来,经过这两者组合,顺利完成数恢复,可以测试业务完全正常

发表在 Oracle, 非常规恢复 | 标签为 , , , | 评论关闭

system表空间丢失部分文件恢复

有客户因为system表空间有一个数据文件放在其他位置,当时没有正常拷贝出来(备份了oradata路径下面文件,遗漏了一个system文件),尝试启动库报ORA-01157 ORA-01147等错误

[oracle@xifenfei check_db]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Sun Oct 5 21:13:28 2025

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> recover datafile 1;
Media recovery complete. 
SQL> recover datafile 2,3,4,5,6,7,8,9,10;   
Media recovery complete.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01157: cannot identify/lock data file 11 - see DBWR trace file
ORA-01110: data file 11:
'/u01/app/oracle/product/11.2.0.4/db_1/dbs/path_to_datafile.dbf'

SQL> alter database datafile 11 offline drop;

Database altered.

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01147: SYSTEM tablespace file 11 is offline
ORA-01110: data file 11:
'/u01/app/oracle/product/11.2.0.4/db_1/dbs/path_to_datafile.dbf'

alert日志报错信息

Sun Oct 05 22:35:01 2025
alter database open
Sun Oct 05 22:35:01 2025
Errors in file /data/app/oracle/diag/rdbms/mtxdb1/mtxdb1/trace/mtxdb1_dbw0_5946.trc:
ORA-01157: cannot identify/lock data file 11 - see DBWR trace file
ORA-01110: data file 11: '/u01/app/oracle/product/11.2.0.4/db_1/dbs/path_to_datafile.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /data/app/oracle/diag/rdbms/mtxdb1/mtxdb1/trace/mtxdb1_ora_11264.trc:
ORA-01157: cannot identify/lock data file 11 - see DBWR trace file
ORA-01110: data file 11: '/u01/app/oracle/product/11.2.0.4/db_1/dbs/path_to_datafile.dbf'
ORA-1157 signalled during: alter database open...
Sun Oct 05 22:35:25 2025
alter database datafile 11 offline 
ORA-1145 signalled during: alter database datafile 11 offline ...
alter database datafile 11 offline drop
Completed: alter database datafile 11 offline drop
alter database open
Errors in file /data/app/oracle/diag/rdbms/mtxdb1/mtxdb1/trace/mtxdb1_ora_11264.trc:
ORA-01147: SYSTEM tablespace file 11 is offline
ORA-01110: data file 11: '/u01/app/oracle/product/11.2.0.4/db_1/dbs/path_to_datafile.dbf'
ORA-1147 signalled during: alter database open...

由于11号文件是system表空间的一个数据文件,对于这种数据文件丢失无法offline该数据文件,然后open库(也就是说在open库的时候,system表空间的数据文件必须全部online,如果有部分文件offline就会报ORA-01147).对于这样的情况,以前有过类似恢复经历:bbed打开丢失部分system数据文件库,这次的编写了一个m_scn程序实现快速处理

[oracle@xifenfei  tmp]$ cat 1.txt
1@/data/app/oracle/oradata/mtxdb1/system01.dbf
11@/tmp/11.dbf
[oracle@xifenfei  tmp]$ ./m_scn 1.txt

-------------Is processing datafile:/tmp/11.dbf-------------
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.000835728 s, 1.3 GB/s

[oracle@xifenfei tmp]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Wed Oct 8 11:27:32 2025

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> set numw 16
SQL> col CHECKPOINT_TIME for a40
SQL> set lines 150
SQL> set pages 1000
SQL> SELECT status,
  2  to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,checkpoint_change#,
  3  count(*) ROW_NUM
  4  FROM v$datafile_header
  5  GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss'),fuzzy
  6  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS  CHECKPOINT_TIME                          FUZ CHECKPOINT_CHANGE#          ROW_NUM
------- ---------------------------------------- --- ------------------ ----------------
OFFLINE 2025-10-02 06:50:06                      NO      17328662858685                1
ONLINE  2025-10-02 06:50:06                      NO      17328662858685               10


SQL> alter database datafile 11 online;

Database altered.

然后重建ctl,并尝试打开库
ctl_re


然后查询11号文件中涉及的对象情况

SQL> select distinct owner,segment_name,segment_type from dba_extents where file_id=11;

OWNER                          SEGMENT_NAME                           SEGMENT_TYPE
------------------------------ -------------------------------------- ------------------
SYS                            SYSTEM                                 ROLLBACK
SYS                            I_COL1                                 INDEX
SYS                            AUD$                                   TABLE

SQL> select owner,segment_name from dba_segments where HEADER_FILE=11;

no rows selected

证明丢失的11号文件(system表空间文件),涉及的对象较少,而且不涉及核心字典,比如tab$,obj$,col$等非常核心对象,评估理论上应该不涉业务数据丢失,尝试直接expdp导出数据,但是很不幸,报ORA-00600: internal error code, arguments: [kdBlkCheckError], [11], [3], [18018]错误

. . exported "XFF020"."OTHERBILLDETAIL_DEL"              6.405 MB  126048 rows
. . exported "XFF020"."POSSOLDOUT"                       7.784 MB  281413 rows
ORA-31693: Table data object "XFF020"."MATERIELTRAN" failed to load/unload and is being skipped due to error:
ORA-39068: invalid master table data in row with PROCESS_ORDER=159:1000001
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [11], [3], [18018], [], [], [], [], [], [], [], []
ORA-06512: at "SYS.KUPF$FILE", line 3720
ORA-06512: at line 1
ORA-39126: Worker unexpected fatal error in KUPW$WORKER.UNLOAD_DATA [TABLE_DATA:"XFF020"."MATERIELTRAN"] 
UPDATE "SYS"."SYS_EXPORT_FULL_01" SET processing_state = :1, processing_status = :2
    WHERE process_order = :3 AND duplicate = 0
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [11], [3], [18018], [], [], [], [], [], [], [], []
ORA-06512: at "SYS.KUPW$WORKER", line 7866
ORA-31693: Table data object "XFF020"."MATERIELTRAN" failed to load/unload and is being skipped due to error:
ORA-39068: invalid master table data in row with PROCESS_ORDER=159:1000001
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [11], [3], [18018], [], [], [], [], [], [], [], []
ORA-06512: at "SYS.KUPF$FILE", line 3720
ORA-06512: at line 1

ORA-06512: at "SYS.DBMS_SYS_ERROR", line 105
ORA-06512: at "SYS.KUPW$WORKER", line 9721

----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0xef2fc508     21979  package body SYS.KUPW$WORKER
0xef2fc508      9742  package body SYS.KUPW$WORKER
0xef2fc508      3437  package body SYS.KUPW$WORKER
0xef2fc508     10436  package body SYS.KUPW$WORKER
0xef2fc508      1824  package body SYS.KUPW$WORKER
0xef2feb20         2  anonymous block

ORA-39097: Data Pump job encountered unexpected error -607
ORA-39065: unexpected master process exception in DISPATCH
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [11], [3], [18018], [], [], [], [], [], [], [], []

ORA-31693: Table data object "XFF020"."ANALYSEREPORT" failed to load/unload and is being skipped due to error:
ORA-39068: invalid master table data in row with PROCESS_ORDER=161:1000001
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [11], [3], [18018], [], [], [], [], [], [], [], []
ORA-06512: at "SYS.KUPF$FILE", line 3720
ORA-06512: at line 1
ORA-39126: Worker unexpected fatal error in KUPW$WORKER.UNLOAD_DATA [TABLE_DATA:"XFF020"."ANALYSEREPORT"] 
UPDATE "SYS"."SYS_EXPORT_FULL_01" SET processing_state = :1, processing_status = :2
   WHERE process_order = :3 AND duplicate = 0
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [11], [3], [18018], [], [], [], [], [], [], [], []
ORA-06512: at "SYS.KUPW$WORKER", line 7866
ORA-31693: Table data object "XFF020"."ANALYSEREPORT" failed to load/unload and is being skipped due to error:
ORA-39068: invalid master table data in row with PROCESS_ORDER=161:1000001
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [11], [3], [18018], [], [], [], [], [], [], [], []
ORA-06512: at "SYS.KUPF$FILE", line 3720
ORA-06512: at line 1

ORA-06512: at "SYS.DBMS_SYS_ERROR", line 105
ORA-06512: at "SYS.KUPW$WORKER", line 9721

----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0xef2fc508     21979  package body SYS.KUPW$WORKER
0xef2fc508      9742  package body SYS.KUPW$WORKER
0xef2fc508      3437  package body SYS.KUPW$WORKER
0xef2fc508     10436  package body SYS.KUPW$WORKER
0xef2fc508      1824  package body SYS.KUPW$WORKER
0xef2feb20         2  anonymous block

ORA-31693: Table data object "XFF020CW"."MATERIELTRAN" failed to load/unload and is being skipped due to error:
ORA-39068: invalid master table data in row with PROCESS_ORDER=160:1000001
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [11], [3], [18018], [], [], [], [], [], [], [], []
ORA-06512: at "SYS.KUPF$FILE", line 3720
ORA-06512: at line 1
ORA-39126: Worker unexpected fatal error in KUPW$WORKER.UNLOAD_DATA [TABLE_DATA:"XFF020CW"."MATERIELTRAN"] 
UPDATE "SYS"."SYS_EXPORT_FULL_01" SET processing_state = :1, processing_status = :2
   WHERE process_order = :3 AND duplicate = 0
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [11], [3], [18018], [], [], [], [], [], [], [], []
ORA-06512: at "SYS.KUPW$WORKER", line 7866
ORA-31693: Table data object "XFF020CW"."MATERIELTRAN" failed to load/unload and is being skipped due to error:
ORA-39068: invalid master table data in row with PROCESS_ORDER=160:1000001
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [11], [3], [18018], [], [], [], [], [], [], [], []
ORA-06512: at "SYS.KUPF$FILE", line 3720
ORA-06512: at line 1

ORA-06512: at "SYS.DBMS_SYS_ERROR", line 105
ORA-06512: at "SYS.KUPW$WORKER", line 9721

----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0xef2fc508     21979  package body SYS.KUPW$WORKER
0xef2fc508      9742  package body SYS.KUPW$WORKER
0xef2fc508      3437  package body SYS.KUPW$WORKER
0xef2fc508     10436  package body SYS.KUPW$WORKER
0xef2fc508      1824  package body SYS.KUPW$WORKER
0xef2feb20         2  anonymous block

Job "SYS"."SYS_EXPORT_FULL_01" stopped due to fatal error at Wed Oct 8 11:59:29 2025 elapsed 0 00:18:48

对ORA-600 kdBlkCheckError进行分析分析(11表示文件号,3表示block),是由于导出生成的master表写入在system表空间,而system表空间中的file# 11是人工构造出来的,block 3 是位图分配信息(该信息和实际字典中存储信息不匹配),所以导致出现该错误,对于这个问题解决方法为expdp写master表不在system表空间即可,通过该操作,顺利导出数据,完成本次恢复任务
expdp_ok


发表在 非常规恢复 | 标签为 , , , , | 评论关闭