标签归档:asm disk 格式化

再一起asm disk被格式化成ext3文件系统故障恢复

国庆节前夕接到朋友求救电话asm disk被格式化成ext3格式了,具体操作如下
20191006205129


20191006205203

并且把这个分区直接挂载到/目录
20191006205239

由于/被挂载新格式化的控盘,导致asm磁盘组访问其他盘报错

Sun Sep 29 18:15:02 2019
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_b000_8094.trc:
ORA-15025: could not open disk "/dev/asmdisk/sdh"
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_b000_8094.trc:
ORA-15025: could not open disk "/dev/asmdisk/sdh"
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
WARNING: cache failed reading from group=1(DATA) fn=9 blk=0 count=1 from 
disk= 5 (DATA_0005) kfkist=0x20 status=0x02 osderr=0x0 file=kfc.c line=11596
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_b000_8094.trc:
ORA-15025: could not open disk "/dev/asmdisk/sdh"
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
ORA-15080: synchronous I/O operation to a disk failed
WARNING: cache succeeded reading from group=1(DATA) fn=9 blk=0 count=1 from 
disk= 7 (DATA_0007) kfkist=0x20 status=0x01 osderr=0x0 file=kfc.c line=11637
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_b000_8094.trc:
ORA-15025: could not open disk "/dev/asmdisk/sdh"
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
WARNING: PST-initiated drop of 1 disk(s) in group 1(.2380027701))

重启系统之后,重试mount 磁盘组

GMON dismounting group 1 at 18 for pid 31, osid 44279
NOTE: Disk DATA_0000 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0002 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0003 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0004 in mode 0x1 marked for de-assignment
NOTE: Disk DATA_0005 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0006 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0007 in mode 0x7f marked for de-assignment
NOTE: Disk  in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0009 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0010 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0011 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0012 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0013 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0014 in mode 0x7f marked for de-assignment
ERROR: diskgroup DATA was not mounted
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "8" is missing from group number "1" 
ERROR: ALTER DISKGROUP DATA MOUNT  /* asm agent *//* {1:2587:2} */

由于sdb(asm disk 8)被格式化,导致data磁盘组无法正常mount.这个客户运气比较好data 磁盘组是normal模式,但是由于mount到/,导致disk 4被强制drop,因此无法mount成功,但是通过一系列处理数据实现完美恢复,0数据丢失
20191006211707


如果磁盘组是外部冗余,请参考:
又一例asm格式化文件系统恢复
一次完美的asm disk被格式化ntfs恢复
oracle asm disk格式化恢复—格式化为ext4文件系统
oracle asm disk格式化恢复—格式化为ntfs文件系统

发表在 非常规恢复 | 标签为 , , , | 评论关闭

oracle asm disk格式化恢复—格式化为ntfs文件系统

接到网友请求,由于操作人员粗心把asm disk的磁盘映射到另外的机器上,并且格式化为了win ntfs文件系统,导致asm 磁盘组异常,数据库无法使用
asm 日志报ORA-27072错

Mon Nov 30 12:00:13 2015
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27070: async read/write failed
OSD-04008: WriteFile() 失败, 无法写入文件
O/S-Error: (OS 21) 设备未就绪。
WARNING: IO Failed. group:1 disk(number.incarnation):0.0xf0f0bbfb disk_path:\\.\ORCLDISKDATA0
	 AU:1 disk_offset(bytes):2093056 io_size:4096 operation:Write type:synchronous
	 result:I/O error process_id:868
WARNING: disk 0.4042308603 (DATA_0000) not responding to heart beat
ERROR: too many offline disks in PST (grp 1)
WARNING: Disk DATA_0000 in mode 0x7f will be taken offline
Mon Nov 30 12:00:13 2015
NOTE: process 576:37952 initiating offline of disk 0.4042308603 (DATA_0000) with mask 0x7e in group 1
WARNING: Disk DATA_0000 in mode 0x7f is now being taken offline
NOTE: initiating PST update: grp = 1, dsk = 0/0xf0f0bbfb, mode = 0x15
kfdp_updateDsk(): 5 
kfdp_updateDskBg(): 5 
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):1.0xf0f0bbfc disk_path:\\.\ORCLDISKDATA1
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):1.0xf0f0bbfc disk_path:\\.\ORCLDISKDATA1
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):2.0xf0f0bbfd disk_path:\\.\ORCLDISKDATA2
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):2.0xf0f0bbfd disk_path:\\.\ORCLDISKDATA2
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):3.0xf0f0bbfe disk_path:\\.\ORCLDISKDATA3
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):3.0xf0f0bbfe disk_path:\\.\ORCLDISKDATA3
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):4.0xf0f0bbff disk_path:\\.\ORCLDISKDATA4
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):4.0xf0f0bbff disk_path:\\.\ORCLDISKDATA4
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):6.0xf0f0bc01 disk_path:\\.\ORCLDISKDATA6
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):6.0xf0f0bc01 disk_path:\\.\ORCLDISKDATA6
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):7.0xf0f0bc02 disk_path:\\.\ORCLDISKDATA7
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):7.0xf0f0bc02 disk_path:\\.\ORCLDISKDATA7
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
ERROR: no PST quorum in group: required 1, found 0
WARNING: Disk DATA_0000 in mode 0x7f offline aborted
Mon Nov 30 12:00:14 2015
SQL> alter diskgroup DATA dismount force /* ASM SERVER */ 
NOTE: cache dismounting (not clean) group 1/0xBB404B03 (DATA) 
Mon Nov 30 12:00:14 2015
NOTE: halting all I/Os to diskgroup DATA
Mon Nov 30 12:00:14 2015
NOTE: LGWR doing non-clean dismount of group 1 (DATA)
NOTE: LGWR sync ABA=367.7265 last written ABA 367.7265
NOTE: cache dismounted group 1/0xBB404B03 (DATA) 
kfdp_dismount(): 6 
kfdp_dismountBg(): 6 
NOTE: De-assigning number (1,0) from disk (\\.\ORCLDISKDATA0)
NOTE: De-assigning number (1,1) from disk (\\.\ORCLDISKDATA1)
NOTE: De-assigning number (1,2) from disk (\\.\ORCLDISKDATA2)
NOTE: De-assigning number (1,3) from disk (\\.\ORCLDISKDATA3)
NOTE: De-assigning number (1,4) from disk (\\.\ORCLDISKDATA4)
NOTE: De-assigning number (1,5) from disk (\\.\ORCLDISKDATA5)
NOTE: De-assigning number (1,6) from disk (\\.\ORCLDISKDATA6)
NOTE: De-assigning number (1,7) from disk (\\.\ORCLDISKDATA7)
SUCCESS: diskgroup DATA was dismounted
NOTE: cache deleting context for group DATA 1/-1153414397
SUCCESS: alter diskgroup DATA dismount force /* ASM SERVER */
ERROR: PST-initiated MANDATORY DISMOUNT of group DATA

这里的asm日志很明显由于asm disk无法正常访问,报ORA-27072错误,磁盘组强制dismount.

分析磁盘情况
asm-disk1
asm-disk2


通过与客户沟通,确定从I到O本为asm disk 被格式化为了NTFS文件系统的磁盘,结合asmtool分析可以发现还有一个asm disk没有格式化掉,该磁盘组中一个共有8个磁盘格式化掉了7个.

通过kfed分析磁盘信息

C:\Users\Administrator>kfed read '\\.\J:'
kfbh.endian:                        235 ; 0x000: 0xeb
kfbh.hard:                           82 ; 0x001: 0x52
kfbh.type:                          144 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                         78 ; 0x003: 0x4e
kfbh.block.blk:               542328404 ; 0x004: T=0 NUMB=0x20534654
kfbh.block.obj:                 2105376 ; 0x008: TYPE=0x0 NUMB=0x2020
kfbh.check:                        2050 ; 0x00c: 0x00000802
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                    63488 ; 0x014: 0x0000f800
kfbh.spare1:                   16711743 ; 0x018: 0x00ff003f
kfbh.spare2:                       2048 ; 0x01c: 0x00000800
ERROR!!!, failed to get the oracore error message

C:\Users\Administrator>kfed read '\\.\J:' blkn=2
kfbh.endian:                         70 ; 0x000: 0x46
kfbh.hard:                           73 ; 0x001: 0x49
kfbh.type:                           76 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                         69 ; 0x003: 0x45
kfbh.block.blk:                  196656 ; 0x004: T=0 NUMB=0x30030
kfbh.block.obj:                33563364 ; 0x008: TYPE=0x0 NUMB=0x22e4
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                    65537 ; 0x010: 0x00010001
kfbh.fcn.wrap:                    65592 ; 0x014: 0x00010038
kfbh.spare1:                        416 ; 0x018: 0x000001a0
kfbh.spare2:                       1024 ; 0x01c: 0x00000400
ERROR!!!, failed to get the oracore error message

C:\Users\Administrator>kfed read '\\.\J:' blkn=256
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                           13 ; 0x002: KFBTYP_PST_NONE
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:              2147483648 ; 0x004: T=1 NUMB=0x0
kfbh.block.obj:              2147483654 ; 0x008: TYPE=0x8 NUMB=0x6
kfbh.check:                    17662471 ; 0x00c: 0x010d8207
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
ERROR!!!, failed to get the oracore error message

C:\Users\Administrator>kfed read '\\.\J:' blkn=510
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                     254 ; 0x004: T=0 NUMB=0xfe
kfbh.block.obj:              2147483654 ; 0x008: TYPE=0x8 NUMB=0x6
kfbh.check:                   717599272 ; 0x00c: 0x2ac5b228
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:    ORCLDISKDATA6 ; 0x000: length=13
kfdhdb.driver.reserved[0]:   1096040772 ; 0x008: 0x41544144
kfdhdb.driver.reserved[1]:           54 ; 0x00c: 0x00000036
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
…………

通过分析,可以确定asm disk的备份block没有被覆盖,原则上可以通过备份block实现磁盘组恢复,从而减小了恢复难度

kfed恢复磁盘头

C:\Users\Administrator> kfed repair '\\.\J:'
C:\Users\Administrator>kfed read '\\.\J:' 
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                     254 ; 0x004: T=0 NUMB=0xfe
kfbh.block.obj:              2147483654 ; 0x008: TYPE=0x8 NUMB=0x6
kfbh.check:                   717599272 ; 0x00c: 0x2ac5b228
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:    ORCLDISKDATA6 ; 0x000: length=13
kfdhdb.driver.reserved[0]:   1096040772 ; 0x008: 0x41544144
kfdhdb.driver.reserved[1]:           54 ; 0x00c: 0x00000036
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
…………

确定asm disk相关信息
对于7个被格式化的磁盘都进行类似处理之后,通过工具看到相关磁盘信息如下
asm-disk3


恢复处理
根据ntfs的文件系统分布,我们可以知道,虽然asm disk header备份block正常,但是asm disk中间部分依旧有不少au会被破坏
ntfs


这样的情况,不合适直接使用工具拷贝出来datafile(由于可能记录block的字典正好被覆盖,导致拷贝出来的文件异常,在恢复过程中我们也做了试验小文件拷贝ok,大文件拷贝然后使用dbv检测有很多坏块),我们采用工具(asm disk header 彻底损坏恢复)从底层扫描直接重组出来asm disk中的数据文件,然后结合拷贝出来的控制文件,redo文件,参数文件,然后通过重命名相关路径,然后直接open数据库

Q:\>sqlplus / as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on 星期三 1月 22 16:08:18 2014

Copyright (c) 1982, 2010, Oracle.  All rights reserved.


连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options


SQL> set pages 1000
SQL> col name for a100
SQL> set lines 150
SQL> select file#,name from v$datafile;

     FILE# NAME
---------- --------------------------------------------------------------------
         1 +DATA/vspdb/datafile/system.256.778520603
         2 +DATA/vspdb/datafile/sysaux.257.778520603
         3 +DATA/vspdb/datafile/undotbs1.258.778520603
         4 +DATA/vspdb/datafile/users.259.778520603
         5 +DATA/vspdb/datafile/vsp_tbs.293.779926097
        …………
       147 +DATA/vspdb/datafile/index_dg.418.864665747
       148 +DATA/vspdb/datafile/data_dg.419.864667053
       149 +DATA/vspdb/datafile/vsp_mm_tbs.420.890410367
       150 +DATA/vspdb/datafile/vsp_mm_tbs.421.890410457

SQL> select member from v$logfile;

MEMBER
-------------------------------------------------------------------------------------
+DATA/vspdb/onlinelog/group_7.263.862676593
+DATA/vspdb/onlinelog/group_7.262.862676601
+DATA/vspdb/onlinelog/group_4.410.862652291
+DATA/vspdb/onlinelog/group_4.411.862652307
+DATA/vspdb/onlinelog/group_5.412.862653715
+DATA/vspdb/onlinelog/group_5.413.862653727
+DATA/vspdb/onlinelog/group_6.414.862676425
+DATA/vspdb/onlinelog/group_6.415.862676433

重命名数据文件和redo文件,open数据库

SQL> recover database;
完成介质恢复。
SQL> alter database open;

数据库已更改。

已用时间:  00: 00: 04.51

由于部分block被覆盖,使用空块代替,导致数据访问到该block就会出现ora-8103(模拟普通ORA-08103并解决,模拟极端ORA-08103并解决)错误,对于该种对象,最简单处理方法就是直接通过dul抽出来数据然后truncate table重新导入数据,当然如果你想彻底安全逻辑方式重建库最靠谱
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:13429648788    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

发表在 Oracle ASM, 非常规恢复 | 标签为 , , , , | 评论关闭