标签归档:asm恢复

asm磁盘分区丢失恢复

有朋友反馈,他们做了xx存储的双活之后,重启主机发现gi无法正常启动,分析发现所有该存储的磁盘分区信息丢失,导致asmlib无法发现磁盘(使用分区做asm disk)
类似如下错误(磁盘分区丢失)

--fdisk -l 显示部分结果
Disk /dev/mapper/datahds1: 1099.5 GB, 1099511627776 bytes
255 heads, 63 sectors/track, 133674 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

--ls -l /dev/mapper/   显示结果无分区信息
lrwxrwxrwx 1 root root      7 May  6 03:44 datahds1 -> ../dm-1
lrwxrwxrwx 1 root root      7 May  6 03:26 datahds2 -> ../dm-3
lrwxrwxrwx 1 root root      7 May  6 03:26 datahds3 -> ../dm-8
lrwxrwxrwx 1 root root      7 May  6 03:26 ocrhds1 -> ../dm-0
lrwxrwxrwx 1 root root      7 May  6 03:26 ocrhds2 -> ../dm-2
lrwxrwxrwx 1 root root      7 May  6 03:26 ocrhds3 -> ../dm-4

asm日志显示

SUCCESS: diskgroup DATADG was mounted
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 3
SUCCESS: diskgroup OCRHDS was mounted
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"

分析系统日志

May  6 02:23:27 db2 kernel: sdb: unknown partition table
May  6 02:23:27 db2 kernel: sde: unknown partition table
May  6 02:23:27 db2 kernel: sdc: unknown partition table
May  6 02:23:27 db2 kernel: sdf: unknown partition table
May  6 02:23:27 db2 kernel: sdd: unknown partition table
May  6 02:23:27 db2 kernel: sdj:Dev sdj: unable to read RDB block 0
May  6 02:23:27 db2 kernel: unable to read partition table
May  6 02:23:27 db2 kernel: sdi: sdi1
May  6 02:23:27 db2 kernel: sdk: sdk1
May  6 02:23:27 db2 kernel: sdg: unknown partition table
May  6 02:23:27 db2 kernel: sdl: sdl1
May  6 02:23:27 db2 kernel: sdm:Dev sdm: unable to read RDB block 0
May  6 02:23:27 db2 kernel: unable to read partition table
May  6 02:23:27 db2 kernel: sdo:Dev sdo: unable to read RDB block 0
May  6 02:23:27 db2 kernel: unable to read partition table
May  6 02:23:27 db2 kernel: sdn:Dev sdn: unable to read RDB block 0
May  6 02:23:27 db2 kernel: unable to read partition table
May  6 02:23:27 db2 kernel: sdp:Dev sdp: unable to read RDB block 0
May  6 02:23:27 db2 kernel: unable to read partition table
May  6 02:23:27 db2 kernel: sds:Dev sds: unable to read RDB block 0
May  6 02:23:27 db2 kernel: unable to read partition table
May  6 02:23:27 db2 kernel: sdh:
May  6 02:23:27 db2 kernel: sdt: sdt1
May  6 02:23:27 db2 kernel: sdv:Dev sdv: unable to read RDB block 0
May  6 02:23:27 db2 kernel: unable to read partition table
May  6 02:23:27 db2 kernel: sdq:Dev sdq: unable to read RDB block 0
May  6 02:23:27 db2 kernel: unable to read partition table
May  6 02:23:27 db2 kernel: sd 1:0:1:9: [sdr] Very big device. Trying to use READ CAPACITY(16).
May  6 02:23:27 db2 kernel: sdr:Dev sdr: unable to read RDB block 0
May  6 02:23:27 db2 kernel: unable to read partition table
May  6 02:23:27 db2 kernel: sd 2:0:0:9: [sdab] Very big device. Trying to use READ CAPACITY(16).
May  6 02:23:27 db2 kernel: sdab: unknown partition table
May  6 02:23:27 db2 kernel: sdac: unknown partition table
May  6 02:23:27 db2 kernel: sdw: sdw1
May  6 02:23:27 db2 kernel: sdu:Dev sdu: unable to read RDB block 0
May  6 02:23:27 db2 kernel: unable to read partition table
May  6 02:23:27 db2 kernel: sdx: sdx1
May  6 02:23:27 db2 kernel: sdy: sdy1
May  6 02:23:27 db2 kernel: sdaa: sdaa1
May  6 02:23:27 db2 kernel: sdz: sdz1
May  6 02:23:27 db2 kernel: sdae: unknown partition table
May  6 02:23:27 db2 kernel: sdaf: unknown partition table
May  6 02:23:27 db2 kernel: sdag: unknown partition table
May  6 02:23:27 db2 kernel: sdai:
May  6 02:23:27 db2 kernel: sdah: unknown partition table
May  6 02:23:27 db2 kernel: sdad: unknown partition table
May  6 02:23:28 db2 mcelog: failed to prefill DIMM database from DMI data

这里错误比较明显unknown partition table,磁盘的分区信息损坏.使用fdisk无法发现分区

partprobe也无效

[root@db2 oracle]# partprobe /dev/mapper/ocrhds3
[root@db2 oracle]# 
[root@db2 oracle]# ls -l /dev/mapper/ocrhds3*
lrwxrwxrwx 1 root root 7 May  6 07:30 /dev/mapper/ocrhds3 -> ../dm-4

从尚需信息看,磁盘的分区表信息应该已经损坏,现在能够做的,就是希望运气好,磁盘的分区的实际数据没有损坏

分析磁盘实际分区数据

[root@db2 ~]$ dd if=/dev/mapper/datahds1 of=/tmp/datahds1.dd bs=1024k count=50
[root@db2 ~]$ dd if=/tmp/datahds1.dd of=/tmp/xff01.dd  bs=3225 skip=1
[grid@db2 ~]$ kfed read /tmp/xff01.dd |more
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  3110278718 ; 0x00c: 0xb963163e
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISKHDSDATA1 ; 0x000: length=16
kfdhdb.driver.reserved[0]:   1146307656 ; 0x008: 0x44534448
kfdhdb.driver.reserved[1]:    826364993 ; 0x00c: 0x31415441
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:             DATADG_0000 ; 0x028: length=11
kfdhdb.grpname:                  DATADG ; 0x048: length=6
kfdhdb.fgname:              DATADG_0000 ; 0x068: length=11
kfdhdb.capname:                         ; 0x088: length=0
kfdhdb.crestmp.hi:             33050696 ; 0x0a8: HOUR=0x8 DAYS=0x2 MNTH=0x4 YEAR=0x7e1
kfdhdb.crestmp.lo:           3813740544 ; 0x0ac: USEC=0x0 MSEC=0x44 SECS=0x35 MINS=0x38
kfdhdb.mntstmp.hi:             33050701 ; 0x0b0: HOUR=0xd DAYS=0x2 MNTH=0x4 YEAR=0x7e1
kfdhdb.mntstmp.lo:            411385856 ; 0x0b4: USEC=0x0 MSEC=0x150 SECS=0x8 MINS=0x6

通过上述分析,我们可以初步判断,分区磁盘的信息很可能是好的(因为asm disk header是好的,根据一般的规则从前往后覆盖,既然header是好的,后面的block被覆盖的概率非常小)

通过准备新磁盘直接把磁盘分区dd到新设备上

dd if=/dev/mapper/ocrhds1 of=/dev/mapper/ocrhdsnew1 skip=1 bs=3225
dd if=/dev/mapper/ocrhds2 of=/dev/mapper/ocrhdsnew2 skip=1 bs=3225
dd if=/dev/mapper/ocrhds3 of=/dev/mapper/ocrhdsnew3 skip=1 bs=3225
dd if=/dev/mapper/datahds1 of=/dev/mapper/datahdsnew1 skip=1 bs=3225
dd if=/dev/mapper/datahds2 of=/dev/mapper/datahdsnew2 skip=1 bs=3225
dd if=/dev/mapper/datahds3 of=/dev/mapper/datahdsnew3 skip=1 bs=3225

asmlib重新扫描磁盘

[root@db1 disks]# oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Scanning system for ASM disks...
Instantiating disk "HDSOCR3"
Instantiating disk "HDSDATA2"
Instantiating disk "HDSDATA1"
Instantiating disk "HDSDATA3"
Instantiating disk "HDSOCR1"
Instantiating disk "HDSOCR2"
[root@db1 disks]# ls -ltr
total 0
brw-rw---- 1 grid asmadmin  8, 160 May  6 13:49 HDSOCR3
brw-rw---- 1 grid asmadmin  8, 192 May  6 13:49 HDSDATA2
brw-rw---- 1 grid asmadmin  8, 176 May  6 13:49 HDSDATA1
brw-rw---- 1 grid asmadmin  8, 208 May  6 13:49 HDSDATA3
brw-rw---- 1 grid asmadmin  8, 128 May  6 13:49 HDSOCR1
brw-rw---- 1 grid asmadmin  8, 144 May  6 13:49 HDSOCR2

kfed验证拷贝的分区

[root@db2 tmp]# /oracle/app/11.2.0/grid_1/bin/kfed read /dev/oracleasm/disks/HDSDATA1
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  3110278718 ; 0x00c: 0xb963163e
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISKHDSDATA1 ; 0x000: length=16
kfdhdb.driver.reserved[0]:   1146307656 ; 0x008: 0x44534448
kfdhdb.driver.reserved[1]:    826364993 ; 0x00c: 0x31415441
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:             DATADG_0000 ; 0x028: length=11
kfdhdb.grpname:                  DATADG ; 0x048: length=6
kfdhdb.fgname:              DATADG_0000 ; 0x068: length=11
kfdhdb.capname:                         ; 0x088: length=0

asm和数据库启动正常

[grid@db2 ~]$ asmcmd
ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512   4096  1048576   3145710  2378034                0         2378034              0             N  DATADG/
MOUNTED  NORMAL  N         512   4096  1048576     15342    14416             5114            4651              0             Y  OCRHDS/
ASMCMD> 

[oracle@db2 ~]$ sqlplus  / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Sat May 6 13:54:21 2017

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area 3.6077E+10 bytes
Fixed Size                  2260648 bytes
Variable Size            7247757656 bytes
Database Buffers         2.8723E+10 bytes
Redo Buffers              104382464 bytes
Database mounted.
Database opened.
SQL> 

asm-disk-partition-lost-recovery


通过上述恢复,实现asm磁盘分区丢失数据0丢失
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:13429648788    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

发表在 Oracle ASM, 非常规恢复 | 标签为 , , , | 评论关闭

asm 加磁盘导致磁盘组损坏恢复

接到客户恢复case请求,希望我们接入恢复数据。大概过程是这样的,16年9月份由于硬件问题,导致normal磁盘组(只有2个磁盘)中的一个磁盘丢失,然后在17年3月6日,运维方尝试增加该磁盘进入磁盘组,结果通过force命令加入成功之后,磁盘组dismount,然后再也无法mount成功。
磁盘组创建信息

Fri Jun 24 19:31:38 2016
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 2
SUCCESS: diskgroup DATADG was mounted
SUCCESS: CREATE DISKGROUP DATADG NORMAL REDUNDANCY  DISK '/dev/asm-diskdata01' SIZE 1048576M ,
'/dev/asm-diskdata02' SIZE 1048576M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='4M' /* ASMCA */

这里可以看出来datadg是一个normal的au为4M的一个磁盘组

自动drop异常asm disk

Mon Sep 12 11:41:54 2016
WARNING: Waited 15 secs for write IO to PST disk 1 in group 1.
WARNING: Waited 15 secs for write IO to PST disk 1 in group 1.
Mon Sep 12 11:41:55 2016
NOTE: process _b000_+asm1 (19491) initiating offline of disk 1.3915923833 (DATADG_0001) with mask 0x7e in group 1
NOTE: checking PST: grp = 1
GMON checking disk modes for group 1 at 9 for pid 29, osid 19491
NOTE: group DATADG: updated PST location: disk 0000 (PST copy 0)
NOTE: checking PST for grp 1 done.
NOTE: sending set offline flag message 2870990318 to 1 disk(s) in group 1
WARNING: Disk DATADG_0001 in mode 0x7f is now being offlined
NOTE: initiating PST update: grp = 1, dsk = 1/0xe9684179, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 10 for pid 29, osid 19491
NOTE: group DATADG: updated PST location: disk 0000 (PST copy 0)
NOTE: group DATADG: updated PST location: disk 0000 (PST copy 0)
NOTE: PST update grp = 1 completed successfully 
NOTE: initiating PST update: grp = 1, dsk = 1/0xe9684179, mask = 0x7e, op = clear
GMON updating disk modes for group 1 at 11 for pid 29, osid 19491
NOTE: group DATADG: updated PST location: disk 0000 (PST copy 0)
NOTE: group DATADG: updated PST location: disk 0000 (PST copy 0)
NOTE: cache closing disk 1 of grp 1: DATADG_0001
NOTE: PST update grp = 1 completed successfully 
Mon Sep 12 11:42:55 2016
WARNING: Waited 15 secs for write IO to PST disk 0 in group 1.
Mon Sep 12 11:44:58 2016
WARNING: PST-initiated drop of 1 disk(s) in group 1(.1137226115))
SQL> alter diskgroup DATADG drop disk DATADG_0001 force /* ASM SERVER */ 
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=1
Mon Sep 12 11:44:59 2016
GMON updating for reconfiguration, group 1 at 12 for pid 29, osid 19491
NOTE: cache closing disk 1 of grp 1: (not open) DATADG_0001
NOTE: group DATADG: updated PST location: disk 0000 (PST copy 0)
NOTE: group 1 PST updated.
Mon Sep 12 11:44:59 2016
NOTE: membership refresh pending for group 1/0x43c8b183 (DATADG)
Mon Sep 12 11:45:02 2016
NOTE: successfully read ACD block gn=1 blk=0 via retry read
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_lgwr_3526.trc:
ORA-15062: ASM disk is globally closed
GMON querying group 1 at 13 for pid 18, osid 3532
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
SUCCESS: refreshed membership for 1/0x43c8b183 (DATADG)
SUCCESS: alter diskgroup DATADG drop disk DATADG_0001 force /* ASM SERVER */
NOTE: starting rebalance of group 1/0x43c8b183 (DATADG) at power 1
SUCCESS: PST-initiated drop disk in group 1(1137226115))
Starting background process ARB0
Mon Sep 12 11:45:03 2016
ARB0 started with pid=35, OS id=19945 
NOTE: assigning ARB0 to group 1/0x43c8b183 (DATADG) with 1 parallel I/O
cellip.ora not found.
NOTE: Rebalance has restored redundancy for any existing control file or redo log in disk group DATADG
NOTE: Attempting voting file refresh on diskgroup DATADG
NOTE: Refresh completed on diskgroup DATADG. No voting file found.
Mon Sep 12 11:46:21 2016
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=1
Mon Sep 12 11:46:24 2016
GMON updating for reconfiguration, group 1 at 14 for pid 36, osid 20110
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
NOTE: group 1 PST updated.
WARNING: offline disk number 1 has references (54679 AUs)
Mon Sep 12 11:46:24 2016
NOTE: membership refresh pending for group 1/0x43c8b183 (DATADG)
GMON querying group 1 at 15 for pid 18, osid 3532
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
SUCCESS: refreshed membership for 1/0x43c8b183 (DATADG)
NOTE: Attempting voting file refresh on diskgroup DATADG
NOTE: Refresh completed on diskgroup DATADG. No voting file found.
NOTE: stopping process ARB0
SUCCESS: rebalance completed for group 1/0x43c8b183 (DATADG)

这里我们可以看出来磁盘组在2016年9月12日由于disk 1 无法响应,直接被asm 踢出了磁盘组

把被强制删除的磁盘重新加回去

Mon Mar 06 15:36:54 2017
SQL> alter diskgroup DATADG add disk '/dev/asm-diskdata01' name DATADG_0000 
NOTE: GroupBlock outside rolling migration privileged region
ORA-15032: not all alterations performed
ORA-15029: disk '/dev/asm-diskdata01' is already mounted by this instance
ERROR: alter diskgroup DATADG add disk '/dev/asm-diskdata01' name DATADG_0000
Mon Mar 06 15:38:27 2017
SQL>  alter diskgroup DATADG add disk '/dev/asm-diskdata02' name DATADG_0001 
NOTE: GroupBlock outside rolling migration privileged region
NOTE: Assigning number (1,2) to disk (/dev/asm-diskdata02)
NOTE: requesting all-instance membership refresh for group=1
NOTE: Disk DATADG_0001 in mode 0x7f marked for de-assignment
ERROR: ORA-15033 signalled during reconfiguration of diskgroup DATADG
Mon Mar 06 15:38:28 2017
NOTE: membership refresh pending for group 1/0x31584f6b (DATADG)
Mon Mar 06 15:38:31 2017
GMON querying group 1 at 7 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
GMON querying group 1 at 8 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
SUCCESS: refreshed membership for 1/0x31584f6b (DATADG)
ORA-15032: not all alterations performed
ORA-15033: disk '/dev/asm-diskdata02' belongs to diskgroup "DATADG"
ERROR:  alter diskgroup DATADG add disk '/dev/asm-diskdata02' name DATADG_0001
NOTE: Attempting voting file refresh on diskgroup DATADG
NOTE: Refresh completed on diskgroup DATADG. No voting file found.
Mon Mar 06 16:04:14 2017
SQL> alter diskgroup DATADG add disk '/dev/asm-diskdata02' name DATADG_0001 
NOTE: GroupBlock outside rolling migration privileged region
NOTE: Assigning number (1,2) to disk (/dev/asm-diskdata02)
NOTE: requesting all-instance membership refresh for group=1
NOTE: Disk DATADG_0001 in mode 0x7f marked for de-assignment
ERROR: ORA-15033 signalled during reconfiguration of diskgroup DATADG
Mon Mar 06 16:04:15 2017
NOTE: membership refresh pending for group 1/0x31584f6b (DATADG)
Mon Mar 06 16:04:18 2017
GMON querying group 1 at 9 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
GMON querying group 1 at 10 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
SUCCESS: refreshed membership for 1/0x31584f6b (DATADG)
ORA-15032: not all alterations performed
ORA-15033: disk '/dev/asm-diskdata02' belongs to diskgroup "DATADG"
ERROR: alter diskgroup DATADG add disk '/dev/asm-diskdata02' name DATADG_0001
NOTE: Attempting voting file refresh on diskgroup DATADG
NOTE: Refresh completed on diskgroup DATADG. No voting file found.
Mon Mar 06 16:23:28 2017
SQL> alter diskgroup DATADG add FAILGROUP DATA_0001 disk '/dev/adm-diskdata02' name DATA_0001 
NOTE: GroupBlock outside rolling migration privileged region
ORA-15032: not all alterations performed
ORA-15031: disk specification '/dev/adm-diskdata02' matches no disks
ERROR: alter diskgroup DATADG add FAILGROUP DATA_0001 disk '/dev/adm-diskdata02' name DATA_0001
Mon Mar 06 16:24:48 2017
SQL> alter diskgroup DATADG add FAILGROUP DATA_0001 disk '/dev/asm-diskdata02' name DATA_0001 
NOTE: GroupBlock outside rolling migration privileged region
NOTE: Assigning number (1,2) to disk (/dev/asm-diskdata02)
NOTE: requesting all-instance membership refresh for group=1
NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment
ERROR: ORA-15033 signalled during reconfiguration of diskgroup DATADG
Mon Mar 06 16:24:49 2017
NOTE: membership refresh pending for group 1/0x31584f6b (DATADG)
Mon Mar 06 16:24:52 2017
GMON querying group 1 at 11 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
GMON querying group 1 at 12 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
SUCCESS: refreshed membership for 1/0x31584f6b (DATADG)
ORA-15032: not all alterations performed
ORA-15033: disk '/dev/asm-diskdata02' belongs to diskgroup "DATADG"
ERROR: alter diskgroup DATADG add FAILGROUP DATA_0001 disk '/dev/asm-diskdata02' name DATA_0001
NOTE: Attempting voting file refresh on diskgroup DATADG
NOTE: Refresh completed on diskgroup DATADG. No voting file found.
Mon Mar 06 16:26:07 2017
SQL> alter diskgroup DATADG add FAILGROUP DATA_0001 disk '/dev/asm-diskdata02' name DATA_0001 force  
NOTE: GroupBlock outside rolling migration privileged region
NOTE: Assigning number (1,2) to disk (/dev/asm-diskdata02)
NOTE: requesting all-instance membership refresh for group=1
NOTE: initializing header on grp 1 disk DATA_0001
NOTE: requesting all-instance disk validation for group=1
Mon Mar 06 16:26:10 2017
NOTE: skipping rediscovery for group 1/0x31584f6b (DATADG) on local instance.
NOTE: requesting all-instance disk validation for group=1
NOTE: skipping rediscovery for group 1/0x31584f6b (DATADG) on local instance.
Mon Mar 06 16:26:15 2017
GMON updating for reconfiguration, group 1 at 13 for pid 28, osid 12861
NOTE: group 1 PST updated.
NOTE: initiating PST update: grp = 1
GMON updating group 1 at 14 for pid 28, osid 12861
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
NOTE: group DATADG: updated PST location: disk 0000 (PST copy 0)
NOTE: group DATADG: updated PST location: disk 0002 (PST copy 1)
NOTE: PST update grp = 1 completed successfully 
NOTE: membership refresh pending for group 1/0x31584f6b (DATADG)
GMON querying group 1 at 15 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
NOTE: cache opening disk 2 of grp 1: DATA_0001 path:/dev/asm-diskdata02
NOTE: Attempting voting file refresh on diskgroup DATADG
NOTE: Refresh completed on diskgroup DATADG. No voting file found.
GMON querying group 1 at 16 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
SUCCESS: refreshed membership for 1/0x31584f6b (DATADG)
Mon Mar 06 16:26:19 2017
SUCCESS: alter diskgroup DATADG add FAILGROUP DATA_0001 disk '/dev/asm-diskdata02' name DATA_0001 force 
NOTE: starting rebalance of group 1/0x31584f6b (DATADG) at power 1
Mon Mar 06 16:26:20 2017
Starting background process ARB0
Mon Mar 06 16:26:20 2017
ARB0 started with pid=32, OS id=25833 
NOTE: assigning ARB0 to group 1/0x31584f6b (DATADG) with 1 parallel I/O
cellip.ora not found.
WARNING:cache read  a corrupt block: group=1(DATADG) dsk=0 blk=0 disk=0 
        (DATADG_0000)incarn=3915956130 au=0 blk=0 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25833.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483648] [0] [0 != 1]
NOTE:a corrupted block from group DATADG was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25833.trc
WARNING:cache read (retry)a corrupt block:group=1(DATADG) 
         dsk=0 blk=0 disk=0(DATADG_0000)incarn=3915956130 au=0 blk=0 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25833.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483648] [0] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483648] [0] [0 != 1]
ERROR: cache failed to read group=1(DATADG) dsk=0 blk=0 from disk(s): 0(DATADG_0000)
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483648] [0] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483648] [0] [0 != 1]
NOTE: cache initiating offline of disk 0 group DATADG
NOTE:process _arb0_+asm1 (25833) initiating offline of disk 0.3915956130(DATADG_0000)with mask 0x7e in group 1
NOTE: checking PST: grp = 1
GMON checking disk modes for group 1 at 17 for pid 32, osid 25833
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
ERROR: too many offline disks in PST (grp 1)
NOTE: checking PST for grp 1 done.
NOTE: initiating PST update: grp = 1, dsk = 0/0xe968bfa2, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 18 for pid 32, osid 25833
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
ERROR: Disk 0 cannot be offlined, since all the disks [0, 1] with mirrored data would be offline.
ERROR: too many offline disks in PST (grp 1)
Mon Mar 06 16:26:23 2017
NOTE: cache dismounting (not clean) group 1/0x31584F6B (DATADG) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 25889, image: oracle@DBN01 (B000)
Mon Mar 06 16:26:23 2017
NOTE: halting all I/Os to diskgroup 1 (DATADG)
Mon Mar 06 16:26:23 2017
NOTE: LGWR doing non-clean dismount of group 1 (DATADG)
NOTE: LGWR sync ABA=19.2851 last written ABA 19.2851
WARNING: Offline for disk DATADG_0000 in mode 0x7f failed.
Mon Mar 06 16:26:23 2017
kjbdomdet send to inst 2
detach from dom 1, sending detach message to inst 2
Mon Mar 06 16:26:23 2017
List of instances:
 1 2
Dirty detach reconfiguration started (new ddet inc 1, cluster inc 8)
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25833.trc  (incident=65537):
ORA-15335: ASM metadata corruption detected in disk group 'DATADG'
ORA-15130: diskgroup "DATADG" is being dismounted
ORA-15066: offlining disk "DATADG_0000" in group "DATADG" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483648] [0] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483648] [0] [0 != 1]
Incident details in:/u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_65537/+ASM1_arb0_25833_i65537.trc
 Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 1 invalid = TRUE 
 3189 GCS resources traversed, 0 cancelled
Dirty Detach Reconfiguration complete
ERROR: ORA-15130 in COD recovery for diskgroup 1/0x31584f6b (DATADG)
ERROR: ORA-15130 thrown in RBAL for group number 1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_3468.trc:
ORA-15130: diskgroup "DATADG" is being dismounted
Mon Mar 06 16:26:23 2017
WARNING: dirty detached from domain 1
NOTE: cache dismounted group 1/0x31584F6B (DATADG) 

---后续mount报错
SQL> ALTER DISKGROUP DATADG MOUNT  /* asm agent *//* {1:18003:2} */ 
NOTE: cache registered group DATADG number=1 incarn=0xb368408f
NOTE: cache began mount (first) of group DATADG number=1 incarn=0xb368408f
NOTE: Assigning number (1,2) to disk (/dev/asm-diskdata02)
WARNING:GMON has insufficient disks to maintain consensus.
Minimum required is 2:updating 1 PST copies from a total of 2.
ERROR: GMON failed to obtain a quorum ofsupporting disks in group 1
NOTE: cache dismounting (clean) group 1/0xB368408F (DATADG) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 27651, image: oracle@DBN01 (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 1/0xB368408F (DATADG) 
NOTE: cache ending mount (fail) of group DATADG number=1 incarn=0xb368408f
NOTE: cache deleting context for group DATADG 1/0xb368408f
GMON dismounting group 1 at 12 for pid 30, osid 27651
NOTE: Disk DATA_0001 in mode 0x9 marked for de-assignment
ERROR: diskgroup DATADG was not mounted
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATADG" cannot be mounted
ORA-15315: Write errors in disk group DATADG could lead to inconsistent ASM metadata.
ERROR: ALTER DISKGROUP DATADG MOUNT  /* asm agent *//* {1:18003:2} */

从这里我们可以看出来,前几次加asm disk 由于各种原因都失败了,最后一次通过加force关键字,使得被自动drop的disk重新强制加到datadg里面.可悲的是在加入成功之后,开始做rebalance的时候,发现disk 0出现坏块,从而引起ORA-15196的错误,使得rebalance无法进行下去,进而整个asm 磁盘组datadg自动dismount.后面再次尝试mount datadg的时候,直接提示元数据库不一致,因为disk 0 的磁盘头已经异常.

通过kfed分析disk 0信息
这里是通过dd命令备份的磁盘头到win进行分析的,以前正常的disk 0的磁盘头损坏(全0)
asm-kfed


对于这个故障已经比较清楚,恢复思路也基本上确定:依次递进
方案1:通过kfed修改文件头,然后尝试mount磁盘头手工修复ASM DISK HEADER 异常
方案2:直接通过amdu,dul之类的工具拷贝出来数据文件找回ASM中数据文件
方案3:通过底层au重组出来数据文件asm disk header 彻底损坏恢复
在我们的实际恢复中运气比较好,通过方案1就完成了恢复工作,通过kfed修复磁盘头之后,然后报错如下

SQL> alter diskgroup DATADG mount 
NOTE: cache registered group DATADG number=1 incarn=0x5134d0d4
NOTE: cache began mount (first) of group DATADG number=1 incarn=0x5134d0d4
NOTE: Assigning number (1,2) to disk (/dev/asm-diskdata02)
NOTE: Assigning number (1,0) to disk (/dev/asm-diskdata01)
Tue Mar 07 19:03:40 2017
NOTE: GMON heartbeating for grp 1
GMON querying group 1 at 27 for pid 28, osid 13837
NOTE: Assigning number (1,1) to disk ()
GMON querying group 1 at 28 for pid 28, osid 13837
NOTE: cache closing disk 1 of grp 1: (not open) 
NOTE: cache opening disk 0 of grp 1: DATADG_0000 path:/dev/asm-diskdata01
NOTE: F1X0 found on disk 0 au 2 fcn 0.178802
NOTE: cache opening disk 2 of grp 1: DATA_0001 path:/dev/asm-diskdata02
NOTE: cache mounting (first) normal redundancy group 1/0x5134D0D4 (DATADG)
Tue Mar 07 19:03:40 2017
* allocate domain 1, invalid = TRUE 
kjbdomatt send to inst 2
Tue Mar 07 19:03:40 2017
NOTE: attached to recovery domain 1
NOTE: starting recovery of thread=1 ckpt=19.2851 group=1 (DATADG)
NOTE: starting recovery of thread=2 ckpt=13.5327 group=1 (DATADG)
NOTE: advancing ckpt for group 1 (DATADG) thread=2 ckpt=13.5327
NOTE: advancing ckpt for group 1 (DATADG) thread=1 ckpt=19.2852
NOTE: cache recovered group 1 to fcn 0.365868
NOTE: redo buffer size is 512 blocks (2101760 bytes)
Tue Mar 07 19:03:40 2017
NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (DATADG)
NOTE: LGWR found thread 1 closed at ABA 19.2851
NOTE: LGWR mounted thread 1 for diskgroup 1 (DATADG)
NOTE: LGWR opening thread 1 at fcn 0.365868 ABA 20.2852
NOTE: cache mounting group 1/0x5134D0D4 (DATADG) succeeded
NOTE: cache ending mount (success) of group DATADG number=1 incarn=0x5134d0d4
Tue Mar 07 19:03:40 2017
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1
SUCCESS: diskgroup DATADG was mounted
SUCCESS: alter diskgroup DATADG mount
Tue Mar 07 19:03:40 2017
NOTE: diskgroup resource ora.DATADG.dg is online
Tue Mar 07 19:03:41 2017
ASM Health Checker found 1 new failures
NOTE: ASM did background COD recovery for group 1/0x5134d0d4 (DATADG)
NOTE: starting rebalance of group 1/0x5134d0d4 (DATADG) at power 1
Starting background process ARB0
Tue Mar 07 19:03:42 2017
ARB0 started with pid=30, OS id=13905 
NOTE: assigning ARB0 to group 1/0x5134d0d4 (DATADG) with 1 parallel I/O
cellip.ora not found.
WARNING: cache read  a corrupt block: group=1(DATADG) dsk=0 blk=0 disk=0 
         (DATADG_0000) incarn=2202280062 au=0 blk=0 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_13905.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [2147483648] [0] [1022 != 0]
NOTE: a corrupted block from group DATADG was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_13905.trc
WARNING:cache read (retry)a corrupt block:group=1(DATADG) dsk=0 blk=0 disk=0
        (DATADG_0000)incarn=2202280062 au=0 blk=0 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_13905.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [2147483648] [0] [1022 != 0]
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [2147483648] [0] [1022 != 0]
ERROR: cache failed to read group=1(DATADG) dsk=0 blk=0 from disk(s): 0(DATADG_0000)
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [2147483648] [0] [1022 != 0]
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [2147483648] [0] [1022 != 0]
Tue Mar 07 19:03:52 2017
NOTE: client oradb1:oradb registered, osid 13989, mbr 0x1
NOTE: cache initiating offline of disk 0 group DATADG
NOTE:process _arb0_+asm1 (13905) initiating offline of disk 0.2202280062(DATADG_0000)with mask 0x7e in group 1
NOTE: checking PST: grp = 1
Tue Mar 07 19:03:52 2017
GMON checking disk modes for group 1 at 30 for pid 30, osid 13905
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
ERROR: too many offline disks in PST (grp 1)
NOTE: checking PST for grp 1 done.
NOTE: initiating PST update: grp = 1, dsk = 0/0x8344207e, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 31 for pid 30, osid 13905
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
ERROR: Disk 0 cannot be offlined, since all the disks [0, 1] with mirrored data would be offline.
ERROR: too many offline disks in PST (grp 1)
Tue Mar 07 19:03:52 2017
NOTE: cache dismounting (not clean) group 1/0x5134D0D4 (DATADG) 
WARNING: Offline for disk DATADG_0000 in mode 0x7f failed.
Tue Mar 07 19:03:52 2017
NOTE: halting all I/Os to diskgroup 1 (DATADG)
NOTE: messaging CKPT to quiesce pins Unix process pid: 14002, image: oracle@DBN01 (B000)
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_13905.trc  (incident=76402):
ORA-15335: ASM metadata corruption detected in disk group 'DATADG'
ORA-15130: diskgroup "DATADG" is being dismounted
ORA-15066: offlining disk "DATADG_0000" in group "DATADG" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [2147483648] [0] [1022 != 0]
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [2147483648] [0] [1022 != 0]
Incident details in: /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_76402/+ASM1_arb0_13905_i76402.trc
Tue Mar 07 19:03:52 2017
NOTE: LGWR doing non-clean dismount of group 1 (DATADG)
NOTE: LGWR sync ABA=20.2857 last written ABA 20.2857

这里比较比较幸运,datadg已经mount成功了,但是由于rab依旧读取到disk header异常信息(没有完全修复成功,而且在日志中不光这个block异常,还有其他block异常,因此不考虑进一步修复),因此直接通过屏蔽asm的acd和cod实现该磁盘组mount,而且不会dismount。

SQL> alter diskgroup DATADG mount 
NOTE: cache registered group DATADG number=1 incarn=0x9c94d0eb
NOTE: cache began mount (first) of group DATADG number=1 incarn=0x9c94d0eb
NOTE: Assigning number (1,2) to disk (/dev/asm-diskdata02)
NOTE: Assigning number (1,0) to disk (/dev/asm-diskdata01)
NOTE: skip COD recovery as part of test at kfrc.c:1639 
NOTE: skip COD recovery as part of test at kfrc.c:1639 
Tue Mar 07 19:12:45 2017
NOTE: GMON heartbeating for grp 1
GMON querying group 1 at 75 for pid 28, osid 15615
NOTE: Assigning number (1,1) to disk ()
GMON querying group 1 at 76 for pid 28, osid 15615
NOTE: cache closing disk 1 of grp 1: (not open) 
NOTE: cache opening disk 0 of grp 1: DATADG_0000 path:/dev/asm-diskdata01
NOTE: F1X0 found on disk 0 au 2 fcn 0.178802
NOTE: cache opening disk 2 of grp 1: DATA_0001 path:/dev/asm-diskdata02
NOTE: cache mounting (first) normal redundancy group 1/0x9C94D0EB (DATADG)
Tue Mar 07 19:12:45 2017
* allocate domain 1, invalid = TRUE 
kjbdomatt send to inst 2
Tue Mar 07 19:12:45 2017
NOTE: attached to recovery domain 1
NOTE: starting recovery of thread=1 ckpt=25.2870 group=1 (DATADG)
NOTE: advancing ckpt for group 1 (DATADG) thread=1 ckpt=25.2873
NOTE: cache recovered group 1 to fcn 0.365897
NOTE: redo buffer size is 512 blocks (2101760 bytes)
Tue Mar 07 19:12:45 2017
NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (DATADG)
NOTE: LGWR found thread 1 closed at ABA 25.2872
NOTE: LGWR mounted thread 1 for diskgroup 1 (DATADG)
NOTE: LGWR opening thread 1 at fcn 0.365897 ABA 26.2873
NOTE: cache mounting group 1/0x9C94D0EB (DATADG) succeeded
NOTE: cache ending mount (success) of group DATADG number=1 incarn=0x9c94d0eb
Tue Mar 07 19:12:45 2017
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1
SUCCESS: diskgroup DATADG was mounted
SUCCESS: alter diskgroup DATADG mount
Tue Mar 07 19:12:45 2017
NOTE: diskgroup resource ora.DATADG.dg is online
NOTE: skip COD recovery as part of test at kfrc.c:1639 
NOTE: skip COD recovery as part of test at kfrc.c:1639 
NOTE: skip COD recovery as part of test at kfrc.c:1639 
NOTE: skip COD recovery as part of test at kfrc.c:1639 

asm的问题解决后,然后登录数据库,发现运气比较好,两个数据库正常open成功,而且alert日志无任何报错,直接通过rman备份出来数据,重建asm磁盘组,还原数据,恢复完成,而且实现数据0丢失。
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:13429648788    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

发表在 非常规恢复 | 标签为 , , , , , , , | 评论关闭

hp平台rdisk中磁盘丢失导致asm启动报ORA-15042恢复

有老朋友找到我,说一个客户的数据库异常,问题是asm无法正常mount,提示是缺少两块磁盘.问我是否可以恢复.因为是内网环境,通过他那边发过来的零零散散的信息,大概分析如下
asm alert日志报错
ERROR: diskgroup DGROUP1 was not mounted

Fri Aug 12 16:03:12 EAT 2016
SQL> alter diskgroup DGROUP1 mount 
Fri Aug 12 16:03:12 EAT 2016
NOTE: cache registered group DGROUP1 number=1 incarn=0xf6781b5c
Fri Aug 12 16:03:12 EAT 2016
NOTE: Hbeat: instance first (grp 1)
Fri Aug 12 16:03:16 EAT 2016
NOTE: start heartbeating (grp 1)
Fri Aug 12 16:03:16 EAT 2016
NOTE: cache dismounting group 1/0xF6781B5C (DGROUP1) 
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGROUP1 was not mounted

前台尝试mount asm 磁盘组报错ORA-15042
ORA-15042


从这里可以明显的看出来asm 磁盘组无法正常mount,是由于缺少asm disk 15,16.如果想恢复asm,最好的方法就是找出来这两个磁盘.通过kfed对现在的磁盘进行分析,最后我们发现asm disk 14对应的磁盘为disk160,,asm disk 17对应的disk163,根据第一感觉很可能是disk161和disk161两块盘异常,让机房检查硬件无任何告警

OS层面分析
省略和本次结论无关的记录

ls -l /dev/rdisk
crw-rw----   1 oracle     dba         13 0x000070 Jan  1  2016 disk160
crw-rw----   1 oracle     dba         13 0x000073 Jan  1  2016 disk163

ls -l /dev/disk
brw-r-----   1 bin        sys          1 0x000070 Jan 13  2015 disk160
brw-r-----   1 bin        sys          1 0x000071 Jan 13  2015 disk161
brw-r-----   1 bin        sys          1 0x000072 Jan 13  2015 disk162
brw-r-----   1 bin        sys          1 0x000073 Jan 13  2015 disk163

这里我们发现在hp unix中/dev/disk下面磁盘都存在,但是/dev/rdisk下面丢失,通过ioscan相关命令继续分析

ioscan -fNnkC disk
disk    160  64000/0xfa00/0x70  esdisk   CLAIMED     DEVICE       HP      OPEN-V
                      /dev/disk/disk160   /dev/rdisk/disk160
disk    161  64000/0xfa00/0x71  esdisk   CLAIMED     DEVICE       HP      OPEN-V
                      /dev/disk/disk161
disk    162  64000/0xfa00/0x72  esdisk   CLAIMED     DEVICE       HP      OPEN-V
                      /dev/disk/disk162
disk    163  64000/0xfa00/0x73  esdisk   CLAIMED     DEVICE       HP      OPEN-V
                      /dev/disk/disk163   /dev/rdisk/disk163

这里我们基本上可以确定是/dev/rdisk下面的盘发生丢失.进一步分析,因为rdisk是聚合后的盘符,那我们分析聚合前的盘符是否正常

ioscan -m dsf
/dev/rdisk/disk160       /dev/rdsk/c29t12d4
                         /dev/rdsk/c28t12d4
/dev/rdisk/disk163       /dev/rdsk/c29t12d7
                         /dev/rdsk/c28t12d7

ls -l /dev/rdsk
crw-r-----   1 bin        sys        188 0x1dc000 Apr 22  2014 c29t12d0
crw-r-----   1 bin        sys        188 0x1dc100 Apr 22  2014 c29t12d1
crw-r-----   1 bin        sys        188 0x1dc300 Jan 13  2015 c29t12d3
crw-r-----   1 bin        sys        188 0x1dc400 Jan 13  2015 c29t12d4
crw-r-----   1 bin        sys        188 0x1dc500 Jan 13  2015 c29t12d5
crw-r-----   1 bin        sys        188 0x1dc600 Jan 13  2015 c29t12d6
crw-r-----   1 bin        sys        188 0x1dc700 Jan 13  2015 c29t12d7

crw-r-----   1 bin        sys        188 0x1cc100 Apr 22  2014 c28t12d1
crw-r-----   1 bin        sys        188 0x1cc300 Jan 13  2015 c28t12d3
crw-r-----   1 bin        sys        188 0x1cc400 Jan 13  2015 c28t12d4
crw-r-----   1 bin        sys        188 0x1cc500 Jan 13  2015 c28t12d5
crw-r-----   1 bin        sys        188 0x1cc600 Jan 13  2015 c28t12d6
crw-r-----   1 bin        sys        188 0x1cc700 Jan 13  2015 c28t12d7

通过这里我们基本上可以大概判断出来/dev/rdsk/c28t12d5,/dev/rdsk/c28t12d6,/dev/rdsk/c29t12d5,/dev/rdsk/c29t12d6就是我们需要找的/dev/rdisk/disk161和disk162的聚合之前的盘符.也就是说,现在我们判断只有/dev/rdisk下面的字符设备有问题,其他均正常.

通过系统命令修复异常

insf -e -H 64000/0xfa00/0x71
insf -e -H 64000/0xfa00/0x72

hp-asm-disk


现在已经可以正常看到/dev/rdisk/disk161和/dev/rdisk/disk162盘符,初步判断,os层面盘符已经恢复正常.修改磁盘权限和所属组

chmod 660 /dev/rdisk/disk161
chmod 660 /dev/rdisk/disk162
chown oracle:dba /dev/rdisk/disk161
chown oracle:dba /dev/rdisk/disk162

正常启动asm,mount磁盘组,open数据库
asm-mount


这次的恢复,主要是从操作系统层面判断解决问题,从而实现数据库完美恢复,数据0丢失.有类似恢复案例:分区无法识别导致asm diskgroup无法mount
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:13429648788    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

发表在 Oracle ASM, Unix | 标签为 , , | 评论关闭