标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 kfed MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (102)
- 数据库 (1,679)
- DB2 (22)
- MySQL (73)
- Oracle (1,541)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (14)
- ORACLE 21C (3)
- Oracle 23ai (7)
- Oracle ASM (67)
- Oracle Bug (8)
- Oracle RAC (52)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (563)
- Oracle安装升级 (92)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (79)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- 解决oracle数据文件路径有回车故障
- .wstop扩展名勒索数据库恢复
- Oracle Recovery Tools工具一键解决ORA-00376 ORA-01110故障(文件offline)
- OGG-02771 Input trail file format RELEASE 19.1 is different from previous trail file form at RELEASE 11.2.
- OGG-02246 Source redo compatibility level 19.0.0 requires trail FORMAT 12.2 or higher
- GoldenGate 19安装和打patch
- dd破坏asm磁盘头恢复
- 删除asmlib磁盘导致磁盘组故障恢复
- Kylin Linux 安装19c
- ORA-600 krse_arc_complete.4
- Oracle 19c 202410补丁(RUs+OJVM)
- ntfs MFT损坏(ntfs文件系统故障)导致oracle异常恢复
- .mkp扩展名oracle数据文件加密恢复
- 清空redo,导致ORA-27048: skgfifi: file header information is invalid
- A_H_README_TO_RECOVER勒索恢复
- 通过alert日志分析客户自行对一个数据库恢复的来龙去脉和点评
- ORA-12514: TNS: 监听进程不能解析在连接描述符中给出的SERVICE_NAME
- ORA-01092 ORA-00604 ORA-01558故障处理
- ORA-65088: database open should be retried
- Oracle 19c异常恢复—ORA-01209/ORA-65088
标签归档:ORA-15131
删除asmlib磁盘导致磁盘组故障恢复
有客户执行drop disk磁盘组操作之后,然后立刻从oracle asmlib层面执行了oracleasm deletedisk,并且在操作系统层面delete partition(删除磁盘分区),导致磁盘组直接dismount
Tue Nov 26 16:44:04 2024 SQL> alter diskgroup data drop disk DATA_0008 NOTE: GroupBlock outside rolling migration privileged region Tue Nov 26 08:44:05 2024 NOTE: stopping process ARB0 NOTE: rebalance interrupted for group 2/0x28dec0d5 (DATA) NOTE: requesting all-instance membership refresh for group=2 NOTE: membership refresh pending for group 2/0x28dec0d5 (DATA) Tue Nov 26 08:44:14 2024 GMON querying group 2 at 48 for pid 18, osid 27385 SUCCESS: refreshed membership for 2/0x28dec0d5 (DATA) SUCCESS: alter diskgroup data drop disk DATA_0008 NOTE: starting rebalance of group 2/0x28dec0d5 (DATA) at power 2 Starting background process ARB0 Tue Nov 26 08:44:14 2024 ARB0 started with pid=38, OS id=56987 NOTE: assigning ARB0 to group 2/0x28dec0d5 (DATA) with 2 parallel I/Os Tue Nov 26 08:44:17 2024 NOTE: Attempting voting file refresh on diskgroup DATA NOTE: Refresh completed on diskgroup DATA. No voting file found. Tue Nov 26 08:44:57 2024 cellip.ora not found. Tue Nov 26 17:08:46 2024 SQL> alter diskgroup data drop disk DATA_0008 ORA-15032: not all alterations performed ORA-15071: ASM disk "DATA_0008" is already being dropped ERROR: alter diskgroup data drop disk DATA_0008 Tue Nov 26 17:10:30 2024 SQL> alter diskgroup data drop disk DATA_0008 ORA-15032: not all alterations performed ORA-15071: ASM disk "DATA_0008" is already being dropped ERROR: alter diskgroup data drop disk DATA_0008 Tue Nov 26 09:34:38 2024 WARNING: cache read a corrupt block:group=2(DATA) dsk=8 blk=98 disk=8 (DATA_0008) incarn=3911069755 au=0 blk=98 count=1 Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1] NOTE: a corrupted block from group DATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc WARNING:cache read (retry) a corrupt block:group=2(DATA) dsk=8 blk=98 disk=8(DATA_0008)incarn=3911069755 au=0 blk=98 count=1 Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1] ERROR: cache failed to read group=2(DATA) dsk=8 blk=98 from disk(s): 8(DATA_0008) ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1] NOTE: cache initiating offline of disk 8 group DATA NOTE: process _arb0_+asm1(56987)initiating offline of disk 8.3911069755 (DATA_0008) with mask 0x7e in group 2 NOTE: initiating PST update: grp = 2, dsk = 8/0xe91e303b, mask = 0x6a, op = clear Tue Nov 26 09:34:38 2024 GMON updating disk modes for group 2 at 49 for pid 38, osid 56987 ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy. ERROR: too many offline disks in PST (grp 2) Tue Nov 26 09:34:38 2024 NOTE: cache dismounting (not clean) group 2/0x28DEC0D5 (DATA) WARNING: Offline for disk DATA_0008 in mode 0x7f failed. NOTE: messaging CKPT to quiesce pins Unix process pid: 89645, image: oracle@ahptdb5 (B000) Tue Nov 26 09:34:38 2024 NOTE: halting all I/Os to diskgroup 2 (DATA) Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc (incident=413105): ORA-15335: ASM metadata corruption detected in disk group 'DATA' ORA-15130: diskgroup "DATA" is being dismounted ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1] Tue Nov 26 09:34:39 2024 ERROR: ORA-15130 in COD recovery for diskgroup 2/0x28dec0d5 (DATA) ERROR: ORA-15130 thrown in RBAL for group number 2 Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_27385.trc: ORA-15130: diskgroup "DATA" is being dismounted ERROR: ORA-15335 thrown in ARB0 for group number 2 Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc: ORA-15335: ASM metadata corruption detected in disk group 'DATA' ORA-15130: diskgroup "DATA" is being dismounted ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1] NOTE: stopping process ARB0 Tue Nov 26 09:34:40 2024 NOTE: LGWR doing non-clean dismount of group 2 (DATA) NOTE: LGWR sync ABA=716.2684 last written ABA 716.2684
通过重新分区,并且kfed repair修复磁盘头操作之后,重新mount磁盘组报错
SQL> alter diskgroup data mount NOTE: cache registered group DATA number=2 incarn=0x73bec220 NOTE: cache began mount (first) of group DATA number=2 incarn=0x73bec220 NOTE: Assigning number (2,16) to disk (/dev/oracleasm/disks/DATA208) NOTE: Assigning number (2,15) to disk (/dev/oracleasm/disks/DATA207) NOTE: Assigning number (2,14) to disk (/dev/oracleasm/disks/DATA206) NOTE: Assigning number (2,13) to disk (/dev/oracleasm/disks/DATA205) NOTE: Assigning number (2,12) to disk (/dev/oracleasm/disks/DATA204) NOTE: Assigning number (2,11) to disk (/dev/oracleasm/disks/DATA203) NOTE: Assigning number (2,10) to disk (/dev/oracleasm/disks/DATA202) NOTE: Assigning number (2,9) to disk (/dev/oracleasm/disks/DATA201) NOTE: Assigning number (2,6) to disk (/dev/oracleasm/disks/DATA07) NOTE: Assigning number (2,5) to disk (/dev/oracleasm/disks/DATA06) NOTE: Assigning number (2,4) to disk (/dev/oracleasm/disks/DATA05) NOTE: Assigning number (2,0) to disk (/dev/oracleasm/disks/DATA01) NOTE: Assigning number (2,3) to disk (/dev/oracleasm/disks/DATA04) NOTE: Assigning number (2,2) to disk (/dev/oracleasm/disks/DATA03) NOTE: Assigning number (2,1) to disk (/dev/oracleasm/disks/DATA02) NOTE: Assigning number (2,8) to disk (/dev/oracleasm/disks/DATA101) Tue Nov 26 11:48:22 2024 NOTE: GMON heartbeating for grp 2 GMON querying group 2 at 83 for pid 27, osid 15781 NOTE: cache opening disk 0 of grp 2: DATA_0000 path:/dev/oracleasm/disks/DATA01 NOTE: F1X0 found on disk 0 au 2 fcn 0.127835487 NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/oracleasm/disks/DATA02 NOTE: cache opening disk 2 of grp 2: DATA_0002 path:/dev/oracleasm/disks/DATA03 NOTE: cache opening disk 3 of grp 2: DATA_0003 path:/dev/oracleasm/disks/DATA04 NOTE: cache opening disk 4 of grp 2: DATA_0004 path:/dev/oracleasm/disks/DATA05 NOTE: cache opening disk 5 of grp 2: DATA_0005 path:/dev/oracleasm/disks/DATA06 NOTE: cache opening disk 6 of grp 2: DATA_0006 path:/dev/oracleasm/disks/DATA07 NOTE: cache opening disk 8 of grp 2: DATA_0008 path:/dev/oracleasm/disks/DATA101 NOTE: cache opening disk 9 of grp 2: DATA_0009 path:/dev/oracleasm/disks/DATA201 NOTE: cache opening disk 10 of grp 2: DATA_0010 path:/dev/oracleasm/disks/DATA202 NOTE: cache opening disk 11 of grp 2: DATA_0011 path:/dev/oracleasm/disks/DATA203 NOTE: cache opening disk 12 of grp 2: DATA_0012 path:/dev/oracleasm/disks/DATA204 NOTE: cache opening disk 13 of grp 2: DATA_0013 path:/dev/oracleasm/disks/DATA205 NOTE: cache opening disk 14 of grp 2: DATA_0014 path:/dev/oracleasm/disks/DATA206 NOTE: cache opening disk 15 of grp 2: DATA_0015 path:/dev/oracleasm/disks/DATA207 NOTE: cache opening disk 16 of grp 2: DATA_0016 path:/dev/oracleasm/disks/DATA208 NOTE: cache mounting (first) external redundancy group 2/0x73BEC220 (DATA) Tue Nov 26 11:48:22 2024 * allocate domain 2, invalid = TRUE kjbdomatt send to inst 2 Tue Nov 26 11:48:22 2024 NOTE: attached to recovery domain 2 NOTE: starting recovery of thread=1 ckpt=716.1536 group=2 (DATA) NOTE: starting recovery of thread=2 ckpt=763.6248 group=2 (DATA) NOTE: recovery initiating offline of disk 8 group 2 (*) NOTE: cache initiating offline of disk 8 group DATA NOTE: process _user15781_+asm1 (15781) initiating offline of disk 8.3911069996 (DATA_0008) with mask 0x7e in group 2 NOTE: initiating PST update: grp = 2, dsk = 8/0xe91e312c, mask = 0x6a, op = clear GMON updating disk modes for group 2 at 84 for pid 27, osid 15781 ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy. ERROR: too many offline disks in PST (grp 2) WARNING: Offline for disk DATA_0008 in mode 0x7f failed. Tue Nov 26 11:48:23 2024 NOTE: halting all I/Os to diskgroup 2 (DATA) NOTE: recovery (pass 2) of diskgroup 2 (DATA) caught error ORA-15130 Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_15781.trc: ORA-15130: diskgroup "DATA" is being dismounted ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss ORA-15131: block 97 of file 8 in diskgroup 2 could not be read ORA-15196: invalid ASM block header [kfc.c:7600] [endian_kfbh] [2147483656] [97] [0 != 1]
由于客户执行了oracleasm deletedisk,根据经验确认该操作是对asm磁盘头的前1M数据进行了清空,而客户这个asm刚好是drop disk触发了rebalance操作的时候干掉磁盘的,基于这样的情况,直接通过修复磁盘1M数据并且mount磁盘组继续使用该磁盘组的概率不大.因此处理建议:
1. 直接恢复出来该磁盘组数据然后打开该库
2. 直接提取客户需要的核心表数据
有过客户有类似操作是asmlib重新创建了磁盘信息恢复:分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例
删除分区信息之后数据库恢复案例:删除分区 oracle asm disk 恢复
手工对multipath设备进行授权导致asm 磁盘组mount报ORA-15032-ORA-15131
客户硬件通过底层重组raid,然后把lun进行到asm的机器上,在mount data_dg磁盘组的时候,报ORA-15032 ORA-15131错误,磁盘组无法正常mount,这种报错不太常见,一般要不直接报某个block无法访问,要不直接报缺少asm disk之类的.
通过远程上去分析,发现alert日志如下
Wed Jul 31 04:55:17 2024 NOTE: attached to recovery domain 1 NOTE: cache recovered group 1 to fcn 0.1814063801 NOTE: redo buffer size is 256 blocks (1053184 bytes) Wed Jul 31 04:55:17 2024 NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (DATA_DG) Errors in file /oracle/u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_lgwr_8681.trc: ORA-15025: could not open disk "/dev/mapper/xffdb_data01_new" ORA-27041: unable to open file Linux-x86_64 Error: 13: Permission denied Additional information: 3 Errors in file /oracle/u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_lgwr_8681.trc: ORA-15025: could not open disk "/dev/mapper/xffdb_data01_new" ORA-27041: unable to open file Linux-x86_64 Error: 13: Permission denied Additional information: 3 WARNING: cache failed reading from group=1(DATA_DG) fn=1 blk=3 count=1 from disk= 0 (DATA_DG_0000) kfkist=0x20 status=0x02 osderr=0x0 file=kfc.c line=11596 Errors in file /oracle/u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_lgwr_8681.trc: ORA-15025: could not open disk "/dev/mapper/xffdb_data01_new" ORA-27041: unable to open file Linux-x86_64 Error: 13: Permission denied Additional information: 3 ORA-15080: synchronous I/O operation to a disk failed ERROR: cache failed to read group=1(DATA_DG) fn=1 blk=3 from disk(s): 0(DATA_DG_0000) ORA-15080: synchronous I/O operation to a disk failed NOTE: cache initiating offline of disk 0 group DATA_DG NOTE: process _lgwr_+asm2 (8681) initiating offline of disk 0.3915927124 (DATA_DG_0000) with mask 0x7e in group 1 NOTE: initiating PST update: grp = 1, dsk = 0/0xe9684e54, mask = 0x6a, op = clear GMON updating disk modes for group 1 at 42 for pid 15, osid 8681 ERROR: Disk 0 cannot be offlined, since diskgroup has external redundancy. ERROR: too many offline disks in PST (grp 1) WARNING: Offline for disk DATA_DG_0000 in mode 0x7f failed. Wed Jul 31 04:55:17 2024 NOTE: halting all I/Os to diskgroup 1 (DATA_DG) NOTE: LGWR caught ORA-15131 while mounting diskgroup 1 ORA-15080: synchronous I/O operation to a disk failed NOTE: cache initiating offline of disk 0 group DATA_DG NOTE: process _lgwr_+asm2 (8681) initiating offline of disk 0.3915927124 (DATA_DG_0000) with mask 0x7e in group 1 NOTE: initiating PST update: grp = 1, dsk = 0/0xe9684e54, mask = 0x6a, op = clear GMON updating disk modes for group 1 at 42 for pid 15, osid 8681 ERROR: Disk 0 cannot be offlined, since diskgroup has external redundancy. ERROR: too many offline disks in PST (grp 1) WARNING: Offline for disk DATA_DG_0000 in mode 0x7f failed. Wed Jul 31 04:55:17 2024 NOTE: halting all I/Os to diskgroup 1 (DATA_DG) NOTE: LGWR caught ORA-15131 while mounting diskgroup 1 ERROR: ORA-15131 signalled during mount of diskgroup DATA_DG NOTE: cache dismounting (clean) group 1/0xA868BD55 (DATA_DG) NOTE: messaging CKPT to quiesce pins Unix process pid: 16915, image: oracle@xffdb2 (TNS V1-V3) NOTE: lgwr not being msg'd to dismount Wed Jul 31 04:55:18 2024 List of instances: 2 Dirty detach reconfiguration started (new ddet inc 1, cluster inc 9) Global Resource Directory partially frozen for dirty detach * dirty detach - domain 1 invalid = TRUE 2 GCS resources traversed, 0 cancelled Dirty Detach Reconfiguration complete freeing rdom 1 WARNING: dirty detached from domain 1 WARNING: thread recovery enqueue was not held for domain 1 when doing a dirty detach NOTE: cache dismounted group 1/0xA868BD55 (DATA_DG) NOTE: cache ending mount (fail) of group DATA_DG number=1 incarn=0xa868bd55 NOTE: cache deleting context for group DATA_DG 1/0xa868bd55 GMON dismounting group 1 at 43 for pid 29, osid 16915 NOTE: Disk DATA_DG_0000 in mode 0x7f marked for de-assignment NOTE: Disk DATA_DG_0001 in mode 0x7f marked for de-assignment NOTE: Disk DATA_DG_0002 in mode 0x7f marked for de-assignment NOTE: Disk DATA_DG_0003 in mode 0x7f marked for de-assignment NOTE: Disk DATA_DG_0004 in mode 0x7f marked for de-assignment NOTE: Disk DATA_DG_0005 in mode 0x7f marked for de-assignment ERROR: diskgroup DATA_DG was not mounted ORA-15032: not all alterations performed ORA-15131: block of file in diskgroup could not be read ERROR: alter diskgroup data_dg mount
基本上可以确认是由于访问/dev/mapper/xffdb_data01_new 磁盘权限不对导致读disk= 0 fn=1 blk=3失败(突然读这个block没有权限,而没有报最初的磁盘头无权限,有点不合常理),进一步分析确认是xffdb_data01_new 权限不对.
xffdb2:/oracle/u01/app/grid/diag/asm/+asm/+ASM2/trace$ls -l /dev/mapper/ total 0 crw-rw---- 1 root root 10, 58 Jul 26 12:24 control lrwxrwxrwx 1 root root 8 Jul 31 04:21 mpathe -> ../dm-17 lrwxrwxrwx 1 root root 7 Jul 31 04:28 mpathf -> ../dm-7 lrwxrwxrwx 1 root root 8 Jul 31 04:55 xffdb_data01_new -> ../dm-14 lrwxrwxrwx 1 root root 8 Jul 31 04:55 xffdb_data02_new -> ../dm-13 lrwxrwxrwx 1 root root 7 Jul 31 04:55 xffdb_data03 -> ../dm-2 lrwxrwxrwx 1 root root 7 Jul 31 04:55 xffdb_data04 -> ../dm-5 lrwxrwxrwx 1 root root 8 Jul 31 04:55 xffdb_data05_new -> ../dm-12 lrwxrwxrwx 1 root root 7 Jul 31 04:55 xffdb_data06 -> ../dm-6 lrwxrwxrwx 1 root root 8 Jul 31 04:28 xffdb_data07 -> ../dm-11 lrwxrwxrwx 1 root root 7 Jul 31 04:28 xffdb_data08 -> ../dm-9 lrwxrwxrwx 1 root root 7 Jul 31 04:59 xffdb_log1 -> ../dm-4 lrwxrwxrwx 1 root root 7 Jul 31 04:59 xffdb_log2 -> ../dm-3 lrwxrwxrwx 1 root root 7 Jul 31 04:59 xffdb_vote2 -> ../dm-8 lrwxrwxrwx 1 root root 8 Jul 31 04:59 xffdb_vote3 -> ../dm-10 lrwxrwxrwx 1 root root 8 Jul 26 12:24 vgdata-lv_data -> ../dm-15 lrwxrwxrwx 1 root root 7 Jul 26 12:24 vg_xffdb2-LogVol00 -> ../dm-1 lrwxrwxrwx 1 root root 7 Jul 26 12:24 vg_xffdb2-LogVol01 -> ../dm-0 lrwxrwxrwx 1 root root 8 Jul 26 12:24 vg_xffdb2-LogVol02 -> ../dm-16 xffdb2:/oracle/u01/app/grid/diag/asm/+asm/+ASM2/trace$ls -l /dev/dm* brw-rw---- 1 root disk 253, 0 Jul 26 12:24 /dev/dm-0 brw-rw---- 1 root disk 253, 1 Jul 26 12:24 /dev/dm-1 brw-rw---- 1 grid asmadmin 253, 10 Jul 31 05:13 /dev/dm-10 brw-rw---- 1 root disk 253, 11 Jul 31 04:28 /dev/dm-11 brw-rw---- 1 root disk 253, 12 Jul 31 04:55 /dev/dm-12 brw-rw---- 1 grid asmadmin 253, 13 Jul 31 04:55 /dev/dm-13 brw-rw---- 1 grid asmadmin 253, 14 Jul 31 04:55 /dev/dm-14 brw-rw---- 1 root disk 253, 15 Jul 26 12:24 /dev/dm-15 brw-rw---- 1 root disk 253, 16 Jul 26 12:24 /dev/dm-16 brw-rw---- 1 root disk 253, 17 Jul 31 04:21 /dev/dm-17 brw-rw---- 1 grid asmadmin 253, 2 Jul 31 04:55 /dev/dm-2 brw-rw---- 1 grid asmadmin 253, 3 Jul 31 04:59 /dev/dm-3 brw-rw---- 1 grid asmadmin 253, 4 Jul 31 05:13 /dev/dm-4 brw-rw---- 1 grid asmadmin 253, 5 Jul 31 04:55 /dev/dm-5 brw-rw---- 1 grid asmadmin 253, 6 Jul 31 04:55 /dev/dm-6 brw-rw---- 1 root disk 253, 7 Jul 31 04:28 /dev/dm-7 brw-rw---- 1 grid asmadmin 253, 8 Jul 31 05:13 /dev/dm-8 brw-rw---- 1 root disk 253, 9 Jul 31 04:28 /dev/dm-9
再进一步确认xffdb_*_new三个磁盘是硬件恢复之后镜像过来的,然后现场工程师直接人工修改/dev/dm_[12-14]权限,再尝试mount磁盘组,结果发生该错误,通过v$asm_disk再次查询asm disk情况,发现xffdb_*_new的磁盘均不在列表中
GROUP_NUMBER DISK_NUMBER HEADER_STATUS STATE PATH ------------ ----------- --------------------- -------------- -------------------------- 0 2 MEMBER NORMAL /dev/mapper/xffdb_data03 0 3 MEMBER NORMAL /dev/mapper/xffdb_data06 0 4 MEMBER NORMAL /dev/mapper/xffdb_data04 3 1 MEMBER NORMAL /dev/mapper/xffdb_vote2 2 0 MEMBER NORMAL /dev/mapper/xffdb_log1 3 2 MEMBER NORMAL /dev/mapper/xffdb_vote3 2 1 MEMBER NORMAL /dev/mapper/xffdb_log2 7 rows selected.
进一步查看磁盘权限
xffdb2:/dev/mapper$ls -ltr total 0 crw-rw---- 1 root root 10, 58 Jul 26 12:24 control lrwxrwxrwx 1 root root 7 Jul 26 12:24 vg_xffdb2-LogVol01 -> ../dm-0 lrwxrwxrwx 1 root root 8 Jul 26 12:24 vgdata-lv_data -> ../dm-15 lrwxrwxrwx 1 root root 7 Jul 26 12:24 vg_xffdb2-LogVol00 -> ../dm-1 lrwxrwxrwx 1 root root 8 Jul 26 12:24 vg_xffdb2-LogVol02 -> ../dm-16 lrwxrwxrwx 1 root root 8 Jul 31 04:21 mpathe -> ../dm-17 lrwxrwxrwx 1 root root 7 Jul 31 04:28 xffdb_data08 -> ../dm-9 lrwxrwxrwx 1 root root 8 Jul 31 04:28 xffdb_data07 -> ../dm-11 lrwxrwxrwx 1 root root 7 Jul 31 04:28 mpathf -> ../dm-7 lrwxrwxrwx 1 root root 8 Jul 31 04:55 xffdb_data05_new -> ../dm-12 lrwxrwxrwx 1 root root 8 Jul 31 04:59 xffdb_vote3 -> ../dm-10 lrwxrwxrwx 1 root root 7 Jul 31 04:59 xffdb_vote2 -> ../dm-8 lrwxrwxrwx 1 root root 7 Jul 31 04:59 xffdb_log2 -> ../dm-3 lrwxrwxrwx 1 root root 7 Jul 31 04:59 xffdb_log1 -> ../dm-4 lrwxrwxrwx 1 root root 8 Jul 31 05:15 xffdb_data01_new -> ../dm-14 lrwxrwxrwx 1 root root 8 Jul 31 05:15 xffdb_data02_new -> ../dm-13 lrwxrwxrwx 1 root root 7 Jul 31 05:15 xffdb_data06 -> ../dm-6 lrwxrwxrwx 1 root root 7 Jul 31 05:15 xffdb_data04 -> ../dm-5 lrwxrwxrwx 1 root root 7 Jul 31 05:15 xffdb_data03 -> ../dm-2 xffdb2:/dev/mapper$ls -l /dev/dm* brw-rw---- 1 root disk 253, 0 Jul 26 12:24 /dev/dm-0 brw-rw---- 1 root disk 253, 1 Jul 26 12:24 /dev/dm-1 brw-rw---- 1 grid asmadmin 253, 10 Jul 31 05:22 /dev/dm-10 brw-rw---- 1 root disk 253, 11 Jul 31 04:28 /dev/dm-11 brw-rw---- 1 root disk 253, 12 Jul 31 04:55 /dev/dm-12 brw-rw---- 1 root disk 253, 13 Jul 31 05:15 /dev/dm-13 brw-rw---- 1 root disk 253, 14 Jul 31 05:15 /dev/dm-14 brw-rw---- 1 root disk 253, 15 Jul 26 12:24 /dev/dm-15 brw-rw---- 1 root disk 253, 16 Jul 26 12:24 /dev/dm-16 brw-rw---- 1 root disk 253, 17 Jul 31 04:21 /dev/dm-17 brw-rw---- 1 grid asmadmin 253, 2 Jul 31 05:15 /dev/dm-2 brw-rw---- 1 grid asmadmin 253, 3 Jul 31 04:59 /dev/dm-3 brw-rw---- 1 grid asmadmin 253, 4 Jul 31 05:22 /dev/dm-4 brw-rw---- 1 grid asmadmin 253, 5 Jul 31 05:15 /dev/dm-5 brw-rw---- 1 grid asmadmin 253, 6 Jul 31 05:15 /dev/dm-6 brw-rw---- 1 root disk 253, 7 Jul 31 04:28 /dev/dm-7 brw-rw---- 1 grid asmadmin 253, 8 Jul 31 05:22 /dev/dm-8 brw-rw---- 1 root disk 253, 9 Jul 31 04:28 /dev/dm-9
发现进一步访问,这三个盘权限全部还原成root:disk,导致grid无法正常访问,到这一部分基本上可以判断恢复过来的多路径下面的三个磁盘,当被访问之时,权限会发生改变,一般发生该问题,是由于这些设备没有被udev进行绑定导致,使用udev对这三个磁盘进行权限和所有组相关信息进行绑定之后,磁盘权限不再变化,v$asm_disk中显示信息也正常
[root@xffdb2 rules.d]# ls -l /dev/dm* brw-rw---- 1 root disk 253, 0 Jul 31 05:26 /dev/dm-0 brw-rw---- 1 root disk 253, 1 Jul 31 05:26 /dev/dm-1 brw-rw---- 1 grid asmadmin 253, 10 Jul 31 05:26 /dev/dm-10 brw-rw---- 1 root disk 253, 11 Jul 31 05:26 /dev/dm-11 brw-rw---- 1 grid asmadmin 253, 12 Jul 31 05:26 /dev/dm-12 brw-rw---- 1 grid asmadmin 253, 13 Jul 31 05:26 /dev/dm-13 brw-rw---- 1 grid asmadmin 253, 14 Jul 31 05:26 /dev/dm-14 brw-rw---- 1 root disk 253, 15 Jul 31 05:26 /dev/dm-15 brw-rw---- 1 root disk 253, 16 Jul 31 05:26 /dev/dm-16 brw-rw---- 1 root disk 253, 17 Jul 31 05:26 /dev/dm-17 brw-rw---- 1 grid asmadmin 253, 2 Jul 31 05:26 /dev/dm-2 brw-rw---- 1 grid asmadmin 253, 3 Jul 31 05:26 /dev/dm-3 brw-rw---- 1 grid asmadmin 253, 4 Jul 31 05:26 /dev/dm-4 brw-rw---- 1 grid asmadmin 253, 5 Jul 31 05:26 /dev/dm-5 brw-rw---- 1 grid asmadmin 253, 6 Jul 31 05:26 /dev/dm-6 brw-rw---- 1 root disk 253, 7 Jul 31 05:26 /dev/dm-7 brw-rw---- 1 grid asmadmin 253, 8 Jul 31 05:26 /dev/dm-8 brw-rw---- 1 root disk 253, 9 Jul 31 05:26 /dev/dm-9 [root@xffdb2 rules.d]# ls -l /dev/mapper/ total 0 crw-rw---- 1 root root 10, 58 Jul 31 05:26 control lrwxrwxrwx 1 root root 8 Jul 31 05:26 mpathe -> ../dm-17 lrwxrwxrwx 1 root root 7 Jul 31 05:26 mpathf -> ../dm-7 lrwxrwxrwx 1 root root 8 Jul 31 05:26 xffdb_data01_new -> ../dm-14 lrwxrwxrwx 1 root root 8 Jul 31 05:26 xffdb_data02_new -> ../dm-13 lrwxrwxrwx 1 root root 7 Jul 31 05:26 xffdb_data03 -> ../dm-2 lrwxrwxrwx 1 root root 7 Jul 31 05:26 xffdb_data04 -> ../dm-5 lrwxrwxrwx 1 root root 8 Jul 31 05:26 xffdb_data05_new -> ../dm-12 lrwxrwxrwx 1 root root 7 Jul 31 05:26 xffdb_data06 -> ../dm-6 lrwxrwxrwx 1 root root 8 Jul 31 05:26 xffdb_data07 -> ../dm-11 lrwxrwxrwx 1 root root 7 Jul 31 05:26 xffdb_data08 -> ../dm-9 lrwxrwxrwx 1 root root 7 Jul 31 05:26 xffdb_log1 -> ../dm-4 lrwxrwxrwx 1 root root 7 Jul 31 05:26 xffdb_log2 -> ../dm-3 lrwxrwxrwx 1 root root 7 Jul 31 05:26 xffdb_vote2 -> ../dm-8 lrwxrwxrwx 1 root root 8 Jul 31 05:26 xffdb_vote3 -> ../dm-10 lrwxrwxrwx 1 root root 8 Jul 31 05:26 vgdata-lv_data -> ../dm-15 lrwxrwxrwx 1 root root 7 Jul 31 05:26 vg_xffdb2-LogVol00 -> ../dm-1 lrwxrwxrwx 1 root root 7 Jul 31 05:26 vg_xffdb2-LogVol01 -> ../dm-0 lrwxrwxrwx 1 root root 8 Jul 31 05:26 vg_xffdb2-LogVol02 -> ../dm-16 [root@xffdb2 rules.d]#
SQL> / GROUP_NUMBER DISK_NUMBER HEADER_STATUS STATE PATH ------------ ----------- ------------------------------------ ------------------------ ----------------------------- 0 0 MEMBER NORMAL /dev/mapper/xffdb_data01_new 0 1 MEMBER NORMAL /dev/mapper/xffdb_data05_new 0 2 MEMBER NORMAL /dev/mapper/xffdb_data03 0 3 MEMBER NORMAL /dev/mapper/xffdb_data06 0 4 MEMBER NORMAL /dev/mapper/xffdb_data04 0 5 MEMBER NORMAL /dev/mapper/xffdb_data02_new 3 1 MEMBER NORMAL /dev/mapper/xffdb_vote2 2 0 MEMBER NORMAL /dev/mapper/xffdb_log1 3 2 MEMBER NORMAL /dev/mapper/xffdb_vote3 2 1 MEMBER NORMAL /dev/mapper/xffdb_log2 10 rows selected.
mount磁盘组成功
SQL> alter diskgroup data_dg mount NOTE: cache registered group DATA_DG number=1 incarn=0x4178bd5e NOTE: cache began mount (first) of group DATA_DG number=1 incarn=0x4178bd5e NOTE: Assigning number (1,0) to disk (/dev/mapper/xffdb_data01_new) NOTE: Assigning number (1,4) to disk (/dev/mapper/xffdb_data05_new) NOTE: Assigning number (1,2) to disk (/dev/mapper/xffdb_data03) NOTE: Assigning number (1,5) to disk (/dev/mapper/xffdb_data06) NOTE: Assigning number (1,3) to disk (/dev/mapper/xffdb_data04) NOTE: Assigning number (1,1) to disk (/dev/mapper/xffdb_data02_new) Wed Jul 31 05:27:47 2024 NOTE: GMON heartbeating for grp 1 GMON querying group 1 at 46 for pid 29, osid 26738 NOTE: cache opening disk 0 of grp 1: DATA_DG_0000 path:/dev/mapper/xffdb_data01_new NOTE: F1X0 found on disk 0 au 2 fcn 0.0 NOTE: cache opening disk 1 of grp 1: DATA_DG_0001 path:/dev/mapper/xffdb_data02_new NOTE: cache opening disk 2 of grp 1: DATA_DG_0002 path:/dev/mapper/xffdb_data03 NOTE: cache opening disk 3 of grp 1: DATA_DG_0003 path:/dev/mapper/xffdb_data04 NOTE: cache opening disk 4 of grp 1: DATA_DG_0004 path:/dev/mapper/xffdb_data05_new NOTE: cache opening disk 5 of grp 1: DATA_DG_0005 path:/dev/mapper/xffdb_data06 NOTE: cache mounting (first) external redundancy group 1/0x4178BD5E (DATA_DG) Wed Jul 31 05:27:47 2024 * allocate domain 1, invalid = TRUE kjbdomatt send to inst 1 Wed Jul 31 05:27:47 2024 NOTE: attached to recovery domain 1 NOTE: cache recovered group 1 to fcn 0.1814063801 NOTE: redo buffer size is 256 blocks (1053184 bytes) Wed Jul 31 05:27:47 2024 NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (DATA_DG) NOTE: LGWR found thread 1 closed at ABA 12401.4517 NOTE: LGWR mounted thread 1 for diskgroup 1 (DATA_DG) NOTE: LGWR opening thread 1 at fcn 0.1814063801 ABA 12402.4518 NOTE: cache mounting group 1/0x4178BD5E (DATA_DG) succeeded NOTE: cache ending mount (success) of group DATA_DG number=1 incarn=0x4178bd5e Wed Jul 31 05:27:47 2024 NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1 SUCCESS: diskgroup DATA_DG was mounted SUCCESS: alter diskgroup data_dg mount
重要提醒:手工直接对multipath设备权限所有者操作,当该设备被访问之时权限可能恢复成当初默认root:disk,对于这样的设备建议通过udev进行设置权限和所有者等信息