标签云
asm恢复 bbed bootstrap$ dul kcbzib_kcrsds_1 kccpb_sanity_check_2 kcratr_nab_less_than_odr kgegpa MySQL恢复 ORA-00312 ORA-00704 ORA-00742 ORA-01110 ORA-01190 ORA-01200 ORA-01555 ORA-01578 ORA-01595 ORA-600 2662 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 ORACLE恢复 Oracle 恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (106)
- 数据库 (1,801)
- DB2 (22)
- MySQL (79)
- Oracle (1,637)
- Data Guard (53)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (166)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (15)
- ORACLE 21C (3)
- Oracle 23ai (8)
- Oracle ASM (69)
- Oracle Bug (8)
- Oracle RAC (54)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (29)
- Oracle备份恢复 (611)
- Oracle安装升级 (101)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (86)
- PostgreSQL (33)
- pdu工具 (7)
- PostgreSQL恢复 (11)
- SQL Server (32)
- SQL Server恢复 (13)
- TimesTen (7)
- 达梦数据库 (3)
- 达梦恢复 (1)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (42)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (25)
-
最近发表
- Oracle数据块编辑工具( Oracle Block Editor Tool)-obet
- Oracle坏块修复工具:Patch_blk
- ORA-01172 ORA-01151故障处理
- C_OBJ#_INTCOL#坏块导致数据库无法open故障处理
- ORA-600 kkkicreatecgmap:!efn3
- Oracle 19c 202510补丁(RUs+OJVM)-19.29
- 记录一次raid恢复之后数据库故障处理(ora-01200,ORA-26101,ORA-600)
- nbu备份文件img格式直接rman恢复
- ORA-600 kokasgi1故障处理(sys被重命名)
- Patch_SCN for Linux 功能完善
- ORA-600 2662错误处理-202510
- system表空间丢失部分文件恢复
- arm环境vg损坏mysql数据库恢复
- redhat系列7/8进入单用户模式
- Failed to open \EFI\redhat\grubx64.efi – Not Found 故障处理
- 11.2.0.4升级到19c详细操作过程
- Postgres数据库truncate表无有效备份恢复
- 一次幸运的ORA-07445 kdxlin故障恢复
- ORA-704 ORA-604 ORA-1426故障分析处理
- ORA-600 4194引起SMON encountered 100 out of maximum 100 non-fatal internal errors故障
标签归档:mount
asm disk误设置pvid导致asm diskgroup无法mount恢复
有朋友找到我说他们把以前存储到AIX直连的存储切换为含光纤交换机的存储网络后,RAC无法启动,让我给予支持.通过分析是由于换链路之后开始磁盘顺序不对,维护人员对其asm disk 设置了pvid,导致asm 磁盘组无法正常mount,从而使得含votedisk的dg的asm disk无法正常访问,从而RAC的cssd进程无法启动,同样数据文件的磁盘组也无法mount,通过kfed修复成功,实现数据0丢失.
平台版本信息(2节点RAC)
$ sqlplus -v SQL*Plus: Release 11.2.0.4.0 Production $ uname -a AIX db2 1 7 00F9733E4C00
GI日志报错信息
2014-12-20 16:44:08.769: [ohasd(6946818)]CRS-2769:Unable to failover resource 'ora.diskmon'. 2014-12-20 16:44:11.775: [cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log 2014-12-20 16:44:26.791: [cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; 、Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log 2014-12-20 16:44:41.812: [cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
从这里可以看出来是由于RAC启动过程中无法获得votedisk使得其无法正常启动,通过分析日志找出来votedisk相关磁盘
2014-12-15 17:36:15.424: [cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk4; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log 2014-12-15 17:36:15.433: [cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk5; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log 2014-12-15 17:36:15.445: [cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk6; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
从这里可以知道rhdisk4,5,6为votedisk对应磁盘,使用kfed查看磁盘头信息
$kfed read /dev/rhdisk4
kfbh.endian: 201 ; 0x000: 0xc9
kfbh.hard: 194 ; 0x001: 0xc2
kfbh.type: 212 ; 0x002: *** Unknown Enum ***
kfbh.datfmt: 193 ; 0x003: 0xc1
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
1102BEE00 C9C2D4C1 00000000 00000000 00000000 [................]
1102BEE10 00000000 00000000 00000000 00000000 [................]
Repeat 6 times
1102BEE80 00F9733D 67553E0A 00000000 00000000 [..s=gU>.........]
1102BEE90 00000000 00000000 00000000 00000000 [................]
Repeat 246 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][212]
$kfed read /dev/rhdisk4 blkn=1
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 2 ; 0x002: KFBTYP_FREESPC
kfbh.datfmt: 2 ; 0x003: 0x02
kfbh.block.blk: 1 ; 0x004: blk=1
kfbh.block.obj: 2147483648 ; 0x008: disk=0
kfbh.check: 3883664132 ; 0x00c: 0xe77c0304
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdfsb.aunum: 0 ; 0x000: 0x00000000
kfdfsb.max: 254 ; 0x004: 0x00fe
kfdfsb.cnt: 23 ; 0x006: 0x0017
kfdfsb.bound: 0 ; 0x008: 0x0000
kfdfsb.flag: 1 ; 0x00a: B=1
kfdfsb.ub1spare: 0 ; 0x00b: 0x00
kfdfsb.spare[0]: 0 ; 0x00c: 0x00000000
kfdfsb.spare[1]: 0 ; 0x010: 0x00000000
kfdfsb.spare[2]: 0 ; 0x014: 0x00000000
kfdfse[0].fse: 119 ; 0x018: FREE=0x7 FRAG=0x7
kfdfse[1].fse: 16 ; 0x019: FREE=0x0 FRAG=0x1
…………
$kfed read /dev/rhdisk4 blkn=510
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 254 ; 0x004: blk=254
kfbh.block.obj: 2147483648 ; 0x008: disk=0
kfbh.check: 3460116983 ; 0x00c: 0xce3d31f7
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]: 0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]: 0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000
kfdhdb.compat: 186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum: 0 ; 0x024: 0x0000
kfdhdb.grptyp: 2 ; 0x026: KFDGTP_NORMAL
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname: CRS_0000 ; 0x028: length=8
kfdhdb.grpname: CRS ; 0x048: length=3
kfdhdb.fgname: CRS_0000 ; 0x068: length=8
…………
由上述分析可以基本上确定是asm disk header 被破坏,进一步分析破坏原因
[db2/dev#]lspv hdisk0 00f9733ef7cf27e9 rootvg active hdisk1 00f9733e21b953e6 rootvg active hdisk2 00f9733e21b97a83 appvg active hdisk3 00f9733e21b98434 appvg active hdisk4 00f9733d67553e0a None hdisk5 00f9733d67553f31 None hdisk6 00f9733d67554011 None hdisk7 00f9733d67554165 None hdisk8 00f9733d675541e5 None hdisk9 00f9733d675542e4 None hdisk10 none None [db2/dev#]ls -l rhdisk* crw------- 2 root system 24, 1 Oct 18 11:45 rhdisk0 crw------- 1 root system 24, 3 Oct 18 13:27 rhdisk1 crw------- 1 root system 24, 5 Dec 20 20:02 rhdisk10 crw------- 1 root system 24, 2 Oct 18 13:32 rhdisk2 crw------- 1 root system 24, 0 Oct 18 13:32 rhdisk3 crw-rw---- 1 grid asmadmin 24, 8 Dec 20 20:02 rhdisk4 crw-rw---- 1 grid asmadmin 24, 9 Dec 20 20:02 rhdisk5 crw-rw---- 1 grid asmadmin 24, 10 Dec 20 20:02 rhdisk6 crw-rw---- 1 grid asmadmin 24, 4 Dec 20 20:02 rhdisk7 crw-rw---- 1 grid asmadmin 24, 6 Dec 20 20:02 rhdisk8 crw-rw---- 1 grid asmadmin 24, 7 Dec 20 20:02 rhdisk9
从这里基本上可以看出来,是由于磁盘头被重写了pvid,导致asm disk header 被破坏.进一步分析asm log,确定哪些磁盘被用作asm disk
SQL> CREATE DISKGROUP CRS NORMAL REDUNDANCY DISK '/dev/rhdisk4', '/dev/rhdisk5', '/dev/rhdisk6' ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */ NOTE: Assigning number (1,0) to disk (/dev/rhdisk4) NOTE: Assigning number (1,1) to disk (/dev/rhdisk5) NOTE: Assigning number (1,2) to disk (/dev/rhdisk6) NOTE: initializing header on grp 1 disk CRS_0000 NOTE: initializing header on grp 1 disk CRS_0001 NOTE: initializing header on grp 1 disk CRS_0002 SQL> CREATE DISKGROUP DATA EXTERNAL REDUNDANCY DISK '/dev/rhdisk9' SIZE 614400M ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */ NOTE: Assigning number (2,0) to disk (/dev/rhdisk9) NOTE: initializing header on grp 2 disk DATA_0000 SQL> CREATE DISKGROUP FBA EXTERNAL REDUNDANCY DISK '/dev/rhdisk8' SIZE 204800M ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */ NOTE: Assigning number (3,0) to disk (/dev/rhdisk8) NOTE: initializing header on grp 3 disk FBA_0000 SQL> CREATE DISKGROUP ARCH EXTERNAL REDUNDANCY DISK '/dev/rhdisk7' SIZE 102400M ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */ NOTE: Assigning number (4,0) to disk (/dev/rhdisk7) NOTE: initializing header on grp 4 disk ARCH_0000
这里可以确定asm disk为rhdisk[4-9],通过kfed分析全部和rhdisk4一样的问题,也符合lspv查询出来的结果,使用kfed repair修复asm disk header后
SQL> alter diskgroup data mount;
Diskgroup altered.
SQL> alter diskgroup fba mount;
Diskgroup altered.
SQL> alter diskgroup arch mount;
Diskgroup altered.
SQL> alter diskgroup crs mount;
Diskgroup altered.
SQL> select group_number,disk_number,path from v$asm_disk;
GROUP_NUMBER DISK_NUMBER PATH
------------ ----------- --------------------------------------------------
2 0 /dev/rhdisk4
2 1 /dev/rhdisk5
2 2 /dev/rhdisk6
1 0 /dev/rhdisk7
4 0 /dev/rhdisk8
3 0 /dev/rhdisk9
6 rows selected.
SQL> select group_number,name from v$asm_diskgroup;
GROUP_NUMBER NAME
------------ ------------------------------------------------------------
1 ARCH
2 CRS
3 DATA
4 FBA
这里证明通过kfed对磁盘头的修复,asm磁盘组已经全部mount成功,GI状态也恢复正常
[db2/#]crsctl status res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
ONLINE ONLINE db1
ONLINE ONLINE db2
ora.CRS.dg
ONLINE ONLINE db1
ONLINE ONLINE db2
ora.DATA.dg
ONLINE ONLINE db1
ONLINE ONLINE db2
ora.FBA.dg
ONLINE ONLINE db1
ONLINE ONLINE db2
ora.LISTENER.lsnr
ONLINE ONLINE db1
ONLINE ONLINE db2
ora.asm
ONLINE ONLINE db1 Started
ONLINE ONLINE db2 Started
ora.gsd
OFFLINE OFFLINE db1
OFFLINE OFFLINE db2
ora.net1.network
ONLINE ONLINE db1
ONLINE ONLINE db2
ora.ons
ONLINE ONLINE db1
ONLINE ONLINE db2
ora.registry.acfs
ONLINE ONLINE db1
ONLINE ONLINE db2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE db1
ora.cvu
1 ONLINE ONLINE db1
ora.db1.vip
1 ONLINE ONLINE db1
ora.db2.vip
1 ONLINE ONLINE db2
ora.nkora.db
1 ONLINE ONLINE db1 Open
2 ONLINE ONLINE db2 Open
ora.oc4j
1 ONLINE ONLINE db1
ora.scan1.vip
1 ONLINE ONLINE db1
这里忽略了一个问题,在修复磁盘头之前没有清除pvid,导致磁盘头修复后,pvid依然存储在odm中
[db2/dev#]lspv hdisk0 00f9733ef7cf27e9 rootvg active hdisk1 00f9733e21b953e6 rootvg active hdisk2 00f9733e21b97a83 appvg active hdisk3 00f9733e21b98434 appvg active hdisk4 00f9733d67553e0a None hdisk5 00f9733d67553f31 None hdisk6 00f9733d67554011 None hdisk7 00f9733d67554165 None hdisk8 00f9733d675541e5 None hdisk9 00f9733d675542e4 None hdisk10 none None
通过分析发现fba磁盘组中无任何记录,使用该磁盘组进行直接清除pvid测试
$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.4.0 Production on Sun Dec 21 03:13:31 2014 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> alter diskgroup fba dismount; Diskgroup altered. SQL> exit Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options $ exit You have mail in /usr/spool/mail/root [db2/#]chdev -l hdisk8 -a pv=clear hdisk8 changed [db2/#]lspv hdisk0 00f9733ef7cf27e9 rootvg active hdisk1 00f9733e21b953e6 rootvg active hdisk2 00f9733e21b97a83 appvg active hdisk3 00f9733e21b98434 appvg active hdisk4 00f9733d67553e0a None hdisk5 00f9733d67553f31 None hdisk6 00f9733d67554011 None hdisk7 00f9733d67554165 None hdisk8 none None hdisk9 00f9733d675542e4 None hdisk10 none None [db2/#]su - grid $ sqlplus / as sysasm SQL*Plus: Release 11.2.0.4.0 Production on Sun Dec 21 03:15:19 2014 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> alter diskgroup fba mount; Diskgroup altered. SQL> exit Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options
通过测试直接清除pvid asm 磁盘头依然工作正常,关闭GI,使用chdev清除hdisk[4-9]所有pvid,启动GI一切正常
[db1/#]crsctl status res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
ONLINE ONLINE db1
ONLINE ONLINE db2
ora.CRS.dg
ONLINE ONLINE db1
ONLINE ONLINE db2
ora.DATA.dg
ONLINE ONLINE db1
ONLINE ONLINE db2
ora.FBA.dg
ONLINE ONLINE db1
ONLINE ONLINE db2
ora.LISTENER.lsnr
ONLINE ONLINE db1
ONLINE ONLINE db2
ora.asm
ONLINE ONLINE db1 Started
ONLINE ONLINE db2 Started
ora.gsd
OFFLINE OFFLINE db1
OFFLINE OFFLINE db2
ora.net1.network
ONLINE ONLINE db1
ONLINE ONLINE db2
ora.ons
ONLINE ONLINE db1
ONLINE ONLINE db2
ora.registry.acfs
ONLINE ONLINE db1
ONLINE ONLINE db2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE db1
ora.cvu
1 ONLINE ONLINE db1
ora.db1.vip
1 ONLINE ONLINE db1
ora.db2.vip
1 ONLINE ONLINE db2
ora.nkora.db
1 ONLINE ONLINE db1 Open
2 ONLINE ONLINE db2 Open
ora.oc4j
1 ONLINE ONLINE db1
ora.scan1.vip
1 ONLINE ONLINE db1
[db1/#]lspv
hdisk0 00f9733df7c7a9db rootvg active
hdisk1 00f9733d21dad8fe rootvg active
hdisk2 00f9733d21dbd08b appvg active
hdisk3 00f9733d21dbd2ab appvg active
hdisk4 none None
hdisk5 none None
hdisk6 none None
hdisk7 none None
hdisk8 none None
hdisk9 none None
hdisk10 none None
至此设置pvid导致asm disk header损坏的asm 恢复正常,实现数据0丢失。
温馨提示:aix asm disk磁盘中不能设置pvid,否则将会导致asm disk header 损坏,无法正常mount
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971 Q Q:107644445
E-Mail:dba@xifenfei.com
发表在 Oracle ASM, 非常规恢复
标签为 asm, asm不能mount, asm恢复, kfbtTraverseBlock, KFED-00322, mount, ORA-15042, pvid
一条评论

加我QQ(107644445)
