标签归档:asm mount

ORA-600 kfrValAcd30 恢复

有客户由于存储控制器损坏,修复控制器之后,asm无法正常mount,报ORA-600 kfrValAcd30错误,让我们提供技术支持
kfrValAcd30


asm alert日志报错

Wed Apr 03 16:50:57 2019
SQL> alter diskgroup data mount 
NOTE: cache registered group DATA number=1 incarn=0x14248741
NOTE: cache began mount (first) of group DATA number=1 incarn=0x14248741
NOTE: Assigning number (1,0) to disk (ORCL:DATAVOL1)
Wed Apr 03 16:51:03 2019
NOTE: start heartbeating (grp 1)
kfdp_query(DATA): 15 
kfdp_queryBg(): 15 
NOTE: cache opening disk 0 of grp 1: DATAVOL1 label:DATAVOL1
NOTE: F1X0 found on disk 0 au 2 fcn 0.0
NOTE: cache mounting (first) external redundancy group 1/0x14248741 (DATA)
Wed Apr 03 16:51:04 2019
* allocate domain 1, invalid = TRUE 
Wed Apr 03 16:51:04 2019
NOTE: attached to recovery domain 1
NOTE: starting recovery of thread=1 ckpt=27.2697 group=1 (DATA)
Errors in file /u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_ora_15951.trc  (incident=23394):
ORA-00600: internal error code, arguments: [kfrValAcd30], [DATA], [1], [27], [2697], [28], [2697], [], [], [], [], []
Incident details in: /u01/app/grid/diag/asm/+asm/+ASM2/incident/incdir_23394/+ASM2_ora_15951_i23394.trc
Abort recovery for domain 1
NOTE: crash recovery signalled OER-600
ERROR: ORA-600 signalled during mount of diskgroup DATA
ORA-00600: internal error code, arguments: [kfrValAcd30], [DATA], [1], [27], [2697], [28], [2697], [], [], [], [], []
ERROR: alter diskgroup data mount
NOTE: cache dismounting (clean) group 1/0x14248741 (DATA) 
NOTE: lgwr not being msg'd to dismount
freeing rdom 1
Wed Apr 03 16:51:05 2019
Sweep [inc][23394]: completed
Sweep [inc2][23394]: completed
Wed Apr 03 16:51:05 2019
Trace dumping is performing id=[cdmp_20190403165105]
NOTE: detached from domain 1
NOTE: cache dismounted group 1/0x14248741 (DATA) 
NOTE: cache ending mount (fail) of group DATA number=1 incarn=0x14248741
kfdp_dismount(): 16 
kfdp_dismountBg(): 16 
NOTE: De-assigning number (1,0) from disk (ORCL:DATAVOL1)
ERROR: diskgroup DATA was not mounted
NOTE: cache deleting context for group DATA 1/337938241

mos相关记录
参考:ORA-600 [KFRVALACD30] in ASM (Doc ID 2123013.1)
kfrValAcd30-mos


ORA-00600: internal error code, arguments: [kfrValAcd30]可能是bug或者硬件故障导致.基于客户的情况,最大可能就是由于硬件故障导致asm 磁盘组的acd无法正常进行,从而无法mount成功.如果运气好,通过kfed相关修复可以正常mount成功,运气不好可以通过底层进行恢复数据文件,从而最大限度恢复数据.

发表在 Oracle ASM, 非常规恢复 | 标签为 , , | 评论关闭

hp平台rdisk中磁盘丢失导致asm启动报ORA-15042恢复

有老朋友找到我,说一个客户的数据库异常,问题是asm无法正常mount,提示是缺少两块磁盘.问我是否可以恢复.因为是内网环境,通过他那边发过来的零零散散的信息,大概分析如下
asm alert日志报错
ERROR: diskgroup DGROUP1 was not mounted

Fri Aug 12 16:03:12 EAT 2016
SQL> alter diskgroup DGROUP1 mount 
Fri Aug 12 16:03:12 EAT 2016
NOTE: cache registered group DGROUP1 number=1 incarn=0xf6781b5c
Fri Aug 12 16:03:12 EAT 2016
NOTE: Hbeat: instance first (grp 1)
Fri Aug 12 16:03:16 EAT 2016
NOTE: start heartbeating (grp 1)
Fri Aug 12 16:03:16 EAT 2016
NOTE: cache dismounting group 1/0xF6781B5C (DGROUP1) 
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGROUP1 was not mounted

前台尝试mount asm 磁盘组报错ORA-15042
ORA-15042


从这里可以明显的看出来asm 磁盘组无法正常mount,是由于缺少asm disk 15,16.如果想恢复asm,最好的方法就是找出来这两个磁盘.通过kfed对现在的磁盘进行分析,最后我们发现asm disk 14对应的磁盘为disk160,,asm disk 17对应的disk163,根据第一感觉很可能是disk161和disk161两块盘异常,让机房检查硬件无任何告警

OS层面分析
省略和本次结论无关的记录

ls -l /dev/rdisk
crw-rw----   1 oracle     dba         13 0x000070 Jan  1  2016 disk160
crw-rw----   1 oracle     dba         13 0x000073 Jan  1  2016 disk163

ls -l /dev/disk
brw-r-----   1 bin        sys          1 0x000070 Jan 13  2015 disk160
brw-r-----   1 bin        sys          1 0x000071 Jan 13  2015 disk161
brw-r-----   1 bin        sys          1 0x000072 Jan 13  2015 disk162
brw-r-----   1 bin        sys          1 0x000073 Jan 13  2015 disk163

这里我们发现在hp unix中/dev/disk下面磁盘都存在,但是/dev/rdisk下面丢失,通过ioscan相关命令继续分析

ioscan -fNnkC disk
disk    160  64000/0xfa00/0x70  esdisk   CLAIMED     DEVICE       HP      OPEN-V
                      /dev/disk/disk160   /dev/rdisk/disk160
disk    161  64000/0xfa00/0x71  esdisk   CLAIMED     DEVICE       HP      OPEN-V
                      /dev/disk/disk161
disk    162  64000/0xfa00/0x72  esdisk   CLAIMED     DEVICE       HP      OPEN-V
                      /dev/disk/disk162
disk    163  64000/0xfa00/0x73  esdisk   CLAIMED     DEVICE       HP      OPEN-V
                      /dev/disk/disk163   /dev/rdisk/disk163

这里我们基本上可以确定是/dev/rdisk下面的盘发生丢失.进一步分析,因为rdisk是聚合后的盘符,那我们分析聚合前的盘符是否正常

ioscan -m dsf
/dev/rdisk/disk160       /dev/rdsk/c29t12d4
                         /dev/rdsk/c28t12d4
/dev/rdisk/disk163       /dev/rdsk/c29t12d7
                         /dev/rdsk/c28t12d7

ls -l /dev/rdsk
crw-r-----   1 bin        sys        188 0x1dc000 Apr 22  2014 c29t12d0
crw-r-----   1 bin        sys        188 0x1dc100 Apr 22  2014 c29t12d1
crw-r-----   1 bin        sys        188 0x1dc300 Jan 13  2015 c29t12d3
crw-r-----   1 bin        sys        188 0x1dc400 Jan 13  2015 c29t12d4
crw-r-----   1 bin        sys        188 0x1dc500 Jan 13  2015 c29t12d5
crw-r-----   1 bin        sys        188 0x1dc600 Jan 13  2015 c29t12d6
crw-r-----   1 bin        sys        188 0x1dc700 Jan 13  2015 c29t12d7

crw-r-----   1 bin        sys        188 0x1cc100 Apr 22  2014 c28t12d1
crw-r-----   1 bin        sys        188 0x1cc300 Jan 13  2015 c28t12d3
crw-r-----   1 bin        sys        188 0x1cc400 Jan 13  2015 c28t12d4
crw-r-----   1 bin        sys        188 0x1cc500 Jan 13  2015 c28t12d5
crw-r-----   1 bin        sys        188 0x1cc600 Jan 13  2015 c28t12d6
crw-r-----   1 bin        sys        188 0x1cc700 Jan 13  2015 c28t12d7

通过这里我们基本上可以大概判断出来/dev/rdsk/c28t12d5,/dev/rdsk/c28t12d6,/dev/rdsk/c29t12d5,/dev/rdsk/c29t12d6就是我们需要找的/dev/rdisk/disk161和disk162的聚合之前的盘符.也就是说,现在我们判断只有/dev/rdisk下面的字符设备有问题,其他均正常.

通过系统命令修复异常

insf -e -H 64000/0xfa00/0x71
insf -e -H 64000/0xfa00/0x72

hp-asm-disk


现在已经可以正常看到/dev/rdisk/disk161和/dev/rdisk/disk162盘符,初步判断,os层面盘符已经恢复正常.修改磁盘权限和所属组

chmod 660 /dev/rdisk/disk161
chmod 660 /dev/rdisk/disk162
chown oracle:dba /dev/rdisk/disk161
chown oracle:dba /dev/rdisk/disk162

正常启动asm,mount磁盘组,open数据库
asm-mount


这次的恢复,主要是从操作系统层面判断解决问题,从而实现数据库完美恢复,数据0丢失.有类似恢复案例:分区无法识别导致asm diskgroup无法mount
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:13429648788    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

发表在 Oracle ASM, Unix | 标签为 , , | 评论关闭

oracle asm系列文章汇总

为了方便大家更容易的查看相关asm内容,今天(2016年7月28日)对asm的相关文章进行了汇总整理.如果有asm相关的其他问题,可以通过手机(13429648788)或者QQ(107644445)交流
远程访问ASM
ASMCMD常用命令
ASM简单管理(1)
ASM简单管理(2)
找回ASM中数据文件
bbed修改ASM中数据
ASM迁移至文件系统
普通库迁移至ASM存储
使用dd复制asm中文件
配置Oracle ASM磁盘
ASM中磁盘组权限设置
监控asm disk磁盘性能
create spfile to asm
pvid=yes导致asm无法mount—ASM恢复案例
asm数据文件迁移(os–>asm)
asm数据文件迁移(asm–>asm)
asm数据文件迁移(asm–>os)
通过ftp/http拷贝asm中文件
ASM DISK HEADER 备份与恢复
手工修复ASM DISK HEADER 异常
asm disk header 彻底损坏恢复—ASM恢复案例
ASM未正常启动,使用dd找回数据文件
ORACLE 12C ASM 新特性:共享密码文件
asm备份元数据之md_backup和md_restore
分区无法识别导致asm diskgroup无法mount—ASM恢复案例
使用losetup实现linux普通文件做asm disk
asm 加磁盘导致磁盘组损坏恢复—ASM恢复案例
asm磁盘头全部损坏数据0丢失恢复—ASM恢复案例
Oracle异常恢复前备份保护现场建议—ASM环境
多cpu环境中运行root.sh失败,asm报ORA-04031
asmlib异常报ORA-00600[kfklLibFetchNext00]
存储精简卷导致asm磁盘组异常
因asm sga_target设置不当导致11gr2 rac无法正常启动
asm disk误设置pvid导致asm diskgroup无法mount恢复—ASM恢复案例
oracle asm disk格式化恢复—格式化为ntfs文件系统—ASM恢复案例
aoracle asm disk格式化恢复—格式化为ext4文件系统—ASM恢复案例
使用_asm_allow_only_raw_disks实现普通文件做asm disk
ORACLE 12C RAC修改ocr/votedisk/asm spfile所在磁盘组名称
使用asm disk header 自动备份信息恢复异常asm disk header
分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例—ASM恢复案例
asm磁盘组操作不当导致数据文件丢失恢复—ASM恢复案例
ADHU(ASM Disk Header Utility)—asm disk header备份恢复工具
How to Get the Contents of an Spfile on ASM when ASM/GRID is down
ORA-15042: ASM disk “N” is missing from group number “M” 故障恢复—ASM恢复案例

发表在 Oracle ASM, 专题索引 | 标签为 , , , | 评论关闭