标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 kfed MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (102)
- 数据库 (1,674)
- DB2 (22)
- MySQL (73)
- Oracle (1,536)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (22)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (14)
- ORACLE 21C (3)
- Oracle 23ai (7)
- Oracle ASM (67)
- Oracle Bug (8)
- Oracle RAC (52)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (562)
- Oracle安装升级 (92)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (78)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- GoldenGate 19安装和打patch
- dd破坏asm磁盘头恢复
- 删除asmlib磁盘导致磁盘组故障恢复
- Kylin Linux 安装19c
- ORA-600 krse_arc_complete.4
- Oracle 19c 202410补丁(RUs+OJVM)
- ntfs MFT损坏(ntfs文件系统故障)导致oracle异常恢复
- .mkp扩展名oracle数据文件加密恢复
- 清空redo,导致ORA-27048: skgfifi: file header information is invalid
- A_H_README_TO_RECOVER勒索恢复
- 通过alert日志分析客户自行对一个数据库恢复的来龙去脉和点评
- ORA-12514: TNS: 监听进程不能解析在连接描述符中给出的SERVICE_NAME
- ORA-01092 ORA-00604 ORA-01558故障处理
- ORA-65088: database open should be retried
- Oracle 19c异常恢复—ORA-01209/ORA-65088
- ORA-600 16703故障再现
- 数据库启动报ORA-27102 OSD-00026 O/S-Error: (OS 1455)
- .[metro777@cock.li].Elbie勒索病毒加密数据库恢复
- 应用连接错误,初始化mysql数据库恢复
- RAC默认服务配置优先节点
标签归档:ORA-15040
Exadata磁盘损坏导致磁盘组无法mount恢复(oracle一体机磁盘组异常恢复)
Oracle Exadata客户,在换盘过程中,cell节点又一块磁盘损坏,导致datac1磁盘组(该磁盘组是normal方式冗余)无法mount
Thu Jul 20 22:01:21 2023 SQL> alter diskgroup datac1 mount force NOTE: cache registered group DATAC1 number=1 incarn=0x0728ad12 NOTE: cache began mount (first) of group DATAC1 number=1 incarn=0x0728ad12 NOTE: Assigning number (1,35) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_11_dm01celadm03) NOTE: Assigning number (1,31) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_07_dm01celadm03) NOTE: Assigning number (1,24) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_00_dm01celadm03) NOTE: Assigning number (1,25) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_01_dm01celadm03) NOTE: Assigning number (1,27) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_03_dm01celadm03) NOTE: Assigning number (1,33) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_09_dm01celadm03) NOTE: Assigning number (1,30) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_06_dm01celadm03) NOTE: Assigning number (1,28) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_04_dm01celadm03) NOTE: Assigning number (1,26) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_02_dm01celadm03) NOTE: Assigning number (1,1) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_08_dm01celadm03) NOTE: Assigning number (1,34) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_10_dm01celadm03) NOTE: Assigning number (1,29) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_05_dm01celadm03) NOTE: Assigning number (1,3) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_07_dm01celadm02) NOTE: Assigning number (1,4) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_06_dm01celadm02) NOTE: Assigning number (1,5) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_00_dm01celadm02) NOTE: Assigning number (1,6) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_10_dm01celadm02) NOTE: Assigning number (1,7) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_08_dm01celadm02) NOTE: Assigning number (1,8) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_03_dm01celadm02) NOTE: Assigning number (1,9) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_11_dm01celadm02) NOTE: Assigning number (1,10) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_01_dm01celadm02) NOTE: Assigning number (1,11) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_04_dm01celadm02) NOTE: Assigning number (1,21) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_05_dm01celadm02) NOTE: Assigning number (1,43) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_02_dm01celadm02) NOTE: Assigning number (1,36) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_07_dm01celadm01) NOTE: Assigning number (1,37) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_09_dm01celadm01) NOTE: Assigning number (1,38) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_11_dm01celadm01) NOTE: Assigning number (1,0) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_08_dm01celadm01) NOTE: Assigning number (1,40) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_00_dm01celadm01) NOTE: Assigning number (1,41) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_03_dm01celadm01) NOTE: Assigning number (1,42) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_06_dm01celadm01) NOTE: Assigning number (1,44) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_05_dm01celadm01) NOTE: Assigning number (1,45) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_01_dm01celadm01) NOTE: Assigning number (1,46) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_02_dm01celadm01) NOTE: Assigning number (1,47) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_10_dm01celadm01) NOTE: Assigning number (1,2) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_04_dm01celadm01) Thu Jul 20 22:01:28 2023 NOTE: GMON heartbeating for grp 1 GMON querying group 1 at 450 for pid 30, osid 171838 NOTE: Assigning number (1,32) to disk () NOTE: Assigning number (1,39) to disk () GMON querying group 1 at 451 for pid 30, osid 171838 NOTE: cache closing disk 32 of grp 1: (not open) NOTE: process _user171838_+asm1 (171838) initiating offline of disk 39.3915945266 () with mask 0x7e[0x7f] in group 1 NOTE: initiating PST update: grp = 1, dsk = 39/0xe9689532, mask = 0x6a, op = clear GMON updating disk modes for group 1 at 452 for pid 30, osid 171838 NOTE: cache closing disk 32 of grp 1: (not open) ERROR: Disk 39 cannot be offlined, since all the disks [39, 32] with mirrored data would be offline. ERROR: too many offline disks in PST (grp 1) WARNING: Offline for disk in mode 0x7f failed. NOTE: cache dismounting (not clean) group 1/0x0728AD12 (DATAC1) NOTE: messaging CKPT to quiesce pins Unix process pid: 171838, image: oracle@dm01dbadm01.gyzq.cn (TNS V1-V3) NOTE: dbwr not being msg'd to dismount NOTE: lgwr not being msg'd to dismount NOTE: cache dismounted group 1/0x0728AD12 (DATAC1) NOTE: cache ending mount (fail) of group DATAC1 number=1 incarn=0x0728ad12 NOTE: cache deleting context for group DATAC1 1/0x0728ad12 NOTE: cache closing disk 32 of grp 1: (not open) GMON dismounting group 1 at 453 for pid 30, osid 171838 NOTE: Disk DATAC1_CD_08_DM01CELADM01 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_08_DM01CELADM03 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_04_DM01CELADM01 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_07_DM01CELADM02 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_06_DM01CELADM02 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_00_DM01CELADM02 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_10_DM01CELADM02 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_08_DM01CELADM02 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_03_DM01CELADM02 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_11_DM01CELADM02 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_01_DM01CELADM02 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_04_DM01CELADM02 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_05_DM01CELADM02 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_00_DM01CELADM03 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_01_DM01CELADM03 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_02_DM01CELADM03 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_03_DM01CELADM03 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_04_DM01CELADM03 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_05_DM01CELADM03 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_06_DM01CELADM03 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_07_DM01CELADM03 in mode 0x7f marked for de-assignment NOTE: Disk in mode 0x1 marked for de-assignment NOTE: Disk DATAC1_CD_09_DM01CELADM03 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_10_DM01CELADM03 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_11_DM01CELADM03 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_07_DM01CELADM01 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_09_DM01CELADM01 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_11_DM01CELADM01 in mode 0x7f marked for de-assignment NOTE: Disk in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_00_DM01CELADM01 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_03_DM01CELADM01 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_06_DM01CELADM01 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_02_DM01CELADM02 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_05_DM01CELADM01 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_01_DM01CELADM01 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_02_DM01CELADM01 in mode 0x7f marked for de-assignment NOTE: Disk DATAC1_CD_10_DM01CELADM01 in mode 0x7f marked for de-assignment ERROR: diskgroup DATAC1 was not mounted ORA-15032: not all alterations performed ORA-15040: diskgroup is incomplete ORA-15066: offlining disk "39" in group "DATAC1" may result in a data loss ORA-15042: ASM disk "39" is missing from group number "1" ORA-15042: ASM disk "32" is missing from group number "1" ERROR: alter diskgroup datac1 mount force
故障原因是由于asm disk 32还已经损坏在换盘过程中(数据没有reblance完成),又损坏了asm disk 39,而这两份磁盘中有数据互为镜像,因此磁盘组无法正常mount起来.
检查cell节点celldisk和griddisk情况,确认底层磁盘损坏
对于这种情况,因为normal冗余的两份数据都有部分丢失,无法直接恢复数据,通过底层磁盘级别恢复(参考以前一次的Oracle exadata故障恢复:Oracle Exadata坏盘导致磁盘组无法mount恢复),然后比较顺利恢复数据,实现业务数据0丢失
SQL> alter datac1 mount; Diskgroup altered. SQL> alter diskgroup datac1 check all; Diskgroup altered.
在实际恢复过程中由于客户进行了各种尝试,直接新镜像盘然后插入新盘,强制拉磁盘组drop异常disk操作等,导致第一现场发生一些破坏,增加了恢复难道,但是最终通过各种方法弥补,实现了预期的恢复效果(业务数据0丢失)
发表在 Oracle备份恢复
标签为 exadata mount, exadata坏盘恢复, exadata恢复, exadata磁盘组恢复, ORA-15040, ORA-15042, ORA-15066, xd坏盘恢复, xd恢复, 一体机数据恢复
评论关闭
fdisk分区导致asm disk破坏数据库恢复
尝试mount data磁盘组
SQL> alter diskgroup DATADG mount NOTE: cache registered group DATADG number=1 incarn=0xbc43fafd NOTE: cache began mount (first) of group DATADG number=1 incarn=0xbc43fafd NOTE: Assigning number (1,0) to disk (/dev/raw/raw2) Thu Jun 02 10:14:33 2022 NOTE: GMON heartbeating for grp 1 GMON querying group 1 at 27 for pid 27, osid 3853 NOTE: Assigning number (1,1) to disk () GMON querying group 1 at 28 for pid 27, osid 3853 NOTE: cache dismounting (clean) group 1/0xBC43FAFD (DATADG) NOTE: messaging CKPT to quiesce pins Unix process pid: 3853, image: oracle@node1 (TNS V1-V3) NOTE: dbwr not being msg'd to dismount NOTE: lgwr not being msg'd to dismount NOTE: cache dismounted group 1/0xBC43FAFD (DATADG) NOTE: cache ending mount (fail) of group DATADG number=1 incarn=0xbc43fafd NOTE: cache deleting context for group DATADG 1/0xbc43fafd GMON dismounting group 1 at 29 for pid 27, osid 3853 NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment ERROR: diskgroup DATADG was not mounted ORA-15032: not all alterations performed ORA-15040: diskgroup is incomplete ORA-15042: ASM disk "1" is missing from group number "1" ERROR: alter diskgroup DATADG mount Thu Jun 02 10:14:33 2022 ASM Health Checker found 1 new failures
报错信息比较明显 datadg的disk number 为1的磁盘丢失了。通过fdisk确认磁盘情况
Disk /dev/sdb: 42.9 GB, 42949672960 bytes 64 heads, 32 sectors/track, 40960 cylinders Units = cylinders of 2048 * 512 = 1048576 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x0006c2be Device Boot Start End Blocks Id System Disk /dev/sda: 53.7 GB, 53687091200 bytes 64 heads, 32 sectors/track, 51200 cylinders Units = cylinders of 2048 * 512 = 1048576 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00061443 Device Boot Start End Blocks Id System /dev/sda1 * 2 2049 2097152 83 Linux Partition 1 does not end on cylinder boundary. /dev/sda2 2050 10241 8388608 82 Linux swap / Solaris Partition 2 does not end on cylinder boundary. /dev/sda3 10242 12289 2097152 83 Linux Partition 3 does not end on cylinder boundary. /dev/sda4 12290 51200 39844864 5 Extended Partition 4 does not end on cylinder boundary. /dev/sda5 12291 14338 2097152 83 Linux /dev/sda6 14340 50178 36699136 83 Linux /dev/sda7 50180 51200 1045504 83 Linux Disk /dev/sdc: 214.7 GB, 214748364800 bytes 255 heads, 63 sectors/track, 26108 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x1b3fba6b Device Boot Start End Blocks Id System /dev/sdc1 1 1045 8393931 83 Linux /dev/sdc2 1046 26108 201318547+ 83 Linux Disk /dev/sdd: 536.9 GB, 536870912000 bytes 255 heads, 63 sectors/track, 65270 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x4c63ecad Device Boot Start End Blocks Id System /dev/sdd1 1 65270 524281243+ 83 Linux Disk /dev/sde: 536.9 GB, 536870912000 bytes 255 heads, 63 sectors/track, 65270 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Disk /dev/sdf: 536.9 GB, 536870912000 bytes 255 heads, 63 sectors/track, 65270 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000
根据客户反馈,异常的应该是一个500G的磁盘,而其中sdb为分区,通过kfed命令分析,确认sdc1为ocr磁盘,sdc2为datadg的一块磁盘,另外一块磁盘应该在sdd,sde,sdf三者之中,通过kfed分析sde,sdf均不可能是asm disk(一块是文件系统,一块是彻底没有使用的空盘),如果datadg的磁盘没有丢失,那应该就是sdd这块磁盘,通过dd 磁盘100M空间,然后通过kfed进行分析确认
E:\TEMP\xff>kfed read sdd.dd kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 006648400 00000000 00000000 00000000 00000000 [................] Repeat 26 times 0066485B0 00000000 00000000 4C63ECAD 01000000 [..........cL....] 0066485C0 FE830001 003FFFFF CB370000 00003E7F [......?...7..>..] 0066485D0 00000000 00000000 00000000 00000000 [................] Repeat 1 times 0066485F0 00000000 00000000 00000000 AA550000 [..............U.] 006648600 00000000 00000000 00000000 00000000 [................] Repeat 223 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0] E:\TEMP\xff>kfed read sdd1.dd kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 006768400 00000000 00000000 00000000 00000000 [................] Repeat 26 times 0067685B0 00000000 00000000 70D364B4 FE000000 [.........d.p....] 0067685C0 FE83FFFF D13FFFFF BB7603EB 00003A93 [......?...v..:..] 0067685D0 00000000 00000000 00000000 00000000 [................] Repeat 1 times 0067685F0 00000000 00000000 00000000 AA550000 [..............U.] 006768600 02038201 00000008 80000001 826037C1 [.............7`.] 006768EA0 00000079 00800105 0000007A 00800105 [y.......z.......] 006768EB0 0000007C 00800105 0000007D 00800105 [|.......}.......] 0067693C0 0000015C 00800105 0000015D 00800105 [\.......].......] 0067693D0 0000015F 00800105 00000160 00800105 [_.......`.......] 0067693E0 00000161 00800105 00000163 00800105 [a.......c.......] 0067693F0 00000164 00800105 00000166 00800105 [d.......f.......] KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0] E:\TEMP\xff>kfed read sdd.dd blkn=1|more kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 2 ; 0x002: KFBTYP_FREESPC kfbh.datfmt: 2 ; 0x003: 0x02 kfbh.block.blk: 1 ; 0x004: blk=1 kfbh.block.obj: 2147483649 ; 0x008: disk=1 kfbh.check: 2197087544 ; 0x00c: 0x82f4e538 kfbh.fcn.base: 616391 ; 0x010: 0x000967c7 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdfsb.aunum: 0 ; 0x000: 0x00000000 kfdfsb.max: 254 ; 0x004: 0x00fe kfdfsb.cnt: 254 ; 0x006: 0x00fe kfdfsb.bound: 0 ; 0x008: 0x0000 kfdfsb.flag: 1 ; 0x00a: B=1 kfdfsb.ub1spare: 0 ; 0x00b: 0x00 kfdfsb.spare[0]: 0 ; 0x00c: 0x00000000 kfdfsb.spare[1]: 0 ; 0x010: 0x00000000 kfdfsb.spare[2]: 0 ; 0x014: 0x00000000
通过上述信息分析,基本上可以确认sdd磁盘以前是asm disk,但是被fdisk进行了分区,基于这种情况,通过对磁盘组进行修复
E:\TEMP\xff>kfed read sdd.ok kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 2147483649 ; 0x008: disk=1 kfbh.check: 424926402 ; 0x00c: 0x1953dcc2 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8 kfdhdb.driver.reserved[0]: 0 ; 0x008: 0x00000000 kfdhdb.driver.reserved[1]: 0 ; 0x00c: 0x00000000 kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000 kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000 kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000 kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000 kfdhdb.compat: 186646528 ; 0x020: 0x0b200000 kfdhdb.dsknum: 1 ; 0x024: 0x0001 kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER kfdhdb.dskname: DATADG_0001 ; 0x028: length=11 kfdhdb.grpname: DATADG ; 0x048: length=6 kfdhdb.fgname: DATADG_0001 ; 0x068: length=11 kfdhdb.capname: ; 0x088: length=0 kfdhdb.crestmp.hi: 33074858 ; 0x0a8: HOUR=0xa DAYS=0x15 MNTH=0xb YEAR=0x7e2 kfdhdb.crestmp.lo: 2375520256 ; 0x0ac: USEC=0x0 MSEC=0x1e4 SECS=0x19 MINS=0x23 kfdhdb.mntstmp.hi: 33074858 ; 0x0b0: HOUR=0xa DAYS=0x15 MNTH=0xb YEAR=0x7e2 kfdhdb.mntstmp.lo: 2375522304 ; 0x0b4: USEC=0x0 MSEC=0x1e6 SECS=0x19 MINS=0x23 kfdhdb.secsize: 512 ; 0x0b8: 0x0200 kfdhdb.blksize: 4096 ; 0x0ba: 0x1000 kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000 kfdhdb.mfact: 113792 ; 0x0c0: 0x0001bc80 kfdhdb.dsksize: 512000 ; 0x0c4: 0x0007d000 kfdhdb.pmcnt: 6 ; 0x0c8: 0x00000006 kfdhdb.fstlocn: 1 ; 0x0cc: 0x00000001 kfdhdb.altlocn: 2 ; 0x0d0: 0x00000002 kfdhdb.f1b1locn: 0 ; 0x0d4: 0x00000000 kfdhdb.redomirrors[0]: 0 ; 0x0d8: 0x0000 kfdhdb.redomirrors[1]: 0 ; 0x0da: 0x0000 kfdhdb.redomirrors[2]: 0 ; 0x0dc: 0x0000 kfdhdb.redomirrors[3]: 0 ; 0x0de: 0x0000 kfdhdb.dbcompat: 168820736 ; 0x0e0: 0x0a100000 kfdhdb.grpstmp.hi: 33072461 ; 0x0e4: HOUR=0xd DAYS=0xa MNTH=0x9 YEAR=0x7e2 kfdhdb.grpstmp.lo: 3452534784 ; 0x0e8: USEC=0x0 MSEC=0x260 SECS=0x1c MINS=0x33
使用rman对数据库进行备份,并且重建磁盘组实现数据0丢失
发表在 Oracle备份恢复
标签为 asm disk分区恢复, asm fdisk, asm不能mount, asm恢复, fdisk asm恢复, ORA-15040, ORA-15042
评论关闭
asm disk被加入vg恢复
接到客户恢复请求:把oracle asm datagroup中的一个磁盘增加到vg中,现在磁盘组无法mount,数据库无法正常启动.远程登录现场进行分析发现情况如下:
操作系统层面分析
history操作记录
这里比较明显把一个磁盘做成pv,并且加入到vg中,然后再分配199G给lv_home,系统层面分析lvm信息
--查看pv信息 [root@xff1 ~]# pvdisplay --- Physical volume --- PV Name /dev/sda2 VG Name VolGroup PV Size 277.98 GiB / not usable 3.00 MiB Allocatable yes (but full) PE Size 4.00 MiB Total PE 71161 Free PE 0 Allocated PE 71161 PV UUID F6QO3f-065n-mwTW-Xbq2-Xx2y-c8HD-Tkr7V7 --- Physical volume --- PV Name /dev/sdg <----新加入的磁盘 VG Name VolGroup PV Size 200.00 GiB / not usable 4.00 MiB Allocatable yes PE Size 4.00 MiB Total PE 51199 Free PE 255 Allocated PE 50944 PV UUID i69vUG-nCIK-dtxL-FvpD-2WZd-bvLv-n7lwrb [root@xff1 ~]# lvdisplay --- Logical volume --- LV Path /dev/VolGroup/lv_root LV Name lv_root VG Name VolGroup LV UUID JUNnkN-m4zq-D0gh-h42b-cUM1-Wh1q-ZMtQE4 LV Write Access read/write LV Creation host, time localhost.localdomain, 2017-07-19 20:08:47 +0800 LV Status available # open 1 LV Size 50.00 GiB Current LE 12800 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:0 --- Logical volume --- LV Path /dev/VolGroup/lv_home LV Name lv_home VG Name VolGroup LV UUID eZTkLt-cNGX-371i-m8Bd-VdD9-q6Hz-wYDRIJ LV Write Access read/write LV Creation host, time localhost.localdomain, 2017-07-19 20:08:54 +0800 LV Status available # open 1 LV Size 422.97 GiB <-----lv大小变成422G,应该是被扩了199G后结果 Current LE 108281 Segments 2 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:2 --- Logical volume --- LV Path /dev/VolGroup/lv_swap LV Name lv_swap VG Name VolGroup LV UUID 54P9ok-VpwO-zM68-hvwY-9GBf-89yb-8xQAMn LV Write Access read/write LV Creation host, time localhost.localdomain, 2017-07-19 20:09:23 +0800 LV Status available # open 1 LV Size 4.00 GiB Current LE 1024 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:1 [root@xff1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup-lv_root 50G 3.9G 43G 9% / tmpfs 63G 509M 63G 1% /dev/shm /dev/sda1 477M 44M 408M 10% /boot /dev/mapper/VolGroup-lv_home 417G 226G 170G 58% /home <----增加了199g空间,剩余只剩170G,证明增加空间之后最少使用了30G以上
基于这样的情况,基本上可以确定sdg盘加入VolGroup中并且被分配给 lv_home中,而且还写入了数据(/home空闲空间只剩余170G,lv_home当时扩了199G).
asm层面分析
asm磁盘组无法mount,提示缺少一块磁盘
SQL> ALTER DISKGROUP DATA MOUNT /* asm agent *//* {1:12056:279} */ NOTE: cache registered group DATA number=1 incarn=0xa1dbff16 NOTE: cache began mount (first) of group DATA number=1 incarn=0xa1dbff16 NOTE: Assigning number (1,2) to disk (/dev/asmdisk3) NOTE: Assigning number (1,1) to disk (/dev/asmdisk2) Sat Apr 25 13:04:58 2020 ERROR: no read quorum in group: required 1, found 0 disks NOTE: cache dismounting (clean) group 1/0xA1DBFF16 (DATA) NOTE: messaging CKPT to quiesce pins Unix process pid: 81552, image: oracle@rac2db1 (TNS V1-V3) NOTE: dbwr not being msg'd to dismount NOTE: lgwr not being msg'd to dismount NOTE: cache dismounted group 1/0xA1DBFF16 (DATA) NOTE: cache ending mount (fail) of group DATA number=1 incarn=0xa1dbff16 NOTE: cache deleting context for group DATA 1/0xa1dbff16 GMON dismounting group 1 at 19 for pid 30, osid 81552 NOTE: Disk DATA_0001 in mode 0x9 marked for de-assignment NOTE: Disk DATA_0002 in mode 0x9 marked for de-assignment ERROR: diskgroup DATA was not mounted ORA-15032: not all alterations performed ORA-15017: diskgroup "DATA" cannot be mounted ORA-15040: diskgroup is incomplete ERROR: ALTER DISKGROUP DATA MOUNT /* asm agent *//* {1:12056:279} */
报错比较明显asm disk磁盘头被lvm的信息取代(因为asm disk 被加入到vg中),根据前面的分析,该磁盘被写入数据很可能超过30G,使用kfed分析一个随意au,确认被破坏,证明开始判断基本正确
root@xff1:/home/oracle11g$kfed read /dev/asmdisk1 aun=10000 kfbh.endian: 51 ; 0x000: 0x33 kfbh.hard: 55 ; 0x001: 0x37 kfbh.type: 32 ; 0x002: *** Unknown Enum *** kfbh.datfmt: 42 ; 0x003: 0x2a kfbh.block.blk: 1329801248 ; 0x004: blk=1329801248 kfbh.block.obj: 1128615502 ; 0x008: file=347726 kfbh.check: 1094999892 ; 0x00c: 0x41445f54 kfbh.fcn.base: 675103060 ; 0x010: 0x283d4154 kfbh.fcn.wrap: 1448232275 ; 0x014: 0x56524553 kfbh.spare1: 1598374729 ; 0x018: 0x5f454349 kfbh.spare2: 1162690894 ; 0x01c: 0x454d414e 7F7843EAD400 2A203733 4F432820 43454E4E 41445F54 [37 * (CONNECT_DA] 7F7843EAD410 283D4154 56524553 5F454349 454D414E [TA=(SERVICE_NAME] 7F7843EAD420 6361723D 29626432 44494328 5250283D [=rac2db)(CID=(PR] 7F7843EAD430 4152474F 3A443D4D 4341505C DFCF3153 [OGRAM=D:\PACS1..] 7F7843EAD440 B3BEB7BB 6369445C 65536D6F 72657672 [....\DicomServer] 7F7843EAD450 445C524D 6D6F6369 76726553 524D7265 [MR\DicomServerMR] 7F7843EAD460 6578652E 4F482829 573D5453 362D4E49 [.exe)(HOST=WIN-6] 7F7843EAD470 51414C38 54553645 28294A30 52455355 [8LAQE6UT0J)(USER] 7F7843EAD480 6D64413D 73696E69 74617274 2929726F [=Administrator))] 7F7843EAD490 202A2029 44444128 53534552 5250283D [) * (ADDRESS=(PR] 7F7843EAD4A0 434F544F 743D4C4F 28297063 54534F48 [OTOCOL=tcp)(HOST] 7F7843EAD4B0 2E30313D 2E303831 30332E31 4F502829 [=10.180.1.30)(PO] 7F7843EAD4C0 343D5452 37333539 2A202929 74736520 [RT=49537)) * est] 7F7843EAD4D0 696C6261 2A206873 63617220 20626432 [ablish * rac2db ] 7F7843EAD4E0 3231202A 0A343135 2D534E54 31353231 [* 12514.TNS-1251] 7F7843EAD4F0 54203A34 6C3A534E 65747369 2072656E [4: TNS:listener ] 7F7843EAD500 73656F64 746F6E20 72756320 746E6572 [does not current] 7F7843EAD510 6B20796C 20776F6E 7320666F 69767265 [ly know of servi] 7F7843EAD520 72206563 65757165 64657473 206E6920 [ce requested in ] 7F7843EAD530 6E6E6F63 20746365 63736564 74706972 [connect descript] ……………… 7F7843EAE300 6F636944 7265536D 4D726576 69445C52 [DicomServerMR\Di] 7F7843EAE310 536D6F63 65767265 2E524D72 29657865 [comServerMR.exe)] 7F7843EAE320 534F4828 49573D54 4F302D4E 314B304A [(HOST=WIN-0OJ0K1] 7F7843EAE330 4955304E 55282954 3D524553 696D6441 [N0UIT)(USER=Admi] 7F7843EAE340 7473696E 6F746172 29292972 28202A20 [nistrator))) * (] 7F7843EAE350 52444441 3D535345 4F525028 4F434F54 [ADDRESS=(PROTOCO] 7F7843EAE360 63743D4C 48282970 3D54534F 312E3031 [L=tcp)(HOST=10.1] 7F7843EAE370 312E3038 2930332E 524F5028 35353D54 [80.1.30)(PORT=55] 7F7843EAE380 29383632 202A2029 61747365 73696C62 [268)) * establis] 7F7843EAE390 202A2068 32636172 2A206264 35323120 [h * rac2db * 125] 7F7843EAE3A0 540A3431 312D534E 34313532 4E54203A [14.TNS-12514: TN] 7F7843EAE3B0 696C3A53 6E657473 64207265 2073656F [S:listener does ] 7F7843EAE3C0 20746F6E 72727563 6C746E65 6E6B2079 [not currently kn] 7F7843EAE3D0 6F20776F 65732066 63697672 65722065 [ow of service re] 7F7843EAE3E0 73657571 20646574 63206E69 656E6E6F [quested in conne] 7F7843EAE3F0 64207463 72637365 6F747069 34320A72 [ct descriptor.24] KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][32]
通过上述kfed可以看到第10000 au的位置被写入的是数据库异常之后listener.log的信息(该数据库安装在/home目录中),进一步证明覆盖,通过以下信息证明sdg就是asmdisk1
[root@xff1 dev]# ls -l sdg brw-rw---- 1 root disk 8, 96 Apr 25 00:05 sdg [root@xff1 dev]# ls -l asmdisk1 brw-rw---- 1 grid asmadmin 8, 96 Apr 25 00:05 asmdisk1
基于现在的情况,data磁盘组是由三块 200G的磁盘组成,第一块磁盘被意外加入vg,并且写入数据大于30G,无法从asm层面直接通过kfed修复磁盘组,然后直接mount,只能通过oracle asm磁盘数据块重组技术(asm disk header 彻底损坏恢复)实现没有覆盖数据的恢复.
该客户运气还不错,通过仅剩的2019年12月份几天的不成功备份找出来所有的数据文件(无归档),然后强制拉库成功.通过碎片恢复的最新的数据文件数据结合2019年12月份备份,实现绝大部分业务数据恢复,最大限度减少客户损失.对于oracle rac数据库服务器磁盘操作需要谨慎.
如果不幸有类似oracle asm disk被破坏(格式化,dd部分,做成lv等),需要进行恢复支持,可以联系我们,做专业的恢复评估,最大限度,最快速度抢救数据,减少损失
Phone:17813235971 Q Q:107644445 E-Mail:dba@xifenfei.com
恢复过部分asm被格式化案例:
又一例asm格式化文件系统恢复
一次完美的asm disk被格式化ntfs恢复
oracle asm disk格式化恢复—格式化为ext4文件系统
oracle asm disk格式化恢复—格式化为ntfs文件系统