asm 加磁盘导致磁盘组损坏恢复

接到客户恢复case请求,希望我们接入恢复数据。大概过程是这样的,16年9月份由于硬件问题,导致normal磁盘组(只有2个磁盘)中的一个磁盘丢失,然后在17年3月6日,运维方尝试增加该磁盘进入磁盘组,结果通过force命令加入成功之后,磁盘组dismount,然后再也无法mount成功。
磁盘组创建信息

Fri Jun 24 19:31:38 2016
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 2
SUCCESS: diskgroup DATADG was mounted
SUCCESS: CREATE DISKGROUP DATADG NORMAL REDUNDANCY  DISK '/dev/asm-diskdata01' SIZE 1048576M ,
'/dev/asm-diskdata02' SIZE 1048576M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='4M' /* ASMCA */

这里可以看出来datadg是一个normal的au为4M的一个磁盘组

自动drop异常asm disk

Mon Sep 12 11:41:54 2016
WARNING: Waited 15 secs for write IO to PST disk 1 in group 1.
WARNING: Waited 15 secs for write IO to PST disk 1 in group 1.
Mon Sep 12 11:41:55 2016
NOTE: process _b000_+asm1 (19491) initiating offline of disk 1.3915923833 (DATADG_0001) with mask 0x7e in group 1
NOTE: checking PST: grp = 1
GMON checking disk modes for group 1 at 9 for pid 29, osid 19491
NOTE: group DATADG: updated PST location: disk 0000 (PST copy 0)
NOTE: checking PST for grp 1 done.
NOTE: sending set offline flag message 2870990318 to 1 disk(s) in group 1
WARNING: Disk DATADG_0001 in mode 0x7f is now being offlined
NOTE: initiating PST update: grp = 1, dsk = 1/0xe9684179, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 10 for pid 29, osid 19491
NOTE: group DATADG: updated PST location: disk 0000 (PST copy 0)
NOTE: group DATADG: updated PST location: disk 0000 (PST copy 0)
NOTE: PST update grp = 1 completed successfully 
NOTE: initiating PST update: grp = 1, dsk = 1/0xe9684179, mask = 0x7e, op = clear
GMON updating disk modes for group 1 at 11 for pid 29, osid 19491
NOTE: group DATADG: updated PST location: disk 0000 (PST copy 0)
NOTE: group DATADG: updated PST location: disk 0000 (PST copy 0)
NOTE: cache closing disk 1 of grp 1: DATADG_0001
NOTE: PST update grp = 1 completed successfully 
Mon Sep 12 11:42:55 2016
WARNING: Waited 15 secs for write IO to PST disk 0 in group 1.
Mon Sep 12 11:44:58 2016
WARNING: PST-initiated drop of 1 disk(s) in group 1(.1137226115))
SQL> alter diskgroup DATADG drop disk DATADG_0001 force /* ASM SERVER */ 
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=1
Mon Sep 12 11:44:59 2016
GMON updating for reconfiguration, group 1 at 12 for pid 29, osid 19491
NOTE: cache closing disk 1 of grp 1: (not open) DATADG_0001
NOTE: group DATADG: updated PST location: disk 0000 (PST copy 0)
NOTE: group 1 PST updated.
Mon Sep 12 11:44:59 2016
NOTE: membership refresh pending for group 1/0x43c8b183 (DATADG)
Mon Sep 12 11:45:02 2016
NOTE: successfully read ACD block gn=1 blk=0 via retry read
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_lgwr_3526.trc:
ORA-15062: ASM disk is globally closed
GMON querying group 1 at 13 for pid 18, osid 3532
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
SUCCESS: refreshed membership for 1/0x43c8b183 (DATADG)
SUCCESS: alter diskgroup DATADG drop disk DATADG_0001 force /* ASM SERVER */
NOTE: starting rebalance of group 1/0x43c8b183 (DATADG) at power 1
SUCCESS: PST-initiated drop disk in group 1(1137226115))
Starting background process ARB0
Mon Sep 12 11:45:03 2016
ARB0 started with pid=35, OS id=19945 
NOTE: assigning ARB0 to group 1/0x43c8b183 (DATADG) with 1 parallel I/O
cellip.ora not found.
NOTE: Rebalance has restored redundancy for any existing control file or redo log in disk group DATADG
NOTE: Attempting voting file refresh on diskgroup DATADG
NOTE: Refresh completed on diskgroup DATADG. No voting file found.
Mon Sep 12 11:46:21 2016
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=1
Mon Sep 12 11:46:24 2016
GMON updating for reconfiguration, group 1 at 14 for pid 36, osid 20110
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
NOTE: group 1 PST updated.
WARNING: offline disk number 1 has references (54679 AUs)
Mon Sep 12 11:46:24 2016
NOTE: membership refresh pending for group 1/0x43c8b183 (DATADG)
GMON querying group 1 at 15 for pid 18, osid 3532
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
SUCCESS: refreshed membership for 1/0x43c8b183 (DATADG)
NOTE: Attempting voting file refresh on diskgroup DATADG
NOTE: Refresh completed on diskgroup DATADG. No voting file found.
NOTE: stopping process ARB0
SUCCESS: rebalance completed for group 1/0x43c8b183 (DATADG)

这里我们可以看出来磁盘组在2016年9月12日由于disk 1 无法响应,直接被asm 踢出了磁盘组

把被强制删除的磁盘重新加回去

Mon Mar 06 15:36:54 2017
SQL> alter diskgroup DATADG add disk '/dev/asm-diskdata01' name DATADG_0000 
NOTE: GroupBlock outside rolling migration privileged region
ORA-15032: not all alterations performed
ORA-15029: disk '/dev/asm-diskdata01' is already mounted by this instance
ERROR: alter diskgroup DATADG add disk '/dev/asm-diskdata01' name DATADG_0000
Mon Mar 06 15:38:27 2017
SQL>  alter diskgroup DATADG add disk '/dev/asm-diskdata02' name DATADG_0001 
NOTE: GroupBlock outside rolling migration privileged region
NOTE: Assigning number (1,2) to disk (/dev/asm-diskdata02)
NOTE: requesting all-instance membership refresh for group=1
NOTE: Disk DATADG_0001 in mode 0x7f marked for de-assignment
ERROR: ORA-15033 signalled during reconfiguration of diskgroup DATADG
Mon Mar 06 15:38:28 2017
NOTE: membership refresh pending for group 1/0x31584f6b (DATADG)
Mon Mar 06 15:38:31 2017
GMON querying group 1 at 7 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
GMON querying group 1 at 8 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
SUCCESS: refreshed membership for 1/0x31584f6b (DATADG)
ORA-15032: not all alterations performed
ORA-15033: disk '/dev/asm-diskdata02' belongs to diskgroup "DATADG"
ERROR:  alter diskgroup DATADG add disk '/dev/asm-diskdata02' name DATADG_0001
NOTE: Attempting voting file refresh on diskgroup DATADG
NOTE: Refresh completed on diskgroup DATADG. No voting file found.
Mon Mar 06 16:04:14 2017
SQL> alter diskgroup DATADG add disk '/dev/asm-diskdata02' name DATADG_0001 
NOTE: GroupBlock outside rolling migration privileged region
NOTE: Assigning number (1,2) to disk (/dev/asm-diskdata02)
NOTE: requesting all-instance membership refresh for group=1
NOTE: Disk DATADG_0001 in mode 0x7f marked for de-assignment
ERROR: ORA-15033 signalled during reconfiguration of diskgroup DATADG
Mon Mar 06 16:04:15 2017
NOTE: membership refresh pending for group 1/0x31584f6b (DATADG)
Mon Mar 06 16:04:18 2017
GMON querying group 1 at 9 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
GMON querying group 1 at 10 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
SUCCESS: refreshed membership for 1/0x31584f6b (DATADG)
ORA-15032: not all alterations performed
ORA-15033: disk '/dev/asm-diskdata02' belongs to diskgroup "DATADG"
ERROR: alter diskgroup DATADG add disk '/dev/asm-diskdata02' name DATADG_0001
NOTE: Attempting voting file refresh on diskgroup DATADG
NOTE: Refresh completed on diskgroup DATADG. No voting file found.
Mon Mar 06 16:23:28 2017
SQL> alter diskgroup DATADG add FAILGROUP DATA_0001 disk '/dev/adm-diskdata02' name DATA_0001 
NOTE: GroupBlock outside rolling migration privileged region
ORA-15032: not all alterations performed
ORA-15031: disk specification '/dev/adm-diskdata02' matches no disks
ERROR: alter diskgroup DATADG add FAILGROUP DATA_0001 disk '/dev/adm-diskdata02' name DATA_0001
Mon Mar 06 16:24:48 2017
SQL> alter diskgroup DATADG add FAILGROUP DATA_0001 disk '/dev/asm-diskdata02' name DATA_0001 
NOTE: GroupBlock outside rolling migration privileged region
NOTE: Assigning number (1,2) to disk (/dev/asm-diskdata02)
NOTE: requesting all-instance membership refresh for group=1
NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment
ERROR: ORA-15033 signalled during reconfiguration of diskgroup DATADG
Mon Mar 06 16:24:49 2017
NOTE: membership refresh pending for group 1/0x31584f6b (DATADG)
Mon Mar 06 16:24:52 2017
GMON querying group 1 at 11 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
GMON querying group 1 at 12 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
SUCCESS: refreshed membership for 1/0x31584f6b (DATADG)
ORA-15032: not all alterations performed
ORA-15033: disk '/dev/asm-diskdata02' belongs to diskgroup "DATADG"
ERROR: alter diskgroup DATADG add FAILGROUP DATA_0001 disk '/dev/asm-diskdata02' name DATA_0001
NOTE: Attempting voting file refresh on diskgroup DATADG
NOTE: Refresh completed on diskgroup DATADG. No voting file found.
Mon Mar 06 16:26:07 2017
SQL> alter diskgroup DATADG add FAILGROUP DATA_0001 disk '/dev/asm-diskdata02' name DATA_0001 force  
NOTE: GroupBlock outside rolling migration privileged region
NOTE: Assigning number (1,2) to disk (/dev/asm-diskdata02)
NOTE: requesting all-instance membership refresh for group=1
NOTE: initializing header on grp 1 disk DATA_0001
NOTE: requesting all-instance disk validation for group=1
Mon Mar 06 16:26:10 2017
NOTE: skipping rediscovery for group 1/0x31584f6b (DATADG) on local instance.
NOTE: requesting all-instance disk validation for group=1
NOTE: skipping rediscovery for group 1/0x31584f6b (DATADG) on local instance.
Mon Mar 06 16:26:15 2017
GMON updating for reconfiguration, group 1 at 13 for pid 28, osid 12861
NOTE: group 1 PST updated.
NOTE: initiating PST update: grp = 1
GMON updating group 1 at 14 for pid 28, osid 12861
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
NOTE: group DATADG: updated PST location: disk 0000 (PST copy 0)
NOTE: group DATADG: updated PST location: disk 0002 (PST copy 1)
NOTE: PST update grp = 1 completed successfully 
NOTE: membership refresh pending for group 1/0x31584f6b (DATADG)
GMON querying group 1 at 15 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
NOTE: cache opening disk 2 of grp 1: DATA_0001 path:/dev/asm-diskdata02
NOTE: Attempting voting file refresh on diskgroup DATADG
NOTE: Refresh completed on diskgroup DATADG. No voting file found.
GMON querying group 1 at 16 for pid 18, osid 3468
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
SUCCESS: refreshed membership for 1/0x31584f6b (DATADG)
Mon Mar 06 16:26:19 2017
SUCCESS: alter diskgroup DATADG add FAILGROUP DATA_0001 disk '/dev/asm-diskdata02' name DATA_0001 force 
NOTE: starting rebalance of group 1/0x31584f6b (DATADG) at power 1
Mon Mar 06 16:26:20 2017
Starting background process ARB0
Mon Mar 06 16:26:20 2017
ARB0 started with pid=32, OS id=25833 
NOTE: assigning ARB0 to group 1/0x31584f6b (DATADG) with 1 parallel I/O
cellip.ora not found.
WARNING:cache read  a corrupt block: group=1(DATADG) dsk=0 blk=0 disk=0 
        (DATADG_0000)incarn=3915956130 au=0 blk=0 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25833.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483648] [0] [0 != 1]
NOTE:a corrupted block from group DATADG was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25833.trc
WARNING:cache read (retry)a corrupt block:group=1(DATADG) 
         dsk=0 blk=0 disk=0(DATADG_0000)incarn=3915956130 au=0 blk=0 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25833.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483648] [0] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483648] [0] [0 != 1]
ERROR: cache failed to read group=1(DATADG) dsk=0 blk=0 from disk(s): 0(DATADG_0000)
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483648] [0] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483648] [0] [0 != 1]
NOTE: cache initiating offline of disk 0 group DATADG
NOTE:process _arb0_+asm1 (25833) initiating offline of disk 0.3915956130(DATADG_0000)with mask 0x7e in group 1
NOTE: checking PST: grp = 1
GMON checking disk modes for group 1 at 17 for pid 32, osid 25833
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
ERROR: too many offline disks in PST (grp 1)
NOTE: checking PST for grp 1 done.
NOTE: initiating PST update: grp = 1, dsk = 0/0xe968bfa2, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 18 for pid 32, osid 25833
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
ERROR: Disk 0 cannot be offlined, since all the disks [0, 1] with mirrored data would be offline.
ERROR: too many offline disks in PST (grp 1)
Mon Mar 06 16:26:23 2017
NOTE: cache dismounting (not clean) group 1/0x31584F6B (DATADG) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 25889, image: oracle@DBN01 (B000)
Mon Mar 06 16:26:23 2017
NOTE: halting all I/Os to diskgroup 1 (DATADG)
Mon Mar 06 16:26:23 2017
NOTE: LGWR doing non-clean dismount of group 1 (DATADG)
NOTE: LGWR sync ABA=19.2851 last written ABA 19.2851
WARNING: Offline for disk DATADG_0000 in mode 0x7f failed.
Mon Mar 06 16:26:23 2017
kjbdomdet send to inst 2
detach from dom 1, sending detach message to inst 2
Mon Mar 06 16:26:23 2017
List of instances:
 1 2
Dirty detach reconfiguration started (new ddet inc 1, cluster inc 8)
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25833.trc  (incident=65537):
ORA-15335: ASM metadata corruption detected in disk group 'DATADG'
ORA-15130: diskgroup "DATADG" is being dismounted
ORA-15066: offlining disk "DATADG_0000" in group "DATADG" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483648] [0] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483648] [0] [0 != 1]
Incident details in:/u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_65537/+ASM1_arb0_25833_i65537.trc
 Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 1 invalid = TRUE 
 3189 GCS resources traversed, 0 cancelled
Dirty Detach Reconfiguration complete
ERROR: ORA-15130 in COD recovery for diskgroup 1/0x31584f6b (DATADG)
ERROR: ORA-15130 thrown in RBAL for group number 1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_3468.trc:
ORA-15130: diskgroup "DATADG" is being dismounted
Mon Mar 06 16:26:23 2017
WARNING: dirty detached from domain 1
NOTE: cache dismounted group 1/0x31584F6B (DATADG) 

---后续mount报错
SQL> ALTER DISKGROUP DATADG MOUNT  /* asm agent *//* {1:18003:2} */ 
NOTE: cache registered group DATADG number=1 incarn=0xb368408f
NOTE: cache began mount (first) of group DATADG number=1 incarn=0xb368408f
NOTE: Assigning number (1,2) to disk (/dev/asm-diskdata02)
WARNING:GMON has insufficient disks to maintain consensus.
Minimum required is 2:updating 1 PST copies from a total of 2.
ERROR: GMON failed to obtain a quorum ofsupporting disks in group 1
NOTE: cache dismounting (clean) group 1/0xB368408F (DATADG) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 27651, image: oracle@DBN01 (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 1/0xB368408F (DATADG) 
NOTE: cache ending mount (fail) of group DATADG number=1 incarn=0xb368408f
NOTE: cache deleting context for group DATADG 1/0xb368408f
GMON dismounting group 1 at 12 for pid 30, osid 27651
NOTE: Disk DATA_0001 in mode 0x9 marked for de-assignment
ERROR: diskgroup DATADG was not mounted
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATADG" cannot be mounted
ORA-15315: Write errors in disk group DATADG could lead to inconsistent ASM metadata.
ERROR: ALTER DISKGROUP DATADG MOUNT  /* asm agent *//* {1:18003:2} */

从这里我们可以看出来,前几次加asm disk 由于各种原因都失败了,最后一次通过加force关键字,使得被自动drop的disk重新强制加到datadg里面.可悲的是在加入成功之后,开始做rebalance的时候,发现disk 0出现坏块,从而引起ORA-15196的错误,使得rebalance无法进行下去,进而整个asm 磁盘组datadg自动dismount.后面再次尝试mount datadg的时候,直接提示元数据库不一致,因为disk 0 的磁盘头已经异常.

通过kfed分析disk 0信息
这里是通过dd命令备份的磁盘头到win进行分析的,以前正常的disk 0的磁盘头损坏(全0)
asm-kfed


对于这个故障已经比较清楚,恢复思路也基本上确定:依次递进
方案1:通过kfed修改文件头,然后尝试mount磁盘头手工修复ASM DISK HEADER 异常
方案2:直接通过amdu,dul之类的工具拷贝出来数据文件找回ASM中数据文件
方案3:通过底层au重组出来数据文件asm disk header 彻底损坏恢复
在我们的实际恢复中运气比较好,通过方案1就完成了恢复工作,通过kfed修复磁盘头之后,然后报错如下

SQL> alter diskgroup DATADG mount 
NOTE: cache registered group DATADG number=1 incarn=0x5134d0d4
NOTE: cache began mount (first) of group DATADG number=1 incarn=0x5134d0d4
NOTE: Assigning number (1,2) to disk (/dev/asm-diskdata02)
NOTE: Assigning number (1,0) to disk (/dev/asm-diskdata01)
Tue Mar 07 19:03:40 2017
NOTE: GMON heartbeating for grp 1
GMON querying group 1 at 27 for pid 28, osid 13837
NOTE: Assigning number (1,1) to disk ()
GMON querying group 1 at 28 for pid 28, osid 13837
NOTE: cache closing disk 1 of grp 1: (not open) 
NOTE: cache opening disk 0 of grp 1: DATADG_0000 path:/dev/asm-diskdata01
NOTE: F1X0 found on disk 0 au 2 fcn 0.178802
NOTE: cache opening disk 2 of grp 1: DATA_0001 path:/dev/asm-diskdata02
NOTE: cache mounting (first) normal redundancy group 1/0x5134D0D4 (DATADG)
Tue Mar 07 19:03:40 2017
* allocate domain 1, invalid = TRUE 
kjbdomatt send to inst 2
Tue Mar 07 19:03:40 2017
NOTE: attached to recovery domain 1
NOTE: starting recovery of thread=1 ckpt=19.2851 group=1 (DATADG)
NOTE: starting recovery of thread=2 ckpt=13.5327 group=1 (DATADG)
NOTE: advancing ckpt for group 1 (DATADG) thread=2 ckpt=13.5327
NOTE: advancing ckpt for group 1 (DATADG) thread=1 ckpt=19.2852
NOTE: cache recovered group 1 to fcn 0.365868
NOTE: redo buffer size is 512 blocks (2101760 bytes)
Tue Mar 07 19:03:40 2017
NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (DATADG)
NOTE: LGWR found thread 1 closed at ABA 19.2851
NOTE: LGWR mounted thread 1 for diskgroup 1 (DATADG)
NOTE: LGWR opening thread 1 at fcn 0.365868 ABA 20.2852
NOTE: cache mounting group 1/0x5134D0D4 (DATADG) succeeded
NOTE: cache ending mount (success) of group DATADG number=1 incarn=0x5134d0d4
Tue Mar 07 19:03:40 2017
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1
SUCCESS: diskgroup DATADG was mounted
SUCCESS: alter diskgroup DATADG mount
Tue Mar 07 19:03:40 2017
NOTE: diskgroup resource ora.DATADG.dg is online
Tue Mar 07 19:03:41 2017
ASM Health Checker found 1 new failures
NOTE: ASM did background COD recovery for group 1/0x5134d0d4 (DATADG)
NOTE: starting rebalance of group 1/0x5134d0d4 (DATADG) at power 1
Starting background process ARB0
Tue Mar 07 19:03:42 2017
ARB0 started with pid=30, OS id=13905 
NOTE: assigning ARB0 to group 1/0x5134d0d4 (DATADG) with 1 parallel I/O
cellip.ora not found.
WARNING: cache read  a corrupt block: group=1(DATADG) dsk=0 blk=0 disk=0 
         (DATADG_0000) incarn=2202280062 au=0 blk=0 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_13905.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [2147483648] [0] [1022 != 0]
NOTE: a corrupted block from group DATADG was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_13905.trc
WARNING:cache read (retry)a corrupt block:group=1(DATADG) dsk=0 blk=0 disk=0
        (DATADG_0000)incarn=2202280062 au=0 blk=0 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_13905.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [2147483648] [0] [1022 != 0]
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [2147483648] [0] [1022 != 0]
ERROR: cache failed to read group=1(DATADG) dsk=0 blk=0 from disk(s): 0(DATADG_0000)
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [2147483648] [0] [1022 != 0]
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [2147483648] [0] [1022 != 0]
Tue Mar 07 19:03:52 2017
NOTE: client oradb1:oradb registered, osid 13989, mbr 0x1
NOTE: cache initiating offline of disk 0 group DATADG
NOTE:process _arb0_+asm1 (13905) initiating offline of disk 0.2202280062(DATADG_0000)with mask 0x7e in group 1
NOTE: checking PST: grp = 1
Tue Mar 07 19:03:52 2017
GMON checking disk modes for group 1 at 30 for pid 30, osid 13905
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
ERROR: too many offline disks in PST (grp 1)
NOTE: checking PST for grp 1 done.
NOTE: initiating PST update: grp = 1, dsk = 0/0x8344207e, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 31 for pid 30, osid 13905
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATADG
ERROR: Disk 0 cannot be offlined, since all the disks [0, 1] with mirrored data would be offline.
ERROR: too many offline disks in PST (grp 1)
Tue Mar 07 19:03:52 2017
NOTE: cache dismounting (not clean) group 1/0x5134D0D4 (DATADG) 
WARNING: Offline for disk DATADG_0000 in mode 0x7f failed.
Tue Mar 07 19:03:52 2017
NOTE: halting all I/Os to diskgroup 1 (DATADG)
NOTE: messaging CKPT to quiesce pins Unix process pid: 14002, image: oracle@DBN01 (B000)
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_13905.trc  (incident=76402):
ORA-15335: ASM metadata corruption detected in disk group 'DATADG'
ORA-15130: diskgroup "DATADG" is being dismounted
ORA-15066: offlining disk "DATADG_0000" in group "DATADG" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [2147483648] [0] [1022 != 0]
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [2147483648] [0] [1022 != 0]
Incident details in: /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_76402/+ASM1_arb0_13905_i76402.trc
Tue Mar 07 19:03:52 2017
NOTE: LGWR doing non-clean dismount of group 1 (DATADG)
NOTE: LGWR sync ABA=20.2857 last written ABA 20.2857

这里比较比较幸运,datadg已经mount成功了,但是由于rab依旧读取到disk header异常信息(没有完全修复成功,而且在日志中不光这个block异常,还有其他block异常,因此不考虑进一步修复),因此直接通过屏蔽asm的acd和cod实现该磁盘组mount,而且不会dismount。

SQL> alter diskgroup DATADG mount 
NOTE: cache registered group DATADG number=1 incarn=0x9c94d0eb
NOTE: cache began mount (first) of group DATADG number=1 incarn=0x9c94d0eb
NOTE: Assigning number (1,2) to disk (/dev/asm-diskdata02)
NOTE: Assigning number (1,0) to disk (/dev/asm-diskdata01)
NOTE: skip COD recovery as part of test at kfrc.c:1639 
NOTE: skip COD recovery as part of test at kfrc.c:1639 
Tue Mar 07 19:12:45 2017
NOTE: GMON heartbeating for grp 1
GMON querying group 1 at 75 for pid 28, osid 15615
NOTE: Assigning number (1,1) to disk ()
GMON querying group 1 at 76 for pid 28, osid 15615
NOTE: cache closing disk 1 of grp 1: (not open) 
NOTE: cache opening disk 0 of grp 1: DATADG_0000 path:/dev/asm-diskdata01
NOTE: F1X0 found on disk 0 au 2 fcn 0.178802
NOTE: cache opening disk 2 of grp 1: DATA_0001 path:/dev/asm-diskdata02
NOTE: cache mounting (first) normal redundancy group 1/0x9C94D0EB (DATADG)
Tue Mar 07 19:12:45 2017
* allocate domain 1, invalid = TRUE 
kjbdomatt send to inst 2
Tue Mar 07 19:12:45 2017
NOTE: attached to recovery domain 1
NOTE: starting recovery of thread=1 ckpt=25.2870 group=1 (DATADG)
NOTE: advancing ckpt for group 1 (DATADG) thread=1 ckpt=25.2873
NOTE: cache recovered group 1 to fcn 0.365897
NOTE: redo buffer size is 512 blocks (2101760 bytes)
Tue Mar 07 19:12:45 2017
NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (DATADG)
NOTE: LGWR found thread 1 closed at ABA 25.2872
NOTE: LGWR mounted thread 1 for diskgroup 1 (DATADG)
NOTE: LGWR opening thread 1 at fcn 0.365897 ABA 26.2873
NOTE: cache mounting group 1/0x9C94D0EB (DATADG) succeeded
NOTE: cache ending mount (success) of group DATADG number=1 incarn=0x9c94d0eb
Tue Mar 07 19:12:45 2017
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1
SUCCESS: diskgroup DATADG was mounted
SUCCESS: alter diskgroup DATADG mount
Tue Mar 07 19:12:45 2017
NOTE: diskgroup resource ora.DATADG.dg is online
NOTE: skip COD recovery as part of test at kfrc.c:1639 
NOTE: skip COD recovery as part of test at kfrc.c:1639 
NOTE: skip COD recovery as part of test at kfrc.c:1639 
NOTE: skip COD recovery as part of test at kfrc.c:1639 

asm的问题解决后,然后登录数据库,发现运气比较好,两个数据库正常open成功,而且alert日志无任何报错,直接通过rman备份出来数据,重建asm磁盘组,还原数据,恢复完成,而且实现数据0丢失。

发表在 非常规恢复 | 标签为 , , , , , , , | 留下评论

ORA 600 3005恢复

数据库打开报ora-600 3005错误

D:\>sqlplus / as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on 星期二 3月 7 23:04:25 2017

Copyright (c) 1982, 2010, Oracle.  All rights reserved.


连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> recover datafile 1;
完成介质恢复。
SQL> recover datafile 2;
完成介质恢复。
SQL> recover datafile 3;
完成介质恢复。
SQL> recover datafile 4;
完成介质恢复。
SQL> recover datafile 5;
完成介质恢复。
SQL> recover datafile 6;
完成介质恢复。
SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-00600: 内部错误代码, 参数: [3005], [1], [8242], [29937], [0], [0], [], [],
[], [], [], []

查询数据库信息

SQL> SELECT status,
  2  checkpoint_change#,
  3  checkpoint_time,FUZZY,
  4  count(*) ROW_NUM
  5  FROM v$datafile_header
  6  GROUP BY status, checkpoint_change#, checkpoint_time,fuzzy
  7  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS         CHECKPOINT_CHANGE# CHECKPOINT_TIM FUZZY     ROW_NUM
-------------- ------------------ -------------- ------ ----------
ONLINE                  227036249 06-3月 -17     NO              5
ONLINE                  227036252 06-3月 -17     NO              1

SQL> set numw 16
SQL> SELECT status,
  2  checkpoint_change#,
  3  checkpoint_time,last_change#,
  4  count(*) ROW_NUM
  5  FROM v$datafile
  6  GROUP BY status, checkpoint_change#, checkpoint_time,last_change#
  7  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS         CHECKPOINT_CHANGE# CHECKPOINT_TIM     LAST_CHANGE#
-------------- ------------------ -------------- ----------------
         ROW_NUM
----------------
ONLINE                  227036249 06-3月 -17
               4

ONLINE                  227036252 06-3月 -17
               1

SYSTEM                  227036249 06-3月 -17
               1

mos上关于ora-600 3005描述

VERSIONS:
versions 10.2 and later
DESCRIPTION:
Raised during pass one of the two pass recovery processing, which
reads and merges open redo threads into a hash table of blocks
that need recovery.
During examination of the the change vectors of online redologs, this
error is raised if no online redo log could be opened to cover the start RBA.
ARGUMENTS:
Arg [a] Thread
Arg [b] Redo Log File Sequence
Arg {c} Redo Log File Block Number
Arg [d] SCN Wrap
Arg [e] SCN Base

根据官方描述,出现该错误的原因是由于在数据库启动的过程中,通过控制文件读取的redo信息不匹配,从而出现该问题,通过重建控制文件可以绕过去该问题

SQL> shutdown immediate;
ORA-01109: 数据库未打开


已经卸载数据库。
ORACLE 例程已经关闭。
SQL> startup nomount pfile='d:/pfile.txt'
ORACLE 例程已经启动。

Total System Global Area      10288615424 bytes
Fixed Size                        2184672 bytes
Variable Size                  7482640928 bytes
Database Buffers               2785017856 bytes
Redo Buffers                     18771968 bytes
SQL> CREATE CONTROLFILE REUSE DATABASE "ORACLEDO" NORESETLOGS  NOARCHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 2336
  7  LOGFILE
  8    GROUP 1 'D:\APP\ADMINISTRATOR\ORADATA\ORACLEDOC\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  9    GROUP 2 'D:\APP\ADMINISTRATOR\ORADATA\ORACLEDOC\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
 10    GROUP 3 'D:\APP\ADMINISTRATOR\ORADATA\ORACLEDOC\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
 11  DATAFILE
 12    'D:\APP\ADMINISTRATOR\ORADATA\ORACLEDOC\SYSTEM01.DBF',
 13    'D:\APP\ADMINISTRATOR\ORADATA\ORACLEDOC\SYSAUX01.DBF',
 14    'D:\APP\ADMINISTRATOR\ORADATA\ORACLEDOC\UNDOTBS01.DBF',
 15    'D:\APP\ADMINISTRATOR\ORADATA\ORACLEDOC\USERS01.DBF',
 16    'D:\APP\ADMINISTRATOR\ORADATA\ORACLEDOC\XIFENFEI01.DBF',
 17    'D:\APP\ADMINISTRATOR\ORADATA\ORACLEDOC\XIFENFEI0102.DBF'
 18  CHARACTER SET AL32UTF8
 19  ;

控制文件已创建。

SQL> recover database;
完成介质恢复。
SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [],
[], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [],
[], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [],
[], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [],
[], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [],
[], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [],
[], [], [], []
进程 ID: 4036
会话 ID: 96 序列号: 1

这个错误就比较熟悉了,按照undo异常方案处理即可
补充说明
ora-600 3005的错误可能需要internal 帐号才能够查询到准确描述和处理方法,其实在这个库的运行最后crash之前,就已经报了控制文件异常,然后库crash掉了.

Mon Mar 06 10:16:37 2017
Thread 1 advanced to log sequence 8242 (LGWR switch)
  Current log# 1 seq# 8242 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORACLEDOC\REDO01.LOG
Mon Mar 06 11:06:31 2017
********************* ATTENTION: ******************** 
 The controlfile header block returned by the OS
 has a sequence number that is too old. 
 The controlfile might be corrupted.
 PLEASE DO NOT ATTEMPT TO START UP THE INSTANCE 
 without following the steps below.
 RE-STARTING THE INSTANCE CAN CAUSE SERIOUS DAMAGE 
 TO THE DATABASE, if the controlfile is truly corrupted.
 In order to re-start the instance safely, 
 please do the following:
 (1) Save all copies of the controlfile for later 
     analysis and contact your OS vendor and Oracle support.
 (2) Mount the instance and issue: 
     ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
 (3) Unmount the instance. 
 (4) Use the script in the trace file to
     RE-CREATE THE CONTROLFILE and open the database. 
*****************************************************
MMON (ospid: 3320): terminating the instance
Mon Mar 06 11:06:32 2017
opiodr aborting process unknown ospid (1528) as a result of ORA-1092
Mon Mar 06 11:06:32 2017
ORA-1092 : opitsk aborting process
Mon Mar 06 11:06:32 2017
opiodr aborting process unknown ospid (2852) as a result of ORA-1092
Mon Mar 06 11:06:32 2017
ORA-1092 : opitsk aborting process
Mon Mar 06 11:06:33 2017
opiodr aborting process unknown ospid (3836) as a result of ORA-1092
Mon Mar 06 11:06:33 2017
ORA-1092 : opitsk aborting process
Instance terminated by MMON, pid = 3320
发表在 Oracle备份恢复 | 标签为 | 留下评论

Oracle dul支持Oracle 12.2(12c)

随着oracle 12.2正式版的临近,通过最新版dul(Data UnLoader: 11.2.0.1.1)的测试,发现对于12c的版本支持依旧有问题,虽然作者在版本中较为明确的表示了支持12c但是测试效果不尽人意.
dul 11对oracle 12c的支持(Compatible参数表明支持12c)

[oracle@localhost dul]$ ./dul

Data UnLoader: 11.2.0.1.1 - Internal Only - on Tue Feb 21 23:39:08 2017
with 64-bit io functions and the decompression option

Copyright (c) 1994 2016 Bernard van Duijnen All rights reserved.

 Strictly Oracle Internal Use Only


Within one week you will need a more recent DUL version for this os
DUL: Warning: Compatible is set to 11 Values can be 6|7|8|9|10|11|12

在12.2的版本中运行dul结果如下

[oracle@xifenfei ~]$ ss

SQL*Plus: Release 12.2.0.1.0 Production on Tue Feb 21 23:54:43 2017

Copyright (c) 1982, 2016, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> set lines 150
SQL> select * from v$version;

BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production              0
PL/SQL Release 12.2.0.1.0 - Production                                                    0
CORE    12.2.0.1.0      Production                                                        0
TNS for Linux: Version 12.2.0.1.0 - Production                                            0
NLSRTL Version 12.2.0.1.0 - Production                                                    0


[root@xifenfei dul]# ./dul

Data UnLoader: 11.2.0.1.1 - Internal Only - on Tue Feb 21 23:53:37 2017
with 64-bit io functions and the decompression option

Copyright (c) 1994 2016 Bernard van Duijnen All rights reserved.

 Strictly Oracle Internal Use Only


Within one week you will need a more recent DUL version for this os
Found db_id = 766547632
Found db_name = ORCL12C
DUL> bootstrap;
Probing file = 1, block = 520
. unloading table                BOOTSTRAP$
DUL: Warning: block number is non zero but marked deferred trying to process it anyhow

DUL: Warning: Recreating file "BOOTSTRAP.ctl"
      60 rows unloaded
Reading BOOTSTRAP.dat 60 entries loaded
Parsing Bootstrap$ contents
DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 12
 OBJ$: segobjno 18, file 1 block 240
 TAB$: segobjno 2, tabno 1, file 1  block 144
 COL$: segobjno 2, tabno 5, file 1  block 144
 USER$: segobjno 10, tabno 1, file 1  block 208
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$
DUL: Warning: Recreating file "OBJ.ctl"
   72655 rows unloaded
. unloading table                      TAB$
DUL: Warning: Recreating file "TAB.ctl"
    2162 rows unloaded
. unloading table                      COL$
DUL: Warning: Recreating file "COL.ctl"
  114785 rows unloaded
. unloading table                     USER$
DUL: Warning: Recreating file "USER.ctl"
     130 rows unloaded
Reading USER.dat 130 entries loaded
Reading OBJ.dat
DUL: FATAL Error: File OBJ.dat, line 22079: identifier too long

这个报错DUL: FATAL Error: File OBJ.dat比较明显由于dul在加载obj$对象的时候有对象字符太长导致加载失败
dul-error


这个是由于在oracle 12c中已经允许对象名/列名程度为128,而不是以前版本的30,从而引起了这个问题,虽然dul的作者已经在dict.ddl中做了限制,但是目前看尚未生效
dul-clear

特殊处理后的dul可以完美支持12c

[root@xifenfei dul]# ./dul

Data UnLoader: 11.2.0.1.1 - Internal Only - on Wed Feb 22 00:04:40 2017
with 64-bit io functions and the decompression option

Copyright (c) 1994 2016 Bernard van Duijnen All rights reserved.


 Strictly Oracle Internal Use Only


Within one week you will need a more recent DUL version for this os
Found db_id = 766547632
Found db_name = ORCL12C

DUL> bootstrap;
Probing file = 1, block = 520
. unloading table                BOOTSTRAP$
DUL: Warning: block number is non zero but marked deferred trying to process it anyhow

DUL: Warning: Recreating file "BOOTSTRAP.ctl"
      60 rows unloaded
Reading BOOTSTRAP.dat 60 entries loaded
Parsing Bootstrap$ contents
DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 12
 OBJ$: segobjno 18, file 1 block 240
 TAB$: segobjno 2, tabno 1, file 1  block 144
 COL$: segobjno 2, tabno 5, file 1  block 144
 USER$: segobjno 10, tabno 1, file 1  block 208
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$
DUL: Warning: Recreating file "OBJ.ctl"
   72655 rows unloaded
. unloading table                      TAB$
DUL: Warning: Recreating file "TAB.ctl"
    2162 rows unloaded
. unloading table                      COL$
DUL: Warning: Recreating file "COL.ctl"
  114785 rows unloaded
. unloading table                     USER$
DUL: Warning: Recreating file "USER.ctl"
     130 rows unloaded
Reading USER.dat 130 entries loaded
Reading OBJ.dat 72655 entries loaded and sorted 72655 entries
Reading TAB.dat 2162 entries loaded
Reading COL.dat 114785 entries loaded and sorted 114785 entries
DUL> unload table sys.undo$;
. unloading table                     UNDO$      21 rows unloaded
DUL> exit

Life is DUL without it

这里是通过特殊处理之后dul才能够支持12c版本,期待作者在后续版本中完善相关工作,在恢复的经验中dul还是一大利器.

发表在 Oracle | 标签为 , , , | 留下评论

oracle 12.2依旧支持bbed

随着oracle 12.2的全面正式发布,通过测试,在12.2中依旧支持bbed,处理方法完全与11g相同
linux版本

[oracle@xifenfei db_1]$ more /etc/oracle-release 
Oracle Linux Server release 7.2
[oracle@xifenfei db_1]$ uname -a
Linux xifenfei 3.8.13-98.7.1.el7uek.x86_64 #2 SMP Wed Nov 25 13:51:41 PST 2015 x86_64 x86_64 x86_64 GNU/Linux
[oracle@xifenfei db_1]$ 

数据库版本

SQL>  select * from v$version;

BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production              0
PL/SQL Release 12.2.0.1.0 - Production                                                    0
CORE    12.2.0.1.0      Production                                                        0
TNS for Linux: Version 12.2.0.1.0 - Production                                            0
NLSRTL Version 12.2.0.1.0 - Production                                                    0

bbed需要文件

[oracle@xifenfei bbed10_linux_64]$ ls -ltr
total 32
-rw-r--r-- 1 oracle oinstall 10270 Jul 25  2000 bbedus.msg
-rw-r--r-- 1 oracle oinstall  3306 Apr 20  2010 ssbbded.o
-rw-r--r-- 1 oracle oinstall  3976 Apr 20  2010 sbbdpt.o
-rw-r--r-- 1 oracle oinstall  8704 Apr 20  2010 bbedus.msb

编译bbed

[oracle@xifenfei bbed10_linux_64]$ 
[oracle@xifenfei bbed10_linux_64]$ cp *.o $ORACLE_HOME/rdbms/lib/
[oracle@xifenfei bbed10_linux_64]$ cp bbed* $ORACLE_HOME/rdbms/mesg/
[oracle@xifenfei bbed10_linux_64]$ cd $ORACLE_HOME
[oracle@xifenfei db_1]$ make -f ./rdbms/lib/ins_rdbms.mk BBED=./bin/bbed ./bin/bbed

Linking BBED utility (bbed)
rm -f bin/bbed
/u01/app/oracle/product/12.2.0/db_1/bin/orald -o bin/bbed -m64 -z noexecstack -Wl,--disable-new-dtags -L
/u01/app/oracle/product/12.2.0/db_1/rdbms/lib/ -L/u01/app/oracle/product/12.2.0/db_1/lib/ -L/u01/app/oracle/product/12.2.0/db_1/lib/stubs/  /u01/app/oracle/product/12.2.0/db_1/lib/s0main.o 
/u01/app/oracle/product/12.2.0/db_1/rdbms/lib/ssbbded.o /u01/app/oracle/product/12.2.0/db_1/rdbms/lib/sbbdpt.o `cat 
/u01/app/oracle/product/12.2.0/db_1/lib/ldflags`    -lncrypt12 -lnsgr12 -lnzjs12 -ln12 -lnl12 -lngsmshd12
 -ldbtools12 -lclntsh -lclntshcore  `cat /u01/app/oracle/product/12.2.0/db_1/lib/ldflags`    -lncrypt12 -lnsgr12
 -lnzjs12 -ln12 -lnl12 -lngsmshd12 -lnro12 `cat /u01/app/oracle/product/12.2.0/db_1/lib/ldflags`    
-lncrypt12 -lnsgr12 -lnzjs12 -ln12 -lnl12 -lngsmshd12 -lnnz12 -lzt12 -lztkg12 -lztkg12 -lclient12 -lnnetd12  
-lvsn12 -lcommon12 -lgeneric12 -lmm -lsnls12 -lnls12  -lcore12 -lsnls12 -lnls12 -lcore12 -lsnls12 
-lnls12 -lxml12 -lcore12 -lunls12 -lsnls12 -lnls12 -lcore12 -lnls12 `cat /u01/app/oracle/product/12.2.0/db_1/lib/ldflags
`    -lncrypt12 -lnsgr12 -lnzjs12 -ln12 -lnl12 -lngsmshd12 -lnro12 `cat /u01/app/oracle/product/12.2.0/db_1/lib/ldflags
`    -lncrypt12 -lnsgr12 -lnzjs12 -ln12 -lnl12 -lngsmshd12 -lclient12 -lnnetd12  -lvsn12 -lcommon12 -lgeneric12  
 -lsnls12 -lnls12  -lcore12 -lsnls12 -lnls12 -lcore12 -lsnls12 -lnls12 -lxml12 -lcore12 -lunls12 -lsnls12 -lnls12 
-lcore12 -lnls12 -lclient12 -lnnetd12  -lvsn12 -lcommon12 -lgeneric12 -lsnls12 -lnls12  -lcore12 -lsnls12
 -lnls12 -lcore12 -lsnls12 -lnls12 -lxml12 -lcore12 -lunls12 -lsnls12 -lnls12 -lcore12 -lnls12  
 `cat /u01/app/oracle/product/12.2.0/db_1/lib/sysliblist` -Wl,-rpath,/u01/app/oracle/product/12.2.0/db_1/lib -lm   
 `cat /u01/app/oracle/product/12.2.0/db_1/lib/sysliblist` -ldl -lm   -L/u01/app/oracle/product/12.2.0/db_1/lib

测试bbed

[oracle@xifenfei db_1]$ bbed blocksize=8192 filename=/u01/app/oracle/oradata/orcl12c/system01.dbf
Password: 

BBED: Release 2.0.0.0.0 - Limited Production on Fri Feb 17 00:57:56 2017

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

************* !!! For Oracle Internal Use only !!! ***************

BBED> map
 File: /u01/app/oracle/oradata/orcl12c/system01.dbf (0)
 Block: 1                                     Dba:0x00000000
------------------------------------------------------------
 Data File Header

 struct kcvfh, 1248 bytes                   @0       

 ub4 tailchk                                @8188    


BBED> p kcvfh
struct kcvfh, 1248 bytes                    @0       
   struct kcvfhbfh, 20 bytes                @0       
      ub1 type_kcbh                         @0        0x0b
      ub1 frmt_kcbh                         @1        0xa2
      ub2 wrp2_kcbh                         @2        0x0000
      ub4 rdba_kcbh                         @4        0x00400001
      ub4 bas_kcbh                          @8        0x00000000
      ub2 wrp_kcbh                          @12       0x0000
      ub1 seq_kcbh                          @14       0x01
      ub1 flg_kcbh                          @15       0x04 (KCBHFCKV)
      ub2 chkval_kcbh                       @16       0x38d0
      ub2 spare3_kcbh                       @18       0x0000
   struct kcvfhhdr, 76 bytes                @20      
      ub4 kccfhswv                          @20       0x00000000
      ub4 kccfhcvn                          @24       0x0c200000  <-------数据文件版本
      ub4 kccfhdbi                          @28       0x2db096b0
      text kccfhdbn[0]                      @32      O
      text kccfhdbn[1]                      @33      R
      text kccfhdbn[2]                      @34      C
      text kccfhdbn[3]                      @35      L
      text kccfhdbn[4]                      @36      1
      text kccfhdbn[5]                      @37      2
      text kccfhdbn[6]                      @38      C
      text kccfhdbn[7]                      @39       
      ub4 kccfhcsq                          @40       0x00000714
      ub4 kccfhfsz                          @44       0x00019500
      s_blkz kccfhbsz                       @48       0x00
      ub2 kccfhfno                          @52       0x0001
      ub2 kccfhtyp                          @54       0x0003
      ub4 kccfhacid                         @56       0x00000000
      ub4 kccfhcks                          @60       0x00000000

通过测试,bbed可以完美支持12.2的数据文件,而且文件头格式无大变化

发表在 Oracle | 标签为 , | 留下评论

修改bootstrap$影响数据库执行计划

在以前的文章中,我写过通过ue修改oracle二进制文件实现数据库启动的sql执行计划,这里再次提供另外一种方法,通过修改bootstrap$表实现数据库启动sql执行计划。这里试验的是数据库不走i_undo1 index。[此方法危害性巨大仅供测试]
跟踪数据库正常启动过程

[oracle@localhost .oradata]$ ss

SQL*Plus: Release 11.2.0.1.0 Production on Sat Feb 4 23:07:41 2017

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup mount;
ORACLE instance started.

Total System Global Area 2421825536 bytes
Fixed Size                  2215744 bytes
Variable Size            1828716736 bytes
Database Buffers          570425344 bytes
Redo Buffers               20467712 bytes
Database mounted.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12
Statement processed.
SQL> oradebug TRACEFILE_NAME
/opt/oracle/diag/rdbms/test/test/trace/test_ora_19003.trc
SQL> alter database open;

Database altered.

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

分析10046 trace文件

PARSING IN CURSOR #2 len=196 dep=1 uid=0 oct=9 lid=0 tim=1486220893978359 hv=24291558 ad='ef95ff70' sqlid='fqkyj700r5a76'
CREATE UNIQUE INDEX I_UNDO1 ON UNDO$(US#) PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE 
(  INITIAL 64K NEXT 1024K MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 OBJNO 34 EXTENTS (FILE 1 BLOCK 320))
END OF STMT
PARSE #2:c=0,e=565,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,plh=241391319,tim=1486220893978358
EXEC #2:c=0,e=123,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=241391319,tim=1486220893978566
STAT #2 id=1 cnt=0 pid=0 pos=1 obj=0 op='INDEX BUILD UNIQUE I_UNDO1 (cr=0 pr=0 pw=0 time=0 us)'
STAT #2 id=2 cnt=0 pid=1 pos=1 obj=0 op='SORT CREATE INDEX (cr=0 pr=0 pw=0 time=0 us cost=0 size=0 card=0)'
STAT #2 id=3 cnt=0 pid=2 pos=1 obj=15 op='TABLE ACCESS FULL UNDO$ (cr=0 pr=0 pw=0 time=0 us)'
CLOSE #2:c=0,e=5,dep=1,type=0,tim=1486220893978690

…………

PARSING IN CURSOR #5 len=142 dep=1 uid=0 oct=3 lid=0 tim=1486220894169144 hv=361892850 ad='ef934cb0' sqlid='7bd391hat42zk'
select /*+ rule */ name,file#,block#,status$,user#,undosqn,xactsqn,scnbas,scnwrp,DECODE(inst#,0,NULL,inst#),ts#,spare1 from undo$ where us#=:1
END OF STMT
PARSE #5:c=0,e=474,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=3,plh=0,tim=1486220894169143
BINDS #5:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f4f4556b0a8  bln=22  avl=02  flg=05
  value=1
EXEC #5:c=1000,e=802,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=3,plh=906473769,tim=1486220894170055
WAIT #5: nam='db file sequential read' ela= 1018 file#=1 block#=321 blocks=1 obj#=34 tim=1486220894171138
WAIT #5: nam='db file sequential read' ela= 2620 file#=1 block#=225 blocks=1 obj#=15 tim=1486220894173817
FETCH #5:c=0,e=3770,p=2,cr=2,cu=0,mis=0,r=1,dep=1,og=3,plh=906473769,tim=1486220894173859
STAT #5 id=1 cnt=1 pid=0 pos=1 obj=15 op='TABLE ACCESS BY INDEX ROWID UNDO$ (cr=2 pr=2 pw=0 time=0 us)'
STAT #5 id=2 cnt=1 pid=1 pos=1 obj=34 op='INDEX UNIQUE SCAN I_UNDO1 (cr=1 pr=1 pw=0 time=0 us)'
CLOSE #5:c=0,e=9,dep=1,type=0,tim=1486220894173944

这些trace文件比较明显的展示了数据库在启动过程中,先是create index i_undo1,然后from undo$ where us#=:1走I_UNDO1 index访问。

删除i_undo1 在bootstrap$中信息

SQL> select line#,obj#,sql_text from bootstrap$ where sql_text like '%I_UNDO1%';

     LINE#       OBJ#
---------- ----------
SQL_TEXT
--------------------------------------------------------------------------------
        34         34
CREATE UNIQUE INDEX I_UNDO1 ON UNDO$(US#) PCTFREE 10 INITRANS 2 MAXTRANS 255 STO
RAGE (  INITIAL 64K NEXT 1024K MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0
OBJNO 34 EXTENTS (FILE 1 BLOCK 320))


SQL> DELETE FROM BOOTSTRAP$ WHERE OBJ#=34;

1 row deleted.

SQL> COMMIT;

Commit complete.

SQL> SHUTDOWN IMMEDIATE;
Database closed.
Database dismounted.
ORACLE instance shut down.

再次跟踪数据库启动

SQL> STARTUP MOUNT;
ORACLE instance started.

Total System Global Area 2421825536 bytes
Fixed Size                  2215744 bytes
Variable Size            1828716736 bytes
Database Buffers          570425344 bytes
Redo Buffers               20467712 bytes
Database mounted.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12
Statement processed.
SQL> oradebug TRACEFILE_NAME
/opt/oracle/diag/rdbms/test/test/trace/test_ora_19205.trc
SQL> alter database Open;

Database altered.

再次分析10046 trace文件

PARSING IN CURSOR #5 len=142 dep=1 uid=0 oct=3 lid=0 tim=1486221250365628 hv=361892850 ad='ef935ce0' sqlid='7bd391hat42zk'
select /*+ rule */ name,file#,block#,status$,user#,undosqn,xactsqn,scnbas,scnwrp,DECODE(inst#,0,NULL,inst#),ts#,spare1 from undo$ where us#=:1
END OF STMT
PARSE #5:c=1000,e=536,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=3,plh=0,tim=1486221250365627
BINDS #5:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f89004340a8  bln=22  avl=02  flg=05
  value=1
EXEC #5:c=1000,e=727,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=3,plh=3995376916,tim=1486221250366467
WAIT #5: nam='db file sequential read' ela= 12 file#=1 block#=224 blocks=1 obj#=15 tim=1486221250366534
WAIT #5: nam='db file sequential read' ela= 10 file#=1 block#=225 blocks=1 obj#=15 tim=1486221250366606
FETCH #5:c=0,e=147,p=2,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=3995376916,tim=1486221250366647
STAT #5 id=1 cnt=1 pid=0 pos=1 obj=15 op='TABLE ACCESS FULL UNDO$ (cr=3 pr=2 pw=0 time=0 us)'
CLOSE #5:c=0,e=39,dep=1,type=0,tim=1486221250366717

这次的trace文件中已经没有了create index i_undo1,而且对于undo$ where us#=:1的访问是通过全表扫描undo$来实现的。通过这里已经清晰证明通过修改bootstrap$表内容,可以影响数据库启动的sql执行计划。

发表在 Oracle | 标签为 , | 留下评论