月归档:八月 2018

由于bootstrap$异常导致数据库启动报ORA-03113 ORA-07445 lmebucp

数据库无法正常启动,报ORA-03113

SQL> startup
ORACLE 例程已经启动。

Total System Global Area 5016387584 bytes
Fixed Size                  2011136 bytes
Variable Size             905969664 bytes
Database Buffers         4093640704 bytes
Redo Buffers               14766080 bytes
数据库装载完毕。
ORA-03113: 通信通道的文件结束

alert日志报错ORA-07445 lmebucp

Mon Aug 27 15:31:37 2018
Thread 1 advanced to log sequence 21691
Thread 1 opened at log sequence 21691
  Current log# 2 seq# 21691 mem# 0: /data/oracle/orcl/redo02.log
Successful open of redo thread 1
Mon Aug 27 15:31:37 2018
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Aug 27 15:31:37 2018
SMON: enabling cache recovery
Mon Aug 27 15:31:37 2018
Errors in file /home/oracle/oracle/product/10.2.0/db_1/admin/orcl/udump/orcl_ora_5827.trc:
ORA-07445: exception encountered: core dump [lmebucp()+24] [SIGSEGV] 
[Address not mapped to object] [0x000000000] [] []

跟踪启动10046 trace

WAIT #1: nam='instance state change' ela= 822 layer=2 value=1 waited=1 obj#=-1 tim=1499370211971345
WAIT #1: nam='db file sequential read' ela= 29 file#=1 block#=257 blocks=1 obj#=-1 tim=1499370211971896
=====================
PARSING IN CURSOR #2 len=188 dep=1 uid=0 oct=1 lid=0 tim=1499370211972625 hv=2809067040 ad='b5fe2d00'
create table bootstrap$ ( line#         number not null,   obj#           
number not null,   sql_text   varchar2(4000) not null)   
storage (initial 50K objno 41 extents (file 1 block 257))
END OF STMT
PARSE #2:c=0,e=598,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=1499370211972621
BINDS #2:
EXEC #2:c=1000,e=195,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1499370211972873
=====================
PARSING IN CURSOR #2 len=55 dep=1 uid=0 oct=3 lid=0 tim=1499370211973429 hv=2111436465 ad='b7bd0530'
select line#, sql_text from bootstrap$ where obj# != :1
END OF STMT
PARSE #2:c=0,e=472,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=1499370211973426
BINDS #2:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=2b8c5d50a4d0  bln=22  avl=02  flg=05
  value=41
EXEC #2:c=1000,e=838,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=1499370211974375
WAIT #2: nam='db file sequential read' ela= 27 file#=1 block#=257 blocks=1 obj#=-1 tim=1499370211974522
WAIT #2: nam='db file sequential read' ela= 21 file#=1 block#=258 blocks=1 obj#=-1 tim=1499370211974855
FETCH #2:c=1000,e=479,p=2,cr=3,cu=0,mis=0,r=0,dep=1,og=4,tim=1499370211974908
Exception signal: 11 (SIGSEGV), code: 1 (Address not mapped to object),
 addr: 0x0, PC: [0x348772c, lmebucp()+24]
*** 2018-08-27 15:31:37.074
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [lmebucp()+24] [SIGSEGV] 
[Address not mapped to object] [0x000000000] [] []
Current SQL statement for this session:
alter database open
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
Cannot find symbol
Cannot find symbol
Cannot find symbol
ksedst()+31          call     ksedst1()            000000001 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000001 ?
ksedmp()+610         call     ksedst()             000000001 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000001 ?
ssexhd()+630         call     ksedmp()             000000003 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000001 ?
<0x336800eca0>       call     ssexhd()             00000000B ? 2B8C5D238D70 ?
                                                   2B8C5D238C40 ? 000000000 ?
                                                   000000000 ? 000000001 ?
 
--------------------- Binary Stack Dump ---------------------

通过这里发现,数据库启动执行select line#, sql_text from bootstrap$ where obj# != :1然后报ORA-07445 lmebucp错误。这样的错误比较诡异,一般可能是由于bootstrap异常导致,但是这里再往上跟踪发现 create bootstrap$表指定的记录为file 1 block 257,根据经验知道数据库的bootstrap$表记录一般是377 或者520比较常见.通过工具对于file 1进行分析

DUL> dump datafile 1 block 257
Block Header:
block type=0x10 (data segment header block (unlimited extents))
block format=0xa2 (oracle 10)
block rdba=0x00400101 (file#=1, block#=257)
scn=0x0000.0000007e, seq=1, tail=0x007e1001
block checksum value=0xe75c=59228, flag=4
Data Segment Header:
  Extent Control Header
  -------------------------------------------------------------
  Extent Header:: extents: 1  blocks: 7
                  last map: 0x00000000  #maps: 0  offset: 4128
      Highwater:: 0x00400103  (rfile#=1,block#=259)
                  ext#: 0  blk#: 1   ext size:7
      #blocks in seg. hdr's freelists: 0
      #blocks below: 1
      mapblk: 0x00000000   offset: 0
      Map Header:: next: 0x00000000   #extents: 1  obj#: 41  flag: 0x40000000
  Extent Control Header
  -------------------------------------------------------------
   0x00400102  length: 7

  nfl = 1, nfb = 1, typ = 2, nxf = 0, ccnt = 0
  SEG LST:: flg:UNUSED lhd: 0x00000000 ltl: 0x00000000

发现异常比较明显,block 257为data_object_id=41,也就是
41|41|CREATE UNIQUE INDEX I_FILE1 ON FILE$(FILE#) PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE ( INITIAL 64K NEXT 1024K MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 OBJNO 41 EXTENTS (FILE 1 BLOCK 257))
这里看数据库的引导异常或者bootstrap$表中记录异常.通过修复bootstrap相关内容,数据库完美启动

发表在 非常规恢复 | 标签为 , , | 评论关闭

又一例asm格式化文件系统恢复

又一个客户把win rac中的asm disk给格式化为ntfs了(data磁盘组由三个500G的磁盘组成,被格式化掉前面两个还剩下一个),而且格式化之后,还进行了一系列恢复(比如修复磁盘头,又进行分区等一些磁盘操作),导致恢复难度增加,也增加了一些数据覆盖
asm alert日志报错

Thu Aug 23 11:20:14 2018
NOTE: ASM client orcl1:orcl disconnected unexpectedly.
NOTE: check client alert log.
NOTE: Process state recorded in trace file d:\app\administrator\diag\asm\+asm\+asm1\trace\+asm1_ora_2260.trc
Thu Aug 23 11:20:28 2018
Errors in file d:\app\administrator\diag\asm\+asm\+asm1\trace\+asm1_lgwr_3820.trc:
ORA-27070: async read/write failed
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 87) 参数错误。
WARNING: IO Failed. group:2 disk(number.incarnation):1.0xf0f0a1cb disk_path:\\.\ORCLDISKDATA1
	 AU:26 disk_offset(bytes):27566080 io_size:4096 operation:Write type:synchronous
	 result:I/O error process_id:3820
NOTE: unable to write any mirror side for diskgroup DATA
NOTE: cache initiating offline of disk 1 group DATA
NOTE: process 3268:3820 initiating offline of disk 1.4042301899 (DATA_0001) with mask 0x7e in group 2
WARNING: Disk DATA_0001 in mode 0x7f is now being taken offline
NOTE: initiating PST update: grp = 2, dsk = 1/0xf0f0a1cb, mode = 0x15
kfdp_updateDsk(): 22 
Thu Aug 23 11:20:28 2018
kfdp_updateDskBg(): 22 
ERROR: too many offline disks in PST (grp 2)
WARNING: Disk DATA_0001 in mode 0x7f offline aborted

数据库alert日志报错

WARNING: IO Failed. group:2 disk(number.incarnation):1.0xf0f0a1cb disk_path:\\.\ORCLDISKDATA1
	 AU:422 disk_offset(bytes):442515456 io_size:16384 operation:Read type:synchronous
	 result:I/O error process_id:11992
WARNING: failed to read mirror side 1 of virtual extent 5 logical extent 0 of file 260 in 
group [2.1859146063] from disk DATA_0001  allocation unit 422 reason error; if possible,will try another mirror side 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_ora_11992.trc:
ORA-15080: 与磁盘的同步 I/O 操作失败
WARNING: failed to write mirror side 1 of virtual extent 5 logical extent 0 of file 260 
in group 2 on disk 1 allocation unit 422 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_ora_11992.trc:
ORA-00202: 控制文件: ''+DATA/orcl/controlfile/current.260.944422981''
ORA-15081: 无法将 I/O 操作提交到磁盘
Thu Aug 23 11:20:13 2018
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-27070: 异步读取/写入失败
WARNING: IO Failed. group:2 disk(number.incarnation):1.0xf0f0a1cb disk_path:\\.\ORCLDISKDATA1
	 AU:841 disk_offset(bytes):882532352 io_size:131072 operation:Write type:asynchronous
	 result:I/O error process_id:3224
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-15080: 与磁盘的同步 I/O 操作失败
WARNING: failed to write mirror side 1 of virtual extent 240 logical extent 0 of file 259 in group 2 on disk 1 
allocation unit 841 KCF: read, write or open error, block=0x7853 online=1
        file=4 '+DATA/orcl/datafile/users.259.944422883'
        error=15081 txt: ''
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 87) 参数错误。
WARNING: IO Failed. group:2 disk(number.incarnation):1.0xf0f0a1cb disk_path:\\.\ORCLDISKDATA1
	 AU:422 disk_offset(bytes):442515456 io_size:16384 operation:Read type:synchronous
	 result:I/O error process_id:3224
WARNING: failed to read mirror side 1 of virtual extent 5 logical extent 0 of file 260 in group [2.1859146063] from 
disk DATA_0001  allocation unit 422 reason error; if possible,will try another mirror side 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-15080: 与磁盘的同步 I/O 操作失败
WARNING: failed to write mirror side 1 of virtual extent 5 logical extent 0 of file 260 in group 2 on disk 1 
allocation unit 422 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-00202: 控制文件: ''+DATA/orcl/controlfile/current.260.944422981''
ORA-15081: 无法将 I/O 操作提交到磁盘
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-00204: 读取控制文件时出错 (块 41, # 块 1)
ORA-00202: 控制文件: ''+DATA/orcl/controlfile/current.260.944422981''
ORA-15081: 无法将 I/O 操作提交到磁盘
DBW1 (ospid: 3224): terminating the instance due to error 204

由于客户进行了一系列恢复恢复操作导致查看磁盘都不全

D:\>asmtool -list
NTFS                             \Device\Harddisk0\Partition1              100M
NTFS                             \Device\Harddisk0\Partition2           102298M
NTFS                             \Device\Harddisk1\Partition1           102397M
NTFS                             \Device\Harddisk2\Partition1           204797M
---这里还有一个磁盘没有正常显示
ORCLDISKDATA10                   \Device\Harddisk4\Partition1           511997M--客户尝试修复的磁盘
ORCLDISKDATA2                    \Device\Harddisk5\Partition1           511997M
ORCLDISKRECOVERY0                \Device\Harddisk6\Partition1            51197M
ORCLDISKRECOVERY1                \Device\Harddisk7\Partition1            51197M
ORCLDISKRECOVERY2                \Device\Harddisk8\Partition1            51197M
ORCLDISKCRS0                     \Device\Harddisk9\Partition1            10237M
ORCLDISKCRS1                     \Device\Harddisk10\Partition1           10237M
ORCLDISKCRS2                     \Device\Harddisk11\Partition1           10237M
NTFS                             \Device\Harddisk12\Partition2         4194174M

通过主机层面激活卷,删除分区等一系列操作,然后通过kfed构造磁盘头,让这些磁盘在os层面可以正常显示

C:\Users\Administrator>asmtool -list
NTFS                             \Device\Harddisk0\Partition1              100M
NTFS                             \Device\Harddisk0\Partition2           102298M
NTFS                             \Device\Harddisk1\Partition1           102397M
NTFS                             \Device\Harddisk2\Partition1           204797M
------需要处理的磁盘------
ORCLDISKDATA0                    \Device\Harddisk3\Partition1           511997M
ORCLDISKDATA1                    \Device\Harddisk4\Partition1           511997M
ORCLDISKDATA2                    \Device\Harddisk5\Partition1           511997M
-----------------------
ORCLDISKRECOVERY0                \Device\Harddisk6\Partition1            51197M
ORCLDISKRECOVERY1                \Device\Harddisk7\Partition1            51197M
ORCLDISKRECOVERY2                \Device\Harddisk8\Partition1            51197M
ORCLDISKCRS0                     \Device\Harddisk9\Partition1            10237M
ORCLDISKCRS1                     \Device\Harddisk10\Partition1           10237M
ORCLDISKCRS2                     \Device\Harddisk11\Partition1           10237M
NTFS                             \Device\Harddisk12\Partition2         4194174M

由于asm磁盘组内部目录au被彻底损坏,导致无法通过asm直接拷贝出来数据,通过底层扫描,按照au恢复出来相关数据,由于格式化ntfs和后续的误操作导致部分数据au被覆盖.其余数据均恢复,抢救了绝大部分数据.
数据文件恢复参考:asm disk header 彻底损坏恢复
另外有一次win平台类似恢复经历:asm disk格式化为ntfs恢复
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

发表在 Oracle ASM, 非常规恢复 | 标签为 , , , , | 评论关闭

asm disk 大小限制

这个问题在12C之前争议很小,基本共识非XD环境不能超过2T,但是到了后面的版本中,发生了一些改变,主要是COMPATIBLE.ASM and COMPATIBLE.RDBMS disk group attributes are set to 12.1 or greater的时候asm disk 大小限制依赖au size,
1M ausize asm disk limit为4 PB
2M ausize asm disk limit为8 PB
4M ausize asm disk limit为16 PB
8M ausize asm disk limit为32 PB

asm-limit-1
asm-limit-2


参见:Oracle ASM Storage Limits
18C中COMPATIBLE.ASM和COMPATIBLE.RDBMS默认值(COMPATIBLE.RDBMS为10.1,也就是说默认情况下非XD情况还是只能支持不超过2T的asm disk)
18c-asm

发表在 Oracle ASM | 标签为 , | 评论关闭