月归档:七月 2012

通过bbed修改回滚段状态解决ORA-00704故障

undo文件丢失

SQL> startup 
ORACLE instance started.

Total System Global Area  313860096 bytes
Fixed Size                  1344652 bytes
Variable Size             218106740 bytes
Database Buffers           88080384 bytes
Redo Buffers                6328320 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 3 - see DBWR trace file
ORA-01110: data file 3: '/u01/oracle/oradata/ora11g/undotbs01.dbf'

[root@xifenfei ~]# ls -l /u01/oracle/oradata/ora11g/undotbs01.dbf
ls: /u01/oracle/oradata/ora11g/undotbs01.dbf: No such file or directory

尝试offline 数据文件


SQL> alter database datafile 3 offline;

Database altered.

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/oracle/oradata/ora11g/undotbs01.dbf'
Process ID: 16365
Session ID: 125 Serial number: 5

通过这里的错误提示可以看到因为datafile 3丢失并且offline,而在数据库启动的时候需要去使用该数据文件中的回滚段去回滚事务,但是因为该数据文件被offline,使得回滚不能进行从而出现该错误.这里出现ORA-00704和ORA-00604的错误,根据感觉不能轻易的使用屏蔽回滚段的方法实现,但是还是做一尝试.使用其他方法找到回滚段.

屏蔽回滚段后重启库

SQL> startup mount pfile='/tmp/pfile'   force
ORACLE instance started.

Total System Global Area  313860096 bytes
Fixed Size                  1344652 bytes
Variable Size             218106740 bytes
Database Buffers           88080384 bytes
Redo Buffers                6328320 bytes
Database mounted.

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/oracle/oradata/ora11g/undotbs01.dbf'
Process ID: 16696
Session ID: 125 Serial number: 5

增加了合适的回滚段屏蔽,但是数据库还不能正常启动,而且依然报ORA-00704,决定对数据库启动过程做一个10046跟踪,来判断在哪一步出了问题

对数据库open过程做10046

SQL> conn  / as sysdba
Connected to an idle instance.
SQL> startup pfile='/tmp/pfile' mount;
ORACLE instance started.

Total System Global Area  313860096 bytes
Fixed Size                  1344652 bytes
Variable Size             218106740 bytes
Database Buffers           88080384 bytes
Redo Buffers                6328320 bytes
Database mounted.

SQL> oradebug setmypid 
Statement processed.
SQL> oradebug EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12
Statement processed.
SQL> oradebug TRACEFILE_NAME
/u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_16869.trc
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/oracle/oradata/ora11g/undotbs01.dbf'
Process ID: 16869
Session ID: 125 Serial number: 5

--trace中发现
PARSE ERROR #3063083528:len=60 dep=1 uid=0 oct=3 lid=0 tim=1342472283605146 err=604
SELECT NULL FROM PROPS$ WHERE NAME='BOOTSTRAP_UPGRADE_ERROR'
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/oracle/oradata/ora11g/undotbs01.dbf'
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/oracle/oradata/ora11g/undotbs01.dbf'

通过这里可以看出来很可能是在SELECT PROPS$表的时候,需要使用到回滚段,这里有两种可能需要使用回滚段(1.有数据未提交[提交事务],2.数据块scn过大[提交事务/推进scn]).对于这个问题,我尝试着修改回滚段状态来解决这个问题

修改回滚段状态

[oracle@xifenfei ~]$ bbed listfile=listfile password=blockedit  mode=edit

BBED: Release 2.0.0.0.0 - Limited Production on Tue Jul 17 09:10:01 2012

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

************* !!! For Oracle Internal Use only !!! ***************

BBED> set file 1 block 225
        FILE#           1
        BLOCK#          225

BBED> map
 File: /u01/oracle/oradata/ora11g/system01.dbf (1)
 Block: 225                                   Dba:0x004000e1
------------------------------------------------------------
 KTB Data Block (Table/Cluster)

 struct kcbh, 20 bytes                      @0       

 struct ktbbh, 48 bytes                     @20      

 struct kdbh, 14 bytes                      @68      

 struct kdbt[1], 4 bytes                    @82      

 sb2 kdbr[21]                               @86      

 ub1 freespace[5357]                        @128     

 ub1 rowdata[2703]                          @5485    

 ub4 tailchk                                @8188    

BBED> p *kdbr[1]
rowdata[1393]
-------------
ub1 rowdata[1393]                           @6878     0x2c

BBED> x /rncnnnnn
rowdata[1393]                               @6878    
-------------
flag@6878: 0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@6879: 0x00
cols@6880:   17

col    0[2] @6881: 1 
col   1[20] @6884: _SYSSMU1_3138885392$
col    2[2] @6905: 1 
col    3[2] @6908: 3 
col    4[3] @6911: 128 
col    5[4] @6915: 822624 
col    6[1] @6920: 0 
col    7[3] @6922: 498 
col    8[2] @6926: 94 
col    9[1] @6929: 0 
col   10[2] @6931: 3 
col   11[2] @6934: 2 
col   12[0] @6937: *NULL*
col   13[0] @6938: *NULL*
col   14[0] @6939: *NULL*
col   15[0] @6940: *NULL*
col   16[2] @6941: 2 


BBED> m /x 02 offset 6933
 File: /u01/oracle/oradata/ora11g/system01.dbf (1)
 Block: 225              Offsets: 6933 to 7444           Dba:0x004000e1
------------------------------------------------------------------------
 0202c103 ffffffff 02c1032c 001102c1 03145f53 5953534d 55325f34 32323832 
 33383232 322402c1 0202c104 03c2022d 04c3531b 16018003 c2075303 c2031401 
 8002c102 02c103ff ffffff02 c1032c00 1102c104 145f5359 53534d55 335f3232 
 31303734 32363432 2402c102 02c10403 c2023d04 c3531b17 018003c2 074903c2 
 02290180 02c10202 c103ffff ffff02c1 032c0011 02c10514 5f535953 534d5534 
 5f313435 35333138 30303624 02c10202 c10403c2 024d04c3 4f4f5101 8003c205 
 4502c164 018002c1 0402c103 ffffffff 02c1032c 001102c1 06145f53 5953534d 
 55355f33 37383736 32323331 362402c1 0202c104 03c2025d 04c3531b 1c018003 
 c2071603 c2021701 8002c102 02c103ff ffffff02 c1032c00 1102c107 145f5359 
 53534d55 365f3234 36303234 38303639 2402c102 02c10403 c2030904 c3531b1d 
 018003c2 075a03c2 02130180 02c10202 c103ffff ffff02c1 032c0011 02c10814 
 5f535953 534d5537 5f313932 34383833 30333724 02c10202 c10403c2 031904c3 
 531b2201 8003c207 5a03c202 1d018002 c10202c1 03ffffff ff02c103 2c001102 
 c109135f 53595353 4d55385f 32373630 33383233 372402c1 0202c104 03c20329 
 04c34f2e 34018003 c2071703 c2024a01 8002c102 02c103ff ffffff02 c1032c00 
 1102c10a 145f5359 53534d55 395f3335 39333435 30363135 2402c102 02c10403 

 <32 bytes per line>

…………类似方法修改其他值

BBED> sum apply
Check value for File 1, Block 225:
current = 0x4d51, required = 0x4d51

启动数据库并解决异常undo

SQL> conn / as sysdba
Connected to an idle instance.
SQL> startup pfile='/tmp/pfile'
ORACLE instance started.

Total System Global Area  313860096 bytes
Fixed Size                  1344652 bytes
Variable Size             218106740 bytes
Database Buffers           88080384 bytes
Redo Buffers                6328320 bytes
Database mounted.
Database opened.

SQL> drop tablespace undotbs1;
drop tablespace undotbs1
*
ERROR at line 1:
ORA-01561: failed to remove all objects in the tablespace specified


SQL> drop tablespace undotbs1 including contents;
drop tablespace undotbs1
*
ERROR at line 1:
ORA-01561: failed to remove all objects in the tablespace specified

SQL> select type# from seg$ where file#=3; 

     TYPE#
----------
        10
        10
        10
        10
        10
        10
        10
        10
        10
        10

10 rows selected.

SQL> update seg$ set type#=3 where file#=3;

10 rows updated.

SQL> commit;

Commit complete.

SQL>  drop tablespace undotbs1 including contents;

Tablespace dropped.

SQL> create undo tablespace undotbs1 datafile '/u01/oracle/oradata/ora11g/undotbs01.dbf' size 10m;

Tablespace created.

相关文档
bbed修改undo$(回滚段)状态
因非常规操作导致删除表空间提示ORA-01561解决办法

发表在 非常规恢复 | 标签为 , , | 一条评论

记录8.0.5数据库恢复过程

某银行需要恢复2004年磁带备份数据库 8.0.5,当时因为开发商使用begin backup命令进行热备,然后压缩到磁带上.在硬盘紧张,rman不成熟的时代,使用这样的方法备份本身没有大问题,可是因为没有备份归档日志,给现在的恢复带来了很多的问题.
尝试不完全恢复启动库

SVRMGR> startup pfile='init.ora' mount
ORACLE instance started.
Total System Global Area                        141826208 bytes
Fixed Size                                          47264 bytes
Variable Size                                   124829696 bytes
Database Buffers                                 16777216 bytes
Redo Buffers                                       172032 bytes
Database mounted.
SVRMGR> recover database using backup controlfile until cancel;
ORA-00279: change 613561556 generated at 08/21/04 22:42:48 needed for thread 1
ORA-00289: suggestion : /oracle/product/8.0.5/dbs/arch1_55329.dbf
ORA-00280: change 613561556 for thread 1 is in sequence #55329
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.
SVRMGR> alter database open resetlogs;
alter database open resetlogs
*
ORA-00600: internal error code, arguments: [4146], [31400], [31370], [], [], [], [], []

这个错误是因为回滚段corruption导致,客户已经没有了一致性要求,所以解决办法是通过一些手段找出回滚段并屏蔽

屏蔽回滚段重启库

SVRMGR> shutdown abort
ORACLE instance shut down.
SVRMGR> startup  pfile='init.ora' mount
ORACLE instance started.
Total System Global Area                        141826208 bytes
Fixed Size                                          47264 bytes
Variable Size                                   124829696 bytes
Database Buffers                                 16777216 bytes
Redo Buffers                                       172032 bytes
Database mounted.
SVRMGR> alter database open;
alter database open
*
ORA-00600: internal error code, arguments: [3668], [1], [2], [55992], [55992], [4], [], []

根据ORA-00600[3668]错误的提示,因为重建控制文件,有一个数据文件因为在磁带恢复丢失,所以控制文件在上次重建的时候没有包含进去,根据经验在8.0中可以通过resetlogs解决

resetlogs重新打开数据库

SVRMGR> alter database open resetlogs;
alter database open resetlogs
*
ORA-01139: RESETLOGS option only valid after an incomplete database recovery
SVRMGR> recover datatabase using backup controlfile until cancel;
ORA-00274: illegal recovery option DATATABASE
SVRMGR> recover database using backup controlfile until cancel;
ORA-00279: change 613561558 generated at 07/30/12 09:14:49 needed for thread 1
ORA-00289: suggestion : /oracle/product/8.0.5/dbs/arch1_1.dbf
ORA-00280: change 613561558 for thread 1 is in sequence #1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.
SVRMGR> alter database open resetlogs;
alter database open resetlogs
*
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number  with name "" too small

根据经验可能是屏蔽了undo或者undo损坏,然后有事务可能需要读undo,无法读到对应记录,可能出现此错误,因为无trace文件,尝试推进scn解决.当然这个问题可以通过open时做10046然后深入分析也许可以找到原因.

open数据库成功
根据上面的分析,重启库,推进scn,resetlogs库解决问题

SVRMGR> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SVRMGR> startup pfile='init.ora' mount;
ORACLE instance started.
Total System Global Area                        141826208 bytes
Fixed Size                                          47264 bytes
Variable Size                                   124829696 bytes
Database Buffers                                 16777216 bytes
Redo Buffers                                       172032 bytes
Database mounted.
SVRMGR> alter database open ;
alter database open 
*
ORA-00600: internal error code, arguments: [3668], [1], [2], [55994], [55994], [4], [], []
SVRMGR> recover database using backup controlfile until cancel;
ORA-00279: change 613561560 generated at 07/30/12 09:17:11 needed for thread 1
ORA-00289: suggestion : /oracle/product/8.0.5/dbs/arch1_1.dbf
ORA-00280: change 613561560 for thread 1 is in sequence #1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.
SVRMGR> alter database open resetlogs;
Statement processed.

这次恢复比较幸运,在备份到磁带过程中被覆盖的一个同名的数据文件,因为是index,通过查询dba_extents,然后删除并重建相关index,s数据库算恢复完全.有些时候,对数据库多一份备份,可能成为最后的救命稻草,哪怕是一份不是十分完整的备份.再次重复:对dba而言,数据库备份重于一切

发表在 Oracle备份恢复 | 标签为 | 一条评论

bbed修改undo$(回滚段)状态

指定undo$对应数据块(11g 1_225,10g 1_106)[具体时候需要查询]

BBED> set file 1 block 225
        FILE#           1
        BLOCK#          225

查询数据内容,主要关注kdbr

BBED> map
 File: /u01/oracle/oradata/ora11g/system01.dbf (1)
 Block: 225                                   Dba:0x004000e1
------------------------------------------------------------
 KTB Data Block (Table/Cluster)

 struct kcbh, 20 bytes                      @0       

 struct ktbbh, 48 bytes                     @20      

 struct kdbh, 14 bytes                      @68      

 struct kdbt[1], 4 bytes                    @82      

 sb2 kdbr[31]                               @86      <===表示31条记录,从0开始 

 ub1 freespace[3644]                        @148     

 ub1 rowdata[4396]                          @3792    

 ub4 tailchk                                @8188   

查看30号回滚段内容(列举其中一个)

BBED> p *kdbr[30]
rowdata[0]
----------
ub1 rowdata[0]                              @3792     0x2c

BBED> x /rncnnnnnnn
rowdata[0]                                  @3792    
----------
flag@3792: 0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@3793: 0x01
cols@3794:   17

col    0[2] @3795: 30 
col   1[21] @3798: _SYSSMU30_3379578723$
col    2[2] @3820: 1 
col    3[2] @3823: 3 
col    4[3] @3826: 432 
col    5[1] @3830: 0 
col    6[1] @3832: 0 
col    7[1] @3834: 0 
col    8[1] @3836: 0 
col    9[1] @3838: 0 
col   10[2] @3840: 5     <===修改前对应值undo$.status$
col   11[2] @3843: 6 
col   12[0] @3846: *NULL*
col   13[0] @3847: *NULL*
col   14[0] @3848: *NULL*
col   15[0] @3849: *NULL*
col   16[2] @3850: 2 

BBED> x /r
rowdata[0]                                  @3792    
----------
flag@3792: 0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@3793: 0x01
cols@3794:   17

col    0[2] @3795:  0xc1  0x1f 
col   1[21] @3798:  0x5f  0x53  0x59  0x53  0x53  0x4d  0x55  0x33  0x30  0x5f 
 0x33  0x33  0x37  0x39  0x35  0x37  0x38  0x37  0x32  0x33  0x24 
col    2[2] @3820:  0xc1  0x02 
col    3[2] @3823:  0xc1  0x04 
col    4[3] @3826:  0xc2  0x05  0x21 
col    5[1] @3830:  0x80 
col    6[1] @3832:  0x80 
col    7[1] @3834:  0x80 
col    8[1] @3836:  0x80 
col    9[1] @3838:  0x80 
col   10[2] @3840:  0xc1  0x06   <===修改前16进制值 
col   11[2] @3843:  0xc1  0x07 
col   12[0] @3846: *NULL*
col   13[0] @3847: *NULL*
col   14[0] @3848: *NULL*
col   15[0] @3849: *NULL*
col   16[2] @3850:  0xc1  0x03 

修改回滚段状态

BBED> m /x 02 offset 3842    <===注意修改为02
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/system01.dbf (1)
 Block: 225              Offsets: 3842 to 4353           Dba:0x004000e1
------------------------------------------------------------------------
 0202c107 ffffffff 02c1032c 001102c1 


 <32 bytes per line>

BBED> p *kdbr[30]
rowdata[0]
----------
ub1 rowdata[0]                              @3792     0x2c

BBED> x /r
rowdata[0]                                  @3792    
----------
flag@3792: 0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@3793: 0x01
cols@3794:   17

col    0[2] @3795:  0xc1  0x1f 
col   1[21] @3798:  0x5f  0x53  0x59  0x53  0x53  0x4d  0x55  0x33  0x30  0x5f 
 0x33  0x33  0x37  0x39  0x35  0x37  0x38  0x37  0x32  0x33  0x24 
col    2[2] @3820:  0xc1  0x02 
col    3[2] @3823:  0xc1  0x04 
col    4[3] @3826:  0xc2  0x05  0x21 
col    5[1] @3830:  0x80 
col    6[1] @3832:  0x80 
col    7[1] @3834:  0x80 
col    8[1] @3836:  0x80 
col    9[1] @3838:  0x80 
col   10[2] @3840:  0xc1  0x02   <===修改后16进制值
col   11[2] @3843:  0xc1  0x07 
col   12[0] @3846: *NULL*
col   13[0] @3847: *NULL*
col   14[0] @3848: *NULL*
col   15[0] @3849: *NULL*
col   16[2] @3850:  0xc1  0x03 

BBED>  x /rncnnnnnnn
rowdata[0]                                  @3792    
----------
flag@3792: 0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@3793: 0x01
cols@3794:   17

col    0[2] @3795: 30 
col   1[21] @3798: _SYSSMU30_3379578723$
col    2[2] @3820: 1 
col    3[2] @3823: 3 
col    4[3] @3826: 432 
col    5[1] @3830: 0 
col    6[1] @3832: 0 
col    7[1] @3834: 0 
col    8[1] @3836: 0 
col    9[1] @3838: 0 
col   10[2] @3840: 1    <===实际展示值undo$.status$
col   11[2] @3843: 6 
col   12[0] @3846: *NULL*
col   13[0] @3847: *NULL*
col   14[0] @3848: *NULL*
col   15[0] @3849: *NULL*
col   16[2] @3850: 2  

1.修改为02(表示的实际值比修改值大1[1:DELETE,2:OFFLINE, 3:ONLINE,4:UNDEFINED,5:NEEDS RECOVERY,6:PARTLY AVAILABLE,其他表示:UNDEFINED])
2.Offset需要在10列漂移量上+2(或者11列偏移量-1)

修改验证值

BBED> sum apply
Check value for File 1, Block 225:
current = 0x9708, required = 0x9708

补充说明
1)通过dbms_metadata.get_ddl分析DBA_ROLLBACK_SEGS可以知道seg$.status$表示含义

2, 'OFFLINE'
3, 'ONLINE',
4, 'UNDEFINED'
5, 'NEEDS RECOVERY',
6, 'PARTLY AVAILABLE'
other, 'UNDEFINED'

2)该方法一般出现在使用隐含参数屏蔽回滚段无效之后使用,一般不到最后,不建议使用该方法,可能引起不可预知的后果
3)使用该方法启动数据库后,需要手工删除seg$中被bbed处理过的回滚段

发表在 非常规恢复 | 标签为 , , | 一条评论