标签归档:ORA-01115

aix平台 ORA-01115 ORA-01110 ORA-27067 故障恢复

接到朋友恢复请求,aix 5.3,Oracle 10.2.0.1平台,数据库启动报ORA-01115 ORA-01110 ORA-27067错误,数据库无法正常打开,通过分析,是由于10201在aix上面的bug导致,通过技巧规避,完美解决给问题,数据0丢失
数据库报错alert日志

Mon Aug 10 13:25:22 2015
ALTER DATABASE   MOUNT
Mon Aug 10 13:25:29 2015
Setting recovery target incarnation to 1
Mon Aug 10 13:25:29 2015
Successful mount of redo thread 1, with mount id 432339141
Mon Aug 10 13:25:29 2015
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Mon Aug 10 13:25:36 2015
alter database open
Mon Aug 10 13:25:36 2015
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Mon Aug 10 13:25:37 2015
Started redo scan
Mon Aug 10 13:25:52 2015
Completed redo scan
 7889582 redo blocks read, 75305 data blocks need recovery
Mon Aug 10 13:25:53 2015
Errors in file /dc/admin/datacent/bdump/datacent_p002_144124.trc:
ORA-01115: IO error reading block from file 2 (block # 40704)
ORA-01110: data file 2: '/dc/oradata/datacent/undotbs01.dbf'
ORA-27067: size of I/O buffer is invalid
Additional information: 2
Additional information: 1572864
Mon Aug 10 13:25:53 2015
Aborting crash recovery due to slave death, attempting serial crash recovery
Mon Aug 10 13:25:53 2015
Beginning crash recovery of 1 threads
Mon Aug 10 13:25:53 2015
Started redo scan
Mon Aug 10 13:26:09 2015
Completed redo scan
 7889582 redo blocks read, 75305 data blocks need recovery
Mon Aug 10 13:26:12 2015
Aborting crash recovery due to error 1115
Mon Aug 10 13:26:12 2015
Errors in file /dc/admin/datacent/udump/datacent_ora_123384.trc:
ORA-01115: IO error reading block from file 2 (block # 39077)
ORA-01110: data file 2: '/dc/oradata/datacent/undotbs01.dbf'
ORA-27067: size of I/O buffer is invalid
Additional information: 2
Additional information: 1310720
ORA-1115 signalled during: alter database open...

这里报的前面两个错误ORA-01115 ORA-01110我们都非常熟悉,类似数据库启动遇到坏块或者io错误之时可能就会报如此错误。但是ORA-27067确实不多见,从mos上看,很多是由于rman备份之时的bug可能导致该错误。

dbv检测undo坏块文件

DBVERIFY: Release 10.2.0.1.0 - Production on Mon Aug 10 23:18:15 2015

Copyright (c) 1982, 2003, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /dc/oradata/datacent/undotbs01.dbf


DBVERIFY - Verification complete

Total Pages Examined         : 329600
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 327504
Total Pages Processed (Seg)  : 17
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2096
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1887888 (0.1887888)

这里可以看到,undo文件本身并没有逻辑和物理的坏块,证明因为数据库异常的原因,可能是由于ORA-27067: size of I/O buffer is invalid导致。根据官方文档ORA-01115 ORA-27067 DURING PARALLEL INSTANCE RECOVERY AFTER INSTANCE CRASH中的解释,我们基本上可以确定很可能是由于10.2.0.1在aix平台的jfs2系统中,由于大量事务操作,突然abort掉数据库(也可能断电),从而数据库在启动的时候进行实例恢复,而由于内部的bug,导致实例恢复无法成功。通过我们处理后的,数据库完美启动,数据0丢失

数据库启动日志

Mon Aug 10 16:34:14 2015
alter database open
Mon Aug 10 16:34:14 2015
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Mon Aug 10 16:34:14 2015
Started redo scan
Mon Aug 10 16:34:27 2015
Completed redo scan
 7889582 redo blocks read, 0 data blocks need recovery
Mon Aug 10 16:34:27 2015
Started redo application at
 Thread 1: logseq 664704, block 1286922
Mon Aug 10 16:34:27 2015
Recovery of Online Redo Log: Thread 1 Group 4 Seq 664704 Reading mem 0
  Mem# 0 errs 0: /dev/rredo04
Mon Aug 10 16:34:32 2015
Recovery of Online Redo Log: Thread 1 Group 5 Seq 664705 Reading mem 0
  Mem# 0 errs 0: /dev/rredo05
Mon Aug 10 16:34:38 2015
Recovery of Online Redo Log: Thread 1 Group 6 Seq 664706 Reading mem 0
  Mem# 0 errs 0: /dev/rredo06
Mon Aug 10 16:34:40 2015
Completed redo application
Mon Aug 10 16:34:40 2015
Completed crash recovery at
 Thread 1: logseq 664706, block 1017805, scn 8554793334
 0 data blocks read, 0 data blocks written, 7889582 redo blocks read
Mon Aug 10 16:34:40 2015
Thread 1 advanced to log sequence 664707
Thread 1 opened at log sequence 664707
  Current log# 1 seq# 664707 mem# 0: /dev/rredo01
Successful open of redo thread 1
Mon Aug 10 16:34:40 2015
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Aug 10 16:34:40 2015
SMON: enabling cache recovery
Mon Aug 10 16:34:40 2015
Successfully onlined Undo Tablespace 1.
Mon Aug 10 16:34:40 2015
SMON: enabling tx recovery
Mon Aug 10 16:34:41 2015
Database Characterset is ZHS32GB18030
replication_dependency_tracking turned off (no async multimaster replication found)
WARNING: AQ_TM_PROCESSES is set to 0. System operation might be adversely affected.
Mon Aug 10 16:34:41 2015
SMON: Parallel transaction recovery tried
Mon Aug 10 16:34:42 2015
db_recovery_file_dest_size of 2048 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Mon Aug 10 16:34:42 2015
Completed: alter database open
发表在 ORA-xxxxx, Oracle备份恢复 | 标签为 , , , | 2 条评论

ORA-27069: skgfdisp: 尝试在文件范围外执行 I/O

接到网友技术支持请求,win 2003 ntfs格式文件系统,Oracle 8.1.7版本,主机重启后,数据库无法正常启动,offline datafile 15,数据库open成功,但是datafile 无法正常online,报错为:ORA-27069: skgfdisp: attempt to do I/O beyond the range of the file,请求协助处理

SQL> recover datafile 'D:\ORACLE\ORADATA\ORCL\ZSF_DATA.DBF';
ORA-00283: 恢复会话因错误而取消
ORA-01115: 从文件 15 读取块时出现 IO 错误 (块 # 1030071)
ORA-01110: 数据文件 15: 'D:\ORACLE\ORADATA\ORCL\ZSF_DATA.DBF'
ORA-27069: skgfdisp: 尝试在文件范围外执行 I/O
OSD-04026: 无效的参数经过. (OS 1030071)

使用bbed,成功online datafile 15

Tue Oct 28 16:30:35 2014
ALTER DATABASE RECOVER  datafile 15  
Tue Oct 28 16:30:35 2014
Media Recovery Datafile: 15
Media Recovery Start
Media Recovery Log 
Recovery of Online Redo Log: Thread 1 Group 1 Seq 245110 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO03.LOG
Media Recovery failed with error 1115
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 15  ...
Tue Oct 28 16:32:53 2014
Shutting down instance (abort)
License high water mark = 6
Instance terminated by USER, pid = 1548
Starting up ORACLE RDBMS Version: 8.1.7.0.0.
System parameters with non-default values:
  processes                = 600
  shared_pool_size         = 52428800
  large_pool_size          = 20971520
  java_pool_size           = 20971520
  control_files            = D:\oracle\oradata\ORCL\control01.ctl, D:\oracle\oradata\ORCL\control02.ctl
  db_block_buffers         = 19200
  db_block_size            = 8192
  compatible               = 8.1.0
  log_buffer               = 32768
  log_checkpoint_interval  = 10000
  log_checkpoint_timeout   = 1800
  db_files                 = 1024
  db_file_multiblock_read_count= 8
  max_enabled_roles        = 30
  remote_login_passwordfile= EXCLUSIVE
  global_names             = TRUE
  distributed_transactions = 500
  instance_name            = ORCL
  service_names            = ORCL
  mts_dispatchers          = (PROTOCOL=TCP)(PRE=oracle.aurora.server.SGiopServer)
  open_links               = 4
  sort_area_size           = 65536
  sort_area_retained_size  = 65536
  db_name                  = ORCL
  open_cursors             = 500
  ifile                    = D:\oracle\admin\ORCL\pfile\init.ora
  os_authent_prefix        = 
  job_queue_processes      = 4
  job_queue_interval       = 10
  parallel_max_servers     = 5
  background_dump_dest     = D:\oracle\admin\ORCL\bdump
  user_dump_dest           = D:\oracle\admin\ORCL\udump
  max_dump_file_size       = 10240
  oracle_trace_collection_name= 
PMON started with pid=2
DBW0 started with pid=3
LGWR started with pid=4
CKPT started with pid=5
SMON started with pid=6
RECO started with pid=7
SNP0 started with pid=8
SNP1 started with pid=9
SNP2 started with pid=10
SNP3 started with pid=11
Tue Oct 28 16:33:01 2014
starting up 1 shared server(s) ...
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Tue Oct 28 16:33:02 2014
ALTER DATABASE   MOUNT
Tue Oct 28 16:33:06 2014
Successful mount of redo thread 1, with mount id 1389958722.
Tue Oct 28 16:33:06 2014
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE   MOUNT
Tue Oct 28 16:33:49 2014
ALTER DATABASE RECOVER  database until cancel  
Tue Oct 28 16:33:49 2014
Media Recovery Start
Media Recovery Log 
kcrrga: Warning.  Log sequence in archive filename wrapped
to fix length as indicated by %S in LOG_ARCHIVE_FORMAT.
Old log archive with same name might be overwritten.
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Tue Oct 28 16:34:03 2014
ALTER DATABASE RECOVER    LOGFILE 'D:\ORACLE\ORADATA\ORCL\REDO02.LOG'  
Tue Oct 28 16:34:03 2014
Media Recovery Log D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Incomplete recovery applied all redo ever generated.
Recovery completed through change %s139866389
Media Recovery Complete
Completed: ALTER DATABASE RECOVER    LOGFILE 'D:\ORACLE\ORADA
Tue Oct 28 16:34:29 2014
alter database datafile 15 online

Tue Oct 28 16:34:29 2014
Completed: alter database datafile 15 online
Tue Oct 28 16:34:36 2014
alter database open resetlogs

RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 139866389
Tue Oct 28 16:34:38 2014
Thread 1 opened at log sequence 1
  Current log# 2 seq# 1 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Successful open of redo thread 1.
Tue Oct 28 16:34:38 2014
SMON: enabling cache recovery
Tue Oct 28 16:34:38 2014
Dictionary check beginning
Dictionary check complete
Tue Oct 28 16:34:39 2014
SMON: enabling tx recovery
Tue Oct 28 16:34:44 2014
Completed: alter database open resetlogs

数据库datafile 15 online成功后,客户操作业务继续发生ORA-600[ktsxs_add2]错误

Tue Oct 28 17:07:42 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02340.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []

Tue Oct 28 17:07:53 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02340.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []

Tue Oct 28 17:08:03 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02340.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []

Tue Oct 28 17:08:16 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02340.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []

Tue Oct 28 17:08:23 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02308.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []

Tue Oct 28 17:08:31 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02340.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []

Tue Oct 28 17:08:38 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02308.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []

通过分析相关日志发现是insert插入表报错,很好理解,该库的datafile 15已经超过了系统的限制,现在继续插入数据,因此报错,查询可能异常对象

SQL> col segment_name for a20
SQL> SELECT distinct OWNER, SEGMENT_NAME, SEGMENT_TYPE, A.PARTITION_NAME
  2    FROM DBA_EXTENTS A
  3   WHERE FILE_ID = 15
  4     AND 1030071 <= BLOCK_ID;

OWNER                          SEGMENT_NAME         SEGMENT_TYPE
------------------------------ -------------------- ------------------
PARTITION_NAME
------------------------------
ZSF                            DETAIL               TABLE


ZSF                            DETAIL1              INDEX


ZSF                            DETAIL2              INDEX



OWNER                          SEGMENT_NAME         SEGMENT_TYPE
------------------------------ -------------------- ------------------
PARTITION_NAME
------------------------------
ZSF                            DETAIL3              INDEX


ZSF                            DETAIL4              INDEX


ZSF                            FK_RECI_ORD          INDEX



OWNER                          SEGMENT_NAME         SEGMENT_TYPE
------------------------------ -------------------- ------------------
PARTITION_NAME
------------------------------
ZSF                            PREPAY1              INDEX


ZSF                            RECEDETAIL1          INDEX

创建新表空间

Create tablespace zsf_new datafile  'D:\ORACLE\ORADATA\ORCL\ZSF_DATA_new01.dbf' size 4096m;
alter tablespace zsf_new add datafile 'D:\ORACLE\ORADATA\ORCL\ZSF_DATA_new02.dbf' 
size 128m autoextend on next 128M maxsize 4096m;

迁移异常对象到新表空间

alter table ZSF.DETAIL move tablespace ZSF_new;
alter index ZSF.DETAIL1 rebuild tablespace ZSF_new;
alter index ZSF.DETAIL2 rebuild tablespace ZSF_new;
alter index ZSF.DETAIL3 rebuild tablespace ZSF_new;
alter index ZSF.DETAIL4 rebuild tablespace ZSF_new;
alter index ZSF.FK_RECI_ORD rebuild tablespace ZSF_new;
alter index ZSF.PREPAY1 rebuild tablespace ZSF_new;
alter index ZSF.RECEDETAIL1 rebuild tablespace ZSF_new;

然后对于datafile 15所在表空间增加新文件,因为已经迁移了异常对象,然后resize datafile 15小于8G,关闭自扩展,至此该数据库恢复完成

发表在 非常规恢复 | 标签为 , , , , , , | 评论关闭