通过多次resetlogs规避类似ORA-01248: file N was created in the future of incomplete recovery错误

数据库现状
控制文件
recover_xifenfei0
控制文件中数据文件信息
recover_xifenfei1
数据文件头信息
recover_xifenfei2
redo信息
recover_xifenfei3
根据当前数据库恢复检查脚本(Oracle Database Recovery Check)收集的信息,数据库的是非归档状态,而且redo已经覆盖,数据库datafile 5 无法直接online.遇到这样情况,可以使用bbed修改文件头scn实现online(使用bbed让rac中的sysaux数据文件online),也可以通过使用_allow_resetlogs_corruption等隐含参数实现online.本恢复案例中有180个数据文件,160个offline,然后open数据库,所以大量数据文件无法正常online,bbed工作量太大.在恢复过程中不幸遇到ORA-01248

数据库resetlogs出现ORA-01248错误

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01248: file 5 was created in the future of incomplete recovery
ORA-01110: data file 5: 'F:\TTDATA\PUBRTS.DAT'

alert日志记录

Fri Oct 10 15:09:26 2014
alter database open resetlogs
Fri Oct 10 15:09:26 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1248 signalled during: alter database open resetlogs...
Fri Oct 10 15:15:22 2014
alter database open
Fri Oct 10 15:15:22 2014
ORA-1589 signalled during: alter database open...
Fri Oct 10 15:15:30 2014
alter database  open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1248 signalled during: alter database  open resetlogs...

尝试offline文件然后resetlogs

SQL>ALTER DATABASE DATAFILE 5  OFFLINE;

Database altered.


sql>ALTER DATABASE OPEN RESETLOGS;

ERROR at line 1:
ORA-01245: ffline file 5 will be lost if resetlogs is done
ORA-01110: data file 5: 'F:\TTDATA\PUBRTS.DAT'

alert日志

Fri Oct 10 15:19:37 2014
ALTER DATABASE DATAFILE 5 offline
Fri Oct 10 15:19:37 2014
Completed: ALTER DATABASE DATAFILE 5 offline
Fri Oct 10 15:19:40 2014
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...

出现该错误原因是由于数据库是非归档模式,offline数据文件需要使用offline drop

Fri Oct 10 15:22:16 2014
alter database datafile 5 offline drop
Fri Oct 10 15:22:17 2014
Completed: alter database datafile 5 offline drop
Fri Oct 10 15:23:13 2014
alter database open resetlogs
Fri Oct 10 15:23:14 2014
Fri Oct 10 15:23:49 2014
RESETLOGS after complete recovery through change 1422423346
Resetting resetlogs activation ID 3503292347 (0xd0cfffbb)
Fri Oct 10 15:24:01 2014
Setting recovery target incarnation to 3
Fri Oct 10 15:24:04 2014
Assigning activation ID 3649065262 (0xd980512e)
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=23, OS id=3772
Fri Oct 10 15:24:04 2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=24, OS id=3668
Fri Oct 10 15:24:05 2014
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLTTDB\REDO01.LOG
Successful open of redo thread 1
Fri Oct 10 15:24:05 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Oct 10 15:24:05 2014
ARC0: STARTING ARCH PROCESSES
ARC2: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ARC0: Becoming the 'no FAL' ARCH
ARC2 started with pid=25, OS id=636
Fri Oct 10 15:24:06 2014
ARC0: Becoming the 'no SRL' ARCH
Fri Oct 10 15:24:06 2014
ARC1: Becoming the heartbeat ARCH
Fri Oct 10 15:24:06 2014
SMON: enabling cache recovery
Fri Oct 10 15:24:07 2014
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
File #5 is offline, but is part of an online tablespace.
data file 5: 'F:\TTDATA\PUBRTS.DAT'
Dictionary check complete
Fri Oct 10 15:24:19 2014
SMON: enabling tx recovery
Fri Oct 10 15:24:19 2014
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=26, OS id=868
Fri Oct 10 15:24:21 2014
LOGSTDBY: Validating controlfile with logical metadata
Fri Oct 10 15:24:21 2014
LOGSTDBY: Validation complete
Completed: alter database open resetlogs

open成功后,再次resetlogs库,实现数据文件online

Fri Oct 10 15:28:44 2014
ALTER DATABASE DATAFILE 5 online
Fri Oct 10 15:28:44 2014
Completed: ALTER DATABASE DATAFILE 5 online
Fri Oct 10 15:31:46 2014
alter database open resetlogs
Fri Oct 10 15:31:46 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
Setting recovery target incarnation to 4
Fri Oct 10 15:32:00 2014
Assigning activation ID 3649091231 (0xd980b69f)
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=23, OS id=700
Fri Oct 10 15:32:00 2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=24, OS id=3360
Fri Oct 10 15:32:01 2014
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLTTDB\REDO01.LOG
Successful open of redo thread 1
Fri Oct 10 15:32:01 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Oct 10 15:32:01 2014
ARC0: STARTING ARCH PROCESSES
ARC2: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ARC0: Becoming the 'no FAL' ARCH
ARC2 started with pid=25, OS id=2016
Fri Oct 10 15:32:02 2014
ARC0: Becoming the 'no SRL' ARCH
Fri Oct 10 15:32:02 2014
ARC1: Becoming the heartbeat ARCH
Fri Oct 10 15:32:02 2014
SMON: enabling cache recovery
Fri Oct 10 15:32:03 2014
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
Fri Oct 10 15:32:15 2014
Dictionary check complete
Fri Oct 10 15:32:15 2014
SMON: enabling tx recovery
Fri Oct 10 15:32:15 2014
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=26, OS id=256
Fri Oct 10 15:32:17 2014
LOGSTDBY: Validating controlfile with logical metadata
Fri Oct 10 15:32:17 2014
LOGSTDBY: Validation complete
Completed: alter database open resetlogs
发表在 Oracle备份恢复 | 标签为 , , | 评论关闭

MOS又一次不靠谱—ORA-27163: out of memory

数据库版本

oracle -> 11g @xifenfei:/home/oracle$sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Tue Sep 30 10:28:30 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE    11.2.0.4.0      Production
TNS for HPUX: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 - Production

数据库补丁信息

oracle -> 11g @xifenfei:/home/oracle$opatch lspatches
18973907;
18795105;
18701707;
18284357;
18020394;
17564992;
17308789;
17306264;
17259786;
17079301;
16477664;
16188701;
14368995;
14245531;
11744544;
18522515;OCW Patch Set Update : 11.2.0.4.3 (18522515)
18522509;Database Patch Set Update : 11.2.0.4.3 (18522509)

收集spa报告报ORA-27163: out of memory错误

SQL> SELECT dbms_sqlpa.report_analysis_task('SPA_TEST', 'HTML', 'errors','ALL') FROM dual;

ERROR:
ORA-27163: out of memory
ORA-06512: at "SYS.DBMS_SQLTUNE_INTERNAL", line 8211
ORA-06512: at "SYS.DBMS_SQLPA", line 515
ORA-06512: at line 1


no rows selected

设置event后收集

Connected.
SQL> alter session set events '31156 trace name context forever, level 0x400';

Session altered.

SQL> SELECT dbms_sqlpa.report_analysis_task('SPA_TEST', 'HTML', 'errors','ALL') FROM dual;

DBMS_SQLPA.REPORT_ANALYSIS_TASK('SPA_TEST','HTML','ERRORS','ALL')
--------------------------------------------------------------------------------
<html>
    <head>
        <title>
   SQL Performance Impact Analyzer Report

查询MOS发现XML Parser Fails With ORA-27163 (Out Of Memory) (Doc ID 1599434.1),相关描述:
bug-xml
按照文档描述,该问题在11.2.0.4中已经修复,可是在hp unix中依然存在该问题,经验告诉我们,现在的MOS不能完全相信

发表在 Oracle | 标签为 | 评论关闭

Alert Log Errors: 12170 TNS-12535/TNS-00505: Operation Timed Out

客户反馈系统经常报会话超时,导致应用测试无法正常进行,经检查alert日志发现

Fatal NI connect error 12170.

  VERSION INFORMATION:
        TNS for HPUX: Version 11.2.0.4.0 - Production
        Oracle Bequeath NT Protocol Adapter for HPUX: Version 11.2.0.4.0 - Production
        TCP/IP NT Protocol Adapter for HPUX: Version 11.2.0.4.0 - Production
  Time: 29-SEP-2014 20:42:56
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12535

TNS-12535: TNS:operation timed out
    ns secondary err code: 12560
    nt main err code: 505

TNS-00505: Operation timed out
    nt secondary err code: 238
    nt OS err code: 0
  Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=10.78.199.47)(PORT=55447))
Mon Sep 29 20:42:56 2014

虽然大部分网站或者mos上描述,Fatal NI connect error 12170部分情况考虑使用配置如下配置

##调整listener.ora 
vi $ORACLE_HOME/network/admin/listener.ora
增加:
DIAG_ADR_ENABLED_LISTENER=OFF
INBOUND_CONNECT_TIMEOUT_LISTENER=180

##调整sqlnet.ora  
vi $ORACLE_HOME/network/admin/sqlnet.ora
增加:
DIAG_ADR_ENABLED=OFF
SQLNET.INBOUND_CONNECT_TIMEOUT=180

这些已经配置,但是现在报12170 TNS-12535 TNS-00505错误,通过结合mos发现,出现该问题,可能是由于应用服务器和数据库服务器之间的防火墙策略设置不适合业务查询需求,出现应用服务器和数据库服务器防火墙超时(比如应用服务器发起一个大查询,在数据库服务器中执行,尚未返回结果,可是网络已经超时,终止会话)
补充知识点

The 'nt secondary err code' identifies the underlying network transport, such as (TCP/IP) timeout limit. 
In the current case 60 identifies Windows underlying transport layer.

The "nt secondary err code" will be different based on the operating system:

Linux x86 or Linux x86-64: "nt secondary err code: 110"
HP-UX : "nt secondary err code: 238"
AIX: "nt secondary err code: 78"
Solaris: "nt secondary err code: 145"


The alert.log message indicates that a connection was terminated AFTER it was established to the instance.  
In this case, it was terminated 2 hours and 3 minutes after the listener handed the connection to the database. 

 This would indicate an issue with a firewall where a maximum idle time setting is in place. 

The connection would not necessarily be "idle".  This issue can arise during a long running query
or when using JDBC Thin connection pooling. If there is no data 'on the wire' for lengthy

periods of time for any reason, the firewall might terminate the connection.

解决方案

The non-Oracle solution would be to remove or increase the firewall setting for maximum idle time.  
In cases where this is not feasible, Oracle offers the following suggestion:

The following parameter, set at the **RDBMS_HOME/network/admin/sqlnet.ora, can resolve this kind of problem.  
DCD or SQLNET.EXPIRE_TIME can mimic data transmission between the server and the client during long periods of idle time.

SQLNET.EXPIRE_TIME=n  Where <n> is a non-zero value set in minutes.  

See the following : Note 257650.1 Resolving Problems with Connection Idle Timeout With Firewall

当然除下面数据库中解决外,还可以在网络防火墙层面解决,比如增加网络空闲终止时间等

具体参考:Alert Log Errors: 12170 TNS-12535/TNS-00505: Operation Timed Out (Doc ID 1628949.1)
Fatal NI Connect Error 12170, ‘TNS-12535: TNS:operation timed out’ Reported in 11g Alert Log (Doc ID 1286376.1)

发表在 Oracle 监听 | 标签为 , , | 评论关闭