9i库遇到ORA-01595/ORA-01594

在alert日志中发现ORA-01595/ORA-01594错误

Sat May 12 21:54:17 2012
Errors in file /oracle/app/admin/prmdb/bdump/prmdb2_smon_483522.trc:
ORA-01595: error freeing extent (2) of rollback segment (19))
ORA-01594: attempt to wrap into rollback segment (19) extent (2) which is being freed

分析trace文件

Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production
ORACLE_HOME = /oracle/app/product/9.2.0
System name:    AIX
Node name:      prmsvr2
Release:        3
Version:        5
Machine:        0008585FD600
Instance name: prmdb2
Redo thread mounted by this instance: 2
Oracle process number: 14
Unix process pid: 483522, image: oracle@prmsvr2 (SMON)

*** 2011-05-03 15:28:47.858
*** SESSION ID:(17.1) 2011-05-03 15:28:47.843
*** 2011-05-03 15:28:47.858
SMON: Parallel transaction recovery tried
*** 2011-07-11 17:13:52.028
SMON: Restarting fast_start parallel rollback
*** 2011-07-11 17:28:39.705
SMON: Parallel transaction recovery tried
*** 2012-05-12 21:54:17.246   --当前问题时间点
SMON: following errors trapped and ignored:
ORA-01595: error freeing extent (2) of rollback segment (19))
ORA-01594: attempt to wrap into rollback segment (19) extent (2) which is being freed

--通过trace文件,我们没有获得关于该错误的其他有用信息

查询MOS相关信息[280151.1]
出现该错误原因

This is a known problem and there is an Internal Bug:2181139 for this Issue.

The error is signaled because smon is shrinking a rollback segment and this fails 
because we need an extent to store some rollback information. This is a failure message 
for the shrinking. Subsequently smon would succeed in doing that.
--当smon在shrink rollback segment因为需要一个extent存放rollback

解决建议

Ignore these error messages.
Normally adding more undo space should solve the problem, 
but if space is not correcting the problem, please file an SR for assistance.

This error message logging is fixed in 10g.
--忽略该错误或者升级到10g
发表在 ORA-xxxxx | 标签为 | 2 条评论

关于High Versions Count总结

High Versions Count后果
sql查询–>hash对比确定是否存在shared pool中(不讨论不存在情况)–>选择合适的children
可能导致library cache latch contention

When you have unnecessary versions of a cursor, each time that cursor is executed, 
the parse engine has to search through the list of versions to see which is the cursor that you want. 
This wastes CPU cycles that you could be using on something else.

High version counts can easily cause high contention for library cache latches.
A process parsing a SQL statement with many versions (children cursors) will need to scan
through all these children while holding on to a library cache latch.
This means that other processes needing the same latch will have to wait and
can lead to significant database-wide performance degradation.

引起High Versions Count原因

•UNBOUND_CURSOR - The existing child cursor was not fully built (in other words, it was not optimized) 

•SQL_TYPE_MISMATCH - The SQL type does not match the existing child cursor 

•**OPTIMIZER_MISMATCH - The optimizer environment does not match the existing child cursor. 
For example:
select count(*) from emp; ->> 1 PARENT, 1 CHILD 
alter session set optimizer_mode=ALL_ROWS 
select count(*) from emp; ->> 1 PARENT, 2 CHILDREN 
    (The optimizer mode has changed and therefore 
       the existing child cannot be reused)
(The same applies with events - if I turned on tracing with 10046 than 
I would get the OPTIMIZER_MISMATCH again and a 3rd child) 

•OUTLINE_MISMATCH - The outlines do not match the existing child cursor 
If my user had created stored outlines previously for this command and they were 
stored in seperate categories (say "OUTLINES1" and "OUTLINES2") running:-

alter session set use_stored_outlines = OUTLINES1; 
select count(*) from emp; 
alter session set use_stored_oulines= OUTLINES2; 
select count(*) from emp;
--> Would create a 2nd child as the outline used is different than the first run. 

•STATS_ROW_MISMATCH - The existing statistics do not match the existing child cursor. 
Check that 10046/sql_trace is not set on all sessions as this can cause this. 

•LITERAL_MISMATCH - Non-data literal values do not match the existing child cursor 

•SEC_DEPTH_MISMATCH - Security level does not match the existing child cursor 

•EXPLAIN_PLAN_CURSOR - The child cursor is an explain plan cursor and should not be shared. 
Explain plan statements will generate a new child by default - the mismatch will be this. 

•BUFFERED_DML_MISMATCH - Buffered DML does not match the existing child cursor 

•PDML_ENV_MISMATCH - PDML environment does not match the existing child cursor 

•INST_DRTLD_MISMATCH - Insert direct load does not match the existing child cursor 

•SLAVE_QC_MISMATCH -The existing child cursor is a slave cursor and the new one was issued by the coordinator 
(or, the existing child cursor was issued by the coordinator and the new one is a slave cursor). 

•TYPECHECK_MISMATCH - The existing child cursor is not fully optimized 

•AUTH_CHECK_MISMATCH - Authorization/translation check failed for the existing child cursor 

The user does not have permission to access the object in any previous version of the cursor. 
A typical example would be where each user has it's own copy of a table 

•**BIND_MISMATCH - The bind metadata does not match the existing child cursor. For example:
variable a varchar2(100); 
select count(*) from emp where ename = :a ->> 1 PARENT, 1 CHILD 
variable a varchar2(400); 
select count(*) from emp where ename = :a ->> 1 PARENT, 2 CHILDREN 
   (The bind 'a' has now changed in definition)

•DESCRIBE_MISMATCH - The typecheck heap is not present during the describe for the child cursor 

•LANGUAGE_MISMATCH - The language handle does not match the existing child cursor 

•TRANSLATION_MISMATCH - The base objects of the existing child cursor do not match. 
The definition of the object does not match any current version. 
Usually this is indicative of the same issue as "AUTH_CHECK_MISMATCH" where the object is different.

•ROW_LEVEL_SEC_MISMATCH - The row level security policies do not match 

•INSUFF_PRIVS - Insufficient privileges on objects referenced by the existing child cursor 

•INSUFF_PRIVS_REM - Insufficient privileges on remote objects referenced by the existing child cursor 

•REMOTE_TRANS_MISMATCH - The remote base objects of the existing child cursor do not match 
USER1: select count(*) from table@remote_db 
USER2: select count(*) from table@remote_db 

  (Although the SQL is identical, the dblink pointed to 
   by remote_db may be a private dblink which resolves 
   to a different object altogether)

•LOGMINER_SESSION_MISMATCH 

•INCOMP_LTRL_MISMATCH 

•OVERLAP_TIME_MISMATCH - error_on_overlap_time mismatch 

•SQL_REDIRECT_MISMATCH - sql redirection mismatch 

•MV_QUERY_GEN_MISMATCH - materialized view query generation 

•USER_BIND_PEEK_MISMATCH - user bind peek mismatch 

•TYPCHK_DEP_MISMATCH - cursor has typecheck dependencies 

•NO_TRIGGER_MISMATCH - no trigger mismatch 

•FLASHBACK_CURSOR - No cursor sharing for flashback 

•ANYDATA_TRANSFORMATION - anydata transformation change 

•INCOMPLETE_CURSOR - incomplete cursor.
When bind length is upgradeable (i.e. we found a child cursor that matches everything
else except that the bind length is not long enough), we mark the old cursor is not usable
and build a new one.  This means the version can be ignored.

•TOP_LEVEL_RPI_CURSOR - top level/rpi cursor 
In a Parallel Query invocation this is expected behaviour (we purposely do not share)

•DIFFERENT_LONG_LENGTH - different long length 

•LOGICAL_STANDBY_APPLY - logical standby apply mismatch 

•DIFF_CALL_DURN - different call duration 

•BIND_UACS_DIFF - bind uacs mismatch 

•PLSQL_CMP_SWITCHS_DIFF - plsql compiler switches mismatch 

•CURSOR_PARTS_MISMATCH - cursor-parts executed mismatch 

•STB_OBJECT_MISMATCH - STB object different (now exists) 

•ROW_SHIP_MISMATCH - row shipping capability mismatch 

•PQ_SLAVE_MISMATCH - PQ slave mismatch 
Check you want to be using PX with this reason code, as the problem could be caused by running 
lots of small SQL statements which do not really need PX. If you are on < 11i you may be hitting Bug:4367986

•TOP_LEVEL_DDL_MISMATCH - top-level DDL cursor 

•MULTI_PX_MISMATCH - multi-px and slave-compiled cursor 

•BIND_PEEKED_PQ_MISMATCH - bind-peeked PQ cursor 

•MV_REWRITE_MISMATCH - MV rewrite cursor 

•ROLL_INVALID_MISMATCH - rolling invalidation window exceeded 
This is caused by the rolling invalidation capability in DBMS_STATS. 
The child cannot be shared as it's invalidation window is exceeded. See:
Note:557661.1  Rolling Cursor Invalidations with DBMS_STATS in Oracle10g (Doc ID 557661.1)

•OPTIMIZER_MODE_MISMATCH - optimizer mode mismatch 

•PX_MISMATCH - parallel query mismatch 
If running 11.1.0.6 and RAC see Bug:7352775. 
Check that if (on each instance) parallel_instance_groups is set then instance_groups is set to the same.

•MV_STALEOBJ_MISMATCH - mv stale object mismatch 

•FLASHBACK_TABLE_MISMATCH - flashback table mismatch 

•LITREP_COMP_MISMATCH - literal replacement compilation mismatch 

New in 11g :
•PLSQL_DEBUG - debug mismatch Session has debugging parameter plsql_debug set to true

•LOAD_OPTIMIZER_STATS  - Load optimizer stats for cursor sharing

•ACL_MISMATCH   -  Check ACL mismatch

•FLASHBACK_ARCHIVE_MISMATCH  - Flashback archive mismatch

•LOCK_USER_SCHEMA_FAILED  - Failed to lock user and schema

•REMOTE_MAPPING_MISMATCH  - Remote mapping mismatch

•LOAD_RUNTIME_HEAP_FAILED  - Runtime heap mismatch

•HASH_MATCH_FAILED  - Hash mismatch
Set to "Y" if sharing fails due to a hash mismatch, such as the case with mismatched histogram data 
or a range predicate marked as unsafe by literal replacement (See Bug:3461251)

New in 11.2  :
•PURGED_CURSOR - cursor marked for purging
The cursor has been marked for purging with dbms_shared_pool.purge

•BIND_LENGTH_UPGRADEABLE - bind length upgradeable
 Could not be shared because a bind variable size was smaller than the new value beiing inserted 
(marked as BIND_MISMATCH in earlier versions). 

•USE_FEEDBACK_STATS - cardinality feedback
Cardinality feedback is being used and therefore a new plan could be formed for the current execution.


•BIND_EQUIV_FAILURE - The bind value's selectivity does not match that used to optimize the existing child cursor

There is no longer  ROW_LEVEL_SEC_MISMATCH in 11.2.

High Versions Count查询
1.使用version_rpt function查询

--install version_rpt3_21
connect / as sysdba
start version_rpt3_21.sql

-- Generate reports for all cursors with more than 100 versions using SQL_ID (10g and up)
select b.* from v$sqlarea a ,table(version_rpt(a.sql_id)) b where loaded_versions >=100;

-- Generate reports for all cursors with more than 100 versions using HASH_VALUE
select b.* from v$sqlarea a ,table(version_rpt(null,a.hash_value)) b where loaded_versions>=100;

-- Generate the report for cursor with sql_id cyzznbykb509s
select * from table(version_rpt('cyzznbykb509s'));

2.直接查询

select sql_id,version_count, ADDRESS,sql_text
from   v$sqlarea 
where version_count > 10 
order by version_count, hash_value;

•Version 9.2.X.X and below : 
select * from v$sql_shared_cursor where kglhdpar = '0000000386BC2E58'
•Version 10.0.X.X and above:
select * from v$sql_shared_cursor where address = '0000000386BC2E58'

NOTE:The 'Y's denote a mismatch

High Versions Count跟踪

CURSORTRACE(10G及其以后版本)

TO trace on using:-  
alter system set events 
'immediate trace name cursortrace level 577, address hash_value';
(level 578/580 can be used for high level tracing (577=level 1, 578=level 2, 580=level 3) 

To turn off tracing use:- 
alter system set events 
'immediate trace name cursortrace level 2147483648, address 1';

Please note: BUG:5555371 exists in 10.2 (fixed in 10.2.0.4) where cursor trace cannot fully be turned off 
and single line entries will still be made to the trace file as a result. 
The w/a is to restart the instance. How invasive this BUG is depends on the executions of the cursor 
(and the size of the resultant trace file additions) 

cursordump(11GR2)
alter system set events 'immediate trace name cursordump level 16'

version_rpt3_21脚本:下载
参考:Troubleshooting: High Version Count Issues [ID 296377.1]

发表在 Oracle性能优化 | 标签为 | 4 条评论

ORACLE在AIX中产生SOFTWARE PROGRAM ABNORMALLY TERMINATED警告原因

数据库中发现如下错误
该错误的解决方案:ORA-07445[dbgrmqmqpk_query_pick_key()+0f88]

Dump file /oracle/diag/rdbms/sgerp5/sgerp5/incident/incdir_579300/sgerp5_m000_7602504_i579300.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /oracle/product/11.1.0/db_1
System name:    AIX
Node name:  sgerp5
Release:    1
Version:    6
Machine:    00C8F0564C00
Instance name: sgerp5
Redo thread mounted by this instance: 1
Oracle process number: 138
Unix process pid: 7602504, image: oracle@sgerp5 (m000)
 
*** 2012-05-11 03:52:35.200
*** SESSION ID:(752.5029) 2012-05-11 03:52:35.200
*** CLIENT ID:() 2012-05-11 03:52:35.200
*** SERVICE NAME:(SYS$BACKGROUND) 2012-05-11 03:52:35.200
*** MODULE NAME:(MMON_SLAVE) 2012-05-11 03:52:35.200
*** ACTION NAME:(Auto-Purge Slave Action) 2012-05-11 03:52:35.200
 
Dump continued from file: /oracle/diag/rdbms/sgerp5/sgerp5/trace/sgerp5_m000_7602504.trc
ORA-07445: exception encountered: core dump [dbgrmqmqpk_query_pick_key()+0f88] 
[SIGSEGV] [ADDR:0xB38F0000000049][PC:0x100213C08] [Address not mapped to object] []

errpt错误说明
在产生7445错误的同时观察aix系统错误日志发现SOFTWARE PROGRAM ABNORMALLY TERMINATED错误

sgerp5_[oracle]-->errpt -aj A924A5FC
---------------------------------------------------------------------------
LABEL:          CORE_DUMP
IDENTIFIER:     A924A5FC

Date/Time:       Fri May 11 03:52:55 BEIST 2012
Sequence Number: 471
Machine Id:      00C8F0564C00
Node Id:         sgerp5
Class:           S
Type:            PERM
WPAR:            Global
Resource Name:   SYSPROC         

Description
SOFTWARE PROGRAM ABNORMALLY TERMINATED

Probable Causes
SOFTWARE PROGRAM

User Causes
USER GENERATED SIGNAL

        Recommended Actions
        CORRECT THEN RETRY

Failure Causes
SOFTWARE PROGRAM

        Recommended Actions
        RERUN THE APPLICATION PROGRAM
        IF PROBLEM PERSISTS THEN DO THE FOLLOWING
        CONTACT APPROPRIATE SERVICE REPRESENTATIVE

Detail Data
SIGNAL NUMBER
           6
USER'S PROCESS ID:
               7602504
FILE SYSTEM SERIAL NUMBER
          14
INODE NUMBER
           0      367648
CORE FILE NAME
/oracle/diag/rdbms/sgerp5/sgerp5/cdump/core_7602504/core
PROGRAM NAME
oracle
STACK EXECUTION DISABLED
           0
COME FROM ADDRESS REGISTER
sskgmcrea 0

PROCESSOR ID
  hw_fru_id: 1
  hw_cpu_id: 2

ADDITIONAL INFORMATION
skgdbgcra 224
??
ksdbgcra 3D0
ssexhd 978
??

Symptom Data
REPORTABLE
1
INTERNAL ERROR
0
SYMPTOM CODE
PCSS/SPI2 FLDS/oracle SIG/6 FLDS/skgdbgcra VALU/224

错误原因

This error is logged when a software program abnormally ends and causes a core dump. Users might
not be exiting applications correctly, the system might have been shut down while users were
working in application, or the user's terminal might have locked up and the application stopped
1)这里也就是说如果oracle进程在aix机器上异常终止,并且产生了一个core dump文件,
  就会出现SOFTWARE PROGRAM ABNORMALLY TERMINATED警告信息
2)用户登录系统没有正常退出,而系统被关闭
3)用户强制终止一个一个lock,而导致进程停止

本次AIX日志警告原因:由于进程7602504异常终止(ORA-07445错误)并且产生了 /oracle/diag/rdbms/sgerp5/sgerp5/cdump/core_7602504/core dump 文件,从而有了AIX中的SOFTWARE PROGRAM ABNORMALLY TERMINATED警告信息

发表在 AIX, Oracle | 2 条评论