分类目录归档:ORA-xxxxx

记录一次ORA-600[13013]处理过程

在一次数据库的异常处理完成后,发现alert日志中出现ORA-600[13013]错误

Thu Mar 08 23:29:37 2012
Errors in file /opt/oracle/diag/rdbms/chf/chf/trace/chf_smon_24137.trc  (incident=38681):
ORA-00600: internal error code, arguments: [13013], [5001], [518], [4198427], [170], [4198427], [17], [], [], [], [], []
Incident details in: /opt/oracle/diag/rdbms/chf/chf/incident/incdir_38681/chf_smon_24137_i38681.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Non-fatal internal error happenned while SMON was doing flushing of monitored table stats.
SMON encountered 1 out of maximum 100 non-fatal internal errors.

trace文件中信息
从这里可以看出是对sys.col_usage$表进行update操作导致该错误发生

Dump continued from file: /opt/oracle/diag/rdbms/chf/chf/trace/chf_smon_24137.trc
ORA-00600: internal error code, arguments: [13013], [5001], [518], [4198427], [170], [4198427], [17], [], [], [], [], []

========= Dump for incident 38681 (ORA 600 [13013]) ========

*** 2012-03-08 23:29:37.400
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=3c1kubcdjnppq) -----
update sys.col_usage$ set   equality_preds    = equality_preds    + decode(bitand(:flag,1),0,0,1),   equijoin_preds    = equijoin_preds    + decode(bitand(:flag,2),0,0
,1),   nonequijoin_preds = nonequijoin_preds + decode(bitand(:flag,4),0,0,1),   range_preds       = range_preds       + decode(bitand(:flag,8),0,0,1),   like_preds    
    = like_preds        + decode(bitand(:flag,16),0,0,1),   null_preds        = null_preds        + decode(bitand(:flag,32),0,0,1),  timestamp = :time where obj# = :ob
jn and intcol# = :coln

MOS中关于ORA-600 [13013]描述

Format: ORA-600 [13013] [a] [b] {c} [d] [e] [f]
Arg [a] Passcount
Arg [b] Data Object number
Arg {c} Tablespace Decimal Relative DBA (RDBA) of block containing the row to be updated
Arg [d] Row Slot number
Arg [e] Decimal RDBA of block being updated (Typically same as {c})
Arg [f] Code

验证MOS中描述

SQL> select dbms_utility.data_block_address_file(4198427) rfile,
  2  dbms_utility.data_block_address_block(4198427) blocks
  3  from dual;

     RFILE     BLOCKS
---------- ----------
         1       4123

SQL> SELECT OWNER, SEGMENT_NAME, SEGMENT_TYPE, TABLESPACE_NAME, A.PARTITION_NAME
  2    FROM DBA_EXTENTS A
  3   WHERE FILE_ID = &FILE_ID
  4     AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1;
Enter value for file_id: 1
old   3:  WHERE FILE_ID = &FILE_ID
new   3:  WHERE FILE_ID = 1
Enter value for block_id: 4123
old   4:    AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1
new   4:    AND 4123 BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1

OWNER SEGMENT_NAME SEGMENT_TY TABLESPACE PARTI
----- ------------ ---------- ---------- -----
SYS   COL_USAGE$   TABLE      SYSTEM
--和trace文件中异常表一致

SQL>  select object_type,object_name from dba_objects where object_id=518;

OBJECT_TYPE         OBJECT_NAME
------------------- ------------------------------
TABLE               COL_USAGE$
--也和trace文件中异常表一致

分析异常表

SQL> ANALYZE TABLE sys.COL_USAGE$ VALIDATE STRUCTURE CASCADE;
ANALYZE TABLE sys.COL_USAGE$ VALIDATE STRUCTURE CASCADE
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file

SQL> select index_name,COLUMN_NAME,COLUMN_POSITION FROM DBA_IND_COLUMNS 
2    WHERE TABLE_NAME='COL_USAGE$';

INDEX_NAME      COLUMN_NAM COLUMN_POSITION
--------------- ---------- ---------------
I_COL_USAGE$    OBJ#                     1
I_COL_USAGE$    INTCOL#                  2

SQL> set autot trace exp
SQL>  SELECT /*+ FULL(t1) */ OBJ#,INTCOL#
  2   FROM sys.COL_USAGE$ t1
  3   MINUS
  4   SELECT /*+ index(t I_COL_USAGE$) */ OBJ#,INTCOL#
  5   FROM sys.COL_USAGE$ t where OBJ# is not null or INTCOL# is not null;

no rows selected
--无记录返回

Execution Plan
----------------------------------------------------------
Plan hash value: 399371572

------------------------------------------------------------------------------------

| Id  | Operation           | Name         | Rows  | Bytes | Cost (%CPU)| Time
   |

------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |              |  4262 | 76716 |    27  (71)| 00:00:01 |
|   1 |  MINUS              |              |       |       |            |          |
|   2 |   SORT UNIQUE       |              |  4262 | 38358 |     9  (12)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| COL_USAGE$   |  4262 | 38358 |     8   (0)| 00:00:01 |
|   4 |   SORT UNIQUE NOSORT|              |  4262 | 38358 |    18   (6)| 00:00:01 |
|*  5 |    INDEX FULL SCAN  | I_COL_USAGE$ |  4262 | 38358 |    17   (0)| 00:00:01 |

------------------------------------------------------------------------------------
--验证表两个sql是否正确(一个全表扫描,另个index 快速扫描)


SQL>  SELECT /*+ index(t I_COL_USAGE$) */ OBJ#,INTCOL#
  2   FROM sys.COL_USAGE$ t where OBJ# is not null or INTCOL# is not null
  3   MINUS
  4  SELECT /*+ FULL(t1) */ OBJ#,INTCOL#
  5   FROM sys.COL_USAGE$ t1;

      OBJ#    INTCOL#
---------- ----------
4294951004          2
4294951004          3
4294951004          4
4294951004         26
4294951004         27
4294951037          4
4294951037          5
4294951037          6
4294951037          9
4294951037         10
4294951840         11

      OBJ#    INTCOL#
---------- ----------
4294951840         12
4294951906          4
4294952709          3
4294952867          4
4294952867          9

16 rows selected.
--证明index中的记录比表中多了16条

解决问题并验证

SQL> alter index sys.I_COL_USAGE$ rebuild online;

Index altered.

SQL>  SELECT /*+ FULL(t1) */ OBJ#,INTCOL#
 FROM sys.COL_USAGE$ t1
  2    3   MINUS
  4   SELECT /*+ index(t I_COL_USAGE$) */ OBJ#,INTCOL#
  5   FROM sys.COL_USAGE$ t where OBJ# is not null or INTCOL# is not null
  6  ;

no rows selected

SQL>  SELECT /*+ index(t I_COL_USAGE$) */ OBJ#,INTCOL#
  2   FROM sys.COL_USAGE$ t where OBJ# is not null or INTCOL# is not null
  3   MINUS
  4  SELECT /*+ FULL(t1) */ OBJ#,INTCOL#
  5   FROM sys.COL_USAGE$ t1;

no rows selected

这次出现此问题的原因是因为在更新语句中使用索引找到一条记录,然后到表中去查询时该记录不存在,出现此错误,一般解决方法是重建索引
官方关于ORA-600[13013]说明

发表在 ORA-xxxxx | 标签为 , | 3 条评论

Getting ORA-01476 during execution of DBMS_STATS.GATHER_SCHEMA_STATS

alert日志

Fri Jan 27 22:00:09 2012
GATHER_STATS_JOB encountered errors.  Check the trace file.
Fri Jan 27 22:00:09 2012
Errors in file /oracle10/admin/ocs/bdump/ocs1_j001_29138.trc:
ORA-01476: divisor is equal to zero

trace内容

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /oracle10/app/product/db/10.2.0
System name:    HP-UX
Node name:      ocsdb1
Release:        B.11.23
Version:        U
Machine:        ia64
Instance name: ocs1
Redo thread mounted by this instance: 1
Oracle process number: 60
Unix process pid: 29138, image: oracle@ocsdb1 (J001)

*** ACTION NAME:(GATHER_STATS_JOB) 2012-01-27 22:00:09.308
*** MODULE NAME:(DBMS_SCHEDULER) 2012-01-27 22:00:09.308
*** SERVICE NAME:(SYS$USERS) 2012-01-27 22:00:09.308
*** SESSION ID:(988.31342) 2012-01-27 22:00:09.307
ORA-01476: divisor is equal to zero
*** 2012-01-27 22:00:09.417
GATHER_STATS_JOB: GATHER_TABLE_STATS('"OCS_SM"','"HLP_SMS_SEND"','""', ...)
ORA-01476: divisor is equal to zero

错误原因
oracle unpublished Bug 5645718

解决方法
1.Setting event 38041 at level 16

sql> connect / as sysdba
sql> alter system set events '38041 trace name context forever, level 16';

2.Patch 6319761

发表在 ORA-xxxxx | 标签为 | 2 条评论

手动提交分布式事务一例

一.alert文件中出现了很多类此记录

Fri Feb 10 05:25:01 2012
Errors in file /tmp/recover/ahcx216_reco_7956.trc:
ORA-12154: TNS:could not resolve service name
Fri Feb 10 05:25:01 2012
Errors in file /tmp/recover/ahcx216_reco_7956.trc:
ORA-12154: TNS:could not resolve service name

这里可以看出来两个信息:
1.出错的进程是rec0进程,而该进程的作用是解决分布式事务失败后遗留问题(事务提交或者回滚)
2.错误信息是tns不能被解析

二.查看trace文件

/tmp/recover/ahcx216_reco_7956.trc
Oracle9i Enterprise Edition Release 9.2.0.8.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production
ORACLE_HOME = /opt/oracle/product/9.2.0/db_1
System name:    Linux
Node name:      localhost.localdomain
Release:        2.6.9-89.0.0.0.1.ELhugemem
Version:        #1 SMP Tue May 19 04:38:38 EDT 2009
Machine:        i686
Instance name: ahcx216
Redo thread mounted by this instance: 1
Oracle process number: 7
Unix process pid: 7956, image: oracle@localhost.localdomain (RECO)


*** SESSION ID:(6.1) 2012-02-10 04:58:24.886
*** 2012-02-10 04:58:24.886
ERROR, tran=6.1.712757, session#=1, ose=0:
ORA-12154: TNS:could not resolve service name
………………
*** 2012-02-10 05:25:01.580
ERROR, tran=6.1.712757, session#=1, ose=0:
ORA-12154: TNS:could not resolve service name
*** 2012-02-10 05:25:01.627
ERROR, tran=12.19.99059, session#=1, ose=0:
ORA-12154: TNS:could not resolve service name

通过这里我们可以看出事务id分别为12.19.99059和6.1.712757有问题

三.查看dba_2pc_pending视图

SQL> select local_tran_id,state,fail_time,retry_time from dba_2pc_pending;

LOCAL_TRAN_ID     STATE            FAIL_TIME           RETRY_TIME
-------------- ---------------- --------------       --------------
6.1.712757	collecting	2010/12/1 13:39:03   2012/2/10 5:38:52
12.19.99059	collecting	2010/12/1 15:56:26   2012/2/10 5:38:52

可以看出,果然有两个分布式事务在2010年12月1日出了问题(本库是一个问题库,在把库拉起来后发现该问题)
因为是异机恢复,而且间隔时间较长,很多tns的信息都已经不存在,所以需要手工提交分布式事务

四.手动提交事务

SQL> BEGIN
  2  DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('6.1.712757');
  3  DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059');
  4  END;
  5  /
BEGIN
*
第 1 行出现错误:
ORA-30019: Illegal rollback Segment operation in Automatic Undo mode
ORA-06512: at "SYS.DBMS_TRANSACTION", line 65
ORA-06512: at "SYS.DBMS_TRANSACTION", line 85
ORA-06512: at line 2

SQL> alter session set "_smu_debug_mode"=4;

Session altered.

-------------------------------------------------------------
--设置UNDO_SUPPRESS_ERRORS=true也可以解决此问题
alter system set UNDO_SUPPRESS_ERRORS = TRUE;
EXECUTE DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('<事务ID>');
commit;
alter system set UNDO_SUPPRESS_ERRORS = false;
--------------------------------------------------------------

SQL> commit;

SQL> exec DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('6.1.712757');

PL/SQL procedure successfully completed.


SQL> exec DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059');
BEGIN DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059'); END;

*
ERROR at line 1:
ORA-01453: SET TRANSACTION must be first statement of transaction
ORA-06512: at "SYS.DBMS_TRANSACTION", line 65
ORA-06512: at "SYS.DBMS_TRANSACTION", line 85
ORA-06512: at line 1
--第一个分布式事务处理后,未提交导致

SQL> commit;

Commit complete.

SQL> exec DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059');

PL/SQL procedure successfully completed.

SQL> commit;

Commit complete.

五.补充说明
开始在另一个会话中,执行失败原因

SQL> alter session set "_smu_debug_mode"=4;

会话已更改。

SQL> commit;

提交完成。

SQL> BEGIN
  2  DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('6.1.712757');
  3  DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059');
  4  END;
  5  /
BEGIN
*
第 1 行出现错误:
ORA-01453: SET TRANSACTION must be first statement of transaction
ORA-06512: at "SYS.DBMS_TRANSACTION", line 65
ORA-06512: at "SYS.DBMS_TRANSACTION", line 85
ORA-06512: at line 3

因为这里的begin end中包含了两个事务的清理,在清理完第一个事务之后,需要提交才能够清理第二个,这里因为没有提交,导致ORA-01453错误。

发表在 ORA-xxxxx | 标签为 | 评论关闭