分类目录归档:ORA-xxxxx

large pool太小导致shared server异常

数据库出现如下错误

Fri Oct  5 09:33:54 2012
Process S001 started up but failed with error = 20
failed to start shared server 'S001', oer=20
Process S001 started up but failed with error = 20
failed to start shared server 'S001', oer=20
Process S001 started up but failed with error = 20
failed to start shared server 'S001', oer=20
Process S001 started up but failed with error = 20
failed to start shared server 'S001', oer=20
Process S001 started up but failed with error = 20
failed to start shared server 'S001', oer=20
Process S001 started up but failed with error = 20
failed to start shared server 'S001', oer=20
Process S001 started up but failed with error = 20
failed to start shared server 'S001', oer=20
Process S001 started up but failed with error = 20
failed to start shared server 'S001', oer=20

重启后错误提示变为

Successfully onlined Undo Tablespace 1.
Fri Oct  5 09:34:41 2012
SMON: enabling tx recovery
Fri Oct  5 09:34:41 2012
Database Characterset is AL32UTF8
replication_dependency_tracking turned off (no async multimaster replication found)
Completed: ALTER DATABASE OPEN
Fri Oct  5 13:53:50 2012
Errors in file /oracle/admin/ora/bdump/ora_s000_25948.trc:
ORA-04031: unable to allocate 72 bytes of shared memory 
("large pool","unknown object","session heap","trigger condition node")
Fri Oct  5 13:53:50 2012
Errors in file /oracle/admin/ora/bdump/ora_s000_25948.trc:
ORA-00600: internal error code, arguments: [ksudel1], [], [], [], [], [], [], []
ORA-04031: unable to allocate 72 bytes of shared memory 
("large pool","unknown object","session heap","trigger condition node")
Fri Oct  5 13:54:52 2012
found dead shared server 'S000', pid = (11, 1)
Fri Oct  5 17:25:59 2012
Errors in file /oracle/admin/ora/bdump/ora_s000_31081.trc:
ORA-04031: unable to allocate 72 bytes of shared memory 
("large pool","unknown object","session heap","trigger condition node")
Fri Oct  5 17:25:59 2012
Errors in file /oracle/admin/ora/bdump/ora_s000_31081.trc:
ORA-00600: internal error code, arguments: [ksudel1], [], [], [], [], [], [], []
ORA-04031: unable to allocate 72 bytes of shared memory 
("large pool","unknown object","session heap","trigger condition node")

通过这里的错误,我们可以看到是large pool不能分配72 bytes的连续内存空间而使得S000进程报错.那这两者有什么联系:我们知道S000是shared server的进程,那shared server为什么导致large pool不足呢?查询官方文档得出,如下三种情况会使用large pool

Session memory for the shared server and the Oracle XA interface 
(used where transactions interact with more than one database)

I/O server processes

Oracle backup and restore operations

主要也就是shared server/parallel query buffers/backup restore这几个操作会使用到large pool.在该案例中很明显的可以看到是因为shared server进程需要分配large pool中一部分空间,而没有连续空间从而出现该错误.数据库相关参数配置

SQL> show parameter mts;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
mts_circuits                         integer     555
mts_dispatchers                      string      (PROTOCOL=TCP) (SERVICE=oraXDB)
mts_listener_address                 string
mts_max_dispatchers                  integer     5
mts_max_servers                      integer     20
mts_multiple_listeners               boolean     FALSE
mts_servers                          integer     1
mts_service                          string      ora
mts_sessions                         integer     550

SQL> show parameter large;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
large_pool_size                      big integer 16777216

这里可以看出来,数据库明显配置了MTS,因为数据库在启动时候,最少会建立一个shared server进程,而这个时候因为large pool太小(16M),导致该进程无法正常建立,从而出现上述alert中相关错误,临时处理方法增加large pool.后续需要关注业务特点,考虑是否可以采用Oracle Dedicated server模式来处理.

发表在 ORA-xxxxx | 评论关闭

ORA-00600[kccpb_sanity_check_2]

alert日志出现如下错误
数据库在mount的时候,因为出现ORA-00600[kccpb_sanity_check_2]错误导致数据库不能正常被mount成功

--sqlplus中报错
SQL> startup nomount;
ORACLE instance started.
 
Total System Global Area 4294967296 bytes                                     
Fixed Size                  2273256 bytes                                     
Variable Size            1486573592 bytes                                     
Database Buffers         2801795072 bytes                                     
Redo Buffers                4325376 bytes                                     
  
SQL> alter database mount;
alter database mount
*
ERROR at line 1:
ORA-600: internal error code, arguments: [kccpb_sanity_check_2], [2825], 
[2824], [0x000000000], [], [], [], [] 

--alert日志
Mon Sep 24 16:35:37 2012
ALTER DATABASE   MOUNT
Mon Sep 24 16:35:41 2012
Errors in file /opt/app/oracle/admin/lhgk/udump/lhgk_ora_17034.trc:
ORA-00600: internal error code, arguments: [kccpb_sanity_check_2], [2825], [2824], [0x000000000], [], [], [], []
Mon Sep 24 16:35:41 2012
ORA-600 signalled during: ALTER DATABASE   MOUNT...

错误原因

ORA-600 [kccpb_sanity_check_2] indicates that the seq# of the last read block is
higher than the seq# of the control file header block. This is indication of
the lost write of the header block during commit of the previous cf
transaction.

解决方法

1) restore a backup of a controlfile and recover

OR

2) recreate the controlfile

OR

3) restore the database from last good backup and recover

NOTE:  If you do not have any special backup of control file to restore and you are using Multiple Control File 
copies in your pfile/init.ora/spfile you can attempt to mount the database using each control file one by one.  
If you are able to mount the database with any of these control file copies you can then issue 
'alter database backup controlfile to trace' to recreate controlfile.
发表在 ORA-xxxxx | 标签为 , | 一条评论

模拟普通ORA-08103并解决

在上一篇中说到:模拟极端ORA-08103并解决,不能通过修改成坏块来解决,这里演示了是一个普通的数据块出现异常,然后通过bbed修改为坏块通过dbms_repair来解决该故障,补充说明:在11.2.0.3.3的库中,使用该方法不能重现该错误,而是直接提示ORA-01578,证明ORACLE的新版本在这一方面进行了改进
创建测试表

SQL> SELECT * FROM V$VERSION;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

SQL> create table t_xifenfei as
  2   select * from dba_objects where rownum<3000;

Table created.

SQL> SELECT owner, segment_name, EXTENT_ID, FILE_ID, BLOCK_ID, BLOCKS 
  2  FROM dba_extents 
  3  WHERE segment_name='T_XIFENFEI' AND owner='CHF';

OWNER                          SEGMENT_NAME     EXTENT_ID    FILE_ID   BLOCK_ID     BLOCKS
------------------------------ --------------- ---------- ---------- ---------- ----------
CHF                            T_XIFENFEI               0          4         57          8
CHF                            T_XIFENFEI               1          4         65          8
CHF                            T_XIFENFEI               2          4         73          8
CHF                            T_XIFENFEI               3          4         81          8
CHF                            T_XIFENFEI               4          4         89          8
CHF                            T_XIFENFEI               5          4         97          8

6 rows selected.

SQL> SELECT DISTINCT dbms_rowid.rowid_block_number(rowid) blk#,
  2  dbms_rowid.rowid_relative_fno(rowid) file# 
  3   FROM t_xifenfei
  4   ORDER BY 2,1;

      BLK#      FILE#
---------- ----------
        60          4
        61          4
        62          4
        63          4
        64          4
        65          4
        66          4
        67          4
        68          4
        69          4
        70          4

      BLK#      FILE#
---------- ----------
        71          4
        72          4
        74          4
        75          4
        76          4
        77          4
        78          4
        79          4
        80          4
        81          4
        82          4

      BLK#      FILE#
---------- ----------
        83          4
        84          4
        85          4
        86          4
        87          4
        88          4
        90          4
        91          4
        92          4
        93          4
        94          4

      BLK#      FILE#
---------- ----------
        95          4
        96          4
        97          4
        98          4

37 rows selected.

模拟ORA-08103

SQL> CONN / AS SYSDBA
Connected.
SQL> SHUTDOWN IMMEDIATE
Database closed.
Database dismounted.
ORACLE instance shut down.

--破坏数据快(其实就是清空一个数据块block 95,注意dd和实际数据的block对应关系相差1)
[oracle@xifenfei ~]$ dd if=/dev/zero of=/u01/oracle/oradata/XFF/users01.dbf bs=8192 seek=95 count=1 conv=notrunc
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000187113 seconds, 43.8 MB/s


SQL> STARTUP
ORACLE instance started.

Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             104860124 bytes
Database Buffers          205520896 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
SQL> SELECT COUNT(*) FROM CHF.T_XIFENFEI;
SELECT COUNT(*) FROM CHF.T_XIFENFEI
                         *
ERROR at line 1:
ORA-08103: object no longer exists


[oracle@xifenfei ~]$ exp chf/xifenfei tables=t_xifenfei file=/tmp/t_xifenfei.dmp

Export: Release 10.2.0.4.0 - Production on Fri Jan 13 22:09:43 2012

Copyright (c) 1982, 2007, Oracle.  All rights reserved.


Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in ZHS16GBK character set and AL16UTF16 NCHAR character set

About to export specified tables via Conventional Path ...
. . exporting table                     T_XIFENFEI
EXP-00056: ORACLE error 8103 encountered
ORA-08103: object no longer exists
Export terminated successfully with warnings.

[oracle@xifenfei ~]$ expdp chf/xifenfei tables=t_xifenfei dumpfile=t_xifenfei.dmp

Export: Release 10.2.0.4.0 - Production on Friday, 13 January, 2012 22:10:26

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "CHF"."SYS_EXPORT_TABLE_01":  chf/******** tables=t_xifenfei dumpfile=t_xifenfei.dmp 
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 384 KB
Processing object type TABLE_EXPORT/TABLE/TABLE
ORA-31693: Table data object "CHF"."T_XIFENFEI" failed to load/unload and is being skipped due to error:
ORA-02354: error in exporting/importing data
ORA-08103: object no longer exists
Master table "CHF"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for CHF.SYS_EXPORT_TABLE_01 is:
  /u01/oracle/oracle/product/10.2.0/db_1/rdbms/log/t_xifenfei.dmp
Job "CHF"."SYS_EXPORT_TABLE_01" completed with 1 error(s) at 22:10:50

测试证明当出现ORA-08103的时候,全表扫描,exp,expdp均不能正常工作

找到出现ORA-08103数据块

SQL> alter session set max_dump_file_size=unlimited;

Session altered.

SQL> alter session set db_file_multiblock_read_count=1;

Session altered.

SQL> alter session set events 'immediate trace name trace_buffer_on level 1048576';

Session altered.

SQL>  alter session set events '10200 trace name context forever, level 1';

Session altered.

SQL> alter session set events '8103 trace name errorstack level 3';

Session altered.

SQL> alter session set events '10236 trace name context forever, level 1';

Session altered.

SQL>  alter session set tracefile_identifier='ORA8103';

Session altered.


SQL> select * from chf.t_xifenfei;

……………………

ERROR:
ORA-08103: object no longer exists

2700 rows selected.


--在trace文件结尾发现如下记录,表示读到这个数据块时发生错误
KTRVAC: path typ=0, rdba=100005f


SQL> select to_number('100005f','xxxxxxxxxxxxx') from dual;

TO_NUMBER('100005F','XXXXXXXXXXXXX')
------------------------------------
                            16777311

SQL> select
  2  dbms_utility.data_block_address_file(16777311) FILE_NO,
  3  dbms_utility.data_block_address_block(16777311) BLOCK_NO
  4  from dual;

   FILE_NO   BLOCK_NO
---------- ----------
         4         95

bbed继续破坏异常块

BBED> set filename '/u01/oracle/oradata/XFF/users01.dbf'
        FILENAME        /u01/oracle/oradata/XFF/users01.dbf

BBED> set block 95
        BLOCK#          95

BBED> map
 File: /u01/oracle/oradata/XFF/users01.dbf (0)
 Block: 95                                    Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (00)


BBED> set count 32
        COUNT           32

BBED> d
 File: /u01/oracle/oradata/XFF/users01.dbf (0)
 Block: 95               Offsets:    0 to   31           Dba:0x00000000
------------------------------------------------------------------------
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 

 <32 bytes per line>

--这个就是和11gr2的区别,在11g中使用该方法来模拟ORA-08103,直接提示坏块,从而不会出现ORA-08103
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/XFF/users01.dbf
BLOCK = 95


DBVERIFY - Verification complete

Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 1
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0

BBED> d offset 8180
 File: /u01/oracle/oradata/XFF/users01.dbf (0)
 Block: 95               Offsets: 8180 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 00000000 00000000 00000000 

 <32 bytes per line>

BBED> m /x 01010101 offset 8188
BBED-00215: editing not allowed in BROWSE mode


BBED> set mode edit
        MODE            Edit

--修改sumcheck
BBED> m /x 01010101 offset 8188
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/XFF/users01.dbf (0)
 Block: 95               Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 01010101 

 <32 bytes per line>

BBED> sum
Check value for File 0, Block 95:
current = 0x0000, required = 0x0000

测试修改为坏块效果

SQL> select count(*) from chf.t_xifenfei;
select count(*) from chf.t_xifenfei
                         *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 95)
ORA-01110: data file 4: '/u01/oracle/oradata/XFF/users01.dbf'


SQL> exec dbms_repair.skip_corrupt_blocks('CHF','T_XIFENFEI');

PL/SQL procedure successfully completed.

SQL> select count(*) from chf.t_xifenfei;

  COUNT(*)
----------
      2918

通过让ORA-08103对应的块变为真正的坏块,然后使用dbms_repair或者event来跳过坏块,达到拯救数据的目的

发表在 ORA-xxxxx | 标签为 | 评论关闭