分类目录归档:Oracle

ORA-00600[kcbshlc_1]导致数据库 down 案例

一台服务器因为ORA-00600[kcbshlc_1]错误引起PMON异常导致数据库down掉

Sun Jul  8 17:20:10 2012
Errors in file /opt/oracle/admin/xff/bdump/xff_pmon_16412.trc:
ORA-00600: internal error code, arguments: [kcbshlc_1], [33], [], [], [], [], [], []
Sun Jul  8 17:20:12 2012
Errors in file /opt/oracle/admin/xff/bdump/xff_pmon_16412.trc:
ORA-00600: internal error code, arguments: [kcbshlc_1], [33], [], [], [], [], [], []
Sun Jul  8 17:20:12 2012
PMON: terminating instance due to error 472

分析trace文件

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /opt/oracle/product/10.2.0
System name:	Linux
Node name:	localhost.localdomain
Release:	2.6.9-89.ELsmp
Version:	#1 SMP Mon Apr 20 10:33:05 EDT 2009
Machine:	x86_64
Instance name: xff
Redo thread mounted by this instance: 1
Oracle process number: 2
Unix process pid: 16412, image: oracle@localhost.localdomain (PMON)

*** 2012-07-08 03:00:11.351
*** SERVICE NAME:(SYS$BACKGROUND) 2012-07-08 03:00:11.338
*** SESSION ID:(1105.1) 2012-07-08 03:00:11.338
 wsd 0x1f8169a6c8, sbuf (nil), setid 9, op 0
lcuridx 0, lasz (nil)
freeing in-flux r/w latch for process state: 1fc165d248
... in-flux r/w latch  1fc1fcc9b0 Child cache buffers chains level=1 child#=4753 
        Location from where latch is held: kcbgtcr: kslbegin excl: 
        Context saved from call: 113266196
        state=busy(exclusive) (val=0x2000000000000071) holder orapid = 113
    waiters [orapid (seconds since: put on list, posted, alive check)]:
     139 (2, 1341687611, 2)
     192 (2, 1341687611, 2)
     191 (2, 1341687611, 2)
     173 (2, 1341687611, 2)
     185 (2, 1341687611, 2)
     176 (2, 1341687611, 2)
     174 (2, 1341687611, 2)
     118 (2, 1341687611, 2)
     190 (2, 1341687611, 2)
     179 (2, 1341687611, 2)
     184 (1, 1341687611, 1)
     189 (1, 1341687611, 1)
     177 (1, 1341687611, 1)
     195 (1, 1341687611, 1)
     187 (1, 1341687611, 1)
     194 (1, 1341687611, 1)
     147 (1, 1341687611, 1)
     183 (1, 1341687611, 1)
     143 (1, 1341687611, 1)
     144 (1, 1341687611, 1)
     186 (1, 1341687611, 1)
     188 (1, 1341687611, 1)
     196 (1, 1341687611, 1)
     145 (1, 1341687611, 1)
     193 (1, 1341687611, 1)
     waiter count=25
*** 2012-07-08 03:50:06.228
 wsd 0x1f8169ac20, sbuf 0xac1ffafe8, setid 10, op 3
lcuridx 1, lasz 0x3c1ffc110
*** 2012-07-08 16:30:05.294
freeing in-flux r/w latch for process state: 20406507f0
... in-flux r/w latch  1f81265f28 Child cache buffers chains level=1 child#=14180 
        Location from where latch is held: kcbgtcr: kslbegin excl: 
        Context saved from call: 71341989
        state=busy(exclusive) (val=0x2000000000000066) holder orapid = 102
    waiters [orapid (seconds since: put on list, posted, alive check)]:
     121 (2, 1341736205, 2)
     116 (2, 1341736205, 2)
     125 (2, 1341736205, 2)
     140 (2, 1341736205, 2)
     145 (2, 1341736205, 2)
     waiter count=5
freeing in-flux r/w latch for process state: 1fc165f9d0
... in-flux r/w latch  1f813aec18 Child cache buffers chains level=1 child#=20914 
        Location from where latch is held: kcbrls: kslbegin: 
        Context saved from call: 96505705
        state=busy(exclusive) (val=0x200000000000007b) holder orapid = 123
*** 2012-07-08 17:20:10.876
 wsd 0x1f8169a6c8, sbuf (nil), setid 9, op 0
lcuridx 0, lasz (nil)
*** 2012-07-08 17:20:10.876
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kcbshlc_1], [33], [], [], [], [], [], []
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedst()+31          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FBFFFCEB0 ? 7FBFFFCF10 ?
                                                   7FBFFFCE50 ? 000000000 ?
ksedmp()+610         call     ksedst()             000000000 ? 000000001 ?
                                                   7FBFFFCEB0 ? 7FBFFFCF10 ?
                                                   7FBFFFCE50 ? 000000000 ?
ksfdmp()+21          call     ksedmp()             000000003 ? 000000001 ?
                                                   7FBFFFCEB0 ? 7FBFFFCF10 ?
                                                   7FBFFFCE50 ? 000000000 ?
kgerinv()+161        call     ksfdmp()             000000003 ? 000000001 ?
                                                   7FBFFFCEB0 ? 7FBFFFCF10 ?
                                                   7FBFFFCE50 ? 000000000 ?
kgeasnmierr()+163    call     kgerinv()            0066876E0 ? 2A97200260 ?
                                                   7FBFFFCF10 ? 7FBFFFCE50 ?
                                                   000000000 ? 000000000 ?
kcbshlc()+239        call     kgeasnmierr()        0066876E0 ? 2A97200260 ?
                                                   7FBFFFCF10 ? 7FBFFFCE50 ?
                                                   000000000 ? 000000021 ?
kslilcr()+770        call     kcbshlc()            0066876E0 ? 1F801DDB28 ?
                                                   7FBFFFCF10 ? 7FBFFFCE50 ?
                                                   000000000 ? 000000021 ?
ksl_cleanup()+1567   call     kslilcr()            7FBFFFCE50 ? 000000000 ?
                                                   7FBFFFDCE0 ? 1F801DDB28 ?
                                                   0066876E0 ? 000000021 ?
ksuxfl()+492         call     ksl_cleanup()        000000000 ? 000000000 ?
                                                   000000000 ? 1F801DDB28 ?
                                                   0066876E0 ? 000000021 ?
ksuxda()+55          call     ksuxfl()             1FC165B8E0 ? 000000000 ?
                                                   000000000 ? 1F801DDB28 ?
                                                   0066876E0 ? 000000021 ?
ksucln()+1390        call     ksuxda()             1FC165B8E0 ? 000000000 ?
                                                   000000000 ? 1F801DDB28 ?
                                                   0066876E0 ? 000000021 ?
ksbrdp()+794         call     ksucln()             060008100 ? 000000000 ?
                                                   FFFFFFFF9720ED9F ?
                                                   1F801DDB28 ? 0066876E0 ?
                                                   000000021 ?
opirip()+616         call     ksbrdp()             060008100 ? 000000000 ?
                                                   000000001 ? 060008100 ?
                                                   0066876E0 ? 000000021 ?
opidrv()+582         call     opirip()             000000032 ? 000000004 ?
                                                   7FBFFFF698 ? 060008100 ?
                                                   0066876E0 ? 000000021 ?
sou2o()+114          call     opidrv()             000000032 ? 000000004 ?
                                                   7FBFFFF698 ? 060008100 ?
                                                   0066876E0 ? 000000021 ?
opimai_real()+317    call     sou2o()              7FBFFFF670 ? 000000032 ?
                                                   000000004 ? 7FBFFFF698 ?
                                                   0066876E0 ? 000000021 ?
main()+116           call     opimai_real()        000000003 ? 7FBFFFF700 ?
                                                   000000004 ? 7FBFFFF698 ?
                                                   0066876E0 ? 000000021 ?
__libc_start_main()  call     main()               000000003 ? 7FBFFFF700 ?
+219                                               000000004 ? 7FBFFFF698 ?
                                                   0066876E0 ? 000000021 ?
_start()+42          call     __libc_start_main()  000713984 ? 000000001 ?
                                                   7FBFFFF848 ? 005288D00 ?
                                                   000000000 ? 000000003 ?

通过这个trace可以看出数据库运行在LINUX 64操作系统,版本是10.2.0.4。
出现错误的原因:
PMON在清理1fc165d248的时候,因为被orapid = 102持有,导致清理失败.
PMON在清理20406507f0的时候,因为被orapid = 102持有,导致清理失败.
PMON在清理1fc165f9d0的时候,因为被orapid = 123持有,导致清理失败.

查询MOS[443909.1]
发现是unpublished Bug 4723109.处理方法打上Patch 4723109.

发表在 ORA-xxxxx | 标签为 , , | 一条评论

dul 10 export_mode=true功能增强

在有次8i的库恢复中,因为硬盘损坏导致几个表出现很多诡异性坏块,尝试使用dul对其进行挖掘数据,当时使用dul 9 遇到一个难题:当一张表中有lob类型,同时又有varchar2类型,而且varchar2类型数据中包含回车键,使得解决起来很麻烦(因为export_mode=false支持lob,但是不支持字符串含回车;export_mode=true支持字符串含回车,但是不支持lob),最后放弃了对部分数据的挖掘.这个问题让我一直不甘心,今天测试dul 10 发现是用export_mode=true可以完美解决该问题
创建模拟表和插入数据

SQL> desc t_xff
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 C_BLOB                                             BLOB
 C_VARCHAR                                          VARCHAR2(4000)

SQL> declare
  2  a_blob BLOB;
  3  bfile_name BFILE := BFILENAME('ULTLOBDIR','awr_ora11g_2012-06-01_174_175.html');
  4  begin
  5  insert into t_xff(C_BLOB,C_VARCHAR) values (
  6  empty_blob(),
  7  'www.xifenfei.com
  8  WWW.XIFENFEI.COM
  9  惜分飞
 10  欢迎访问惜分飞博客
 11  提供数据库异常恢复技术支持')
 12  returning C_BLOB into a_blob;
 13  dbms_lob.fileopen(bfile_name);
 14  dbms_lob.loadfromfile(a_blob, bfile_name, dbms_lob.getlength(bfile_name));
 15  dbms_lob.fileclose(bfile_name);
 16  commit;
 17  end;
 18  /

PL/SQL procedure successfully completed.

SQL> select length(c_varchar),dbms_lob.getlength(c_blob) from t_xff;

LENGTH(C_VARCHAR) DBMS_LOB.GETLENGTH(C_BLOB)
----------------- --------------------------
               61                    4282573

SQL>  select c_varchar from t_xff;

C_VARCHAR
---------------------------------------------------------------
www.xifenfei.com
WWW.XIFENFEI.COM
惜分飞
欢迎访问惜分飞博客
提供数据库异常恢复技术支持

dul 挖数据

[oracle@xifenfei dul]$ ./dul

Data UnLoader: 10.2.0.5.13 - Internal Only - on Mon Jul  2 04:29:10 2012
with 64-bit io functions

Copyright (c) 1994 2012 Bernard van Duijnen All rights reserved.

 Strictly Oracle Internal Use Only

DUL> bootstrap;
DUL> desc chf.t_xff;
Table CHF.T_XFF
obj#= 51353, dataobj#= 51353, ts#= 4, file#= 4, block#=67
      tab#= 0, segcols= 2, clucols= 0
Column information:
icol# 01 segcol# 01       C_BLOB len 4000 type 113 BLOB
  LOB Segment: dataobj#= 51354, ts#= 4, file#= 4, block#=75 chunk=1
  LOB Index: dataobj#= 51355, ts#= 4, file#= 4, block#=83
icol# 02 segcol# 02    C_VARCHAR len 4000 type  1 VARCHAR2 cs 852(ZHS16GBK)

--export_mode=false
DUL> unload table chf.t_xff;
. unloading (index organized) table     LOB01000053      65 rows unloaded
Preparing lob metadata from lob index
Reading LOB01000053.dat 65 entries loaded and sorted 65 entries
. unloading table                     T_XFF       1 row  unloaded

--导出数据文件
-rw-r--r-- 1 oracle oinstall 6.1K Jul  2 04:15 LOB01000053.dat
-rw-r--r-- 1 oracle oinstall  335 Jul  2 04:15 LOB01000053.ctl
-rw-r--r-- 1 oracle oinstall 8.2M Jul  2 04:15 CHF_T_XFF.dat
-rw-r--r-- 1 oracle oinstall  263 Jul  2 04:15 CHF_T_XFF.ctl

----export_mode=true
DUL> unload table chf.t_xff;
. unloading (index organized) table     LOB01000053
DUL: Warning: Recreating file "LOB01000053.ctl"
      65 rows unloaded
Preparing lob metadata from lob index
Reading LOB01000053.dat 65 entries loaded and sorted 65 entries
. unloading table                     T_XFF       1 row  unloaded

--导出数据文件
-rw-r--r-- 1 oracle oinstall    6229 Jul  2 04:29 LOB01000053.dat
-rw-r--r-- 1 oracle oinstall     335 Jul  2 04:29 LOB01000053.ctl
-rw-r--r-- 1 oracle oinstall 4285027 Jul  2 04:29 CHF_T_XFF.dmp

导入数据测试
sqlldr导入

SQL> truncate table chf.t_xff;

Table truncated.

[oracle@xifenfei dul]$ sqlldr chf/xifenfei control=CHF_T_XFF.ctl

SQL*Loader: Release 10.2.0.1.0 - Production on Mon Jul 2 04:23:18 2012

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

SQL*Loader-510: Physical record in data file (CHF_T_XFF.dat) is longer than the maximum(1048576)
SQL*Loader-2026: the load was aborted because SQL Loader cannot continue.
[oracle@xifenfei dul]$ sqlldr chf/xifenfei control=CHF_T_XFF.ctl readsize=20971520

SQL*Loader: Release 10.2.0.1.0 - Production on Mon Jul 2 04:26:50 2012

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

SQL> select length(c_varchar),dbms_lob.getlength(c_blob) from chf.t_xff;

no rows selected

--试验结果证明在出现表中同时有lob和varchar2列(含回车)时,export_mode=false不能正常工作

imp导入

SQL> drop table chf.t_xff;

Table dropped.

[oracle@xifenfei dul]$ imp chf/xifenfei file=CHF_T_XFF.dmp full=y

Import: Release 10.2.0.1.0 - Production on Mon Jul 2 04:30:30 2012

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

Export file created by EXPORT:V07.00.07 via conventional path

Warning: the objects were exported by Bernard's DUL, not by you

. importing Bernard's DUL's objects into CHF
. importing Bernard's DUL's objects into CHF
. . importing table                        "T_XFF"          1 rows imported

SQL> select length(c_varchar),dbms_lob.getlength(c_blob) from t_xff;

LENGTH(C_VARCHAR) DBMS_LOB.GETLENGTH(C_BLOB)
----------------- --------------------------
               61                    4282573

SQL>  select c_varchar from t_xff;

C_VARCHAR
---------------------------------------------------------------
www.xifenfei.com
WWW.XIFENFEI.COM
惜分飞
欢迎访问惜分飞博客
提供数据库异常恢复技术支持

--试验结果证明在出现表中同时有lob和varchar2列(含回车)时,export_mode=true正常工作
发表在 非常规恢复 | 标签为 | 评论关闭

DBCA Fails With ORA-15243

今天接到朋友的电话说他们装ORACLE 11G R1 RAC的时候遇到ORA-12801/ORA-15243错误,请求我帮忙解决
具体情况
AIX系统以前装过11G R2 RAC,现因为项目要求11G R1,已经重装了系统,然后安装R1,在安装到DBCA配置ASM的时候,出现ORA-12801/ORA-15243错误

ORA-12801: error signaled in parallel query server PZ99, instance wmsdb1:+ASM1(1)
ORA-15243: 11.2.0.0.0 is not a valid version number


通过SQLPLUS登录ASM1实例查询发现该有一个ORADATA磁盘组,包含了一个/dev/rhdisk1.通过询问,得出结论是这个磁盘组以前是安装R2的时候作为存储OCR和VOTINGDISK使用,重装系统的时候未对该磁盘进行处理.

处理思路[想办法清除磁盘中asm信息]
1.尝试通过sqlplus 删除该磁盘组,报该磁盘组处于dismount状态
2.尝试mount该磁盘组,提示版本无效(ORA-15243)[当前的asm程序是11.1而磁盘组信息是11.2 程序当然不一致了]
3.直接使用dd清理该asm disk header信息(dd if=/dev/zero of=/dev/rhdisk1 bs=4096 count=1)
4.重新运行dbca一切工作正常

MOS中相关文章[1460997.1]只适合linux asmlib情况

Applies to:
Oracle Server - Enterprise Edition - Version 11.1.0.7 and later
Information in this document applies to any platform.

Symptoms
On : 11.1.0.7 version, STORAGE

When attempting to create database or query gv$asm_diskgroup,
the following error occurs.

ERROR
-----------------------
ORA-12801: error signaled in parallel query server PZ99, instance dchilcmsdb2.hq.navteq.com:+ASM2 (2)
ORA-15243: 11.2.0.0.0 is not a valid version number


STEPS
-----------------------
The issue can be reproduced at will with the following steps:
1. Previously had 11GR2 installed and configured. Removed this installation then installed 
   11.1.0.7  and created diskgroups using some of the same disks previously used.
2. Attempt to create database and receive the errors. Drop the newly 
   created diskgroups and query the view still get same errors.


BUSINESS IMPACT
-----------------------
The issue has the following business impact:
Due to this issue, users cannot create new database.

Changes
 Removed 11.2.0.1 installation and installed 11.1.0.7 software without cleaning up all of 
 the diskgroup information from previous installation.

Cause
All the current information shows that we are using correct binaries and 
that the diskgroups that are being used have correct comparability settings. 
HTML shows that the disks for the old diskgroup are still being discovered. 
This in conjunction with the text of the error as follows shows that 
we are picking up 11.2.0.0.0 as version from somewhere.
ORA-15243: 11.2.0.0.0 is not a valid version number
 
Problem was caused by the disks that had been used for the 
OCR/Voting disk diskgroup in 11GR2 installation still being present and accessible.
 

Solution
As the root user execute /etc/init.d/oracleasm/deletedisk command against all the disks 
that were previously used for the OCR/Voting disk diskgroup then try the operation again.
发表在 ORA-xxxxx | 标签为 | 评论关闭