标签归档:kokiasg1

ORA-600 kokiasg1故障分析(obj$中核心字典序列全部被恶意删除)

故障总结:客户正常关闭数据库,然后启动报ORA-600 kokiasg1错误,通过对启动分析确认是由于IDGEN1$序列丢失导致,修复该故障之后,数据库启动成功,但是后台大量报ORA-600 12803,ORA-600 15264等错误,业务用户无法登录.经过深入分析,发现数据库字典obj$中所有核心字典的序列全部被删除,但是在seq$中这些对象的obj#记录还存在.初步怀疑是有人恶意删除了obj$中字典核心序列对象导致.
数据库启动报ORA-600 kokiasg1错误

SQL> startup ;
ORACLE 例程已经启动。

Total System Global Area 1.4531E+10 bytes
Fixed Size                  2295256 bytes
Variable Size            2181040680 bytes
Database Buffers         1.2314E+10 bytes
Redo Buffers               33193984 bytes
数据库装载完毕。
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kokiasg1], [], [], [], [], [], [],
[], [], [], [], []
进程 ID: 5628
会话 ID: 122 序列号: 3

对应的alert日志信息

Thu Jul 03 16:35:25 2025
Shutting down instance (immediate)
Stopping background process SMCO
Shutting down instance: further logons disabled
Thu Jul 03 16:35:26 2025
Stopping background process CJQ0
Stopping background process QMNC
Stopping background process MMNL
Stopping background process MMON
License high water mark = 272
All dispatchers and shared servers shutdown
Thu Jul 03 16:35:54 2025
alter database close normal
Thu Jul 03 16:35:54 2025
SMON: disabling tx recovery
SMON: disabling cache recovery
Thu Jul 03 16:35:54 2025
Shutting down archive processes
Archiving is disabled
Archive process shutdown avoided: 0 active
Thread 1 closed at log sequence 296590
Successful close of redo thread 1
Completed: alter database close normal
alter database dismount
Shutting down archive processes
Archiving is disabled
Completed: alter database dismount
ARCH: Archival disabled due to shutdown: 1089
Shutting down archive processes
Archiving is disabled
ARCH: Archival disabled due to shutdown: 1089
Shutting down archive processes
Archiving is disabled
Thu Jul 03 16:36:02 2025
Stopping background process VKTM
Thu Jul 03 16:36:07 2025
Instance shutdown complete
Thu Jul 03 16:36:19 2025
Adjusting the default value of parameter parallel_max_servers
from 640 to 270 due to the value of parameter processes (300)
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Initial number of CPU is 16
Number of processor cores in the system is 8
Number of processor sockets in the system is 1
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST
Autotune of undo retention is turned on. 
IMODE=BR
ILAT =52
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options.
Windows NT Version V6.2  
CPU                 : 16 - type 8664, 8 Physical Cores
Process Affinity    : 0x0x0000000000000000
Memory (Avail/Total): Ph:24712M/32767M, Ph+PgF:14089M/39123M 
System parameters with non-default values:
  processes                = 300
  sessions                 = 480
  nls_language             = "SIMPLIFIED CHINESE"
  nls_territory            = "CHINA"
  sga_target               = 13920M
  control_files            = "D:\APP\ADMINISTRATOR\ORADATA\orcl\CONTROL01.CTL"
  control_files            = "D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\orcl\CONTROL02.CTL"
  db_block_size            = 8192
  compatible               = "11.2.0.4.0"
  db_recovery_file_dest    = "D:\app\Administrator\fast_recovery_area"
  db_recovery_file_dest_size= 10G
  undo_tablespace          = "UNDOTBS1"
  remote_login_passwordfile= "EXCLUSIVE"
  db_domain                = ""
  dispatchers              = "(PROTOCOL=TCP) (SERVICE=orclXDB)"
  job_queue_processes      = 10
  audit_file_dest          = "D:\APP\ADMINISTRATOR\ADMIN\orcl\ADUMP"
  audit_trail              = "DB"
  db_name                  = "orcl"
  open_cursors             = 300
  pga_aggregate_target     = 4639M
  diagnostic_dest          = "D:\APP\ADMINISTRATOR"
Thu Jul 03 16:36:20 2025
PMON started with pid=2, OS id=13088 
Thu Jul 03 16:36:20 2025
PSP0 started with pid=3, OS id=16168 
Thu Jul 03 16:36:21 2025
VKTM started with pid=4, OS id=7948 at elevated priority
VKTM running at (10)millisec precision with DBRM quantum (100)ms
Thu Jul 03 16:36:21 2025
GEN0 started with pid=5, OS id=4192 
Thu Jul 03 16:36:21 2025
DIAG started with pid=6, OS id=8232 
Thu Jul 03 16:36:21 2025
DBRM started with pid=7, OS id=16436 
Thu Jul 03 16:36:21 2025
DIA0 started with pid=8, OS id=11400 
Thu Jul 03 16:36:21 2025
MMAN started with pid=9, OS id=11108 
Thu Jul 03 16:36:21 2025
DBW0 started with pid=10, OS id=12232 
Thu Jul 03 16:36:21 2025
DBW1 started with pid=11, OS id=7368 
Thu Jul 03 16:36:21 2025
LGWR started with pid=12, OS id=13520 
Thu Jul 03 16:36:21 2025
CKPT started with pid=13, OS id=11952 
Thu Jul 03 16:36:21 2025
SMON started with pid=14, OS id=9304 
Thu Jul 03 16:36:21 2025
RECO started with pid=15, OS id=17136 
Thu Jul 03 16:36:21 2025
MMON started with pid=16, OS id=1984 
Thu Jul 03 16:36:21 2025
MMNL started with pid=17, OS id=2568 
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'
starting up 1 shared server(s) ...
ORACLE_BASE from environment = D:\app\Administrator
Thu Jul 03 16:36:22 2025
alter database mount exclusive
Successful mount of redo thread 1, with mount id 1287723014
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount exclusive
alter database open
Thread 1 opened at log sequence 296590
  Current log# 1 seq# 296590 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\orcl\REDO01.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
[15144] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:3680275922 end:3680276032 diff:110 (1 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_15144.trc  (incident=7579):
ORA-00600: 内部错误代码, 参数: [kokiasg1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_7579\orcl_ora_15144_i7579.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_15144.trc:
ORA-00600: 内部错误代码, 参数: [kokiasg1], [], [], [], [], [], [], [], [], [], [], []
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_15144.trc:
ORA-00600: 内部错误代码, 参数: [kokiasg1], [], [], [], [], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 15144): terminating the instance due to error 600
Instance terminated by USER, pid = 15144
ORA-1092 signalled during: alter database open...

对数据库启动过程进行跟踪确认报错可能和IDGEN1$对象有关系

PARSING IN CURSOR #615624160 len=30 dep=1 uid=0 oct=3 lid=0 tim=752975051401
   hv=3013659460 ad='7ffbd8f025d0' sqlid='6d8vr86tu1ku4'
select TOTAL from SYS.ID_GENS$
END OF STMT
PARSE #615624160:c=15625,e=2775,p=2,cr=14,cu=0,mis=1,r=0,dep=1,og=4,plh=1676180847,tim=752975051401
EXEC #615624160:c=0,e=6,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=1676180847,tim=752975051452
WAIT #615624160: nam='db file sequential read' ela= 126 file#=1 block#=3440 blocks=1 obj#=514 tim=752975051594
WAIT #615624160: nam='db file sequential read' ela= 48 file#=1 block#=3441 blocks=1 obj#=514 tim=752975051671
FETCH #615624160:c=0,e=224,p=2,cr=3,cu=0,mis=0,r=1,dep=1,og=4,plh=1676180847,tim=752975051687
STAT #615624160 id=1 cnt=1 pid=0 pos=1 obj=514 op='TABLE ACCESS FULL ID_GENS$ (cr=3 pr=2 pw=0 time=223 us)'
CLOSE #615624160:c=0,e=15,dep=1,type=0,tim=752975051716
BINDS #12720440:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=0001 frm=00 csi=00 siz=80 off=0
  kxsbbbfp=24b1b128  bln=22  avl=01  flg=05
  value=0
 Bind#1
  oacdty=01 mxl=32(07) mxlc=00 mal=00 scl=00 pre=00
  oacflg=10 fl2=0001 frm=01 csi=852 siz=0 off=24
  kxsbbbfp=24b1b140  bln=32  avl=07  flg=01
  value="IDGEN1$"
 Bind#2
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=0001 frm=00 csi=00 siz=0 off=56
  kxsbbbfp=24b1b160  bln=22  avl=02  flg=01
  value=1
EXEC #12720440:c=0,e=107,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2853959010,tim=752975051842
FETCH #12720440:c=0,e=5,p=0,cr=3,cu=0,mis=0,r=0,dep=1,og=4,plh=2853959010,tim=752975051856
CLOSE #12720440:c=0,e=0,dep=1,type=3,tim=752975051870
Incident 161 created, dump file: C:\APP\XFF\diag\rdbms\orcl\orcl\incident\incdir_161\orcl_ora_1880_i161.trc
ORA-00600: 内部错误代码, 参数: [kokiasg1], [], [], [], [], [], [], [], [], [], [], []

ORA-00600: 内部错误代码, 参数: [kokiasg1], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: 内部错误代码, 参数: [kokiasg1], [], [], [], [], [], [], [], [], [], [], []

从mos中确认当数据库缺少IDGEN1$序列的时候,启动会报ORA-600 kokiasg1错误.
ORA-600-kokiasg1


使用工具恢复obj$表到新库中

E:\dump>imp test/oracle file=SYS_OBJ$.dmp full=y

Import: Release 11.2.0.4.0 - Production on 星期六 7月 5 09:34:42 2025

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.


连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

经由常规路径由 EXPORT:V08.01.07 创建的导出文件

警告: 这些对象由 SYS 导出, 而不是当前用户

已经完成 ZHS16GBK 字符集和 AL16UTF16 NCHAR 字符集中的导入
导出服务器使用 UTF8 NCHAR 字符集 (可能的 ncharset 转换)
. 正在将 SYS 的对象导入到 TEST
. 正在将 SYS 的对象导入到 TEST
. . 正在导入表                          "OBJ$"导入了      103764 行
成功终止导入, 没有出现警告。

查询test.obj$表确认没有IDGEN1$对象名称记录

SQL> select * from test.obj$ where name='IDGEN1$';

未选定行

SQL>

查询正常obj$字典中关于IDGEN1$对象信息

SQL> select owner#, obj#,type# from obj$ where name='IDGEN1$';

    OWNER#       OBJ#      TYPE#
---------- ---------- ----------
         0       1229          6

在故障库恢复出来的test.obj$中查询obj#为1229附近对象

SQL> select owner#, obj#,type#,name from test.obj$ where obj# in(1228,1229,1230);

    OWNER#       OBJ#      TYPE# NAME
---------- ---------- ---------- ------------------------------
         0       1228          2 DST$TRIGGER_TABLE
         0       1230         13 BFILE

SQL> select owner#, obj#,type#,name from obj$ where obj# in(1228,1229,1230);

    OWNER#       OBJ#      TYPE# NAME
---------- ---------- ---------- ------------------------------
         0       1228          2 DST$TRIGGER_TABLE
         0       1229          6 IDGEN1$
         0       1230         13 BFILE

目前看初步判断故障库确实由于IDGEN1$序列丢失导致无法启动,处理过程相对比较简单,在数据库open的过程中,打开新会话创建IDGEN1$序列序列
11
22


然后重启数据库,即可正常启动成功,让看尝试登录数据库报ora-600 12803错误
ORA-600-12803

再次检查alert日志大量ORA-600错误

Fri Jul 04 15:57:13 2025
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_ora_27788.trc  (incident=12239):
ORA-00600: 内部错误代码, 参数: [12803], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fri Jul 04 15:58:04 2025
Errors in file C:\APP\XFF\diag\rdbms\orcl\orcl\trace\orcl_mmon_1976.trc  (incident=12184):
ORA-00600: 内部错误代码, 参数: [15264], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.

基于这样ORA-600错误,初步怀疑字典层面还有问题,因为最初的错误是序列异常,所以这次我重点对系统队列进行分析,通过dul把seq$表恢复到test用户中

E:\dump>imp test/oracle file=SYS_seq$.dmp full=y

Import: Release 11.2.0.4.0 - Production on 星期六 7月 5 10:10:17 2025

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.


连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

经由常规路径由 EXPORT:V08.01.07 创建的导出文件

警告: 这些对象由 SYS 导出, 而不是当前用户

已经完成 ZHS16GBK 字符集和 AL16UTF16 NCHAR 字符集中的导入
导出服务器使用 UTF8 NCHAR 字符集 (可能的 ncharset 转换)
. 正在将 SYS 的对象导入到 TEST
. 正在将 SYS 的对象导入到 TEST
. . 正在导入表                          "SEQ$"导入了         359 行
成功终止导入, 没有出现警告。

查询发现之前的序列(obj=1229)的竟然还在seq$中(obj$中没有了记录)

SQL> select * from test.seq$ where obj#=1229;

      OBJ# INCREMENT$   MINVALUE   MAXVALUE     CYCLE#     ORDER$      CACHE
---------- ---------- ---------- ---------- ---------- ---------- ----------
 HIGHWATER AUDIT$                                      FLAGS
---------- -------------------------------------- ----------
      1229         50          1 1.0000E+28          0          0       1000
  60267151 --------------------------------                0

这种现象证明seq 不是通过drop sequence命令删除,而可能直接delete obj$表进行删除,通过试验重现正常删除seq之后,obj$和seq$都会同步被删除

SQL> create sequence xxxx;

序列已创建。

SQL> select obj#,type# from obj$ where name='XXXX';

      OBJ#      TYPE#
---------- ----------
     87383          6

SQL> SELECT * FROM SEQ$ WHERE OBJ#=87383;

      OBJ# INCREMENT$   MINVALUE   MAXVALUE     CYCLE#     ORDER$      CACHE
---------- ---------- ---------- ---------- ---------- ---------- ----------
 HIGHWATER AUDIT$                                      FLAGS
---------- -------------------------------------- ----------
     87383          1          1 1.0000E+28          0          0         20
         1 --------------------------------                0


SQL> DROP SEQUENCE XXXX;

序列已删除。

SQL> SELECT * FROM SEQ$ WHERE OBJ#=87383;

未选定行

SQL> select obj#,type# from obj$ where name='XXXX';

未选定行

想到这里,那进一步分析,是否还有其他的系统序列被删除,分析思路是:在一个正常的库里面找出来SYS的seq的obj#,然后和test用户里面的obj$,seq$表里面对比
找出来test.obj$中sys用户的seq对象名字

SQL> select name,obj#,type# from test.obj$ where obj# in(
  2  select obj# from sys.obj$ where owner#=0 and type#=6)
  3  and type#=6;

未选定行

通过查询确认故障库中sys下面系统自带的核心seq的对象名称全部被删除(obj$中明确被删除),分析seq$中情况确认
QQ20250705-102429

SQL> select name,ctime from test.obj$ where type#=6 and owner#=0;

未选定行

通过上述相关核实,故障库中的obj$中系统字典seq基本上被删除(正常情况应该有130多个).对于这种情况,后续的类此比较简单,通过seq$表内容,构造出来系统 seq的创建语句,对其进行创建,然后数据库恢复正常,完成本次恢复工作.

发表在 非常规恢复 | 标签为 , , | 留下评论