标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-00742 ORA-01110 ORA-01555 ORA-01578 ORA-01595 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (103)
- 数据库 (1,770)
- DB2 (22)
- MySQL (77)
- Oracle (1,611)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (166)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (15)
- ORACLE 21C (3)
- Oracle 23ai (8)
- Oracle ASM (69)
- Oracle Bug (8)
- Oracle RAC (54)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (29)
- Oracle备份恢复 (592)
- Oracle安装升级 (98)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (86)
- PostgreSQL (30)
- pdu工具 (6)
- PostgreSQL恢复 (9)
- SQL Server (32)
- SQL Server恢复 (13)
- TimesTen (7)
- 达梦数据库 (3)
- 达梦恢复 (1)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (39)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (22)
-
最近发表
- Oracle 19c 202507补丁(RUs+OJVM)-19.28
- 2025年的Oracle 8.0.5数据库恢复
- ORA-600 kokiasg1故障分析(obj$中核心字典序列全部被恶意删除)
- ORA-00756 ORA-10567故障数据0丢失恢复
- 数据库文件变成32k故障恢复
- tcp连接过多导致监听TNS-12532 TNS-12560 TNS-00502错误
- 文件系统格式化MySQL数据库恢复
- .sstop勒索加密数据库恢复
- 解决一次硬件恢复之后数据文件0kb的故障恢复case
- Error in invoking target ‘libasmclntsh19.ohso libasmperl19.ohso client_sharedlib’问题处理
- ORA-01171: datafile N going offline due to error advancing checkpoint
- linux环境oracle数据库被文件系统勒索加密为.babyk扩展名溯源
- ORA-600 ksvworkmsgalloc: bad reaper
- ORA-600 krccfl_chunk故障处理
- Oracle Recovery Tools恢复案例总结—202505
- ORA-600 kddummy_blkchk 数据库循环重启
- 记录一次asm disk加入到vg通过恢复直接open库的案例
- CHECKDB 发现了 N 个分配错误和 M 个一致性错误
- 达梦数据库dm.ctl文件异常恢复
- Oracle Recovery Tools修复ORA-00742、ORA-600 ktbair2: illegal inheritance故障
分类目录归档:Oracle
主键表插入数据不提交,外键表插入数据被阻塞
有客户和我说:他在含主外键的表中实验发现,在主表数据未提交,然后在外键表插入该数据数据时,出现外键表hang住现象.我开始以为是不同的会话,根据oracle数据库的一致性原则,应该新会话在外键表中不能知道这个记录的存在,直接报错.
可是我实验结果证明:外键表会被阻塞.分析原因如下:
模拟环境
SQL> create table t_p(id number primary key,name varchar2(100)); Table created. SQL> create table t_f(fid number primary key,pid number, foreign key(pid) references t_p(id)); Table created. --会话1 SQL> insert into t_p values(1,'xifenfei'); 1 row created. SQL> commit; Commit complete. --会话2 SQL> insert into t_f values(1,1); 1 row created. SQL> commit; --会话1 SQL> insert into t_p values(2,'XIFENFEI'); 1 row created. --会话2 SQL> insert into t_f values(2,2); --hang住
通过实验发现,当主键数据没有提交,然后在外键表中插入该数据外键数据时,该条记录会处于hang住状态(等待),那是什么原因导致了这个等待呢?对会话2做一个10046的trace,发现如下
*** 2012-05-17 17:25:41.757 WAIT #3065187488: nam='enq: TX - row lock contention' ela= 27002895 name|mode=1415053316 usn<<16 | slot=262151 sequence=588 obj#=-1 tim=1337246741756917 EXEC #3065187488:c=4000,e=27004456,p=0,cr=2,cu=14,mis=0,r=0,dep=0,og=1, plh=0,tim=1337246741757690 ERROR #3065187488:err=1013 tim=1337246741757751 STAT #3065187488 id=1 cnt=0 pid=0 pos=1 obj=0 op='LOAD TABLE CONVENTIONAL (cr=0 pr=0 pw=0 time=12 us)' WAIT #3065187488: nam='SQL*Net break/reset to client' ela= 581 driver id=1650815232 break?=0 p3=0 obj#=-1 tim=1337246741782587 WAIT #3065187488: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1337246741782668
通过这个trace发现,是因为TX锁导致了外键表上的插入操作被阻塞.出现该问题的原因
有两种可能:1.两次插入(主键表和外键表分别插入)在主键表上有不兼容锁;2.外键表上有不兼容性锁.
使用oradebug跟踪会话
oradebug setmypid --EVENT 10704跟踪锁的使用情况 oradebug EVENT 10704 trace name context forever,level 10 --插入数据操作 oradebug EVENT 10704 trace name context off oradebug TRACEFILE_NAME
跟踪主键表插入数据
*** 2012-05-17 19:05:52.410 ksqgtl *** TM-00012892-00000000 mode=3 flags=0x401 timeout=21474836 *** ksqgtl: xcb=0x343c0e54, ktcdix=2147483647, topxcb=0x343c0e54 ktcipt(topxcb)=0x0 *** 2012-05-17 19:05:52.411 ksucti: init txn DID from session DID ksqgtl: ksqlkdid: 0001-0013-0000000F *** 2012-05-17 19:05:52.429 *** ksudidTrace: ksqgtl ktcmydid(): 0001-0013-0000000F ksusesdi: 0000-0000-00000000 ksusetxn: 0001-0013-0000000F ksqgtl: RETURNS 0 *** 2012-05-17 19:05:52.430 ksqgtl *** TM-00012894-00000000 mode=3 flags=0x401 timeout=21474836 *** ksqgtl: xcb=0x343c0e54, ktcdix=2147483647, topxcb=0x343c0e54 ktcipt(topxcb)=0x0 *** 2012-05-17 19:05:52.430 ksucti: init session DID from txn DID: ksqgtl: ksqlkdid: 0001-0013-0000000F *** 2012-05-17 19:05:52.430 *** ksudidTrace: ksqgtl ktcmydid(): 0001-0013-0000000F ksusesdi: 0000-0000-00000000 ksusetxn: 0001-0013-0000000F ksqgtl: RETURNS 0 *** 2012-05-17 19:05:52.431 ksqgtl *** TX-00050019-00000307 mode=6 flags=0x401 timeout=0 *** ksqgtl: xcb=0x343c0e54, ktcdix=2147483647, topxcb=0x343c0e54 ktcipt(topxcb)=0x0 *** 2012-05-17 19:05:52.431 ksucti: init session DID from txn DID: ksqgtl: ksqlkdid: 0001-0013-0000000F *** 2012-05-17 19:05:52.431 *** ksudidTrace: ksqgtl ktcmydid(): 0001-0013-0000000F ksusesdi: 0000-0000-00000000 ksusetxn: 0001-0013-0000000F ksqgtl: RETURNS 0 SQL> SELECT TO_NUMBER(12892,'xxxxxxx') from dual; TO_NUMBER(12892,'XXXXXXX') -------------------------- 75922 SQL> SELECT TO_NUMBER(12894,'xxxxxxx') from dual; TO_NUMBER(12894,'XXXXXXX') -------------------------- 75924 SQL> select object_name from dba_objects where object_id in(75922,75924); OBJECT_NAM ---------- T_P T_F
通过锁使用情况跟踪可以知道,在主键表插入一条记录时,先在主键表获得TM锁,然后外键表获得TM锁,最后主键表获得TX MODE=6的锁。
跟踪外键表插入数据
*** 2012-05-17 19:49:24.912 ksqgtl *** TM-00012892-00000000 mode=3 flags=0x401 timeout=21474836 *** ksqgtl: xcb=0x343a45e8, ktcdix=2147483647, topxcb=0x343a45e8 ktcipt(topxcb)=0x0 *** 2012-05-17 19:49:24.912 ksucti: init txn DID from session DID ksqgtl: ksqlkdid: 0001-0015-00000064 *** 2012-05-17 19:49:24.913 *** ksudidTrace: ksqgtl ktcmydid(): 0001-0015-00000064 ksusesdi: 0000-0000-00000000 ksusetxn: 0001-0015-00000064 ksqgtl: RETURNS 0 *** 2012-05-17 19:49:24.913 ksqgtl *** TM-00012894-00000000 mode=3 flags=0x401 timeout=21474836 *** ksqgtl: xcb=0x343a45e8, ktcdix=2147483647, topxcb=0x343a45e8 ktcipt(topxcb)=0x0 *** 2012-05-17 19:49:24.913 ksucti: init session DID from txn DID: ksqgtl: ksqlkdid: 0001-0015-00000064 *** 2012-05-17 19:49:24.913 *** ksudidTrace: ksqgtl ktcmydid(): 0001-0015-00000064 ksusesdi: 0000-0000-00000000 ksusetxn: 0001-0015-00000064 ksqgtl: RETURNS 0 *** 2012-05-17 19:49:24.913 ksqgtl *** TX-0002001f-0000034a mode=6 flags=0x401 timeout=0 *** ksqgtl: xcb=0x343a45e8, ktcdix=2147483647, topxcb=0x343a45e8 ktcipt(topxcb)=0x0 *** 2012-05-17 19:49:24.913 ksucti: init session DID from txn DID: ksqgtl: ksqlkdid: 0001-0015-00000064 *** 2012-05-17 19:49:24.914 *** ksudidTrace: ksqgtl ktcmydid(): 0001-0015-00000064 ksusesdi: 0000-0000-00000000 ksusetxn: 0001-0015-00000064 ksqgtl: RETURNS 0 *** 2012-05-17 19:49:24.914 ksqgtl *** TX-00050019-00000307 mode=4 flags=0x10021 timeout=21474836 *** ksqgtl: xcb=0x343a45e8, ktcdix=2147483647, topxcb=0x343a45e8 ktcipt(topxcb)=0x0 *** 2012-05-17 19:49:24.914 ksucti: init session DID from txn DID: ksqgtl: ksqlkdid: 0001-0015-00000064 *** 2012-05-17 19:49:24.914 *** ksudidTrace: ksqgtl ktcmydid(): 0001-0015-00000064 ksusesdi: 0000-0000-00000000 ksusetxn: 0001-0015-00000064 *** 2012-05-17 19:49:24.914 ksqcmi: TX,50019,307 mode=4 timeout=21474836
从这里可以发现:先在主键表和外键表上加上TM锁,然后外键表获得TX MODE=6的锁(这边成功,因为该表上未有其他级别不兼容锁),再需要在主键表上获得TX MODE=4(表结构共享锁+所有记录共享锁),但是这个时候,发现该锁上已经在主键表插入数据未提交的时候,已经含有了TX MODE=6的锁,从而使得TX MODE=4无法获得,从而使得外键表插入数据处于阻塞状态.
发表在 Oracle
评论关闭
rac redo log file被意外覆盖数据库恢复
当前日志被覆盖导致错误
朋友的一客户在一套rac上包含了两个数据库,其其中一个库增加redo group时候,覆盖了另外一个库的redo,悲剧的是刚好是current redo
Wed May 16 17:03:05 2012 ALTER DATABASE OPEN This instance was first to open Wed May 16 17:03:09 2012 Beginning crash recovery of 2 threads parallel recovery started with 15 processes Wed May 16 17:03:11 2012 Started redo scan Wed May 16 17:03:11 2012 Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2040024.trc: ORA-00305: log 14 of thread 1 inconsistent; belongs to another database ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_2' ORA-00305: log 14 of thread 1 inconsistent; belongs to another database ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_1' Abort recovery for domain 0 Wed May 16 17:03:11 2012 Aborting crash recovery due to error 305 Wed May 16 17:03:11 2012 Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2040024.trc: ORA-00305: log 14 of thread 1 inconsistent; belongs to another database ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_2' ORA-00305: log 14 of thread 1 inconsistent; belongs to another database ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_1' ORA-305 signalled during: ALTER DATABASE OPEN... Wed May 16 17:03:13 2012 Shutting down instance (abort)
使用_allow_resetlogs_corruption= TRUE进行恢复
Wed May 16 18:16:48 2012 SMON: enabling cache recovery Wed May 16 18:16:48 2012 Instance recovery: looking for dead threads Instance recovery: lock domain invalid but no dead threads Wed May 16 18:16:48 2012 Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2105454.trc: ORA-00600: internal error code, arguments: [kclchkblk_4], [2522], [18446744072024280773], [2522], [18446744072024247666], [], [], [] Wed May 16 18:16:50 2012 Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2105454.trc: ORA-00600: internal error code, arguments: [kclchkblk_4], [2522], [18446744072024280773], [2522], [18446744072024247666], [], [], [] Wed May 16 18:16:50 2012 Error 600 happened during db open, shutting down database USER: terminating instance due to error 600 Instance terminated by USER, pid = 2105454 ORA-1092 signalled during: alter database open resetlogs...
ORA-600[KCLCHKBLK_4], is signaled because the SCN in a tempfile block is too high.
The same reason caused the ORA-600[2662]s in the alert logs.
因为是临时文件的scn太大的问题,那就比较好解决:
启动数据库到mount状态,查询出来相关temp file,然后drop掉.
ORA-00600[6856]
Wed May 16 20:25:16 2012 Errors in file /oracle/admin/odsdb/bdump/odsdb1_smon_2482210.trc: ORA-00339: archived log does not contain any redo ORA-00334: archived log: '/dev/rods_redo2_1_1' ORA-00600: internal error code, arguments: [6856], [0], [0], [], [], [], [], [] ORACLE Instance odsdb1 (pid = 16) - Error 600 encountered while recovering transaction (10, 8) on object 7162533. Wed May 16 20:25:16 2012 Errors in file /oracle/admin/odsdb/bdump/odsdb1_smon_2482210.trc: ORA-00600: internal error code, arguments: [6856], [0], [0], [], [], [], [], []
这里的4193和4194是比较熟悉的,根据这里的提示猜测6856也是和undo有关系
ORA-600[6856]SMON is trying to recover a dead transaction.
But the undo application runs into an internal error (trying to delete a row that is already deleted).
因为smon回滚的时候出现上面错误,解决方法是想办法终止回滚,使用event=”10513 trace name context forever, level 2″.
ORA-00600[4193]/ORA-00600[4194]
Wed May 16 20:25:17 2012 Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2547936.trc: ORA-00339: archived log does not contain any redo ORA-00334: archived log: '/dev/rods_redo2_1_1' ORA-00600: internal error code, arguments: [4194], [22], [25], [], [], [], [], [] Wed May 16 20:25:18 2012 Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2547936.trc: ORA-00339: archived log does not contain any redo ORA-00334: archived log: '/dev/rods_redo2_1_1' ORA-00600: internal error code, arguments: [4194], [22], [25], [], [], [], [], [] Wed May 16 20:25:56 2012 Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2547936.trc: ORA-00600: internal error code, arguments: [4193], [22248], [22252], [], [], [], [], []
太常见错误,不再做说明,虽然使用event是的库open成功,因为部分回滚段有问题,该错误还是会出现(还是喜欢直接屏蔽回滚段)
ORA-00600[ktpridestroy2]
Wed May 16 20:36:26 2012 Errors in file /oracle/admin/odsdb/bdump/odsdb1_smon_2101296.trc: ORA-00600: internal error code, arguments: [ktpridestroy2], [], [], [], [], [], [], []
This error could be the result of a corruption and involves the parallel rollback that SMON enables each startup.
解决:fast_start_parallel_rollback=false
ORA-00600[kturacf1]/ORA-00600[kcbgcur_9]
Wed May 16 20:49:15 2012 Errors in file /oracle/admin/odsdb/bdump/odsdb1_j000_2007088.trc: ORA-00600: internal error code, arguments: [kturacf1], [2097152], [], [], [], [], [], [] ORA-00600: internal error code, arguments: [kcbgcur_9], [780140563], [4], [4294959056], [2097152], [], [], []
ORA-00600[kturacf1]错误未查询到原因
ORA-00600[kcbgcur_9]错误原因可能是:Buffers are pinned in a specific class order to prevent internal deadlocks.
因为这两个错误是job产生非致命错误,在这次的处理过程中可以忽略
ORA-00600[4097]
Wed May 16 21:05:05 2012 Errors in file /oracle/admin/odsdb/bdump/odsdb1_j000_1716282.trc: ORA-12012: error on auto execute of job 6603 ORA-20001: ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], [] ORA-06512: at "EPBI.UP_SYSLOG_ONLINE_USER", line 141 ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
When an instance has a rollback segment offline and the instance crashes, or
the user does a shutdown abort, the rollback segment wrap number does not get
updated. If that segment is then dropped and recreated immediately after the
instance is restarted, the wrap number could be lower than existing wrap
numbers. This will cause the ORA-600[4097] to occur in subsequent
transactions using Rollback.
这个错误也是因为回滚段wrap number未被及时更新导致的异常.
总结这次恢复过程
1.因当前redo丢失,使用隐含参数打开库,然后出现ORA-600[KCLCHKBLK_4](这个错误比较少见,更何况这个库是10.2.0.4)
2.undo出现问题出现ORA-00600[6856]错误不是很多见
3.接下来都是一些列undo导致的问题,其实如果开始就直接使用隐含参数删除掉有问题回滚段,效果可能会比event好.
4.因为部分trace文件没有拿到,未做深入分析,文章列出整体恢复思路
5.本次恢复的数据库版本是10.2.0.4,根据mos描述,很多错误应不会发生,但是实际还是发生了,MOS的版本范围,不要太看重.
6.其实这篇文章的本质不是展示恢复过程,而是再一次提醒:操作数据库慎重,特别是一台机器上多套库.
library cache lock等待事件
Library cache lock介绍
Oracle利用Library cache lock和Library cache pin来实现并发控制,Library cache lock是在handle上获取的,而Library cache pin则是在data heap上获取。访问对象时,首先必须获取handle上的lock,然后将访问的数据pin在内存中。lock的作用是控制进程间的并发访问,而pin的作用是保证数据一致性,防止数据在访问时被交换出去。
lock和pin的实现类似于enqueue,在每个handle上都有lock和pin的holder list和waiter list,用来保存持有该资源和等待该资源的队列。
Library cache lock相关sql语句
--找出library cache lock等待sid,saddr信息 select sid,saddr from v$session where event= 'library cache lock'; SID SADDR ---------- -------- 16 572ed244 --找出blocked信息 select kgllkhdl Handle,kgllkreq Request, kglnaobj Object from x$kgllk where kgllkses = '572ed244' and kgllkreq > 0; HANDLE REQUEST OBJECT -------- ---------- ------------------ 62d064dc 2 EMPLOYEES --找出blocking信息 select kgllkses saddr,kgllkhdl handle,kgllkmod mod,kglnaobj object from x$kgllk lock_a where kgllkmod > 0 and exists (select lock_b.kgllkhdl from x$kgllk lock_b where kgllkses = '572ed244' /* blocked session */ and lock_a.kgllkhdl = lock_b.kgllkhdl and kgllkreq > 0); SADDR HANDLE MOD OBJECT -------- -------- ---------- ------------ 572eac94 62d064dc 3 EMPLOYEES --blocking 会话信息 select sid,username,terminal,program from v$session where saddr = '572eac94' SID USERNAME TERMINAL PROGRAM ---------- ----------- --------- -------------------------------------------- 12 SCOTT pts/20 sqlplus@goblin.forgotten.realms (TNS V1-V3) --所有blocked 会话 select sid,username,terminal,program from v$session where saddr in (select kgllkses from x$kgllk lock_a where kgllkreq > 0 and exists (select lock_b.kgllkhdl from x$kgllk lock_b where kgllkses = '572eac94' /* blocking session */ and lock_a.kgllkhdl = lock_b.kgllkhdl and kgllkreq = 0) ); SID USERNAME TERMINAL PROGRAM ---------- --------- --------- ------------------------------------------- 13 SCOTT pts/22 sqlplus@goblin.forgotten.realms (TNS V1-V3) 16 SCOTT pts/7 sqlplus@goblin.forgotten.realms (TNS V1-V3)