标签云
asm 恢复 asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 kfed MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 ORACLE恢复 Oracle 恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (100)
- 数据库 (1,597)
- DB2 (22)
- MySQL (70)
- Oracle (1,463)
- Data Guard (49)
- EXADATA (7)
- GoldenGate (21)
- ORA-xxxxx (158)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (13)
- ORACLE 21C (3)
- Oracle ASM (65)
- Oracle Bug (7)
- Oracle RAC (47)
- Oracle 安全 (6)
- Oracle 开发 (27)
- Oracle 监听 (27)
- Oracle备份恢复 (530)
- Oracle安装升级 (84)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (75)
- PostgreSQL (17)
- PostgreSQL恢复 (5)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (36)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (19)
-
最近发表
- Oracle 19c/21c最新patch信息-202404
- PostgreSQL恢复系列:pg_filedump批量处理
- PostgreSQL部分主要字典信息
- PostgreSQL恢复系列:pg_filedump恢复字典构造
- PostgreSQL 16 源码安装
- ORA-00742 ORA-00312 恢复
- 数据库open成功后报ORA-00353 ORA-00354错误引起的一系列问题(本质ntfs文件系统异常)
- ORA-600 ktsiseginfo1故障
- ORA-00600: internal error code, arguments: [16703], [1403], [4] 原因
- 最近遇到几起ORA-600 16703故障(tab$被清空),请引起重视
- ORA-600 2662快速恢复之Patch scn工具
- TNS-12518: TNS:listener could not hand off client connection
- ora.storage无法启动报ORA-12514故障处理
- 断电引起文件scn异常数据库恢复
- ORA-16188: LOG_ARCHIVE_CONFIG settings inconsistent with previously started instance
- .[hudsonL@cock.li].mkp勒索加密数据库完美恢复
- 模拟带库实现rman远程备份
- 又一例:ORA-600 kclchkblk_4和2662故障
- Oracle误删除数据文件恢复
- Oracle 19C 备库DML重定向—DML Redirection
月归档:五月 2019
存储双活系统逻辑损坏数据库抢救恢复
计划休假的前一夜晚上节点朋友求救电话,说xx医院核心his系统的Oracle数据库很多表报ORA-8103错误,业务无法正常办理.
通过dbv检查文件发现连续坏块
根据以往经验数据库出现类似这样的错误,很可能是底层问题,查看系统日志发现大量磁盘错误
该报错时间和应用反馈系统异常时间基本上匹配,初步怀疑是硬件或者os异常导致.因为客户数据库大量表表ORA-8103,而且有文件出现连片被置空,无法准确定位数据库损坏情况(置空值数据库级别的物理损坏,ora-8103是逻辑错误在表不被访问的情况下无法检查出来),考虑分析客户的硬件环境,备份容灾情况,分析选择最佳方案.
通过和客户沟通以及检查数据库的相关情况发现信息如下:
1)存储使用的是xx厂商的双活方案,这种存储级解决方案对于该故障来说没用,因为是lun的逻辑级别损坏,损坏数据同时同步到两套存储上.
2)数据库库容灾使用的是某厂家的cdp同步容灾,客户对cdp库进行分析,发现数据同步异常,基本上该方案也无法使用
3)数据库的备份情况:由于存放数据库备份的存储电池异常和有坏盘导致存储写io效率非常低,客户在3天之前停止掉了文件系统中的rman备份;有tsm的带库备份,结果检查发现竟无一次备份成功.
故障进一步扩大
针对客户情况,确定是节点2有明显异常,准备停掉节点2的数据库和集群,然后看下在节点1上是否有改善,结果发现把节点2的crs停掉之后,节点1的库直接crash,通过分析发现asm disk有一块盘磁盘前几M表直接置空(应该是在关闭crs之前就已经异常,只是因为磁盘头部分数据没有相关操作,因此没有触发相关问题),当一个节点关闭会去写磁盘头信息,asm发现异常直接dismount 节点1的磁盘组了,从而使得节点1的库异常.
现在的情况:
1)现在的asm 磁盘组异常(其中一个磁盘头前几M损坏),也就是说在原库基础上直接修复的概率基本上没有可能
2)cdp数据异常,不可用
3)在数据库相关服务器中找到一份4天之前的一次全备
恢复思路:
1.客户准备新空间,直接把4天之前的备份还原到本地文件系统中
2.通过底层工具对于有磁盘损坏的asm磁盘组进行分析,尝试恢复归档日志和redo(尽可能做到最大限度恢复数据)
3.通过备份还原4天之前的备份结合我们恢复的归档日志和redo尝试完全恢复数据
4.问题风险,就算归档日志和redo从损坏的asm 磁盘组中恢复出来,但是也有可能损坏,导致后面无法恢复到最新数据(造成数据丢失)
实际操作:
1. 由于客户在昨天晚上故障之后增加了一些undo数据文件,使得无法正常全库restore database(因为ctl中数据文件信息比备份集中多)
2. 后续由于10204 rac还原到单机出现ORA-600 kgeade_is_1错误
3. 数据库恢复完成之后,出现sqlplus 操作数据库正常,plsql dev和应用访问数据库报ora-27092的问题
最后运气不错,经过一系列努力,数据库open成功,应用也正常访问,最初生产环境中损坏的表现在查询也不再报ORA-8103,dbv检查异常文件也ok
再次提醒各位朋友:
1)你的数据库备份是否正常,建议定期做故障演练
2)选择合适数据库的容灾方案,建议定期检查或者演练
3)存储双活可以解决硬件故障问题,但是还要有适当的解决方案来规避存储逻辑错误风险.
数据库open过程遭遇ORA-1555对应sql语句补充
在2015年的在数据库open过程中常遇到ORA-01555汇总文章中写过oracle open过程中可能会遇到ORA-01555错误,对应的sql语句.最近的恢复中又遇到两个新的,对其进行补充
select rowcnt,blkcnt,empcnt,avgspc,chncnt,avgrln,nvl(degree,1), nvl(instances,1) from tab$ where obj# = :1
Thu May 09 02:10:27 2019 SMON: enabling cache recovery ORA-01555 caused by SQL statement below (SQL ID: bqbdby3c400p7, SCN: 0x0000.3e785fc7): select rowcnt,blkcnt,empcnt,avgspc,chncnt,avgrln,nvl(degree,1), nvl(instances,1) from tab$ where obj# = :1 Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_ora_15929.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 91 with name "_SYSSMU91_1360910548$" too small Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_ora_15929.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 91 with name "_SYSSMU91_1360910548$" too small Error 704 happened during db open, shutting down database USER (ospid: 15929): terminating the instance due to error 704 Instance terminated by USER, pid = 15929 ORA-1092 signalled during: alter database open resetlogs... opiodr aborting process unknown ospid (15929) as a result of ORA-1092 Thu May 09 02:10:28 2019 ORA-1092 : opitsk aborting process
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
NSA2 started with pid=41, OS id=32571518 ORA-01555 caused by SQL statement below (SQL ID: 3nkd3g3ju5ph1, Query Duration=0 sec, SCN: 0x0005.e4bea784): select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null Errors in file /u01/app/oracle/diag/rdbms/xifenfei_std/xifenfei/trace/xifenfei_ora_18939904.trc: ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 2 ORA-01555: snapshot too old: rollback segment number 7 with name "_SYSSMU7_542380376$" too small Errors in file /u01/app/oracle/diag/rdbms/xifenfei_std/xifenfei/trace/xifenfei_ora_18939904.trc: ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 2 ORA-01555: snapshot too old: rollback segment number 7 with name "_SYSSMU7_542380376$" too small Error 704 happened during db open, shutting down database USER (ospid: 18939904): terminating the instance due to error 704 Instance terminated by USER, pid = 18939904 ORA-1092 signalled during: alter database open RESETLOGS... opiodr aborting process unknown ospid (18939904) as a result of ORA-1092
ORA-00600 kcratr_scan_rc
11.2.0.4数据库启动报ORA-600 kcratr_scan_rc错误
ORA-600 kcratr_scan_rc错误相关alert日志
Thu May 09 01:56:01 2019 alter database open Beginning crash recovery of 1 threads parallel recovery started with 23 processes Started redo scan Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_ora_9975.trc (incident=171): ORA-00600: internal error code, arguments: [kcratr_scan_rc], [4], [1], [39821], [190063], [], [], [], [], [], [], [] Incident details in: /home/u01/diag/rdbms/orcl/orcl/incident/incdir_171/orcl_ora_9975_i171.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Aborting crash recovery due to error 600 Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_ora_9975.trc: ORA-00600: internal error code, arguments: [kcratr_scan_rc], [4], [1], [39821], [190063], [], [], [], [], [], [], [] Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_ora_9975.trc: ORA-00600: internal error code, arguments: [kcratr_scan_rc], [4], [1], [39821], [190063], [], [], [], [], [], [], [] ORA-600 signalled during: alter database open...
从这里看是由于数据库在做实例恢复的时候无法正确的应用日志导致.通过屏蔽数据库前滚恢复,强制open库
SQL> startup mount pfile='/tmp/pfile' ORACLE instance started. Total System Global Area 6998261760 bytes Fixed Size 2266624 bytes Variable Size 2684357120 bytes Database Buffers 4294967296 bytes Redo Buffers 16670720 bytes Database mounted. SQL> alter database open resetlogs; alter database open resetlogs * ERROR at line 1: ORA-01092: ORACLE instance terminated. Disconnection forced ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 91 with name "_SYSSMU91_1360910548$" too small Process ID: 15929 Session ID: 5272 Serial number: 3
数据库报出来比较熟悉的ORA-01092 ORA-00704 ORA-00604 ORA-01555错误,分析trace文件
Successful open of redo thread 1 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Thu May 09 02:10:27 2019 SMON: enabling cache recovery ORA-01555 caused by SQL statement below (SQL ID: bqbdby3c400p7, SCN: 0x0000.3e785fc7): select rowcnt,blkcnt,empcnt,avgspc,chncnt,avgrln,nvl(degree,1), nvl(instances,1) from tab$ where obj# = :1 Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_ora_15929.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 91 with name "_SYSSMU91_1360910548$" too small Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_ora_15929.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 91 with name "_SYSSMU91_1360910548$" too small Error 704 happened during db open, shutting down database USER (ospid: 15929): terminating the instance due to error 704 Instance terminated by USER, pid = 15929 ORA-1092 signalled during: alter database open resetlogs... opiodr aborting process unknown ospid (15929) as a result of ORA-1092 Thu May 09 02:10:28 2019 ORA-1092 : opitsk aborting process
这次的是比较少见的ORA-1555的错误语句select rowcnt,blkcnt,empcnt,avgspc,chncnt,avgrln,nvl(degree,1), nvl(instances,1) from tab$ where obj# = :1,通过bbed进行处理之后
SQL> alter database open; alter database open * ERROR at line 1: ORA-03113: end-of-file on communication channel Process ID: 17776 Session ID: 5272 Serial number: 3
数据库出现ORA-03113错误,通过分析alert日志
ORA-00600: internal error code, arguments: [4198], [], [], [], [], [], [], [], [], [], [], [] Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_smon_17739.trc (incident=128132): ORA-00600: internal error code, arguments: [6006], [1], [], [], [], [], [], [], [], [], [], [] Incident details in: /home/u01/diag/rdbms/orcl/orcl/incident/incdir_128132/orcl_smon_17739_i128132.trc Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_mmon_17743.trc (incident=128151): ORA-00600: internal error code, arguments: [4412], [0x1E6BC4DF8], [0x000000000], [1], [6283], [], [], [], [], [], [], [] Incident details in: /home/u01/diag/rdbms/orcl/orcl/incident/incdir_128151/orcl_mmon_17743_i128151.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. SMON: Parallel transaction recovery slave got internal error SMON: Downgrading transaction recovery to serial Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_smon_17739.trc (incident=128133): ORA-00600: internal error code, arguments: [4137], [10.4.1100583], [0], [0], [], [], [], [], [], [], [], [] Incident details in: /home/u01/diag/rdbms/orcl/orcl/incident/incdir_128133/orcl_smon_17739_i128133.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_mmon_17743.trc (incident=128152): ORA-00600: internal error code, arguments: [4406], [0x1E6BC4DF8], [0x000000000], [2], [6289], [], [], [], [], [], [], [] ORA-00600: internal error code, arguments: [4412], [0x1E6BC4DF8], [0x000000000], [1], [6283], [], [], [], [], [], [], [] Incident details in: /home/u01/diag/rdbms/orcl/orcl/incident/incdir_128152/orcl_mmon_17743_i128152.trc ORACLE Instance orcl (pid = 15) - Error 600 encountered while recovering transaction (10, 4). Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_smon_17739.trc (incident=128134): ORA-00600: internal error code, arguments: [4137], [98.33.13158], [0], [0], [], [], [], [], [], [], [], [] Incident details in: /home/u01/diag/rdbms/orcl/orcl/incident/incdir_128134/orcl_smon_17739_i128134.trc Starting background process SMCO Thu May 09 02:16:25 2019 SMCO started with pid=21, OS id=18119 Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_smon_17739.trc (incident=128135): ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], [] Incident details in: /home/u01/diag/rdbms/orcl/orcl/incident/incdir_128135/orcl_smon_17739_i128135.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Non-fatal internal error happenned while SMON was doing non-existent object cleanup. SMON encountered 1 out of maximum 100 non-fatal internal errors. Thu May 09 02:16:27 2019 Thu May 09 02:16:32 2019 Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_ora_18168.trc (incident=128620): ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], [] ORA-28000: the account is locked Incident details in: /home/u01/diag/rdbms/orcl/orcl/incident/incdir_128620/orcl_ora_18168_i128620.trc ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], [] ORA-28000: the account is locked Block recovery from logseq 3, block 239 to scn 2147483855 Recovery of Online Redo Log: Thread 1 Group 3 Seq 3 Reading mem 0 Mem# 0: /home/u01/oradata/orcl/redo03.log Block recovery stopped at EOT rba 3.241.16 Block recovery completed at rba 3.241.16, scn 0.2147483854 Block recovery from logseq 3, block 239 to scn 2147483853 Recovery of Online Redo Log: Thread 1 Group 3 Seq 3 Reading mem 0 Mem# 0: /home/u01/oradata/orcl/redo03.log Block recovery completed at rba 3.241.16, scn 0.2147483854 Thu May 09 02:16:33 2019 Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_ora_18170.trc (incident=128621): ORA-00600: internal error code, arguments: [4193], [], [], [], [], [], [], [], [], [], [], [] ORA-28000: the account is locked Incident details in: /home/u01/diag/rdbms/orcl/orcl/incident/incdir_128621/orcl_ora_18170_i128621.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Thu May 09 02:16:34 2019 PMON (ospid: 17710): terminating the instance due to error 474 Instance terminated by PMON, pid = 17710
大量的ORA-600 4198,ORA-600 6006,ORA-600 4412,ORA-600 4137,ORA-600 4406,ORA-600 kdsgrp1等错误,根据以往经验,主要是undo异常导致,通过相关处理之后,数据库open正常,数据逻辑导出.