月归档:六月 2016

Resize operation completed for file#

Orale 12c DataGuard环境中备库出现Resize operation completed for file# 现象
数据库版本

[oracle@ray01 ~]$ opatch lspatches
22291127;Database Patch Set Update : 12.1.0.2.160419 (22291127)

OPatch succeeded.

SQL> select * from v$version;

BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production              0
PL/SQL Release 12.1.0.2.0 - Production                                                    0
CORE    12.1.0.2.0      Production                                                        0
TNS for Linux: Version 12.1.0.2.0 - Production                                            0
NLSRTL Version 12.1.0.2.0 - Production                                                    0

alert日志提示

Mon Jun 27 20:56:37 2016
Resize operation completed for file# 18, old size 25600000K, new size 30720000K
Mon Jun 27 20:56:56 2016
Archived Log entry 210 added for thread 1 sequence 286 rlc 915405135 ID 0x2316988c dest 2:
Mon Jun 27 20:57:01 2016
Primary database is in MAXIMUM PERFORMANCE mode
RFS[211]: Assigned to RFS process (PID:22867)
RFS[211]: Selected log 11 for thread 1 sequence 287 dbid 588725388 branch 915405135
Mon Jun 27 20:57:14 2016
Resize operation completed for file# 17, old size 25600000K, new size 30720000K
Mon Jun 27 07:57:15 2016
Archived Log entry 211 added for thread 1 sequence 287 ID 0x2316988c dest 1:
Mon Jun 27 20:57:15 2016
Resize operation completed for file# 3, old size 972800K, new size 983040K
Mon Jun 27 20:57:15 2016
Primary database is in MAXIMUM PERFORMANCE mode
RFS[212]: Assigned to RFS process (PID:22873)
RFS[212]: Selected log 11 for thread 1 sequence 288 dbid 588725388 branch 915405135
Mon Jun 27 20:57:15 2016
Resize operation completed for file# 3, old size 983040K, new size 1024000K
Resize operation completed for file# 3, old size 1024000K, new size 1034240K
Mon Jun 27 20:57:54 2016
Resize operation completed for file# 15, old size 25600000K, new size 30720000K
Mon Jun 27 20:58:15 2016
Resize operation completed for file# 18, old size 30720000K, new size 33554416K
Mon Jun 27 20:58:34 2016
Resize operation completed for file# 17, old size 30720000K, new size 33554416K
Mon Jun 27 20:58:54 2016
Resize operation completed for file# 15, old size 30720000K, new size 33554416K

大量Resize operation completed for file# 操作记录,给人感觉比较烦,根据多年使用oracle的经验,这种现象很可能有隐含参数或者event屏蔽,隐含参数可以猜测到,event需要查询官方资料.

查询汗resize的隐含参数

SQL> col name for a52
SQL> col value for a24
SQL> col description for a50
set linesize 150
select a.ksppinm name,b.ksppstvl value,a.ksppdesc description
  from x$ksppi a,x$ksppcv b
 where a.inst_id = USERENV ('Instance')
   and b.inst_id = USERENV ('Instance')
   and a.indx = b.indx
SQL> SQL>   2    3    4    5    6     and upper(a.ksppinm) LIKE upper('%&param%')
  7  order by name
/
  8  Enter value for param: resize
old   6:    and upper(a.ksppinm) LIKE upper('%&param%')
new   6:    and upper(a.ksppinm) LIKE upper('%resize%')

NAME                                                 VALUE                    DESCRIPTION
---------------------------------------------------- ------------------------ --------------------------------------------------
_asm_skip_resize_check                               FALSE                    skip the checking of the clients for s/w compatibi
                                                                              lity for resize

_bct_public_dba_buffer_dynresize                     2                        allow dynamic resizing of public dba buffers, zero
                                                                               to disable

_disable_file_resize_logging                         FALSE                    disable file resize logging to alert log

从这里可以发现_disable_file_resize_logging参数默认值为false,表示显示文件resize的提示,设置为true应该就可以解决该问题.

发表在 ORACLE 12C | 标签为 | 评论关闭

DataGuard ora-16157故障解决

由于硬件故障,需要激活备库,由于登录错误服务器对本不该激活的服务器进行了如下操作
STANDBY DATABASE FINISH操作

[oracle@app73 ~]$ ss

SQL*Plus: Release 11.2.0.4.0 Production on Sun Jun 26 23:33:19 2016

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning and Data Mining options

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

Database altered.

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH;

Database altered.

对应的alert日志提示

Sun Jun 26 23:33:22 2016
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
Sun Jun 26 23:33:22 2016
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /Data/oracle/diag/rdbms/commentdbdg/commentdb/trace/commentdb_pr00_11871.trc:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Recovered data files to a consistent state at change 47440782676
Sun Jun 26 23:33:23 2016
MRP0: Background Media Recovery process shutdown (commentdb)
Managed Standby Recovery Canceled (commentdb)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
Sun Jun 26 23:33:44 2016
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH
Attempt to do a Terminal Recovery (commentdb)
Media Recovery Start: Managed Standby Recovery (commentdb)
 started logmerger process
Sun Jun 26 23:33:44 2016
Managed Standby Recovery not using Real Time Apply
Parallel Media Recovery started with 4 slaves
Media Recovery Waiting for thread 1 sequence 21196 (in transit)
Killing 4 processes with pids 11881,11867,11869,14314 (all RFS, wait for I/O) 
in order to disallow current and future RFS connections. Requested by OS process 7372
Begin: Standby Redo Logfile archival
End: Standby Redo Logfile archival
Terminal Recovery timestamp is '06/26/2016 23:33:48'
Terminal Recovery: applying standby redo logs.
Terminal Recovery: thread 1 seq# 21196 redo required
Terminal Recovery:
Recovery of Online Redo Log: Thread 1 Group 10 Seq 21196 Reading mem 0
  Mem# 0: /Data/oracle/oradata/commentdb/std_redo10.log
Identified End-Of-Redo (failover) for thread 1 sequence 21196 at SCN 0xffff.ffffffff
Incomplete Recovery applied until change 47440782709 time 06/26/2016 23:29:51
Media Recovery Complete (commentdb)
Terminal Recovery: Enabled archive destination LOG_ARCHIVE_DEST_2
Terminal Recovery: successful completion
Forcing ARSCN to IRSCN for TR 11:196142453
Attempt to set limbo arscn 11:196142453 irscn 11:196142453
Resetting standby activation ID 3880004483 (0xe7442b83)
Sun Jun 26 23:33:48 2016
ARCH: Archival stopped, error occurred. Will continue retrying
ORACLE Instance commentdb - Archival Error
ORA-16014: log 10 sequence# 21196 not archived, no available destinations
ORA-00312: online log 10 thread 1: '/Data/oracle/oradata/commentdb/std_redo10.log'
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH
Sun Jun 26 23:39:24 2016
RFS[4]: Assigned to RFS process 7392
RFS[4]: No connections allowed during/after terminal recovery.

很明显数据库已经挺尸mrp并且成功执行了ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH

尝试重新启动mrp应用日志

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.

Total System Global Area 2.1379E+10 bytes
Fixed Size                  2262656 bytes
Variable Size            2684356992 bytes
Database Buffers         1.8656E+10 bytes
Redo Buffers               36073472 bytes
Database mounted.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION;

Database altered.

alert日志提示

Sun Jun 26 23:40:39 2016
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Attempt to start background Managed Standby Recovery process (commentdb)
Sun Jun 26 23:40:39 2016
MRP0 started with pid=24, OS id=7458
MRP0: Background Managed Standby Recovery process started (commentdb)
 started logmerger process
Sun Jun 26 23:40:44 2016
Managed Standby Recovery starting Real Time Apply
MRP0: Background Media Recovery terminated with error 16157
Errors in file /Data/oracle/diag/rdbms/commentdbdg/commentdb/trace/commentdb_pr00_7460.trc:
ORA-16157: media recovery not allowed following successful FINISH recovery
Managed Standby Recovery not using Real Time Apply
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Recovery Slave PR00 previously exited with exception 16157
MRP0: Background Media Recovery process shutdown (commentdb)

[oracle@app73 trace]$ oerr ora 16157
16157, 00000, "media recovery not allowed following successful FINISH recovery"
// *Cause:  A RECOVER MANAGED STANDBY DATABASE FINISH command has previously
//          completed successfully.  Another media recovery is not allowed.
// *Action: Issue one of these operations following a FINISH recocvery:
//          ALTER DATABASE OPEN READ ONLY or
//          ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY.

前台虽然提示mrp启动成功,但是alert日志提示启动mrp进程失败,原因是由于media recovery not allowed following successful FINISH recovery,也就是说由于我开始已经执行了ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH导致无法继续启动mrp进程,也就无法继续应用日志。

查看mos看看是否有解决方案
ora-16157


根据官方的说法只能重建dg,实在不甘心,个人感觉ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH执行完成,但是我没有failover命令执行,应该数据文件没有改变,只是控制文件发生了改变。

解决ora-16157问题
从主库重新生成standby controlfile并且传输到备库,再次尝试启动mrp

--主库操作
SQL> ALTER DATABASE CREATE standby CONTROLFILE AS '/tmp/controlfs01.ctl';

Database altered.
--scp到备库

SQL> shutdown immediate;
ORA-01109: database not open


Database dismounted.
ORACLE instance shut down.
SQL> startup nomount;
ORACLE instance started.

Total System Global Area 2.1379E+10 bytes
Fixed Size                  2262656 bytes
Variable Size            2684356992 bytes
Database Buffers         1.8656E+10 bytes
Redo Buffers               36073472 bytes

[oracle@app73 ~]$ cp /tmp/controlfs01.ctl /Data/oracle/oradata/commentdb/control01.ctl 
[oracle@app73 ~]$ cp /tmp/controlfs01.ctl /Data/oracle/fast_recovery_area/commentdb/control02.ctl

SQL> alter database mount;

Database altered.

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION;

Database altered.

观察alert日志

ARC2: Becoming the active heartbeat ARCH
Completed: alter database mount
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Attempt to start background Managed Standby Recovery process (commentdb)
Sun Jun 26 23:46:50 2016
MRP0 started with pid=24, OS id=7547
MRP0: Background Managed Standby Recovery process started (commentdb)
Sun Jun 26 23:46:54 2016
RFS[1]: Assigned to RFS process 7553
RFS[1]: Selected log 10 for thread 1 sequence 21197 dbid -414945661 branch 893285763
Sun Jun 26 23:46:54 2016
Primary database is in MAXIMUM PERFORMANCE mode
RFS[2]: Assigned to RFS process 7555
RFS[2]: Selected log 11 for thread 1 sequence 21198 dbid -414945661 branch 893285763
Archived Log entry 1 added for thread 1 sequence 21197 ID 0xe7442b83 dest 1:
 started logmerger process
Sun Jun 26 23:46:55 2016
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 4 slaves
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Sun Jun 26 23:46:56 2016
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Sun Jun 26 23:47:06 2016
Media Recovery Waiting for thread 1 sequence 21196
Fetching gap sequence in thread 1, gap sequence 21196-21196
Sun Jun 26 23:47:06 2016
RFS[3]: Assigned to RFS process 7567
RFS[3]: Opened log for thread 1 sequence 21196 dbid -414945661 branch 893285763
Archived Log entry 2 added for thread 1 sequence 21196 rlc 893285763 ID 0xe7442b83 dest 2:
Media Recovery Log /Data/oracle/fast_recovery_area/COMMENTDBDG/archivelog/2016_06_26/o1_mf_1_21196_cpzy7tjc_.arc
Media Recovery Log /Data/oracle/fast_recovery_area/COMMENTDBDG/archivelog/2016_06_26/o1_mf_1_21197_cpzy7gtl_.arc
Media Recovery Waiting for thread 1 sequence 21198 (in transit)
Recovery of Online Redo Log: Thread 1 Group 11 Seq 21198 Reading mem 0
  Mem# 0: /Data/oracle/oradata/commentdb/std_redo11.log

到这里已经证明,通过重建standby controlfile实现了即使执行了ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH(ORA-16157)也可以通过不重建dg,让其恢复正常(恢复dg状态).

发表在 Data Guard | 标签为 , | 评论关闭

pvid=yes导致asm无法mount

今天凌晨接到客户恢复请求,对于aix rac,两个ibm存储做mirror的环境中,客户做存储容灾演练,发现磁盘的名称发生改变,然后对其中一个磁盘设置pvid,结果悲剧了导致asm一个磁盘组无法正常起来。然后又aix端删除这些设备,然后重新扫描设备。结果不是一个磁盘组不能mount,而是整个gi就无法正常启动。希望我们给予技术支持。
查看asm 日志,确定asm disk信息
asm-disk1
asm-disk2


从这里可以确定,一共有两个asm diskgroup,每个group有两个磁盘,hdisk2和hdisk3 为hisdata,hdisk4,和hdisk5为emrdata.

使用kfed分析磁盘头

dd if=/dev/rhdisk2 of=/tmp/xifenfei/rhdisk2.dd bs=1024k count=10
dd if=/dev/rhdisk3 of=/tmp/xifenfei/rhdisk3.dd bs=1024k count=10
dd if=/dev/rhdisk4 of=/tmp/xifenfei/rhdisk4.dd bs=1024k count=10
dd if=/dev/rhdisk5 of=/tmp/xifenfei/rhdisk5.dd bs=1024k count=10
--传输到我电脑上分析

C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk2.dd|grep name
kfdhdb.dskname:            HISDATA_0000 ; 0x028: length=12
kfdhdb.grpname:                 HISDATA ; 0x048: length=7
kfdhdb.fgname:             HISDATA_0000 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0

C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk3.dd|grep name
kfdhdb.dskname:            HISDATA_0001 ; 0x028: length=12
kfdhdb.grpname:                 HISDATA ; 0x048: length=7
kfdhdb.fgname:             HISDATA_0001 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0

C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk4.dd|grep name
kfdhdb.dskname:            EMRDATA_0000 ; 0x028: length=12
kfdhdb.grpname:                 EMRDATA ; 0x048: length=7
kfdhdb.fgname:             EMRDATA_0000 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0

C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd|grep name

C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd
kfbh.endian:                        201 ; 0x000: 0xc9
kfbh.hard:                          194 ; 0x001: 0xc2
kfbh.type:                          212 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                        193 ; 0x003: 0xc1
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
000000000 C1D4C2C9 00000000 00000000 00000000  [................]
000000010 00000000 00000000 00000000 00000000  [................]
  Repeat 254 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][212]

C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd blkn=2|grep kfbh
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                33554432 ; 0x004: blk=33554432
kfbh.block.obj:                16777344 ; 0x008: file=128
kfbh.check:                  2654889601 ; 0x00c: 0x9e3e6681
kfbh.fcn.base:               1696071680 ; 0x010: 0x65180000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000

C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd blkn=510|grep name
kfdhdb.dskname:            EMRDATA_0001 ; 0x028: length=12
kfdhdb.grpname:                 EMRDATA ; 0x048: length=7
kfdhdb.fgname:             EMRDATA_0001 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0

通过上述分析,基本上确定由于对hdisk5设置了pvid导致该asm disk的磁盘头损坏.这个可以直接使用asm repair功能修复(注意要clear pvid)

C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd |grep name
kfdhdb.dskname:            EMRDATA_0001 ; 0x028: length=12
kfdhdb.grpname:                 EMRDATA ; 0x048: length=7
kfdhdb.fgname:             EMRDATA_0001 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0

启动crs到cssd进程报错分析
1. 由于删除磁盘,扫描设备导致hdisk[2-5] 权限和用户组不对
2. 由于删除,扫描磁盘导致磁盘共享模式不对
修复磁盘头和解决这两个问题之后,gi启动正常,磁盘组也正常mount,数据库也正常启动,数据0丢失,至此完美恢复
oracle-open


类似客户恢复案例:asm disk误设置pvid导致asm diskgroup无法mount恢复
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

发表在 AIX, Oracle备份恢复 | 标签为 , , , , , | 评论关闭