OGG-01705故障处理

由于机器突然重启,导致ogg replicat进程启动报OGG-01705
OGG-01705


从报错信息看应该是ogg产生的Trail文件大小异常了,查看操作系统层面该文件大小
20210527124807

ogg进程启动需要读取的文件位置433622176,而操作系统层面看到的文件大小为433609363(os层面文件较小,很可能是由于os层面系统重启写丢失导致),对于这样的问题:
在11.2.1.07及其以后版本可以通过以下命令启动replicat进程,过滤掉已经在checkpoint table中已经应用的记录
参考:OGG Replicat Checkpoint RBA Is Larger than Local Trail Size- Ogg v11.2 (Doc ID 1536741.1)

start replicat <rep name> filterduptransactions

对于11.2.1.07之前版本,需要通过Logdump找出来合适的extrba,然后通过以下类似命令处理
参考:OGG Extract / Replicat Checkpoint RBA Is Larger than Local Trail Size (Doc ID 1138409.1)

alter rep < rep name>, extseqno 27506, extrba 92047.
发表在 GoldenGate | 标签为 , | 评论关闭

利用createGoldImage 创建 19.11(含202104patch)完整版db和grid

最近实施了一套19c rac并且打上patch 32545008(GI Update 202104)和32399816(OJVM Update 202104),通过createGoldImage 创建了安装程序,直接使用该zip包即可安装含gi/db(含ojvm) 2021年4月的patch

[oracle@dzbl1 ~]$ $ORACLE_HOME/runInstaller -createGoldImage -silent -destinationLocation /tmp/soft_img
Launching Oracle Database Setup Wizard...

Successfully Setup Software.
Gold Image location: /tmp/soft_img/db_home_2021-05-20_09-05-40PM.zip


[oracle@dzbl1 ~]$ exit
logout
[root@dzbl1 ~]# su - grid
Last login: Thu May 20 20:57:05 CST 2021
[grid@dzbl1 ~]$ ./gridSetup.sh -createGoldImage  -silent -destinationLocation /tmp/soft_img
-bash: ./gridSetup.sh: No such file or directory
[grid@dzbl1 ~]$ $ORACLE_HOME/gridSetup.sh -createGoldImage  -silent -destinationLocation /tmp/soft_img
Launching Oracle Grid Infrastructure Setup Wizard...

Successfully Setup Software.
Gold Image location: /tmp/soft_img/grid_home_2021-05-20_09-13-58PM.zip


[grid@dzbl1 ~]$ md5sum  /tmp/soft_img/grid_home_2021-05-20_09-13-58PM.zip
7cefb1be8ead8250435d5a95785d1239  /tmp/soft_img/grid_home_2021-05-20_09-13-58PM.zip
[grid@dzbl1 ~]$ md5sum /tmp/soft_img/db_home_2021-05-20_09-05-40PM.zip
325841792c44f168c524b440440773b0  /tmp/soft_img/db_home_2021-05-20_09-05-40PM.zip
[grid@dzbl1 ~]$ opatch lspatches
32585572;DBWLM RELEASE UPDATE 19.0.0.0.0 (32585572)
32584670;TOMCAT RELEASE UPDATE 19.0.0.0.0 (32584670)
32579761;OCW RELEASE UPDATE 19.11.0.0.0 (32579761)
32576499;ACFS RELEASE UPDATE 19.11.0.0.0 (32576499)
32545013;Database Release Update : 19.11.0.0.210420 (32545013)

OPatch succeeded.
[grid@dzbl1 ~]$ su - oracle
Password: 
Last login: Thu May 20 21:04:33 CST 2021 on pts/1
[oracle@dzbl1 ~]$ opatch lspatches
32399816;OJVM RELEASE UPDATE: 19.11.0.0.210420 (32399816)
32579761;OCW RELEASE UPDATE 19.11.0.0.0 (32579761)
32545013;Database Release Update : 19.11.0.0.210420 (32545013)

OPatch succeeded.
[oracle@dzbl1 ~]$ ls -l /tmp/soft_img/
total 9225956
-rw-r--r-- 1 oracle oinstall 4268265132 May 20 21:13 db_home_2021-05-20_09-05-40PM.zip
-rw-r--r-- 1 grid   oinstall 5179109549 May 20 21:21 grid_home_2021-05-20_09-13-58PM.zip
[oracle@dzbl1 ~]$ 

20210520212657


下载到win,并且按照oracle官方命名方式进程重命名,并且md5验证,确定文件完整性
20210520234704

C:\Users\XFF>CertUtil -hashfile E:\vm_shared\LINUX.X64_1911000_grid_home.zip md5
MD5 的 E:\vm_shared\LINUX.X64_1911000_grid_home.zip 哈希:
7cefb1be8ead8250435d5a95785d1239
CertUtil: -hashfile 命令成功完成。

C:\Users\XFF>CertUtil -hashfile E:\vm_shared\LINUX.X64_1911000_db_home.zip md5
MD5 的 E:\vm_shared\LINUX.X64_1911000_db_home.zip 哈希:
325841792c44f168c524b440440773b0
CertUtil: -hashfile 命令成功完成。

通过此类方法,可以自己制作19c含patch版本的安装介质

发表在 Oracle, ORACLE 19C, Oracle安装升级 | 评论关闭

公有云安装19c rac遇到问题—169网段udp异常

应客户要求在xx公有云上面安装19c rac,通过各方的努力,最后安装情况如下
1. 两个节点root.sh执行成功,crs启动正常,asm磁盘组访问正常,但是有一个节点asm实例无法启动,一个节点的db实例无法启动

---节点1
[root@dzbl1 ~]# su - grid
Last login: Thu May 20 12:32:55 CST 2021
[grid@dzbl1 ~]$ ps -ef|grep ASM
grid       477     1  0 May19 ?        00:00:24 /u01/app/19c/grid/bin/tnslsnr ASMNET1LSNR_ASM -no_crs_notify -inherit
grid     22075 22039  0 12:42 pts/1    00:00:00 grep --color=auto ASM
[grid@dzbl1 ~]$ asmcmd
ASMCMD> lsdg
State    Type    Rebal  Sector  Logical_Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512             512   4096  4194304   1907344  1904420                0         1904420              0             N  DATA/
MOUNTED  EXTERN  N         512             512   4096  4194304   1150344  1149032                0         1149032              0             N  FRA/
MOUNTED  EXTERN  N         512             512   4096  4194304     14304    13988                0           13988              0             Y  SYSTEMDG/
ASMCMD> exit
[grid@dzbl1 ~]$ crsctl status res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       dzbl1                    STABLE
               ONLINE  ONLINE       dzbl2                    STABLE
ora.chad
               ONLINE  ONLINE       dzbl1                    STABLE
               ONLINE  ONLINE       dzbl2                    STABLE
ora.net1.network
               ONLINE  ONLINE       dzbl1                    STABLE
               ONLINE  ONLINE       dzbl2                    STABLE
ora.ons
               ONLINE  ONLINE       dzbl1                    STABLE
               ONLINE  ONLINE       dzbl2                    STABLE
ora.proxy_advm
               OFFLINE OFFLINE      dzbl1                    STABLE
               OFFLINE OFFLINE      dzbl2                    STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)
      1        ONLINE  ONLINE       dzbl1                    STABLE
      2        ONLINE  ONLINE       dzbl2                    STABLE
      3        ONLINE  OFFLINE                               STABLE
ora.DATA.dg(ora.asmgroup)
      1        ONLINE  OFFLINE                               STABLE
      2        ONLINE  ONLINE       dzbl2                    STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.FRA.dg(ora.asmgroup)
      1        ONLINE  OFFLINE                               STABLE
      2        ONLINE  ONLINE       dzbl2                    STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       dzbl2                    STABLE
ora.SYSTEMDG.dg(ora.asmgroup)
      1        OFFLINE OFFLINE                               STABLE
      2        ONLINE  ONLINE       dzbl2                    STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.asm(ora.asmgroup)
      1        ONLINE  OFFLINE                               STABLE
      2        ONLINE  ONLINE       dzbl2                    Started,STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)
      1        ONLINE  ONLINE       dzbl1                    STABLE
      2        ONLINE  ONLINE       dzbl2                    STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cvu
      1        ONLINE  ONLINE       dzbl2                    STABLE
ora.dzbl1.vip
      1        ONLINE  ONLINE       dzbl1                    STABLE
ora.dzbl2.vip
      1        ONLINE  ONLINE       dzbl2                    STABLE
ora.dzbldb.db
      1        ONLINE  OFFLINE                               STABLE
      2        ONLINE  ONLINE       dzbl2                    Open,HOME=/u01/app/o
                                                             racle/product/19c/db
                                                             _1,STABLE
ora.qosmserver
      1        ONLINE  ONLINE       dzbl2                    STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       dzbl2                    STABLE
--------------------------------------------------------------------------------
[grid@dzbl1 ~]$ 

---节点2
[grid@dzbl2 ~]$ ps -ef|grep ASM
grid      2464     1  0 May18 ?        00:00:29 /u01/app/19c/grid/bin/tnslsnr ASMNET1LSNR_ASM -no_crs_notify -inherit
grid      6826     1  0 May19 ?        00:00:09 oracle+ASM2_asmb_dzbldb2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
grid     14089     1  0 12:38 ?        00:00:00 asm_m000_+ASM2
grid     15670     1  0 12:40 ?        00:00:00 oracle+ASM2_crf (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
grid     16503     1  0 May18 ?        00:00:05 asm_pmon_+ASM2
grid     16505     1  0 May18 ?        00:00:04 asm_clmn_+ASM2
grid     16507     1  0 May18 ?        00:00:11 asm_psp0_+ASM2
grid     16518     1  0 12:42 ?        00:00:00 oracle+ASM2 (LOCAL=NO)
grid     16562     1  0 May18 ?        00:18:22 asm_vktm_+ASM2
grid     16567     1  0 May18 ?        00:00:08 asm_gen0_+ASM2
grid     16569     1  0 May18 ?        00:00:02 asm_mman_+ASM2
grid     16573     1  0 May18 ?        00:00:06 asm_gen1_+ASM2
grid     16577     1  0 May18 ?        00:01:13 asm_diag_+ASM2
grid     16579     1  0 May18 ?        00:00:04 asm_ping_+ASM2
grid     16581     1  0 May18 ?        00:00:09 asm_pman_+ASM2
grid     16583     1  0 May18 ?        00:03:08 asm_dia0_+ASM2
grid     16585     1  0 May18 ?        00:01:41 asm_lmon_+ASM2
grid     16587     1  0 May18 ?        00:01:55 asm_lmd0_+ASM2
grid     16589     1  0 May18 ?        00:04:26 asm_lms0_+ASM2
grid     16591     1  0 May18 ?        00:02:13 asm_lmhb_+ASM2
grid     16596     1  0 May18 ?        00:00:02 asm_lck1_+ASM2
grid     16598     1  0 May18 ?        00:00:02 asm_dbw0_+ASM2
grid     16600     1  0 May18 ?        00:00:02 asm_lgwr_+ASM2
grid     16602     1  0 May18 ?        00:00:05 asm_ckpt_+ASM2
grid     16604     1  0 May18 ?        00:00:01 asm_smon_+ASM2
grid     16606     1  0 May18 ?        00:00:02 asm_lreg_+ASM2
grid     16608     1  0 May18 ?        00:00:01 asm_pxmn_+ASM2
grid     16610     1  0 May18 ?        00:00:11 asm_rbal_+ASM2
grid     16612     1  0 May18 ?        00:00:24 asm_gmon_+ASM2
grid     16614     1  0 May18 ?        00:00:06 asm_mmon_+ASM2
grid     16616     1  0 May18 ?        00:00:47 asm_mmnl_+ASM2
grid     16618     1  0 May18 ?        00:02:52 asm_imr0_+ASM2
grid     16627     1  0 May18 ?        00:00:30 asm_scm0_+ASM2
grid     16633     1  0 May18 ?        00:00:11 asm_lck0_+ASM2
grid     16662     1  0 May18 ?        00:07:10 asm_gcr0_+ASM2
grid     16699     1  0 May19 ?        00:00:00 oracle+ASM2 (LOCAL=NO)
grid     16746     1  0 May18 ?        00:00:06 asm_asmb_+ASM2
grid     16748     1  0 May18 ?        00:00:13 oracle+ASM2_asmb_+asm2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
grid     16756     1  0 May18 ?        00:00:00 oracle+ASM2_ocr (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
grid     17567     1  0 May18 ?        00:00:00 oracle+ASM2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
grid     17622 17536  0 12:43 pts/1    00:00:00 grep --color=auto ASM
grid     27829     1  0 May18 ?        00:00:00 oracle+ASM2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
[grid@dzbl2 ~]$ asmcmd
ASMCMD> lsdg
State    Type    Rebal  Sector  Logical_Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512             512   4096  4194304   1907344  1904420                0         1904420              0             N  DATA/
MOUNTED  EXTERN  N         512             512   4096  4194304   1150344  1149032                0         1149032              0             N  FRA/
MOUNTED  EXTERN  N         512             512   4096  4194304     14304    13988                0           13988              0             Y  SYSTEMDG/
ASMCMD> exit
[grid@dzbl2 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       dzbl1                    STABLE
               ONLINE  ONLINE       dzbl2                    STABLE
ora.chad
               ONLINE  ONLINE       dzbl1                    STABLE
               ONLINE  ONLINE       dzbl2                    STABLE
ora.net1.network
               ONLINE  ONLINE       dzbl1                    STABLE
               ONLINE  ONLINE       dzbl2                    STABLE
ora.ons
               ONLINE  ONLINE       dzbl1                    STABLE
               ONLINE  ONLINE       dzbl2                    STABLE
ora.proxy_advm
               OFFLINE OFFLINE      dzbl1                    STABLE
               OFFLINE OFFLINE      dzbl2                    STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)
      1        ONLINE  ONLINE       dzbl1                    STABLE
      2        ONLINE  ONLINE       dzbl2                    STABLE
      3        ONLINE  OFFLINE                               STABLE
ora.DATA.dg(ora.asmgroup)
      1        ONLINE  OFFLINE                               STABLE
      2        ONLINE  ONLINE       dzbl2                    STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.FRA.dg(ora.asmgroup)
      1        ONLINE  OFFLINE                               STABLE
      2        ONLINE  ONLINE       dzbl2                    STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       dzbl2                    STABLE
ora.SYSTEMDG.dg(ora.asmgroup)
      1        OFFLINE OFFLINE                               STABLE
      2        ONLINE  ONLINE       dzbl2                    STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.asm(ora.asmgroup)
      1        ONLINE  OFFLINE                               STABLE
      2        ONLINE  ONLINE       dzbl2                    Started,STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)
      1        ONLINE  ONLINE       dzbl1                    STABLE
      2        ONLINE  ONLINE       dzbl2                    STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cvu
      1        ONLINE  ONLINE       dzbl2                    STABLE
ora.dzbl1.vip
      1        ONLINE  ONLINE       dzbl1                    STABLE
ora.dzbl2.vip
      1        ONLINE  ONLINE       dzbl2                    STABLE
ora.dzbldb.db
      1        ONLINE  OFFLINE                               STABLE
      2        ONLINE  ONLINE       dzbl2                    Open,HOME=/u01/app/o
                                                             racle/product/19c/db
                                                             _1,STABLE
ora.qosmserver
      1        ONLINE  ONLINE       dzbl2                    STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       dzbl2                    STABLE
--------------------------------------------------------------------------------
[grid@dzbl2 ~]$ 

2. 分析db和asm有一个实例无法启动原因分析

--实例启动报错
SQL>  startup
ORA-03113: end-of-file on communication channel

--无法启动节点alert日志
2021-05-19T12:41:32.143124+08:00
NOTE: ASMB (index:0) registering with ASM instance as Flex client 0xffffffffffffffff (reg:2449521867) (startid:1072960888) (new connection)
2021-05-19T12:41:32.349766+08:00
My CSS node number is 1
My CSS hostname is dzbl1
lmon registered with NM - instance number 1 (internal mem no 0)
2021-05-19T12:41:34.054865+08:00
Using default pga_aggregate_limit of 16384 MB
2021-05-19T12:42:16.978085+08:00
No connectivity to other instances in the cluster during startup. Hence, LMON is terminating the instance. Please check the LMON trace file for details.
 Also, please check the network logs of this instance along with clusterwide network health for problems and then re-start this instance.
LMON (ospid: ): terminating the instance due to ORA error
Cause - 'Instance is being terminated by LMON'
2021-05-19T12:42:17.115807+08:00
System state dump requested by (instance=1, osid=29660 (LMON)), summary=[abnormal instance termination]. error - 'Instance is terminating.
System State dumped to trace file /u01/app/oracle/diag/rdbms/dzbldb/dzbldb1/trace/dzbldb1_diag_29641.trc
2021-05-19T12:42:17.227469+08:00
Dumping diagnostic data in directory=[cdmp_20210519124217], requested by (instance=1, osid=29660 (LMON)), summary=[abnormal instance termination].
2021-05-19T12:42:18.344481+08:00
Instance terminated by LMON, pid = 29660

--正常节点lmon日志
*** 2021-05-19T12:42:29.348455+08:00
IPCLW:[0.16]{-}[CNCT]:PROTO: [1621399349248289]Warning! ACNH://0x7f3d993a7990/peer=[UNKNWN]&ospid=0&msn=993097808&seq=995707504
  (169.254.14.18:32056) has outstanding sends during delete.
IPCLW:[0.17]{-}[CNCT]:UTIL: [1621399349248289]  ACNH 0x7f3d993a7990 State: 2 SMSN: 993097806 PKT(993097808.995707504) # Pending: 2
IPCLW:[0.18]{-}[CNCT]:UTIL: [1621399349248289]   Peer: [UNKNWN].0 AckSeq: 0
IPCLW:[0.19]{-}[CNCT]:UTIL: [1621399349248289]   Flags: 0x40000000 IHint: 0x30693d920000001f THint: 0x0
IPCLW:[0.20]{-}[CNCT]:UTIL: [1621399349248289]   Local Address: 169.254.17.231:19443 Remote Address: 169.254.14.18:32056
IPCLW:[0.21]{-}[CNCT]:UTIL: [1621399349248289]   Remote PID: ver 0 flags 1 trans 2 tos 0 opts 0 xdata3 165f xdata2 70dbd629
IPCLW:[0.22]{-}[CNCT]:UTIL: [1621399349248289]             : mmsz 32768 mmr 4096 mms 4096 xdata c2a71bf9
IPCLW:[0.23]{-}[CNCT]:UTIL: [1621399349248289]   IVPort: 46944 TVPort: 7161 IMPT: 25433 RMPT: 5727   Pending Sends: Yes Unacked Sends: Yes
IPCLW:[0.24]{-}[CNCT]:UTIL: [1621399349248289]   Send Engine Queued: No sshdl -1 ssts 0 rtts 0 snderrchk 0 creqcnt 19 credits 0/0
IPCLW:[0.25]{-}[CNCT]:UTIL: [1621399349248289]   Unackd Messages 993097806 -> 993097807. SSEQ 995707502 Send Time: 
                                                  INVALID TIME SMSN # Xmits: 0 EMSN INVALID TIME
IPCLW:[0.26]{-}[CNCT]:UTIL: [1621399349248289]  Pending send queue:
IPCLW:[0.27]{-}[CNCT]:UTIL: [1621399349248289]    [0] mbuf 0x7f3d99397770 MSN 993097806 Seq 995707502 -> 995707503 # XMits: 0
IPCLW:[0.28]{-}[CNCT]:UTIL: [1621399349248289]    [1] mbuf 0x7f3d99397350 MSN 993097807 Seq 995707503 -> 995707504 # XMits: 0
kjxgfipccb: msg 0x7f3d9934a680, mbo 0x7f3d9934a670, type 24, ack 0, ref 0, stat 34
kjxgfipccb: msg 0x7f3d9934a878, mbo 0x7f3d9934a868, type 18, ack 0, ref 0, stat 34

从日志看异常节点的169.254.14.18:32056和169.254.17.231:19443无法使用udp进行通讯,参考:Only One Instance of a RAC Database Can Start at a Time: Second Instance Fails to Start due to “No reconfig messages from other instances” – LMON is terminating the instance (Doc ID 2528588.1),从而使得asm和db实例只能启动一个节点.到目前为止,初步看很可能是公有云的对于169.254网段的某些限制导致.
对于两个节点asm磁盘组mount,crs正常启动.这个是由于使用的是fiex asm技术实现(在asm实例启动正常情况下直接启动本地asm实例,在本地asm实例无法正常启动,通过fiex asm实现磁盘组正常mount)

发表在 Oracle安装升级 | 标签为 , | 一条评论