网关不通致使vip/lsnr资源异常

crs_stat显示节点1的listener和vip时断时续(一会online,一会offline)

rac1-> crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.devdb.db   application    ONLINE    ONLINE    rac1        
ora....b1.inst application    ONLINE    ONLINE    rac1        
ora....b2.inst application    ONLINE    ONLINE    rac2        
ora....SM1.asm application    ONLINE    ONLINE    rac1        
ora....C1.lsnr application    ONLINE    OFFLINE               
ora.rac1.gsd   application    ONLINE    ONLINE    rac1        
ora.rac1.ons   application    ONLINE    ONLINE    rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac2        
ora....SM2.asm application    ONLINE    ONLINE    rac2        
ora....C2.lsnr application    ONLINE    OFFLINE               
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac1        
rac1-> crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.devdb.db   application    ONLINE    ONLINE    rac1        
ora....b1.inst application    ONLINE    ONLINE    rac1        
ora....b2.inst application    ONLINE    ONLINE    rac2        
ora....SM1.asm application    ONLINE    ONLINE    rac1        
ora....C1.lsnr application    ONLINE    OFFLINE               
ora.rac1.gsd   application    ONLINE    ONLINE    rac1        
ora.rac1.ons   application    ONLINE    ONLINE    rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac2        
ora....SM2.asm application    ONLINE    ONLINE    rac2        
ora....C2.lsnr application    ONLINE    ONLINE    rac2        
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac2

查看crsd.log日志

0Attempting to start `ora.rac1.vip` on member `rac2`
0Start of `ora.rac1.vip` on member `rac2` failed.
0startRunnable: setting CLI values
0Attempting to start `ora.rac1.vip` on member `rac1`
0Start of `ora.rac1.vip` on member `rac1` succeeded.
0startRunnable: setting CLI values
0Attempting to start `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
0Start of `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` succeeded.
u_freem: mem passed is null
0CheckResource error for ora.rac1.vip error code = 1
0In stateChanged, ora.rac1.vip target is ONLINE
0ora.rac1.vip on rac1 went OFFLINE unexpectedly
0StopResource: setting CLI values
0Attempting to stop `ora.rac1.vip` on member `rac1`
0Stop of `ora.rac1.vip` on member `rac1` succeeded.
0ora.rac1.vip RESTART_COUNT=0 RESTART_ATTEMPTS=0
0ora.rac1.vip failed on rac1 relocating.
0StopResource: setting CLI values
0Attempting to stop `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
0Stop of `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` succeeded.
0Attempting to start `ora.rac1.vip` on member `rac2`
0Start of `ora.rac1.vip` on member `rac2` failed.
0Attempting to start `ora.rac1.vip` on member `rac2`
0Start of `ora.rac1.vip` on member `rac2` succeeded.
0CRS-1002: Resource 'ora.rac1.vip' is already running on member 'rac2'

这里可以看出由于vip资源失败,致使lsnr资源也出现失败,紧接着又是启动vip,再启动lsnr。所以使得我们通过crs_stat -t观察资源情况时,看到这两个进程一直处于波动状态

分析ora.rac1.vip.log日志

[ora.rac1.vip]: clsrcexecut:env ORACLE_CONFIG_HOME=/u01/app/oracle/product/10.2.0/crs_1
[ora.rac1.vip]: clsrcexecut:cmd=/u01/app/oracle/product/10.2.0/crs_1/bin/racgeut -e 
_USR_ORA_DEBUG=0 54 /u01/app/oracle/product/10.2.0/crs_1/bin/racgvip check rac1
[ora.rac1.vip]: clsrcexecut: rc = 1, time = 6.430s
[ora.rac1.vip]: end for resource = ora.rac1.vip, action=check,status=1,time=6.450s
[ora.rac1.vip]: ping to 192.168.1.1 via eth0 failed, rc = 1 (host=rac1)
ping to 192.168.1.1 via eth0 failed, rc = 1 (host=rac1)
[ora.rac1.vip]: clsrcstartorp: Error with malloc
[ora.rac1.vip]: ping to 192.168.1.1 via eth0 failed, rc = 1 (host=rac1)
ping to 192.168.1.1 via eth0 failed, rc = 1 (host=rac1)
Interface eth0 checked failed (host=rac1)
Invalid parameters, or failed to bring up VIP (host=rac1)

通过这里发现:从eth0网卡ping192.168.1.1(网关)不通,导致VIP资源不能正常工作

核实问题原因/解决
我们人工从节点1上ping 网关(192.168.1.1),果真不通.继续检查发现,网关服务器上意外的开启了防火墙,对部分进来的包进行了过滤,恰好节点1在被禁止之列,使得节点1 ping 网关不成功,从而出现该了该错误.关闭防火墙或者重新设置规则后,rac工作正常,未出现vip和lsnr资源出现波动情况.

发表在 Oracle RAC | 一条评论

OCR/Vote disk 维护操作

数据库版本

SQL>  select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - Prod
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS for Linux: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production

ocr测试(可以online处理)

rac2-> ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     160396
         Used space (kbytes)      :       4376
         Available space (kbytes) :     156020
         ID                       : 1302494786
         Device/File Name         : /dev/raw/raw11
                                    Device/File integrity check succeeded

                                    Device/File not configured

         Cluster registry integrity check succeeded

rac2-> more /etc/oracle/ocr.loc 
ocrconfig_loc=/dev/raw/raw11
local_only=false

--增加ocr镜像
[root@rac2 bin]# ./ocrconfig -replace ocrmirror /dev/raw/raw12

rac2-> ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     160396
         Used space (kbytes)      :       4376
         Available space (kbytes) :     156020
         ID                       : 1302494786
         Device/File Name         : /dev/raw/raw11
                                    Device/File integrity check succeeded
         Device/File Name         : /dev/raw/raw12
                                    Device/File integrity check succeeded

         Cluster registry integrity check succeeded

rac2-> more /etc/oracle/ocr.loc 
#Device/file  getting replaced by device /dev/raw/raw12 
ocrconfig_loc=/dev/raw/raw11
ocrmirrorconfig_loc=/dev/raw/raw12
local_only=false

--删除ocr
[root@rac2 bin]# ./ocrconfig -replace ocr

rac2-> more /etc/oracle/ocr.loc 
#Device/file /dev/raw/raw11 being deleted 
ocrconfig_loc=/dev/raw/raw12
local_only=false

rac2-> ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     160396
         Used space (kbytes)      :       4376
         Available space (kbytes) :     156020
         ID                       : 1302494786
         Device/File Name         : /dev/raw/raw12
                                    Device/File integrity check succeeded

                                    Device/File not configured

         Cluster registry integrity check succeeded

--补充删除ocr镜像
[root@rac2 bin]# ./ocrconfig -replace ocrmirror

Vote disk测试(10g offline/11g online)

--关闭crs
[root@rac2 bin]# ./crsctl stop crs
[root@rac1 bin]# ./crsctl stop crs

--查询vote disk
rac2-> crsctl query css votedisk
 0.     0    /dev/raw/raw31

--增加vote disk
[root@rac2 bin]# ./crsctl add css votedisk /dev/raw/raw23 -force
Now formatting voting disk: /dev/raw/raw23
successful addition of votedisk /dev/raw/raw23.
[root@rac2 bin]# ./crsctl add css votedisk /dev/raw/raw33 -force
Now formatting voting disk: /dev/raw/raw33
successful addition of votedisk /dev/raw/raw33.
[root@rac2 bin]# ./crsctl add css votedisk /dev/raw/raw32 -force
Now formatting voting disk: /dev/raw/raw32
successful addition of votedisk /dev/raw/raw32.

rac2-> crsctl query css votedisk
 0.     0    /dev/raw/raw31
 1.     0    /dev/raw/raw23
 2.     0    /dev/raw/raw33
 3.     0    /dev/raw/raw32

located 4 votedisk(s).

--删除vote disk
[root@rac2 bin]# ./crsctl delete css votedisk /dev/raw/raw33 -force
successful deletion of votedisk /dev/raw/raw33.

--启动crs
[root@rac2 bin]# ./crsctl start crs
[root@rac1 bin]# ./crsctl start crs

补充官方操作说明[ID 428681.1]
//www.xifenfei.com/wp-content/uploads/2012/04/OCR_Vote_disk_Maintenance_Operations.pdf

发表在 Oracle RAC | 评论关闭

RAC 10g升级到10.2.0.5

1.Back Up database
一般情况下rman备份

2.备份ocr和vote disk

[root@rac2 bin]# ./ocrconfig -export /tmp/ocr_export.bak
[root@rac2 bin]# more /etc/oracle/ocr.loc 
ocrconfig_loc=/dev/raw/raw11
local_only=FALSE
[root@rac2 bin]# dd if=/dev/raw/raw11 of=/tmp/ocr_dd.bak
[root@rac2 bin]# dd if=/dev/raw/raw31 of=/tmp/vote_dd.bak

3.Update Oracle Time Zone Definitions
Actions for the DSTv4 update in the 10.2.0.5 patchset [ID 1086400.1]

4.Stopping All Processes
滚动升级关闭一个节点所有进程,非滚动升级关闭所有进程

$ isqlplusctl stop
$ emctl stop dbconsole
$ srvctl stop service -d db_name [-s service_name_list [-i inst_name]]
$ srvctl stop instance -d db_name -i inst_name
$ srvctl stop asm -n node
$ srvctl stop listener -n node [-l listenername]
$ srvctl stop nodeapps -n node
# CRS_home/bin/crsctl stop crs(root执行,滚动升级不需要关闭)

5.Back Up the System
$ORACLE_BASE中文件,主要包括(db和crs安装文件/oraInventory文件)

6.升级crs软件
执行./runInstaller选择crs目录

执行下面命令
# CRS_home/bin/crsctl stop crs
# CRS_home/install/root102.sh

7.升级db软件
关闭crs和db所有进程(步骤同4)
执行./runInstaller选择db目录

执行下面命令
# ORACLE_HOME/root.sh

8.升级数据库
8.1)检查数据库升级需要满足条件,对存在不合适之处,进行修正
How to Download and Run Oracle’s Database Pre-Upgrade Utility [ID 884522.1]

SQL> STARTUP UPGRADE
SQL> SPOOL upgrade_info.log 
SQL> @/rdbms/admin/utlu102i.sql
SQL> SPOOL OFF
SQL> ALTER SYSTEM SET CLUSTER_DATABASE=FALSE SCOPE=spfile;
--其他根据upgrade_info.log中提示修改
SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP UPGRADE

8.2)启动监听
srvctl start listener -n node

8.3)升级数据库

SQL> SPOOL patch.log
SQL> @?/rdbms/admin/catupgrd.sql
--检查patch.log,发现有错误查找原因,重新执行catupgrd.sql脚本
SQL> SPOOL OFF
SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP
SQL> @?/rdbms/admin/utlrp.sql
SQL> ALTER SYSTEM SET CLUSTER_DATABASE=TRUE SCOPE=spfile;
--包括其他修改调整参数
SQL> SHUTDOWN IMMEDIATE
--使用rac管理相关命令,启动需要启动资源

9.修改相关目录权限
# ORACLE_HOME/install/changePerm.sh

具体操作步骤请阅读README.html

发表在 Oracle RAC, Oracle安装升级 | 评论关闭