月归档:六月 2015

记录解决一次Listener状态为Not All Endpoints Registered的故障

客户反馈系统异常无法正常访问,检查发现监听异常

C:\Users\Administrator>crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       rac2
ora.LISTENER.lsnr
               ONLINE  INTERMEDIATE rac2                     Not All Endpoints R
                                                             egistered
ora.asm
               ONLINE  ONLINE       rac2                     Started
ora.gsd
               OFFLINE OFFLINE      rac2
ora.net1.network
               ONLINE  ONLINE       rac2
ora.ons
               ONLINE  ONLINE       rac2
ora.registry.acfs
               ONLINE  ONLINE       rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  INTERMEDIATE rac2                     Not All Endpoints R
                                                             egistered
ora.cvu
      1        ONLINE  ONLINE       rac2
ora.oc4j
      1        ONLINE  ONLINE       rac2
ora.rac.db
      1        ONLINE  ONLINE       rac2                     Open
      2        ONLINE  OFFLINE
ora.rac1.vip
      1        ONLINE  OFFLINE
ora.rac2.vip
      1        ONLINE  OFFLINE
ora.scan1.vip
      1        ONLINE  OFFLINE

C:\Users\Administrator>lsnrctl status

LSNRCTL for 64-bit Windows: Version 11.2.0.3.0 - Production on 12-6月 -2015 15:50:43

Copyright (c) 1991, 2011, Oracle.  All rights reserved.

正在连接到 (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
LISTENER 的 STATUS
------------------------
别名                      LISTENER
版本                      TNSLSNR for 64-bit Windows: Version 11.2.0.3.0 - Production
启动日期                  12-6月 -2015 15:31:30
正常运行时间              0 天 0 小时 19 分 20 秒
跟踪级别                  off
安全性                    ON: Local OS Authentication
SNMP                      OFF
监听程序参数文件          D:\app\11.2.0\grid\network\admin\listener.ora
监听程序日志文件          D:\app\11.2.0\grid\log\diag\tnslsnr\rac2\listener\alert\log.xml
监听端点概要...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=\\.\pipe\LISTENERipc)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.63.64.70)(PORT=1521)))
监听程序不支持服务
命令执行成功

通过这里可以看到LISTENER和LISTENER_SCAN1为Not All Endpoints Registered状态,而且这个RAC只有一个节点rac2,rac1节点未加入到集群中.进一步检查IP和hosts文件

C:\Users\Administrator>ipconfig -all

Windows IP 配置

   主机名  . . . . . . . . . . . . . : rac2
   主 DNS 后缀 . . . . . . . . . . . :
   节点类型  . . . . . . . . . . . . : 混合
   IP 路由已启用 . . . . . . . . . . : 否
   WINS 代理已启用 . . . . . . . . . : 否

以太网适配器 pub:

   连接特定的 DNS 后缀 . . . . . . . :
   描述. . . . . . . . . . . . . . . : Intel(R) 82576 Gigabit Dual Port Network Connection #2
   物理地址. . . . . . . . . . . . . : 00-25-90-5A-0F-47
   DHCP 已启用 . . . . . . . . . . . : 否
   自动配置已启用. . . . . . . . . . : 是
   本地链接 IPv6 地址. . . . . . . . : fe80::c5ef:663f:7333:45f2%12(首选)
   IPv4 地址 . . . . . . . . . . . . : 10.63.64.70(首选)
   子网掩码  . . . . . . . . . . . . : 255.255.255.192
   默认网关. . . . . . . . . . . . . : 10.63.64.126
   DHCPv6 IAID . . . . . . . . . . . : 301999504
   DHCPv6 客户端 DUID  . . . . . . . : 00-01-00-01-1A-5C-19-A1-00-25-90-5A-0F-46
   DNS 服务器  . . . . . . . . . . . : 218.30.19.40
   TCPIP 上的 NetBIOS  . . . . . . . : 已启用

以太网适配器 priv:

   连接特定的 DNS 后缀 . . . . . . . :
   描述. . . . . . . . . . . . . . . : Intel(R) 82576 Gigabit Dual Port Network Connection
   物理地址. . . . . . . . . . . . . : 00-25-90-5A-0F-46
   DHCP 已启用 . . . . . . . . . . . : 否
   自动配置已启用. . . . . . . . . . : 是
   本地链接 IPv6 地址. . . . . . . . : fe80::c88d:78ff:d2e8:bde1%11(首选)
   IPv4 地址 . . . . . . . . . . . . : 10.10.1.2(首选)
   子网掩码  . . . . . . . . . . . . : 255.255.255.0
   默认网关. . . . . . . . . . . . . :
   DHCPv6 IAID . . . . . . . . . . . : 234890640
   DHCPv6 客户端 DUID  . . . . . . . : 00-01-00-01-1A-5C-19-A1-00-25-90-5A-0F-46
   DNS 服务器  . . . . . . . . . . . : fec0:0:0:ffff::1%1
                                       fec0:0:0:ffff::2%1
                                       fec0:0:0:ffff::3%1
   TCPIP 上的 NetBIOS  . . . . . . . : 已启用

--hosts文件
10.63.64.69		rac1
10.63.64.70		rac2
10.63.64.71		rac1-vip
10.63.64.72		rac2-vip
10.63.64.73		scan-cluster
10.10.1.1		rac1-priv
10.10.1.2		rac2-priv

这里可以看到主机之上的pub网卡只有一个ip 10.63.64.70,不太符合我们对rac的理解(一般来说其上应该有vip,部分情况下甚至可能有scan ip),尝试ping vip和scan ip

C:\Users\Administrator>ping 10.63.64.72

正在 Ping 10.63.64.72 具有 32 字节的数据:
来自 10.63.64.72 的回复: 字节=32 时间<1ms TTL=128
来自 10.63.64.72 的回复: 字节=32 时间<1ms TTL=128
来自 10.63.64.72 的回复: 字节=32 时间<1ms TTL=128

10.63.64.72 的 Ping 统计信息:
    数据包: 已发送 = 3,已接收 = 3,丢失 = 0 (0% 丢失),
往返行程的估计时间(以毫秒为单位):
    最短 = 0ms,最长 = 0ms,平均 = 0ms
Control-C
^C
C:\Users\Administrator>ping 10.63.64.73

正在 Ping 10.63.64.73 具有 32 字节的数据:
来自 10.63.64.73 的回复: 字节=32 时间<1ms TTL=128
来自 10.63.64.73 的回复: 字节=32 时间<1ms TTL=128
来自 10.63.64.73 的回复: 字节=32 时间<1ms TTL=128

10.63.64.73 的 Ping 统计信息:
    数据包: 已发送 = 3,已接收 = 3,丢失 = 0 (0% 丢失),
往返行程的估计时间(以毫秒为单位):
    最短 = 0ms,最长 = 0ms,平均 = 0ms

这里发现一个异常问题:crs显示只有rac2在集群之中,而该主机ip中又不存在vip和scan ip属于异常情况,但是这两个ip又可以ping通,基于这样情况,我第一反应就是vip和scanip可能飘到rac1中了,而rac1又未正常加入到crs中(因为这个库以前处理过,由于rac1的hba卡有问题,数据库无法正常启动,crs起来也无法提供工作),检查rac1机器情况

C:\Users\Administrator>crsctl status res -t
CRS-4535: 无法与集群就绪服务通信
CRS-4000: 命令 Status 失败, 或已完成但出现错误。

C:\Users\Administrator>crsctl status res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       rac1                     Started
ora.crf
      1        ONLINE  ONLINE       rac1
ora.crsd
      1        ONLINE  OFFLINE
ora.cssd
      1        ONLINE  ONLINE       rac1
ora.cssdmonitor
      1        ONLINE  ONLINE       rac1
ora.ctssd
      1        ONLINE  ONLINE       rac1                     OBSERVER
ora.drivers.acfs
      1        ONLINE  ONLINE       rac1
ora.evmd
      1        ONLINE  ONLINE       rac1
ora.gipcd
      1        ONLINE  ONLINE       rac1
ora.gpnpd
      1        ONLINE  ONLINE       rac1
ora.mdnsd
      1        ONLINE  ONLINE       rac1

以太网适配器 pub:

   连接特定的 DNS 后缀 . . . . . . . :
   描述. . . . . . . . . . . . . . . : Intel(R) 82576 Gigabit Dual Port Network Connection
   物理地址. . . . . . . . . . . . . : 00-25-90-5A-0E-E7
   DHCP 已启用 . . . . . . . . . . . : 否
   自动配置已启用. . . . . . . . . . : 是
   本地链接 IPv6 地址. . . . . . . . : fe80::409d:8c2e:446b:af42%11(首选)
   IPv4 地址 . . . . . . . . . . . . : 10.63.64.69(首选)
   子网掩码  . . . . . . . . . . . . : 255.255.255.192
   IPv4 地址 . . . . . . . . . . . . : 10.63.64.71(首选)
   子网掩码  . . . . . . . . . . . . : 255.255.255.192
   IPv4 地址 . . . . . . . . . . . . : 10.63.64.72(首选)
   子网掩码  . . . . . . . . . . . . : 255.255.255.192
   IPv4 地址 . . . . . . . . . . . . : 10.63.64.73(首选)
   子网掩码  . . . . . . . . . . . . : 255.255.255.192
   默认网关. . . . . . . . . . . . . : 10.63.64.126
   DHCPv6 IAID . . . . . . . . . . . : 234890640
   DHCPv6 客户端 DUID  . . . . . . . : 00-01-00-01-1A-5C-19-0A-00-25-90-5A-0E-E7
   DNS 服务器  . . . . . . . . . . . : 8.8.8.8
   TCPIP 上的 NetBIOS  . . . . . . . : 已启用

以太网适配器 priv:

   连接特定的 DNS 后缀 . . . . . . . :
   描述. . . . . . . . . . . . . . . : Intel(R) 82576 Gigabit Dual Port Network Connection #2
   物理地址. . . . . . . . . . . . . : 00-25-90-5A-0E-E6
   DHCP 已启用 . . . . . . . . . . . : 否
   自动配置已启用. . . . . . . . . . : 是
   本地链接 IPv6 地址. . . . . . . . : fe80::154:dad7:f9e3:bea3%13(首选)
   IPv4 地址 . . . . . . . . . . . . : 10.10.1.1(首选)
   子网掩码  . . . . . . . . . . . . : 255.255.255.0
   默认网关. . . . . . . . . . . . . :
   DHCPv6 IAID . . . . . . . . . . . : 301999504
   DHCPv6 客户端 DUID  . . . . . . . : 00-01-00-01-1A-5C-19-0A-00-25-90-5A-0E-E7
   DNS 服务器  . . . . . . . . . . . : fec0:0:0:ffff::1%1
                                       fec0:0:0:ffff::2%1
                                       fec0:0:0:ffff::3%1
   TCPIP 上的 NetBIOS  . . . . . . . : 已启用

果然这里rac2的vip和scan ip都漂到rac1中,但是crs状态属于不正常情况,由于rac1无法正常使用,关闭该主机,并重启rac2(由于rac2处于异常情况无法正常工作),后续rac2恢复正常

C:\Users\Administrator>crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       rac2
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac2
ora.asm
               ONLINE  ONLINE       rac2                     Started
ora.gsd
               OFFLINE OFFLINE      rac2
ora.net1.network
               ONLINE  ONLINE       rac2
ora.ons
               ONLINE  ONLINE       rac2
ora.registry.acfs
               ONLINE  ONLINE       rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac2
ora.cvu
      1        ONLINE  ONLINE       rac2
ora.oc4j
      1        ONLINE  ONLINE       rac2
ora.rac.db
      1        OFFLINE OFFLINE                               Instance Shutdown
      2        ONLINE  ONLINE       rac2                     Open
ora.rac1.vip
      1        ONLINE  INTERMEDIATE rac2                     FAILED OVER
ora.rac2.vip
      1        ONLINE  ONLINE       rac2
ora.scan1.vip
      1        ONLINE  ONLINE       rac2

C:\Users\Administrator>lsnrctl status

LSNRCTL for 64-bit Windows: Version 11.2.0.3.0 - Production on 12-6月 -2015 17:02:46

Copyright (c) 1991, 2011, Oracle.  All rights reserved.

正在连接到 (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
LISTENER 的 STATUS
------------------------
别名                      LISTENER
版本                      TNSLSNR for 64-bit Windows: Version 11.2.0.3.0 - Production
启动日期                  12-6月 -2015 16:44:43
正常运行时间              0 天 0 小时 18 分 3 秒
跟踪级别                  off
安全性                    ON: Local OS Authentication
SNMP                      OFF
监听程序参数文件          D:\app\11.2.0\grid\network\admin\listener.ora
监听程序日志文件          D:\app\11.2.0\grid\log\diag\tnslsnr\rac2\listener\alert\log.xml
监听端点概要...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=\\.\pipe\LISTENERipc)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.63.64.70)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.63.64.72)(PORT=1521)))
服务摘要..
服务 "+ASM" 包含 1 个实例。
  实例 "+asm2", 状态 READY, 包含此服务的 1 个处理程序...
服务 "rac" 包含 1 个实例。
  实例 "rac2", 状态 READY, 包含此服务的 1 个处理程序...
服务 "racXDB" 包含 1 个实例。
  实例 "rac2", 状态 READY, 包含此服务的 1 个处理程序...
命令执行成功

出现该问题的原因至此可以总结出来:由于rac1和rac2的集群处于异常状态,rac1持有了vip和scan ip,但是又未正常加入crs,导致rac2无法获得vip和scan ip,从而使得LISTENER和LISTENER_SCAN1为Not All Endpoints Registered状态.另外对于不能正常工作的集群节点,建议关闭crs,甚至可以考虑关闭主机,减少异常节点对正常节点的影响.关于该类问题的分析,可以从Scan Listener In INTERMEDIATE Mode Not All Endpoints Registered (Doc ID 1667873.1)中找到依据,证明是由于IP被占用导致.

发表在 Oracle RAC | 标签为 | 评论关闭

ORA-00230: operation disallowed: snapshot control file enqueue unavailable

rman 备份控制文件报ORA-00230: operation disallowed: snapshot control file enqueue unavailable错误

db1:/home/oracle>$rman target /

Recovery Manager: Release 10.2.0.3.0 - Production on Wed Jun 10 16:00:08 2015

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

connected to target database: ORCL (DBID=1150889877)

RMAN> backup current controlfile format '/tmp/xifenfei.ctl';

Starting backup at 10-JUN-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=628 instance=orcl1 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
waiting for snapshot control file enqueue
waiting for snapshot control file enqueue
waiting for snapshot control file enqueue
waiting for snapshot control file enqueue
waiting for snapshot control file enqueue
cannot make a snapshot control file
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 06/10/2015 16:03:10
ORA-00230: operation disallowed: snapshot control file enqueue unavailable

查看持有CF enqueue会话

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options

SQL> SELECT s.SID, USERNAME AS "User", PROGRAM, MODULE,
  2  ACTION, LOGON_TIME "Logon"
  3  FROM V$SESSION s, V$ENQUEUE_LOCK l
  4  WHERE l.SID = s.SID
  5  AND l.TYPE = 'CF'
  6  AND l.ID1 = 0
  7  AND l.ID2 = 2;

       SID User
---------- ------------------------------
PROGRAM
------------------------------------------------
MODULE
------------------------------------------------
ACTION                           Logon
-------------------------------- ------------
       648 SYS
rman@db1 (TNS V1-V3)
backup full datafile
0000152 STARTED111               03-JUN-15

kill相关session

SQL> select spid from v$process where addr in(select paddr from v$session where sid=648);

SPID
------------
40108238

SQL> !ps -ef|grep 40108238
  oracle 39125244 65011720   0 15:59:27  pts/0  0:00 grep 40108238
  oracle 40108238        1   0   Jun 03      -  1:18 oracleorcl1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))


db1:/home/oracle>$kill -9 40108238

再次测试备份控制文件–OK

db1:/home/oracle>$rman target /

Recovery Manager: Release 10.2.0.3.0 - Production on Wed Jun 10 16:05:06 2015

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

connected to target database: ORCL (DBID=1150889877)

RMAN> backup current controlfile format '/tmp/xifenfei.ctl';

Starting backup at 10-JUN-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=632 instance=orcl1 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current control file in backupset
channel ORA_DISK_1: starting piece 1 at 10-JUN-15
channel ORA_DISK_1: finished piece 1 at 10-JUN-15
piece handle=/tmp/xifenfei.ctl tag=TAG20150610T160516 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
Finished backup at 10-JUN-15
发表在 ORA-xxxxx, rman备份/恢复 | 标签为 | 评论关闭

ORA-21561: OID generation failed故障解决

数据库无法登陆报ORA-21561: OID generation failed错误

[oracle@essc ~]$ sqlplus XIFENFEI/"www.xifenfei.com"@172.16.50.200/orcl

SQL*Plus: Release 11.2.0.4.0 Production on Mon Jun 1 16:52:29 2015

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

ERROR:
ORA-21561: OID generation failed


Enter user-name: 
ERROR:
ORA-12545: Connect failed because target host or object does not exist


Enter user-name: 
ERROR:
ORA-12545: Connect failed because target host or object does not exist


SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus

主机名无法ping通

[root@essc ~]# ping essc
ping: unknown host essc
[root@essc ~]# hostname
essc

[root@essc ~]# more /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.10.11 app1

[oracle@essc ~]$ ifconfig
eth3      Link encap:Ethernet  HWaddr 00:50:56:BB:00:6B  
          inet addr:172.16.10.30  Bcast:172.16.10.255  Mask:255.255.255.0
          inet6 addr: fe80::250:56ff:febb:6b/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:9597 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4018 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1549999 (1.4 MiB)  TX bytes:470158 (459.1 KiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:164 errors:0 dropped:0 overruns:0 frame:0
          TX packets:164 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:11424 (11.1 KiB)  TX bytes:11424 (11.1 KiB)

修改hosts文件
让hosts中含主机名,也就是为了主机名能够ping通

[oracle@essc ~]$ more /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.10.11 app1
172.16.10.30 essc

数据库登录测试

[oracle@essc ~]$ sqlplus XIFENFEI/"www.xifenfei.com"@172.16.50.200/orcl

SQL*Plus: Release 11.2.0.4.0 Production on Mon Jun 1 16:56:39 2015

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> 
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

具体见官方说明
Ora-21561: OID Generation Failed (Doc ID 1335327.1)

APPLIES TO:
Oracle Net Services - Version 9.2.0.8 to 11.2.0.3 [Release 9.2 to 11.2]
Information in this document applies to any platform.
***Checked for relevance on 14-Jan-2013*** 

SYMPTOMS
When attempting to connect to the database using SQL*Plus or DBCA, 
the following error occurs intermittently:

ERROR
-----------------------
ORA-21561: OID generation failed

CAUSE
This could be caused by not having the host name for the 
target database fully qualified in the hosts file.
To verify if you are hitting this issue, the following symptoms should be met:

- ORA-21561: OID generation failed.
- Hosts file has un-fully qualified entry for the target database host:

127.0.0.1 loopback localhost # loopback (lo0) name/address 
10.210.9.111 dbhost
In this sample, dbhost is the target db host.

This is reported in an unpublished Bug 12597261: 
"ORA-21561 IF HOSTNAME ENVIRONMENT VARIABLE IS NOT FULLY QUALIFIED",
 which should be resolved as of 12G.

SOLUTION
Modify the hosts file to have the fully qualified host names, 
by adding the fully qualified domain name to the entry.

127.0.0.1 loopback localhost # loopback (lo0) name/address 
10.210.9.111 dbhost.sample.com

Verify that other environment and service handles are properly defined as well.

If this is a Windows environment, please check : 
Windows: Connections Fail with ORA-12640 or ORA-21561 (Doc ID 744125.1)

再次建议:修改主机名请修改完全,具体参考:linux上安装oracle10g注意事项中修改主机名部分

发表在 ORA-xxxxx | 标签为 , , , | 评论关闭