标签云
asm 恢复 asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 kfed MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 ORACLE恢复 Oracle 恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (100)
- 数据库 (1,595)
- DB2 (22)
- MySQL (70)
- Oracle (1,462)
- Data Guard (49)
- EXADATA (7)
- GoldenGate (21)
- ORA-xxxxx (158)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (13)
- ORACLE 21C (3)
- Oracle ASM (65)
- Oracle Bug (7)
- Oracle RAC (47)
- Oracle 安全 (6)
- Oracle 开发 (27)
- Oracle 监听 (27)
- Oracle备份恢复 (530)
- Oracle安装升级 (83)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (75)
- PostgreSQL (16)
- PostgreSQL恢复 (4)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (36)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (19)
-
最近发表
- PostgreSQL部分主要字典信息
- PostgreSQL恢复系列:pg_filedump恢复字典构造
- PostgreSQL 16 源码安装
- ORA-00742 ORA-00312 恢复
- 数据库open成功后报ORA-00353 ORA-00354错误引起的一系列问题(本质ntfs文件系统异常)
- ORA-600 ktsiseginfo1故障
- ORA-00600: internal error code, arguments: [16703], [1403], [4] 原因
- 最近遇到几起ORA-600 16703故障(tab$被清空),请引起重视
- ORA-600 2662快速恢复之Patch scn工具
- TNS-12518: TNS:listener could not hand off client connection
- ora.storage无法启动报ORA-12514故障处理
- 断电引起文件scn异常数据库恢复
- ORA-16188: LOG_ARCHIVE_CONFIG settings inconsistent with previously started instance
- .[hudsonL@cock.li].mkp勒索加密数据库完美恢复
- 模拟带库实现rman远程备份
- 又一例:ORA-600 kclchkblk_4和2662故障
- Oracle误删除数据文件恢复
- Oracle 19C 备库DML重定向—DML Redirection
- ORA-01595/ORA-600 4194处理
- 从ORA-00283 ORA-16433报错开始恢复
月归档:六月 2012
iscsiadm主要操作命令
当前包含磁盘
[root@xifenfei ~]# fdisk -l Disk /dev/sda: 21.4 GB, 21474836480 bytes 255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 * 1 2355 18916506 83 Linux /dev/sda2 2356 2610 2048287+ 82 Linux swap / Solaris Disk /dev/sdb: 21.4 GB, 21474836480 bytes 255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdb1 1 2610 20964793+ 83 Linux Disk /dev/sdc: 2147 MB, 2147483648 bytes 255 heads, 63 sectors/track, 261 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk /dev/sdc doesn't contain a valid partition table Disk /dev/sdd: 21.4 GB, 21474836480 bytes 255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdd1 1 100 803218+ 83 Linux /dev/sdd2 101 1000 7229250 83 Linux
查看iscsi运行情况
[root@xifenfei ~]# rpm -aq|grep iscsi iscsi-initiator-utils-6.2.0.872-10.0.1.el5 [root@xifenfei ~]# chkconfig --list |grep iscsi iscsi 0:off 1:off 2:on 3:on 4:on 5:on 6:off iscsid 0:off 1:off 2:off 3:on 4:on 5:on 6:off [root@xifenfei ~]# ps -ef|grep iscs root 2753 2 0 Jun21 ? 00:00:00 [iscsi_eh] root 15793 1 0 09:08 ? 00:00:00 brcm_iscsiuio root 15800 1 0 09:08 ? 00:00:00 iscsid root 15802 1 0 09:08 ? 00:00:00 iscsid root 19533 15269 0 10:11 pts/1 00:00:00 grep iscs
配置iscsi存储
[root@xifenfei ~]# iscsiadm -m discovery -t sendtargets -p 192.168.1.254:3260 192.168.1.254:3260,1 iqn.2006-01.com.openfiler:tsn.32b32087937b [root@xifenfei ~]# iscsiadm -m node –T iqn.2006-01.com.openfiler:tsn.32b32087937b -p 192.168.1.254:3260 -l Logging in to [iface: default, target: iqn.2006-01.com.openfiler:tsn.32b32087937b, portal: 192.168.1.254,3260] Login to [iface: default, target: iqn.2006-01.com.openfiler:tsn.32b32087937b, portal: 192.168.1.254,3260] successful. [root@xifenfei ~]# iscsiadm -m node –T iqn.2006-01.com.openfiler:tsn.32b32087937b -p 192.168.1.254:3260 >--op update -n node.startup -v automatic
当前包含磁盘
[root@xifenfei ~]# fdisk -l Disk /dev/sda: 21.4 GB, 21474836480 bytes 255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 * 1 2355 18916506 83 Linux /dev/sda2 2356 2610 2048287+ 82 Linux swap / Solaris Disk /dev/sdb: 21.4 GB, 21474836480 bytes 255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdb1 1 2610 20964793+ 83 Linux Disk /dev/sdc: 2147 MB, 2147483648 bytes 255 heads, 63 sectors/track, 261 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk /dev/sdc doesn't contain a valid partition table Disk /dev/sdd: 21.4 GB, 21474836480 bytes 255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdd1 1 100 803218+ 83 Linux /dev/sdd2 101 1000 7229250 83 Linux Disk /dev/sde: 1073 MB, 1073741824 bytes 34 heads, 61 sectors/track, 1011 cylinders Units = cylinders of 2074 * 512 = 1061888 bytes Disk /dev/sde doesn't contain a valid partition table Disk /dev/sdf: 1073 MB, 1073741824 bytes 34 heads, 61 sectors/track, 1011 cylinders Units = cylinders of 2074 * 512 = 1061888 bytes Disk /dev/sdf doesn't contain a valid partition table Disk /dev/sdg: 1073 MB, 1073741824 bytes 34 heads, 61 sectors/track, 1011 cylinders Units = cylinders of 2074 * 512 = 1061888 bytes Disk /dev/sdg doesn't contain a valid partition table
卸载iscsi存储
[root@xifenfei ~]# iscsiadm -m node --logoutall=all Logging out of session [sid: 3, target: iqn.2006-01.com.openfiler:tsn.32b32087937b, portal: 192.168.1.254,3260] Logout of [sid: 3, target: iqn.2006-01.com.openfiler:tsn.32b32087937b, portal: 192.168.1.254,3260] successful. [root@xifenfei ~]# iscsiadm -m node --op delete --targetname iqn.2006-01.com.openfiler:tsn.32b32087937b
当前包含磁盘
[root@xifenfei ~]# fdisk -l Disk /dev/sda: 21.4 GB, 21474836480 bytes 255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 * 1 2355 18916506 83 Linux /dev/sda2 2356 2610 2048287+ 82 Linux swap / Solaris Disk /dev/sdb: 21.4 GB, 21474836480 bytes 255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdb1 1 2610 20964793+ 83 Linux Disk /dev/sdc: 2147 MB, 2147483648 bytes 255 heads, 63 sectors/track, 261 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk /dev/sdc doesn't contain a valid partition table Disk /dev/sdd: 21.4 GB, 21474836480 bytes 255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdd1 1 100 803218+ 83 Linux /dev/sdd2 101 1000 7229250 83 Linux
iscsi操作总结
增加iscsi存储 (1)发现iscsi存储:iscsiadm -m discovery -t st -p ISCSI_IP (2)查看iscsi发现记录:iscsiadm -m node (3)登录iscsi存储:iscsiadm -m node -T LUN_NAME -p ISCSI_IP -l (4)开机自动: iscsiadm -m node –T LUN_NAME -p ISCSI_IP --op update -n node.startup -v automatic 删除iscsi存储 (1)登出iscsi存储 iscsiadm -m node -T LUN_NAME -p ISCSI_IP -u (2)对出iscsi所有登录 iscsiadm -m node --logoutall=all (3)删除iscsi发现记录:iscsiadm -m node -o delete -T LUN_NAME -p ISCSI_IP 登入需验证码的节点 (1)开启认证 iscsiadm -m node -T LUN_NAME -o update --name node.session.auth.authmethod --value=CHAP *.使用-o同--op (2)添加用户 iscsiadm -m node -T LUN_NAME --op update --name node.session.auth.username --value=[用户名] (3)添加密码 iscsiadm –m node –T LUN_NAME –op update –name node.session.auth.password –value=[密码]
发表在 Linux
评论关闭
gv$视图不能查询所有节点信息
今天遇到诡异的事情,AIX 5.3 ORACLE 9I RAC的gv$视图只能查询到本地的记录,而不是所有节点.但是所有节点均运行正常,除gv$视图之外未发现其他异常.
异常时节点1信息
SQL> show parameter clu; NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ cluster_database boolean TRUE cluster_database_instances integer 2 cluster_interconnects string 192.168.6.24 SQL> select * from v$version; BANNER ---------------------------------------------------------------- Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production PL/SQL Release 9.2.0.8.0 - Production CORE 9.2.0.8.0 Production TNS for IBM/AIX RISC System/6000: Version 9.2.0.8.0 - Production NLSRTL Version 9.2.0.8.0 - Production SQL> col host_name for a10 SQL> select inst_id,HOST_NAME,STATUS from gv$instance; INST_ID HOST_NAME STATUS ---------- ---------- ------------ 1 zwq_crm1 OPEN SQL> show parameter par; NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ fast_start_parallel_rollback string LOW log_parallelism integer 1 parallel_adaptive_multi_user boolean FALSE parallel_automatic_tuning boolean FALSE parallel_execution_message_size integer 2152 parallel_instance_group string parallel_max_servers integer 5 parallel_min_percent integer 0 parallel_min_servers integer 0 parallel_server boolean TRUE parallel_server_instances integer 2 parallel_threads_per_cpu integer 2 partition_view_enabled boolean FALSE recovery_parallelism integer 0 SQL> !ps -ef|grep p0 oracrm 1929258 1 0 09:49:19 - 0:01 ora_p005_crm1 oracrm 745844 1 0 Jun 26 - 0:20 ora_p004_crm1 oraeye 2421272 3948648 0 21:53:49 pts/0 0:00 grep p0 oracrm 3060406 1 0 Jun 26 - 0:20 ora_p002_crm1 oracrm 3170868 1 0 Jun 20 - 2:13 ora_p000_crm1 oracrm 787414 1 0 Jun 26 - 0:20 ora_p001_crm1 oracrm 2552690 1 0 Jun 26 - 0:20 ora_p003_crm1
1.节点最大允许5个并发进程,现在已经启动并发进程到p005(6个)
2.gv$视图只能查询一个节点信息
异常时节点2信息
问题所有情况和1节点完全相似
SQL> col host_name for a10 SQL> select inst_id,HOST_NAME,STATUS from gv$instance; INST_ID HOST_NAME STATUS ---------- ---------- ------------ 2 zwq_crm2 OPEN SQL> show parameter par; NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ fast_start_parallel_rollback string LOW log_parallelism integer 1 parallel_adaptive_multi_user boolean FALSE parallel_automatic_tuning boolean FALSE parallel_execution_message_size integer 2152 parallel_instance_group string parallel_max_servers integer 5 parallel_min_percent integer 0 parallel_min_servers integer 0 parallel_server boolean TRUE parallel_server_instances integer 2 parallel_threads_per_cpu integer 2 partition_view_enabled boolean FALSE recovery_parallelism integer 0 SQL> !ps -ef|grep p0 oracrm 1867938 1 0 15:17:25 - 0:00 ora_p004_crm2 oracrm 2633748 1 0 09:49:19 - 0:01 ora_p005_crm2 oraeye 3059876 1007714 0 21:54:01 pts/0 0:00 grep p0 oracrm 323884 1 120 Jun 20 - 10692:47 ora_p000_crm2 oracrm 1839818 1 0 06:16:32 - 0:00 ora_p003_crm2 oracrm 459660 1 107 Jun 26 - 1857:00 ora_p001_crm2 oracrm 2351894 1 0 16:52:52 - 0:00 ora_p002_crm2
在异常2节点上做10046
SQL> oradebug setmypid Statement processed. SQL> oradebug EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 4 Statement processed. SQL> select * from gv$version; INST_ID BANNER ---------- ---------------------------------------------------------------- 2 Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production 2 PL/SQL Release 9.2.0.8.0 - Production 2 CORE 9.2.0.8.0 Production 2 TNS for IBM/AIX RISC System/6000: Version 9.2.0.8.0 - Production 2 NLSRTL Version 9.2.0.8.0 - Production SQL> oradebug EVENT 10046 trace name context off Statement processed. SQL> oradebug TRACEFILE_NAME /oracle9/app/admin/crm/udump/crm2_ora_517066.trc
分析10046内容发现
PARSING IN CURSOR #1 len=24 dep=0 uid=0 oct=3 lid=0 tim=41759005850609 hv=775381991 ad='21195808' select * from gv$version END OF STMT PARSE #1:c=20000,e=33894,p=1,cr=59,cu=3,mis=1,r=0,dep=0,og=4,tim=41759005850607 BINDS #1: kxfpg1srv could not start P006, inst 1 kxfpg1srv could not start local P006 EXEC #1:c=0,e=3540,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=41759005854254 FETCH #1:c=0,e=28,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=4,tim=41759005854347 FETCH #1:c=0,e=23,p=0,cr=0,cu=0,mis=0,r=4,dep=0,og=4,tim=41759005854678 STAT #1 id=1 cnt=5 pid=0 pos=1 obj=0 op='VIEW ' STAT #1 id=2 cnt=5 pid=1 pos=1 obj=304 op='FIXED TABLE FULL X$VERSION '
通过这里观察10046可以看到:在2节点上查询gv$instance,需要通过使用并发进程去访问1节点,但是因为1节点的parallel_max_servers为5,而当前的并发进程已经达到最大数目,从而使得想在节点1上启动并发失败,进入使得gv$视图只能查询出来本节点数据
gv$视图异常解决方法
--重启两个节点,查询正常 SQL> col host_name for a10 SQL> select inst_id,HOST_NAME,STATUS from gv$instance; INST_ID HOST_NAME STATUS ---------- ---------- ------------ 1 zwq_crm1 OPEN 2 zwq_crm2 OPEN
针对这个问题,很可能是oracle bug(因为是9i版本,我无法深究),或者是并发进程僵死所致,当时有一个想法,kill 掉数据库并发进程,因客户不同意(采用稳妥重启方案),未能通过尝试验证我的猜想.
对gv$视图正常做10046
SQL> oradebug setmypid Statement processed. SQL> oradebug EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12 Statement processed. SQL> select * from gv$version; INST_ID BANNER ---------- ---------------------------------------------------------------- 1 Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production 1 PL/SQL Release 9.2.0.8.0 - Production 1 CORE 9.2.0.8.0 Production 1 TNS for IBM/AIX RISC System/6000: Version 9.2.0.8.0 - Production 1 NLSRTL Version 9.2.0.8.0 - Production 2 Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production 2 PL/SQL Release 9.2.0.8.0 - Production 2 CORE 9.2.0.8.0 Production 2 TNS for IBM/AIX RISC System/6000: Version 9.2.0.8.0 - Production 2 NLSRTL Version 9.2.0.8.0 - Production 10 rows selected. SQL> oradebug EVENT 10046 trace name context off Statement processed. SQL> oradebug TRACEFILE_NAME /oracle9/app/admin/crm/udump/crm1_ora_1708916.trc
分析10046内容
PARSING IN CURSOR #1 len=24 dep=0 uid=0 oct=3 lid=0 tim=41752681925071 hv=775381991 ad='92ef64f0' select * from gv$version END OF STMT PARSE #1:c=0,e=6770,p=0,cr=12,cu=3,mis=1,r=0,dep=0,og=4,tim=41752681925070 BINDS #1: WAIT #1: nam='PX Deq: reap credit' ela= 21 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: Join ACK' ela= 237 p1=268500992 p2=1 p3=504403208016510312 WAIT #1: nam='PX Deq: reap credit' ela= 4 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: Join ACK' ela= 64 p1=268500992 p2=2 p3=504403208016510312 WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: Join ACK' ela= 46 p1=268566528 p2=1 p3=504403208016502096 WAIT #1: nam='PX Deq: reap credit' ela= 4 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: Join ACK' ela= 855 p1=268566528 p2=2 p3=504403208016502096 WAIT #1: nam='PX Deq: reap credit' ela= 4 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq Credit: send blkd' ela= 188 p1=268566528 p2=1 p3=504403208016502096 WAIT #1: nam='PX Deq: reap credit' ela= 1 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: Parse Reply' ela= 406 p1=200 p2=1 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: Parse Reply' ela= 7442 p1=200 p2=2 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 4 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: Parse Reply' ela= 1664 p1=200 p2=1 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0 EXEC #1:c=0,e=11572,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=41752681936727 WAIT #1: nam='SQL*Net message to client' ela= 1 p1=1650815232 p2=1 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: Execute Reply' ela= 166 p1=200 p2=1 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: Execute Reply' ela= 310 p1=200 p2=2 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0 FETCH #1:c=0,e=557,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=4,tim=41752681937320 WAIT #1: nam='SQL*Net message from client' ela= 223 p1=1650815232 p2=1 p3=0 WAIT #1: nam='SQL*Net message to client' ela= 1 p1=1650815232 p2=1 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 18 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0 WAIT #1: nam='DFS lock handle' ela= 263 p1=1128857605 p2=9 p3=5 WAIT #1: nam='DFS lock handle' ela= 254 p1=1128857605 p2=9 p3=1 WAIT #1: nam='DFS lock handle' ela= 263 p1=1128857605 p2=9 p3=3 WAIT #1: nam='DFS lock handle' ela= 73 p1=1128857605 p2=9 p3=2 WAIT #1: nam='DFS lock handle' ela= 363 p1=1128857605 p2=9 p3=2 WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 24 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: Signal ACK' ela= 3 p1=0 p2=1 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: Signal ACK' ela= 155 p1=10 p2=2 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 1 p1=0 p2=0 p3=0 FETCH #1:c=0,e=1735,p=0,cr=0,cu=0,mis=0,r=9,dep=0,og=4,tim=41752681939329 WAIT #1: nam='DFS lock handle' ela= 249 p1=1128857605 p2=9 p3=1 WAIT #1: nam='DFS lock handle' ela= 258 p1=1128857605 p2=9 p3=3 WAIT #1: nam='DFS lock handle' ela= 66 p1=1128857605 p2=9 p3=2 WAIT #1: nam='DFS lock handle' ela= 369 p1=1128857605 p2=9 p3=2 WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 19 p1=0 p2=0 p3=0 WAIT #1: nam='PX Deq: reap credit' ela= 1 p1=0 p2=0 p3=0 WAIT #1: nam='enqueue' ela= 201 p1=1347616774 p2=2 p3=0 WAIT #1: nam='enqueue' ela= 41 p1=1347616774 p2=2 p3=0 WAIT #1: nam='SQL*Net message from client' ela= 6981860 p1=1650815232 p2=1 p3=0 STAT #1 id=1 cnt=0 pid=0 pos=1 obj=0 op='VIEW ' STAT #1 id=2 cnt=0 pid=1 pos=1 obj=304 op='FIXED TABLE FULL X$VERSION '
通过这里可以看出,rac在正常情况下gv$视图默认的查询就是并发方式进行
发表在 Oracle
评论关闭
因未配置Hugepage会话数添增悲剧案例
今天一朋友反馈他们的一个数据库hang住了,通过ssh也不能登录系统,他们没有办法重启系统解决问题,现在想让我帮忙找出问题原因
分析awr得出
询问朋友,他们的库一般session保持在200个左右,这次突然飙升到750以上,属于异常情况
分析监听日志
看到在截图的时间内,整体访问较频繁,某个ip访问异常频繁,通过这些信息,初步怀疑是用户的数据库内存使用完,导致系统数据库hang住.
查看系统日志
Jun 26 14:35:55 result01 kernel: [5613531.566617] Free swap = 0kB Jun 26 14:35:55 result01 kernel: [5613531.566618] Total swap = 2104504kB Jun 26 14:35:55 result01 kernel: [5613531.566620] Free swap: 0kB Jun 26 14:35:55 result01 kernel: [5613531.591073] 2359296 pages of RAM Jun 26 14:35:55 result01 kernel: [5613531.591074] 318236 reserved pages Jun 26 14:35:55 result01 kernel: [5613531.591075] 73353 pages shared Jun 26 14:35:56 result01 kernel: [5613531.591076] 529 pages swap cached Jun 26 14:35:56 result01 kernel: [5613531.591079] Out of Memory: Kill process 8904 (oracle) score 891 and children. Jun 26 14:35:56 result01 kernel: [5613531.591201] Out of memory: Killed process 8904 (oracle). Jun 26 14:35:56 result01 kernel: [5613531.592280] oracle invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
通过这个日志看出系统内存和交换分区都使用完,因为内存不够,系统开始kill掉部分oracle进程.通过这些确定是系统内存使用完导致hang住可以理解.
分析hang住原因
为什么session意外的从200添增到750的时候,系统内存被使用完
cat /proc/meminfo MemTotal: 8164240 kB SwapTotal: 2104504 kB PageTables: 69732 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB sga_target=3674210304 pga_aggregate_target=1732247552
从这里得出几个信息:
1.数据库总内存8g,swap配置2g
2.数据库未使用Hugepage
3.数据库设置sga和pga信息
内存参数估算
数据库总计占用内存为:(3674210304+1732247552)/1024/1024=5156M(pga可能未使用完,也可能超过)
结合实际sga_target=3674210304,会话数.
保守估计下Oracle进程占用的系统内存3674210304/(4*1024)*1.5*750/1024/1024=960M
估算如果使用Hugepage Oracle进程占用系统内存为:3674210304/(2*1024*1024)*1.5*750/1024/1024=1.9M
通过这里分析Oracle总占用内存为:5156+960=6116M
通过保守计算留给系统的内存大概为:1.8G左右
因为系统的其他操作,最终导致该系统内存耗完,系统和数据库hang住
总结说明
这是一个实实在在因为linux中因为未配置Hugepage,因为用户突增,导致系统内存消耗光,从而使得系统和数据库hang住的例子.
这个库因为sga不是非常大,所以Oracle占用系统内存不是高到离谱,如果sga配置为32g,1000个session,那就会占用12g的系统内存
通过这些可以看出在linux中配置Hugepage的优点:Hugepage不光是为了减轻cpu的负担,还可以减少系统内存的消耗;在没有极端的情况下,建议linux的数据库系统配置Hugepage.