月归档:六月 2012

iscsiadm主要操作命令

当前包含磁盘

[root@xifenfei ~]# fdisk -l

Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        2355    18916506   83  Linux
/dev/sda2            2356        2610     2048287+  82  Linux swap / Solaris

Disk /dev/sdb: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1        2610    20964793+  83  Linux

Disk /dev/sdc: 2147 MB, 2147483648 bytes
255 heads, 63 sectors/track, 261 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdc doesn't contain a valid partition table

Disk /dev/sdd: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1         100      803218+  83  Linux
/dev/sdd2             101        1000     7229250   83  Linux

查看iscsi运行情况

[root@xifenfei ~]# rpm -aq|grep iscsi
iscsi-initiator-utils-6.2.0.872-10.0.1.el5
[root@xifenfei ~]#  chkconfig --list |grep iscsi
iscsi           0:off   1:off   2:on    3:on    4:on    5:on    6:off
iscsid          0:off   1:off   2:off   3:on    4:on    5:on    6:off
[root@xifenfei ~]# ps -ef|grep iscs
root      2753     2  0 Jun21 ?        00:00:00 [iscsi_eh]
root     15793     1  0 09:08 ?        00:00:00 brcm_iscsiuio
root     15800     1  0 09:08 ?        00:00:00 iscsid
root     15802     1  0 09:08 ?        00:00:00 iscsid
root     19533 15269  0 10:11 pts/1    00:00:00 grep iscs

配置iscsi存储

[root@xifenfei ~]# iscsiadm -m discovery -t sendtargets -p 192.168.1.254:3260
192.168.1.254:3260,1 iqn.2006-01.com.openfiler:tsn.32b32087937b
[root@xifenfei ~]# iscsiadm -m node –T iqn.2006-01.com.openfiler:tsn.32b32087937b -p 192.168.1.254:3260 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:tsn.32b32087937b, portal: 192.168.1.254,3260]
Login to [iface: default, target: iqn.2006-01.com.openfiler:tsn.32b32087937b, portal: 192.168.1.254,3260] successful.
[root@xifenfei ~]# iscsiadm -m node –T iqn.2006-01.com.openfiler:tsn.32b32087937b -p 192.168.1.254:3260 
>--op update -n node.startup -v automatic

当前包含磁盘

[root@xifenfei ~]# fdisk -l

Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        2355    18916506   83  Linux
/dev/sda2            2356        2610     2048287+  82  Linux swap / Solaris

Disk /dev/sdb: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1        2610    20964793+  83  Linux

Disk /dev/sdc: 2147 MB, 2147483648 bytes
255 heads, 63 sectors/track, 261 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdc doesn't contain a valid partition table

Disk /dev/sdd: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1         100      803218+  83  Linux
/dev/sdd2             101        1000     7229250   83  Linux

Disk /dev/sde: 1073 MB, 1073741824 bytes
34 heads, 61 sectors/track, 1011 cylinders
Units = cylinders of 2074 * 512 = 1061888 bytes

Disk /dev/sde doesn't contain a valid partition table

Disk /dev/sdf: 1073 MB, 1073741824 bytes
34 heads, 61 sectors/track, 1011 cylinders
Units = cylinders of 2074 * 512 = 1061888 bytes

Disk /dev/sdf doesn't contain a valid partition table

Disk /dev/sdg: 1073 MB, 1073741824 bytes
34 heads, 61 sectors/track, 1011 cylinders
Units = cylinders of 2074 * 512 = 1061888 bytes

Disk /dev/sdg doesn't contain a valid partition table

卸载iscsi存储

[root@xifenfei ~]# iscsiadm -m node --logoutall=all
Logging out of session [sid: 3, target: iqn.2006-01.com.openfiler:tsn.32b32087937b, portal: 192.168.1.254,3260]
Logout of [sid: 3, target: iqn.2006-01.com.openfiler:tsn.32b32087937b, portal: 192.168.1.254,3260] successful.
[root@xifenfei ~]# iscsiadm -m node --op delete --targetname iqn.2006-01.com.openfiler:tsn.32b32087937b

当前包含磁盘

[root@xifenfei ~]# fdisk -l

Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        2355    18916506   83  Linux
/dev/sda2            2356        2610     2048287+  82  Linux swap / Solaris

Disk /dev/sdb: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1        2610    20964793+  83  Linux

Disk /dev/sdc: 2147 MB, 2147483648 bytes
255 heads, 63 sectors/track, 261 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdc doesn't contain a valid partition table

Disk /dev/sdd: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1         100      803218+  83  Linux
/dev/sdd2             101        1000     7229250   83  Linux

iscsi操作总结

增加iscsi存储
(1)发现iscsi存储:iscsiadm -m discovery -t st -p ISCSI_IP
(2)查看iscsi发现记录:iscsiadm -m node
(3)登录iscsi存储:iscsiadm -m node -T LUN_NAME -p ISCSI_IP -l
(4)开机自动: iscsiadm -m node –T LUN_NAME -p ISCSI_IP --op update -n node.startup -v automatic

删除iscsi存储
(1)登出iscsi存储 iscsiadm -m node -T LUN_NAME -p ISCSI_IP -u
(2)对出iscsi所有登录 iscsiadm -m node --logoutall=all
(3)删除iscsi发现记录:iscsiadm -m node -o delete -T LUN_NAME -p ISCSI_IP

登入需验证码的节点
(1)开启认证
iscsiadm -m node -T LUN_NAME -o update --name node.session.auth.authmethod --value=CHAP
*.使用-o同--op
(2)添加用户
iscsiadm -m node -T LUN_NAME --op update --name node.session.auth.username --value=[用户名]
(3)添加密码
iscsiadm –m node –T LUN_NAME –op update –name node.session.auth.password –value=[密码]
发表在 Linux | 评论关闭

gv$视图不能查询所有节点信息

今天遇到诡异的事情,AIX 5.3 ORACLE 9I RAC的gv$视图只能查询到本地的记录,而不是所有节点.但是所有节点均运行正常,除gv$视图之外未发现其他异常.
异常时节点1信息

SQL> show parameter clu;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cluster_database                     boolean     TRUE
cluster_database_instances           integer     2
cluster_interconnects                string      192.168.6.24

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
PL/SQL Release 9.2.0.8.0 - Production
CORE    9.2.0.8.0       Production
TNS for IBM/AIX RISC System/6000: Version 9.2.0.8.0 - Production
NLSRTL Version 9.2.0.8.0 - Production

SQL> col host_name for a10
SQL> select inst_id,HOST_NAME,STATUS from gv$instance;

   INST_ID HOST_NAME  STATUS
---------- ---------- ------------
         1 zwq_crm1   OPEN

SQL> show parameter par;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
log_parallelism                      integer     1
parallel_adaptive_multi_user         boolean     FALSE
parallel_automatic_tuning            boolean     FALSE
parallel_execution_message_size      integer     2152
parallel_instance_group              string
parallel_max_servers                 integer     5
parallel_min_percent                 integer     0
parallel_min_servers                 integer     0
parallel_server                      boolean     TRUE
parallel_server_instances            integer     2
parallel_threads_per_cpu             integer     2
partition_view_enabled               boolean     FALSE
recovery_parallelism                 integer     0

SQL> !ps -ef|grep p0
  oracrm 1929258       1   0 09:49:19      -  0:01 ora_p005_crm1 
  oracrm  745844       1   0   Jun 26      -  0:20 ora_p004_crm1 
  oraeye 2421272 3948648   0 21:53:49  pts/0  0:00 grep p0 
  oracrm 3060406       1   0   Jun 26      -  0:20 ora_p002_crm1 
  oracrm 3170868       1   0   Jun 20      -  2:13 ora_p000_crm1 
  oracrm  787414       1   0   Jun 26      -  0:20 ora_p001_crm1 
  oracrm 2552690       1   0   Jun 26      -  0:20 ora_p003_crm1 

1.节点最大允许5个并发进程,现在已经启动并发进程到p005(6个)
2.gv$视图只能查询一个节点信息

异常时节点2信息
问题所有情况和1节点完全相似

SQL> col host_name for a10
SQL> select inst_id,HOST_NAME,STATUS from gv$instance;

   INST_ID HOST_NAME  STATUS
---------- ---------- ------------
         2 zwq_crm2   OPEN

SQL> show parameter par;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
log_parallelism                      integer     1
parallel_adaptive_multi_user         boolean     FALSE
parallel_automatic_tuning            boolean     FALSE
parallel_execution_message_size      integer     2152
parallel_instance_group              string
parallel_max_servers                 integer     5
parallel_min_percent                 integer     0
parallel_min_servers                 integer     0
parallel_server                      boolean     TRUE
parallel_server_instances            integer     2
parallel_threads_per_cpu             integer     2
partition_view_enabled               boolean     FALSE
recovery_parallelism                 integer     0

SQL> !ps -ef|grep p0
  oracrm 1867938       1   0 15:17:25      -  0:00 ora_p004_crm2 
  oracrm 2633748       1   0 09:49:19      -  0:01 ora_p005_crm2 
  oraeye 3059876 1007714   0 21:54:01  pts/0  0:00 grep p0 
  oracrm  323884       1 120   Jun 20      - 10692:47 ora_p000_crm2 
  oracrm 1839818       1   0 06:16:32      -  0:00 ora_p003_crm2 
  oracrm  459660       1 107   Jun 26      - 1857:00 ora_p001_crm2 
  oracrm 2351894       1   0 16:52:52      -  0:00 ora_p002_crm2 

在异常2节点上做10046

SQL> oradebug setmypid
Statement processed.
SQL> oradebug EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 4
Statement processed.
SQL> select * from gv$version;

   INST_ID BANNER
---------- ----------------------------------------------------------------
         2 Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
         2 PL/SQL Release 9.2.0.8.0 - Production
         2 CORE 9.2.0.8.0       Production
         2 TNS for IBM/AIX RISC System/6000: Version 9.2.0.8.0 - Production
         2 NLSRTL Version 9.2.0.8.0 - Production

SQL> oradebug EVENT 10046 trace name context off
Statement processed.
SQL> oradebug TRACEFILE_NAME
/oracle9/app/admin/crm/udump/crm2_ora_517066.trc

分析10046内容发现

PARSING IN CURSOR #1 len=24 dep=0 uid=0 oct=3 lid=0 tim=41759005850609 hv=775381991 ad='21195808'
select * from gv$version
END OF STMT
PARSE #1:c=20000,e=33894,p=1,cr=59,cu=3,mis=1,r=0,dep=0,og=4,tim=41759005850607
BINDS #1:
kxfpg1srv
        could not start P006, inst 1
kxfpg1srv
        could not start local P006
EXEC #1:c=0,e=3540,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=41759005854254
FETCH #1:c=0,e=28,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=4,tim=41759005854347
FETCH #1:c=0,e=23,p=0,cr=0,cu=0,mis=0,r=4,dep=0,og=4,tim=41759005854678
STAT #1 id=1 cnt=5 pid=0 pos=1 obj=0 op='VIEW  '
STAT #1 id=2 cnt=5 pid=1 pos=1 obj=304 op='FIXED TABLE FULL X$VERSION '

通过这里观察10046可以看到:在2节点上查询gv$instance,需要通过使用并发进程去访问1节点,但是因为1节点的parallel_max_servers为5,而当前的并发进程已经达到最大数目,从而使得想在节点1上启动并发失败,进入使得gv$视图只能查询出来本节点数据

gv$视图异常解决方法

--重启两个节点,查询正常
SQL> col host_name for a10
SQL> select inst_id,HOST_NAME,STATUS from gv$instance;

   INST_ID HOST_NAME  STATUS
---------- ---------- ------------
         1 zwq_crm1   OPEN
         2 zwq_crm2   OPEN

针对这个问题,很可能是oracle bug(因为是9i版本,我无法深究),或者是并发进程僵死所致,当时有一个想法,kill 掉数据库并发进程,因客户不同意(采用稳妥重启方案),未能通过尝试验证我的猜想.

对gv$视图正常做10046

SQL> oradebug setmypid
Statement processed.
SQL> oradebug EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12
Statement processed.
SQL> select * from gv$version;

   INST_ID BANNER
---------- ----------------------------------------------------------------
         1 Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
         1 PL/SQL Release 9.2.0.8.0 - Production
         1 CORE 9.2.0.8.0       Production
         1 TNS for IBM/AIX RISC System/6000: Version 9.2.0.8.0 - Production
         1 NLSRTL Version 9.2.0.8.0 - Production
         2 Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
         2 PL/SQL Release 9.2.0.8.0 - Production
         2 CORE 9.2.0.8.0       Production
         2 TNS for IBM/AIX RISC System/6000: Version 9.2.0.8.0 - Production
         2 NLSRTL Version 9.2.0.8.0 - Production

10 rows selected.

SQL> oradebug EVENT 10046 trace name context off
Statement processed.
SQL> oradebug TRACEFILE_NAME
/oracle9/app/admin/crm/udump/crm1_ora_1708916.trc

分析10046内容

PARSING IN CURSOR #1 len=24 dep=0 uid=0 oct=3 lid=0 tim=41752681925071 hv=775381991 ad='92ef64f0'
select * from gv$version
END OF STMT
PARSE #1:c=0,e=6770,p=0,cr=12,cu=3,mis=1,r=0,dep=0,og=4,tim=41752681925070
BINDS #1:
WAIT #1: nam='PX Deq: reap credit' ela= 21 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: Join ACK' ela= 237 p1=268500992 p2=1 p3=504403208016510312
WAIT #1: nam='PX Deq: reap credit' ela= 4 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: Join ACK' ela= 64 p1=268500992 p2=2 p3=504403208016510312
WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: Join ACK' ela= 46 p1=268566528 p2=1 p3=504403208016502096
WAIT #1: nam='PX Deq: reap credit' ela= 4 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: Join ACK' ela= 855 p1=268566528 p2=2 p3=504403208016502096
WAIT #1: nam='PX Deq: reap credit' ela= 4 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq Credit: send blkd' ela= 188 p1=268566528 p2=1 p3=504403208016502096
WAIT #1: nam='PX Deq: reap credit' ela= 1 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: Parse Reply' ela= 406 p1=200 p2=1 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: Parse Reply' ela= 7442 p1=200 p2=2 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 4 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: Parse Reply' ela= 1664 p1=200 p2=1 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0
EXEC #1:c=0,e=11572,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=41752681936727
WAIT #1: nam='SQL*Net message to client' ela= 1 p1=1650815232 p2=1 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: Execute Reply' ela= 166 p1=200 p2=1 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: Execute Reply' ela= 310 p1=200 p2=2 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0
FETCH #1:c=0,e=557,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=4,tim=41752681937320
WAIT #1: nam='SQL*Net message from client' ela= 223 p1=1650815232 p2=1 p3=0
WAIT #1: nam='SQL*Net message to client' ela= 1 p1=1650815232 p2=1 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 18 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0
WAIT #1: nam='DFS lock handle' ela= 263 p1=1128857605 p2=9 p3=5
WAIT #1: nam='DFS lock handle' ela= 254 p1=1128857605 p2=9 p3=1
WAIT #1: nam='DFS lock handle' ela= 263 p1=1128857605 p2=9 p3=3
WAIT #1: nam='DFS lock handle' ela= 73 p1=1128857605 p2=9 p3=2
WAIT #1: nam='DFS lock handle' ela= 363 p1=1128857605 p2=9 p3=2
WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 24 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: Signal ACK' ela= 3 p1=0 p2=1 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: Signal ACK' ela= 155 p1=10 p2=2 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 1 p1=0 p2=0 p3=0
FETCH #1:c=0,e=1735,p=0,cr=0,cu=0,mis=0,r=9,dep=0,og=4,tim=41752681939329
WAIT #1: nam='DFS lock handle' ela= 249 p1=1128857605 p2=9 p3=1
WAIT #1: nam='DFS lock handle' ela= 258 p1=1128857605 p2=9 p3=3
WAIT #1: nam='DFS lock handle' ela= 66 p1=1128857605 p2=9 p3=2
WAIT #1: nam='DFS lock handle' ela= 369 p1=1128857605 p2=9 p3=2
WAIT #1: nam='PX Deq: reap credit' ela= 3 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 19 p1=0 p2=0 p3=0
WAIT #1: nam='PX Deq: reap credit' ela= 1 p1=0 p2=0 p3=0
WAIT #1: nam='enqueue' ela= 201 p1=1347616774 p2=2 p3=0
WAIT #1: nam='enqueue' ela= 41 p1=1347616774 p2=2 p3=0
WAIT #1: nam='SQL*Net message from client' ela= 6981860 p1=1650815232 p2=1 p3=0
STAT #1 id=1 cnt=0 pid=0 pos=1 obj=0 op='VIEW  '
STAT #1 id=2 cnt=0 pid=1 pos=1 obj=304 op='FIXED TABLE FULL X$VERSION '

通过这里可以看出,rac在正常情况下gv$视图默认的查询就是并发方式进行

发表在 Oracle | 评论关闭

因未配置Hugepage会话数添增悲剧案例

今天一朋友反馈他们的一个数据库hang住了,通过ssh也不能登录系统,他们没有办法重启系统解决问题,现在想让我帮忙找出问题原因
分析awr得出


询问朋友,他们的库一般session保持在200个左右,这次突然飙升到750以上,属于异常情况

分析监听日志

看到在截图的时间内,整体访问较频繁,某个ip访问异常频繁,通过这些信息,初步怀疑是用户的数据库内存使用完,导致系统数据库hang住.

查看系统日志

Jun 26 14:35:55 result01 kernel: [5613531.566617] Free swap  = 0kB
Jun 26 14:35:55 result01 kernel: [5613531.566618] Total swap = 2104504kB
Jun 26 14:35:55 result01 kernel: [5613531.566620] Free swap:            0kB
Jun 26 14:35:55 result01 kernel: [5613531.591073] 2359296 pages of RAM
Jun 26 14:35:55 result01 kernel: [5613531.591074] 318236 reserved pages
Jun 26 14:35:55 result01 kernel: [5613531.591075] 73353 pages shared
Jun 26 14:35:56 result01 kernel: [5613531.591076] 529 pages swap cached
Jun 26 14:35:56 result01 kernel: [5613531.591079] Out of Memory: Kill process 8904 (oracle) score 891 and children.
Jun 26 14:35:56 result01 kernel: [5613531.591201] Out of memory: Killed process 8904 (oracle).
Jun 26 14:35:56 result01 kernel: [5613531.592280] oracle invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0

通过这个日志看出系统内存和交换分区都使用完,因为内存不够,系统开始kill掉部分oracle进程.通过这些确定是系统内存使用完导致hang住可以理解.

分析hang住原因
为什么session意外的从200添增到750的时候,系统内存被使用完

cat /proc/meminfo
MemTotal:      8164240 kB
SwapTotal:     2104504 kB
PageTables:      69732 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

sga_target=3674210304
pga_aggregate_target=1732247552

从这里得出几个信息:
1.数据库总内存8g,swap配置2g
2.数据库未使用Hugepage
3.数据库设置sga和pga信息

内存参数估算
数据库总计占用内存为:(3674210304+1732247552)/1024/1024=5156M(pga可能未使用完,也可能超过)
结合实际sga_target=3674210304,会话数.
保守估计下Oracle进程占用的系统内存3674210304/(4*1024)*1.5*750/1024/1024=960M
估算如果使用Hugepage Oracle进程占用系统内存为:3674210304/(2*1024*1024)*1.5*750/1024/1024=1.9M
通过这里分析Oracle总占用内存为:5156+960=6116M
通过保守计算留给系统的内存大概为:1.8G左右
因为系统的其他操作,最终导致该系统内存耗完,系统和数据库hang住

总结说明
这是一个实实在在因为linux中因为未配置Hugepage,因为用户突增,导致系统内存消耗光,从而使得系统和数据库hang住的例子.
这个库因为sga不是非常大,所以Oracle占用系统内存不是高到离谱,如果sga配置为32g,1000个session,那就会占用12g的系统内存
通过这些可以看出在linux中配置Hugepage的优点:Hugepage不光是为了减轻cpu的负担,还可以减少系统内存的消耗;在没有极端的情况下,建议linux的数据库系统配置Hugepage.

发表在 Oracle | 4 条评论