月归档:一月 2013

aix中procmap 查看oracle进程占用系统内存

procmap是用来显示进程地址空间,通过这个命令找出来的“read/write”表示为进程的私有内存,如果对应到oracle 进程的LOCAL中来,也就是对应了是oracle 会话进程占用的操作系统内存,和sga与pga无关,即ORACLE数据库进程占用的额外的系统内存,在计算oracle数据库消耗内存的时候,要考虑sga+pga+process占用的内存
procmap命令使用

$procmap 7931354
7931354 : oracleccicdx (LOCAL=NO) 
100000000            95504K  read/exec         oracle
110000035             2399K  read/write        oracle
9fffffff0000000         51K  read/exec         /usr/ccs/bin/usla64
9fffffff000cfe2          0K  read/write        /usr/ccs/bin/usla64
900000000b05930          2K  read/exec         /usr/lib/libC.a[shr3_64.o]
9001000a0122930          0K  read/write        /usr/lib/libC.a[shr3_64.o]
900000000ae6b00        118K  read/exec         /usr/lib/libC.a[shrcore_64.o]
9001000a030a100         12K  read/write        /usr/lib/libC.a[shrcore_64.o]
900000000ac8000        118K  read/exec         /usr/lib/libC.a[ansicore_64.o]
9001000a0300e00         36K  read/write        /usr/lib/libC.a[ansicore_64.o]
900000000411468          0K  read/exec         /usr/lib/libicudata.a[shr_64.o]
9001000a0121468          0K  read/write        /usr/lib/libicudata.a[shr_64.o]
90000000040f738          2K  read/exec         /usr/lib/libC.a[shr2_64.o]
9001000a0314738          0K  read/write        /usr/lib/libC.a[shr2_64.o]
9000000008dd800       1699K  read/exec         /usr/lib/libC.a[ansi_64.o]
9001000a0315a00        277K  read/write        /usr/lib/libC.a[ansi_64.o]
9000000008bab00        135K  read/exec         /usr/lib/libC.a[shr_64.o]
9001000a030eb00         19K  read/write        /usr/lib/libC.a[shr_64.o]
900000000708180       1732K  read/exec         /usr/lib/libicuuc.a[shr_64.o]
9001000a035cdac        180K  read/write        /usr/lib/libicuuc.a[shr_64.o]
900000000493d80       2510K  read/exec         /usr/lib/libicui18n.a[shr_64.o]
9001000a038a148        270K  read/write        /usr/lib/libicui18n.a[shr_64.o]
900000000473200         91K  read/exec         /usr/lib/libsrc.a[shr_64.o]
9001000a01127a8         55K  read/write        /usr/lib/libsrc.a[shr_64.o]
90000000045a300         98K  read/exec         /usr/lib/libcorcfg.a[shr_64.o]
9001000a04147c8         18K  read/write        /usr/lib/libcorcfg.a[shr_64.o]
900000000b16200        750K  read/exec         /usr/lib/liblvm.a[shr_64.o]
9001000a03dd028        219K  read/write        /usr/lib/liblvm.a[shr_64.o]
900000000444f00         82K  read/exec         /usr/lib/libcfg.a[shr_64.o]
9001000a03d58f0         26K  read/write        /usr/lib/libcfg.a[shr_64.o]
90000000040e3a0          2K  read/exec         /usr/lib/libcrypt.a[shr_64.o]
9001000a0106948          0K  read/write        /usr/lib/libcrypt.a[shr_64.o]
90000001615d860          5K  read/exec         /usr/lib/libc.a[aio_64.o]
9001000a3aed568          0K  read/write        /usr/lib/libc.a[aio_64.o]
9000000003efc00        120K  read/exec         /usr/lib/libodm.a[shr_64.o]
9001000a0107cc8         40K  read/write        /usr/lib/libodm.a[shr_64.o]
900000000bd2c80        147K  read/exec         /usr/lib/libperfstat.a[shr_64.o]
9001000a041a960         14K  read/write        /usr/lib/libperfstat.a[shr_64.o]
9000000017d7000          0K  read/exec         /usr/lib/libdl.a[shr_64.o]
9001000a0517000          0K  read/write        /usr/lib/libdl.a[shr_64.o]
9000000158ed100       8636K  read/exec         /oracle/product/db10gr2/lib/libjox10.a[shr.o]
8001000a0000b78        587K  read/write        /oracle/product/db10gr2/lib/libjox10.a[shr.o]
900000000a87000        257K  read/exec         /usr/lib/libpthreads.a[shr_xpg5_64.o]
9001000a0274000        559K  read/write        /usr/lib/libpthreads.a[shr_xpg5_64.o]
900000000000800       4025K  read/exec         /usr/lib/libc.a[shr_64.o]
9001000a0000020       1047K  read/write        /usr/lib/libc.a[shr_64.o]
         Total      121863K

简化命令,统计私有内存,procmap 7931354|grep “read/write” |awk -F ” ” ‘{print $2}’,通过相关计算的出来,在当前的操作系统和数据库版本中,一个LOCAL=NO进程占用系统内存为:5758KB

补充说明
1.操作系统版本

$oslevel -r
6100-06

2.数据库版本

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Productio
NLSRTL Version 10.2.0.4.0 - Production

3.通过跟踪多个LOCAL=NO进程,发现类似进程占用的系统内存相同,估算给系统oracle进程占用的内存,可以通过该值进行大概估算
4.确认ORACLE使用的内存量不是以往认识的sga+pga,实际上应该是sga+pga+所有oracle进程占用
5.在linux中使用pmap来查看

发表在 AIX, Oracle | 2 条评论

设置_smu_debug_mode实现指定session级别使用特定回滚段

通过设置_smu_debug_mode值来实现指定session级别使用特定的回滚段
_smu_debug_mode为默认值

--测试数据库版本
SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Solaris: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

--_smu_debug_mode值
SQL> select a.ksppinm name,b.ksppstvl value
  2    from x$ksppi a,x$ksppcv b
  3   where a.inst_id = USERENV ('Instance')
  4     and b.inst_id = USERENV ('Instance')
  5     and a.indx = b.indx
  6     and upper(a.ksppinm) LIKE upper('%&param%')
  7  order by name;
Enter value for param: _smu_debug_mode
old   6:    and upper(a.ksppinm) LIKE upper('%&param%')
new   6:    and upper(a.ksppinm) LIKE upper('%_smu_debug_mode%')

NAME                             VALUE                   
-------------------------------- ------------------------ 
_smu_debug_mode                  0                        

--undo管理模式
SQL> show parameter undo;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
undo_management                      string      AUTO
undo_retention                       integer     900
undo_tablespace                      string      UNDOTBS1

--指定回滚段(查询dba_rollback_segs得到回滚段名称)
SQL>  set transaction use rollback segment "_SYSSMU7_1887299474$";

Transaction set.

SQL> delete from t where rownum<10;

9 rows deleted.

--查询使用回滚段
SQL> select XIDUSN from V$TRANSACTION;

    XIDUSN
----------
         9

这里可以看到在undo自动管理模式下,我们手工指定了回滚段但是被数据库给忽略,还是使用了系统自动分配的回滚段。例如这里我指定的回滚段7,但是使用了系统自动分配的回滚段9

_smu_debug_mode=45

SQL> select * from v$version;

BANNER
-------------------------------------------------------------------------

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for 32-bit Windows: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> show parameter undo;

NAME                                 TYPE                VALUE
------------------------------------ -------    -----------------
undo_management                      string       AUTO
undo_retention                       integer      900
undo_tablespace                      string       undo_new

SQL> select a.ksppinm name,b.ksppstvl value
  2    from x$ksppi a,x$ksppcv b
  3   where a.inst_id = USERENV ('Instance')
  4     and b.inst_id = USERENV ('Instance')
  5     and a.indx = b.indx
  6     and upper(a.ksppinm) LIKE upper('%&param%')
  7  order by name;
Enter value for param: _smu_debug_mode
old   6:    and upper(a.ksppinm) LIKE upper('%&param%')
new   6:    and upper(a.ksppinm) LIKE upper('%_smu_debug_mode%')

NAME                             VALUE                    
-------------------------------- ------------------------ 
_smu_debug_mode                  45   

/*
使用alter system set "_smu_debug_mode" = 45;配置
注意:该参数只能在system级别配置
*/                   

--测试表
SQL> create table t_xifenfei
  2  as
  3  select * from dba_objects;

Table created.

--指定回滚段
SQL> set transaction use rollback segment "_SYSSMU15_1680736333$";

Transaction set.

SQL> delete from t_xifenfei where rownum<10;

9 rows deleted.

--查询事务回滚段
SQL> select XIDUSN from V$TRANSACTION;

    XIDUSN
----------
        15

SQL> commit;

Commit complete.

--再次指定回滚段
SQL> set transaction use rollback segment "_SYSSMU17_527554872$";

Transaction set.

SQL> delete from t_xifenfei where rownum<10;

9 rows deleted.

--查询事务回滚段
SQL> select XIDUSN from V$TRANSACTION;

    XIDUSN
----------
        17

这里可以看出来通过设置”_smu_debug_mode” = 45可以很好的实现在undo自动管理模式下,指定事务在特定的回滚段,在某些极限情况下,可以通过该操作来减少回滚段争用.

发表在 Oracle | 标签为 | 一条评论

asmlib异常报ORA-00600[kfklLibFetchNext00]

一个朋友的历史库出现故障,在linux 4的平台上asm的10.2.0.1的单库,asm使用asmlib来处理。
asm不能正常mount磁盘组,可以看到asmdisk,alert日志报ORA-00600[kfklLibFetchNext00]
操作系统内核是:2.6.9-78
oracleasmlib是:2.0.2-1
asm磁盘组mount失败

--以前故障
SQL> ALTER DISKGROUP ALL MOUNT 
Thu Sep  6 14:23:16 2012
NOTE: cache registered group DGARC number=1 incarn=0x2bf96274
NOTE: cache registered group DGDATA number=2 incarn=0x2c196275
NOTE: cache registered group DGSYS number=3 incarn=0x2c196276
Thu Sep  6 14:23:16 2012
Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_10204.trc:
ORA-15183: ASMLIB initialization error [driver/agent not installed]
Thu Sep  6 14:23:16 2012
Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_10204.trc:
ORA-15183: ASMLIB initialization error [/opt/oracle/extapi/64/asm/orcl/1/libasm.so]
ORA-15183: ASMLIB initialization error [driver/agent not installed]
Thu Sep  6 14:23:16 2012
ERROR: no PST quorum in group 1: required 2, found 0
Thu Sep  6 14:23:16 2012
NOTE: cache dismounting group 1/0x2BF96274 (DGARC) 
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGARC was not mounted
Thu Sep  6 14:23:16 2012
ERROR: no PST quorum in group 2: required 2, found 0
Thu Sep  6 14:23:16 2012
NOTE: cache dismounting group 2/0x2C196275 (DGDATA) 
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGDATA was not mounted
Thu Sep  6 14:23:16 2012
ERROR: no PST quorum in group 3: required 2, found 0
Thu Sep  6 14:23:16 2012
NOTE: cache dismounting group 3/0x2C196276 (DGSYS) 
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGSYS was not mounted

--现在故障
Thu Jan 24 13:49:45 2013
SQL> ALTER DISKGROUP ALL MOUNT 
Thu Jan 24 13:49:45 2013
NOTE: cache registered group DGARC number=1 incarn=0xf388cee9
NOTE: cache registered group DGDATA number=2 incarn=0xf3a8ceea
NOTE: cache registered group DGSYS number=3 incarn=0xf3a8ceeb
Thu Jan 24 13:49:45 2013
Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_13449.trc:
ORA-00600: internal error code, arguments: [kfklLibFetchNext00], 
[18446744073709551614], [0], [], [], [], [], []
Thu Jan 24 13:49:46 2013
Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_13449.trc:
ORA-00600: internal error code, arguments: [kfklLibFetchNext00], 
[18446744073709551614], [0], [], [], [], [], []
Thu Jan 24 13:49:46 2013
ERROR: no PST quorum in group 1: required 2, found 0
Thu Jan 24 13:49:46 2013
NOTE: cache dismounting group 1/0xF388CEE9 (DGARC) 
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGARC was not mounted
Thu Jan 24 13:49:46 2013
ERROR: no PST quorum in group 2: required 2, found 0
Thu Jan 24 13:49:46 2013
NOTE: cache dismounting group 2/0xF3A8CEEA (DGDATA) 
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGDATA was not mounted
Thu Jan 24 13:49:46 2013
ERROR: no PST quorum in group 3: required 2, found 0
Thu Jan 24 13:49:46 2013
NOTE: cache dismounting group 3/0xF3A8CEEB (DGSYS) 
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGSYS was not mounted
Shutting down instance: further logons disabled

trace文件信息

----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedst()+31          call     ksedst1()            000000000 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000001 ?
ksedmp()+610         call     ksedst()             000000000 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000001 ?
ksfdmp()+21          call     ksedmp()             000000003 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000001 ?
kgerinv()+161        call     ksfdmp()             000000003 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000001 ?
kgesinv()+33         call     kgerinv()            006469D40 ? 0064E1C58 ?
                                                   000000000 ? 000000000 ?
                                                   000000001 ? 000000001 ?
kgesinw()+166        call     kgesinv()            006469D40 ? 0064E1C58 ?
                                                   000000000 ? 000000000 ?
                                                   000000001 ? 000000001 ?
kfklLibScanNext()+2  call     kgesinw()            006469D40 ? 000000000 ?
39                                                 000000001 ? 000000000 ?
                                                   FFFFFFFFFFFFFFFE ?
                                                   000000000 ?
kfkLibFetchNext()+3  call     kfklLibScanNext()    0064DDD70 ? 7FBFFFDCD0 ?
43                                                 000000001 ? 000000000 ?
                                                   FFFFFFFFFFFFFFFE ?
                                                   000000000 ?
kfuitrnInit()+524    call     kfkLibFetchNext()    006469D40 ? 2A971DFF90 ?
                                                   000000001 ? 000000000 ?
                                                   FFFFFFFFFFFFFFFE ?
                                                   000000000 ?
kfkLibIterInit()+18  call     kfuitrnInit()        006469D40 ? 2A971DFCB0 ?
0                                                  2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
kfkLoadAllLibs()+36  call     kfkLibIterInit()     000000000 ? 00646C7E0 ?
3                                                  2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
kfkDiscoverString()  call     kfkLoadAllLibs()     000000000 ? 00646C7E0 ?
+107                                               2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
Cannot find symbol
Cannot find symbol
Cannot find symbol
kfdDiscoverString()  call     kfkDiscoverString()  067A53768 ? 00646C7E0 ?
+28                                                2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
kfdDiscoverShallow(  call     kfdDiscoverString()  067A53768 ? 000000000 ?
)+315                                              2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
kfgbDriver()+1174    call     kfdDiscoverShallow(  000000180 ? 000000000 ?
                              )                    2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
ksbabs()+564         call     kfgbDriver()         7FBFFFE5C0 ? 000000048 ?
                                                   000000000 ? 000000009 ?
                                                   000000009 ? 000000000 ?
ksbrdp()+727         call     ksbabs()             7FBFFFE5C0 ? 000000048 ?
                                                   000000000 ? 000000009 ?
                                                   000000009 ? 000000000 ?
opirip()+616         call     ksbrdp()             7FBFFFE5C0 ? 000000048 ?
                                                   000000001 ? 06002C770 ?
                                                   000000009 ? 000000000 ?
opidrv()+582         call     opirip()             000000032 ? 000000004 ?
                                                   7FBFFFF6C8 ? 06002C770 ?
                                                   000000009 ? 000000000 ?
sou2o()+114          call     opidrv()             000000032 ? 000000004 ?
                                                   7FBFFFF6C8 ? 06002C770 ?
                                                   000000009 ? 000000000 ?
opimai_real()+317    call     sou2o()              7FBFFFF6A0 ? 000000032 ?
                                                   000000004 ? 7FBFFFF6C8 ?
                                                   000000009 ? 000000000 ?
main()+116           call     opimai_real()        000000003 ? 7FBFFFF730 ?
                                                   000000004 ? 7FBFFFF6C8 ?
                                                   000000009 ? 000000000 ?
<0x3c9fb1c40b>       call     main()               000000003 ? 7FBFFFF730 ?
                                                   000000004 ? 7FBFFFF6C8 ?
                                                   000000009 ? 000000000 ?
 
--------------------- Binary Stack Dump ---------------------

因为客户的库是一个历史库,基本上不怎么使用,在2012年启动asm就出现了ORA-15183错误,然后在2013年重启机器后,再次启动asm就出现了ORA-00600[kfklLibFetchNext00]错误,通过2012年的错误提示,我们大概可以判断出来该问题和ASMLIB有关系,查询mos发现429945.1,发现Call Stack Trace完全一致,可以定位是该问题(如果想深入分析,可以通过strace继续分析)

ORA-600: [kfklLibFetchNext00], [18446744073709551614], [0] when mounting diskgroup in ASM

Applies to: 
Linux OS - Version: 2.0.1-1 and later   [Release: RHEL4 and later ]
Information in this document applies to any platform.
Linux Kernel - Version: 2.0.1 

Symptoms
 3 RAC db.  
2 nodes are up and functioning except for 1 node - ASM did not come back up after 
the reboot eventhough all disks show available from asmlib's perspective:
 
Changes
 All that was done with resources were stopped on Node1 and an extra LUN added. 
 A reboot was then performed.

Cause
 The cause of the issue is libasm.o corruption

Ran the following to confirm that disks are ok:
/dev/oracleasm listdisks
/usr/sbin/asmtool -I -l /dev/oracleasm -n /dev/sdg1 -a label
/usr/sbin/oracleasm-discover 'ORCL:*'
dd if=/dev/sdg1 bs=8192 count=1 | od -c
==> output checked out fine
.
kfod asm_diskstring='ORCL:*'
==> this failed on Node1
KFOD-00600: file not found; argument [610][kfklLibFetchNext00] even though libasm.o exists

You might see the following call stack as well

----- Call Stack Trace -----

kfklLibScanNext 
kfkLibFetchNext 
kfuitrnInit 
kfkLibIterInit 
kfkLoadAllLibs 
kfkDiscoverString 
kfdDiscoverString 
kfdDiscoverShallow 
kfgbDriver 

strace showed
 
 Node1-failing
-------
stat("/opt/oracle/extapi/64/asm/orcl/1/libasm.so", {st_mode=S_IFREG|0777, st_size=19344, ...}) = 0
 getdents64(4, /* 0 entries */, 4096) = 0 <<<<
 close(4) = 0
 open("/opt/oracle/product/10.2.0/db_1/rdbms/mesg/kfodus.msb", O_RDONLY) = -1 
 ENOENT (No such file or directory)
 open("/opt/oracle/product/10.2.0/db_1/rdbms/mesg/kfodus.msb", O_RDONLY) = -1 
 ENOENT (No such file or directory)
 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2a9750d000
write(1, "KFOD-00600: file not found; argu"..., 69) = 69

Node2-working
 -----
 stat("/opt/oracle/extapi/64/asm/orcl/1/libasm.so", {st_mode=S_IFREG|0755, st_size=19344, ...}) = 0
 open("/opt/oracle/extapi/64/asm/orcl/1/libasm.so", O_RDONLY) = 4
read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20\23\0"..., 832) = 832
fstat(4, {st_mode=S_IFREG|0755, st_size=19344, ...}) = 0
mmap(NULL, 1066104, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 4, 0) 0x2a9750d000

通过MOS的描述,可以明确定位到问题是:libasm.o异常导致

解决方案

To implement the solution, reinstall the ASMlib RPM
>rpm -Uvh oracleasmlib-2.0.0-1
 
This replaces the /opt/oracle/extapi/64/asm/orcl/1/libasm.so
发表在 Oracle ASM | 标签为 , , | 2 条评论