ORA-27154 ORA-27300 ORA-27301 ORA-27302故障处理

根据经验对系统的内核参数做了一些调整,结果导致数据库启动失败提示报ORA-27154 ORA-27300 ORA-27301 ORA-27302错误

ORA-27154: post/wait create failed
ORA-27300: OS system dependent operation:semget failed with status: 28
ORA-27301: OS failure message: No space left on device
ORA-27302: failure occurred at: sskgpcreates

ORA-27154


根据官方描述:Database Startup Fails with ORA-27300: OS system dependent operation:semget failed with status: 28 (Doc ID 949468.1),出现该问题原因可能是由于kernel.sem参数配置不合适当时,该库的processes配置为:20000,kernel.sem参数配置为:kernel.sem = 250 32000 100 128,参数说明:

kernel.sem = SEMMSL SEMMNS SEMOPM SEMMNI
SEMMSL - max semaphores per array
SEMMNS - max semaphores system wide
SEMOPM - max ops per semop call
SEMMNI - max number of arrays

理论上每个这样的配置最大值SEMMSL*SEMMNI=32000大于process的20000的设置,可是实际上控制每个信号集的信号数量没有达到250,而是只有156,通过ipcs命令可以看

[oracle@xifenfei ~]$ ipcs

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status      
0x00000000 32768      oracle     640        33554432   30                      
0x00000000 65537      oracle     640        4261412864 30                      
0xc2d167d0 98306      oracle     640        2097152    30                      
0x00000072 131075     root       444        1          1                       

------ Semaphore Arrays --------
key        semid      owner      perms      nsems   
0x450e15bd 0 	      root       666        1
0x0000cace 32769      root       666        1
0x358b172c 327683     oracle     660        104
0x9053d038 11075588   oracle     660        156
0x9053d039 11108357   oracle     660        156
0x9053d03a 11141126   oracle     660        156
0x9053d03b 11173895   oracle     660        156

从而使得SEMMSL*SEMMNI小于processes值,进而数据库启动报ORA-27154 ORA-27300 ORA-27301 ORA-27302,修改kernel.sem = 250 64000 128 256,数据库启动成功

发表在 Oracle | 标签为 , , , | 留下评论

Patch SCN工具for Linux

前几年开发了图形化的win平台的patch scn工具:一键修改Oracle SCN工具升级(patch scn)
20221006220814
最近基于linux平台开发了命令行方式的Patch scn工具,大概使用如下:
1. 数据库必须处于mount状态才能修改scn

[oracle@iZbp11c0qyuuo1gr7j98upZ tmp]$ ./Patch_SCN   328 247884300
ERROR:Oracle Database must be in Mount state in order to modify SCN

2. Patch_SCN参数提示

[oracle@iZbp11c0qyuuo1gr7j98upZ tmp]$ ./Patch_SCN   
Use Parameters: PID  SCN(10)  [ADDR(16,don't 0x)]

3.修改SCN 具体操作

1)启动数据库到mount状态
SQL> startup mount;
ORACLE instance started.

Total System Global Area  551165952 bytes
Fixed Size                  2255112 bytes
Variable Size             369100536 bytes
Database Buffers          171966464 bytes
Redo Buffers                7843840 bytes
Database mounted.

2)查询数据库当前scn
SQL> select CHECKPOINT_CHANGE# from v$database;

CHECKPOINT_CHANGE#
------------------
           2478843

3)新会话中修改数据库scn
[oracle@iZbp11c0qyuuo1gr7j98upZ tmp]$ ./Patch_SCN   328 247884300
Please press ENTER to continue...
Modify the Oracle SCN value to:EC66A0C:247884300

4)启动数据库并查询scn
SQL> ALTER DATABASE OPEN;

Database altered.

SQL>  select CHECKPOINT_CHANGE# from v$database;

CHECKPOINT_CHANGE#
------------------
         247884301 (比修改scn稍微大由于数据库已经启动会自动增加scn值)

这个小工具直接通过内存地址修改scn,绕过Oracle在一些版本中的oradebug的限制:oradebug poke ORA-32521/ORA-32519故障解决,软件还处于测试阶段,后续稳定后将对外提供软件形式服务.

发表在 小工具 | 标签为 | 留下评论

awr创建snapshot等待library cache: mutex X

客户一个11.2.0.4的库,在准备收集awr的时候发现没有snap id
awr


人工创建snapshot发现hang住了
awr_snap

查询该会话等待事件为:library cache: mutex X,查看以前mmon的子进程m000/1的trace信息

Trace file /u01/app/oracle/diag/rdbms/xff/xff/trace/xff_m000_6241.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_1
System name:    Linux
Node name:      HIS
Release:        5.4.17-2102.201.3.el7uek.x86_64
Version:        #2 SMP Fri Apr 23 09:05:55 PDT 2021
Machine:        x86_64
VM name:        VMWare Version: 6
Instance name: xff
Redo thread mounted by this instance: 1
Oracle process number: 5714
Unix process pid: 6241, image: oracle@HIS (M000)


*** 2024-06-19 11:44:39.483
*** SESSION ID:(8709.38013) 2024-06-19 11:44:39.483
*** CLIENT ID:() 2024-06-19 11:44:39.483
*** SERVICE NAME:(SYS$BACKGROUND) 2024-06-19 11:44:39.483
*** MODULE NAME:(MMON_SLAVE) 2024-06-19 11:44:39.483
*** ACTION NAME:(Auto-Flush Slave Action) 2024-06-19 11:44:39.483

DDE rules only execution for: ORA 12751
----- START Event Driven Actions Dump ----
---- END Event Driven Actions Dump ----
----- START DDE Actions Dump -----
Executing SYNC actions
Executing ASYNC actions
----- START DDE Action: 'ORA_12751_DUMP' (Sync) -----
Runtime exceeded 900 seconds
Time limit violation detected at:
ksedsts()+465<-kspol_12751_dump()+145<-dbgdaExecuteAction()+1065<-dbgerRunAction()+109<-dbgerRunActions()
+4134<-dbgexPhaseII()+1873<-dbgexProcessError()+2680<-dbgeExecuteForError()+88<-dbgePostErrorKGE()+2136<-
dbkePostKGE_kgsf()+71<-kgeselv()+276<-kgesecl0()+139<-kgxWait()+1412<-kgxExclusive()+447<-
kglGetMutex()+140<-kglGetHandleReference()+69<-kglic0()+319<-kksIterCursorStat()+330<-kewrrtsq_rank_topsql()
+240<-kewrbtsq_build_topsql()+128<-kewrftsq_flush_topsql()+679<-kewrft_flush_table()+397<-
kewrftec_flush_table_ehdlcx()+766<-kewrfat_flush_all_tables()+1406<-kewrfos_flush_onesnap()+170
<-kewrfsc_flush_snapshot_c()+623<-kewrafs_auto_flush_slave()+769<-kebm_slave_main()+586<-ksvrdp()+1766
<-opirip()+674<-opidrv()+603<-sou2o()+103<-opimai_real()+250<-ssthrdmain()+265<-main()+201
<-__libc_start_main()+245
Current Wait Stack:
 0: waiting for 'library cache: mutex X'
    idn=0x644e2de0, value=0xf3a00000000, where=0x7c
    wait_id=1189 seq_num=1190 snap_id=1
    wait times: snap=3 min 0 sec, exc=3 min 0 sec, total=3 min 0 sec
    wait times: max=infinite, heur=15 min 3 sec
    wait counts: calls=16376 os=16376
    in_wait=1 iflags=0x15b2
There is at least one session blocking this session.
  Dumping 1 direct blocker(s):
    inst: 1, sid: 3898, ser: 47299
  Dumping final blocker:
    inst: 1, sid: 3898, ser: 47299
Wait State:
  fixed_waits=0 flags=0x22 boundary=(nil)/-1
Session Wait History:
    elapsed time of 0.000016 sec since current wait
 0: waited for 'library cache: mutex X'
    idn=0x644e2de0, value=0xf3a00000000, where=0x7c
    wait_id=1188 seq_num=1189 snap_id=1
    wait times: snap=12 min 2 sec, exc=12 min 2 sec, total=12 min 2 sec
    wait times: max=infinite
    wait counts: calls=65535 os=65535
    occurred after 0.327543 sec of elapsed time
 1: waited for 'db file sequential read'
    file#=0x2, block#=0x1a5b, blocks=0x1
    wait_id=1187 seq_num=1188 snap_id=1
    wait times: snap=0.000420 sec, exc=0.000420 sec, total=0.000420 sec
    wait times: max=infinite
    wait counts: calls=0 os=0
    occurred after 0.000251 sec of elapsed time
 2: waited for 'db file sequential read'
    file#=0x1, block#=0x82e6, blocks=0x1
    wait_id=1186 seq_num=1187 snap_id=1
    wait times: snap=0.000429 sec, exc=0.000429 sec, total=0.000429 sec
    wait times: max=infinite
    wait counts: calls=0 os=0
    occurred after 0.001085 sec of elapsed time
 3: waited for 'db file sequential read'
    file#=0x2, block#=0x11344, blocks=0x1
    wait_id=1185 seq_num=1186 snap_id=1
    wait times: snap=0.000356 sec, exc=0.000356 sec, total=0.000356 sec
    wait times: max=infinite
    wait counts: calls=0 os=0
    occurred after 0.000008 sec of elapsed time
 4: waited for 'db file sequential read'
    file#=0x2, block#=0x19eb, blocks=0x1
    wait_id=1184 seq_num=1185 snap_id=1
    wait times: snap=0.000397 sec, exc=0.000397 sec, total=0.000397 sec
    wait times: max=infinite
    wait counts: calls=0 os=0
    occurred after 0.000044 sec of elapsed time
 5: waited for 'db file sequential read'
    file#=0x2, block#=0xb1659, blocks=0x1
    wait_id=1183 seq_num=1184 snap_id=1
    wait times: snap=0.000003 sec, exc=0.000003 sec, total=0.000003 sec
    wait times: max=infinite
    wait counts: calls=0 os=0
    occurred after 0.000010 sec of elapsed time
 6: waited for 'db file sequential read'
    file#=0x2, block#=0xb1658, blocks=0x1
    wait_id=1182 seq_num=1183 snap_id=1
    wait times: snap=0.000453 sec, exc=0.000453 sec, total=0.000453 sec
    wait times: max=infinite
    wait counts: calls=0 os=0
    occurred after 0.000009 sec of elapsed time
 7: waited for 'db file sequential read'
    file#=0x2, block#=0x19e1, blocks=0x1
    wait_id=1181 seq_num=1182 snap_id=1
    wait times: snap=0.000388 sec, exc=0.000388 sec, total=0.000388 sec
    wait times: max=infinite
    wait counts: calls=0 os=0
    occurred after 0.000017 sec of elapsed time
 8: waited for 'db file sequential read'
    file#=0x2, block#=0x19e2, blocks=0x1
    wait_id=1180 seq_num=1181 snap_id=1
    wait times: snap=0.000415 sec, exc=0.000415 sec, total=0.000415 sec
    wait times: max=infinite
    wait counts: calls=0 os=0
    occurred after 0.004826 sec of elapsed time
 9: waited for 'db file sequential read'
    file#=0x2, block#=0x2ffc0c, blocks=0x1
    wait_id=1179 seq_num=1180 snap_id=1
    wait times: snap=0.000404 sec, exc=0.000404 sec, total=0.000404 sec
    wait times: max=infinite
    wait counts: calls=0 os=0
    occurred after 0.000007 sec of elapsed time
Sampled Session History of session 8709 serial 38013
---------------------------------------------------
The sampled session history is constructed by sampling
the target session every 1 second. The sampling process
captures at each sample if the session is in a non-idle wait,
an idle wait, or not in a wait. If the session is in a
non-idle wait then one interval is shown for all the samples
the session was in the same non-idle wait. If the
session is in an idle wait or not in a wait for
consecutive samples then one interval is shown for all
the consecutive samples. Though we display these consecutive
samples  in a single interval the session may NOT be continuously
idle or not in a wait (the sampling process does not know).

The history is displayed in reverse chronological order.

sample interval: 1 sec, max history 120 sec
---------------------------------------------------
  [118 samples,                                            11:42:39 - 11:44:39]
    waited for 'library cache: mutex X', seq_num: 1190
      p1: 'idn'=0x644e2de0
      p2: 'value'=0xf3a00000000
      p3: 'where'=0x7c
      time_waited: >= 120 sec (still in wait)
  [3 samples,                                              11:42:39 - 11:42:38]
    idle wait at each sample
---------------------------------------------------
Sampled Session History Summary:
  longest_non_idle_wait: 'library cache: mutex X'
  [118 samples, 11:42:39 - 11:44:39]
      time_waited: >= 120 sec (still in wait)
---------------------------------------------------
----- END DDE Action: 'ORA_12751_DUMP' (SUCCESS, 8 csec) -----
----- END DDE Actions Dump (total 8 csec) -----
KGX cleanup...
KGX Atomic Operation Log 0x1de44da670
 Mutex 0x1d113cf7c8(8709, 0) idn 2de0 oper EXCL(6)
 Library Cache uid 8709 efd 7 whr 49 slp 0
 oper=0 pt1=(nil) pt2=(nil) pt3=(nil)
 pt4=(nil) pt5=(nil) ub4=0
KGX cleanup...
KGX Atomic Operation Log 0x1de44da6c8
 Mutex 0x1de9468550(3898, 0) idn 644e2de0 oper GET_EXCL(5)
 Library Cache uid 8709 efd 7 whr 124 slp 16376
 oper=0 pt1=0x1de9468410 pt2=(nil) pt3=(nil)
 pt4=(nil) pt5=(nil) ub4=0
*** KEWRAFM1: Error=12751 encountered by kewrfteh
*** KEWRAFS: Error=12751 encountered by Auto Flush Slave.
KEBM: MMON slave action policy violation. kewrmafsa_; viol=1; err=12751

一般来说类似这样的系统自动任务被阻塞很可能是由于那种bug导致,找到相关mos文档: library cache: mutex x waits during AWR Flush High Cursor Scan (Doc ID 2382741.1),确认为:Bug 19294556 AWR Flush Waiting For Cursor Scan, Library Cache : Mutex X,目前没有好的workaround,而且在11.2.0.4基础版上没有对应的patch

发表在 Oracle | 标签为 , | 留下评论