标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-00742 ORA-01110 ORA-01555 ORA-01578 ORA-01595 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (103)
- 数据库 (1,769)
- DB2 (22)
- MySQL (77)
- Oracle (1,610)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (166)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (15)
- ORACLE 21C (3)
- Oracle 23ai (8)
- Oracle ASM (69)
- Oracle Bug (8)
- Oracle RAC (54)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (29)
- Oracle备份恢复 (592)
- Oracle安装升级 (97)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (86)
- PostgreSQL (30)
- pdu工具 (6)
- PostgreSQL恢复 (9)
- SQL Server (32)
- SQL Server恢复 (13)
- TimesTen (7)
- 达梦数据库 (3)
- 达梦恢复 (1)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (39)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (22)
-
最近发表
- 2025年的Oracle 8.0.5数据库恢复
- ORA-600 kokiasg1故障分析(obj$中核心字典序列全部被恶意删除)
- ORA-00756 ORA-10567故障数据0丢失恢复
- 数据库文件变成32k故障恢复
- tcp连接过多导致监听TNS-12532 TNS-12560 TNS-00502错误
- 文件系统格式化MySQL数据库恢复
- .sstop勒索加密数据库恢复
- 解决一次硬件恢复之后数据文件0kb的故障恢复case
- Error in invoking target ‘libasmclntsh19.ohso libasmperl19.ohso client_sharedlib’问题处理
- ORA-01171: datafile N going offline due to error advancing checkpoint
- linux环境oracle数据库被文件系统勒索加密为.babyk扩展名溯源
- ORA-600 ksvworkmsgalloc: bad reaper
- ORA-600 krccfl_chunk故障处理
- Oracle Recovery Tools恢复案例总结—202505
- ORA-600 kddummy_blkchk 数据库循环重启
- 记录一次asm disk加入到vg通过恢复直接open库的案例
- CHECKDB 发现了 N 个分配错误和 M 个一致性错误
- 达梦数据库dm.ctl文件异常恢复
- Oracle Recovery Tools修复ORA-00742、ORA-600 ktbair2: illegal inheritance故障
- 可能是 tempdb 空间用尽或某个系统表不一致故障处理
分类目录归档:Oracle
因未配置Hugepage会话数添增悲剧案例
今天一朋友反馈他们的一个数据库hang住了,通过ssh也不能登录系统,他们没有办法重启系统解决问题,现在想让我帮忙找出问题原因
分析awr得出
询问朋友,他们的库一般session保持在200个左右,这次突然飙升到750以上,属于异常情况
分析监听日志
看到在截图的时间内,整体访问较频繁,某个ip访问异常频繁,通过这些信息,初步怀疑是用户的数据库内存使用完,导致系统数据库hang住.
查看系统日志
Jun 26 14:35:55 result01 kernel: [5613531.566617] Free swap = 0kB Jun 26 14:35:55 result01 kernel: [5613531.566618] Total swap = 2104504kB Jun 26 14:35:55 result01 kernel: [5613531.566620] Free swap: 0kB Jun 26 14:35:55 result01 kernel: [5613531.591073] 2359296 pages of RAM Jun 26 14:35:55 result01 kernel: [5613531.591074] 318236 reserved pages Jun 26 14:35:55 result01 kernel: [5613531.591075] 73353 pages shared Jun 26 14:35:56 result01 kernel: [5613531.591076] 529 pages swap cached Jun 26 14:35:56 result01 kernel: [5613531.591079] Out of Memory: Kill process 8904 (oracle) score 891 and children. Jun 26 14:35:56 result01 kernel: [5613531.591201] Out of memory: Killed process 8904 (oracle). Jun 26 14:35:56 result01 kernel: [5613531.592280] oracle invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
通过这个日志看出系统内存和交换分区都使用完,因为内存不够,系统开始kill掉部分oracle进程.通过这些确定是系统内存使用完导致hang住可以理解.
分析hang住原因
为什么session意外的从200添增到750的时候,系统内存被使用完
cat /proc/meminfo MemTotal: 8164240 kB SwapTotal: 2104504 kB PageTables: 69732 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB sga_target=3674210304 pga_aggregate_target=1732247552
从这里得出几个信息:
1.数据库总内存8g,swap配置2g
2.数据库未使用Hugepage
3.数据库设置sga和pga信息
内存参数估算
数据库总计占用内存为:(3674210304+1732247552)/1024/1024=5156M(pga可能未使用完,也可能超过)
结合实际sga_target=3674210304,会话数.
保守估计下Oracle进程占用的系统内存3674210304/(4*1024)*1.5*750/1024/1024=960M
估算如果使用Hugepage Oracle进程占用系统内存为:3674210304/(2*1024*1024)*1.5*750/1024/1024=1.9M
通过这里分析Oracle总占用内存为:5156+960=6116M
通过保守计算留给系统的内存大概为:1.8G左右
因为系统的其他操作,最终导致该系统内存耗完,系统和数据库hang住
总结说明
这是一个实实在在因为linux中因为未配置Hugepage,因为用户突增,导致系统内存消耗光,从而使得系统和数据库hang住的例子.
这个库因为sga不是非常大,所以Oracle占用系统内存不是高到离谱,如果sga配置为32g,1000个session,那就会占用12g的系统内存
通过这些可以看出在linux中配置Hugepage的优点:Hugepage不光是为了减轻cpu的负担,还可以减少系统内存的消耗;在没有极端的情况下,建议linux的数据库系统配置Hugepage.
To find the TX Enqueue contention in a RAC or OPS environment
今天查找TX Enqueue看到的一篇文章,拿出来共享下
PURPOSE ------------- To find the TX Enqueue contention in a RAC or OPS environment What is TX Enqueue ? In one word oracle is maintaining queue for transaction. How Many Resources ? 1/ active transaction How Many Locks? 1/transaction + 1/process waiting for a locked row by that transaction. How Many Users? 1 + 1/ process waiting for something locked by this transaction. Who Uses? All processes What need to investigate? The mode of TX (6/4), Holding/Waiting/Requesting SCOPE & APPLICATION ===================== This document will help to analyze the application design related to transaction bottlenecks and database performance tuning. Let start with an example: =================== create table akdas (A1 number, Col1 Varchar2(10), Col2 Varchar2(10)); insert into akdas values(5,'Hello','Hi'); insert into akdas values(6,'Sudip','Datta'); insert into akdas values(7,'Preetam','Roy'); insert into akdas values(8,'Michael','Polaski'); From Node 1: ========== update akdas set a1=11 where a1=6; From Node 2: ========== update akdas set a1=12 where a1=7; update akdas set a1=11 where a1=6; /* this will wait for Node1: to complete the transaction */ This Note Is Made To Analyzing Only the TX-Mode-6 (Exclusive). 1. Now run the following query to track down the problem: Who is waiting =================================================================== prompt prompt Query 1. Waiting for TX Enqueue where mode is Exclusive prompt ===================================== prompt set linesize 100 set pagesize 66 col c1 for a15 col c1 heading "Program Name " select l.inst_id,l.SID,program c1,l.TYPE,l.ID1,l.ID2,l.LMODE,l.REQUEST from gv$lock l,gv$session s where l.type like 'TX' and l.REQUEST =6 and l.inst_id=s.inst_id and l.sid=s.sid order by id1 / Output will be here =============== INST_ID SID Program Name TY ID1 ID2 LMODE REQUEST ----------- ---------- ------------------ --- -------- -------- ---------- -------- 2 13 sqlplus@opcbsol TX 393236 780 0 6 2 (TNS V1-V3) It is clear that SID 12 of instance 2 is doing a DML and waiting on REQUEST Mode 6. 2. Let's run the next query to find who is holding =========================================== prompt prompt prompt Query 2. Holding for TX Enqueue where mode greater than 6 prompt ======================================= prompt set linesize 100 set pagesize 66 col c1 for a15 col c1 heading "Program Name " select l.inst_id,l.SID,program c1,l.TYPE,l.ID1,l.ID2,l.LMODE,l.REQUEST from gv$lock l,gv$session s where l.type like 'TX' and l.LMODE =6 and (l.ID1,l.ID2) in (select id1,id2 from gv$lock where type like 'TX' and REQUEST =6) and l.inst_id=s.inst_id and l.sid=s.sid order by id1 / Output will be here =============== INST_ID SID Program Name TY ID1 ID2 LMODE REQUEST ---------- ---------- -------------- --- ---------- -------- ----------- -------- 1 12 sqlplus@opcbsol TX 393236 780 6 0 1 (TNS V1-V3) So holder is SID 12 on instance 1. Where LMODE = 6. 3. Let's find out the exact file#, block# and Record# where it is waiting =============================================================== prompt prompt prompt Query 3. Object# ,File#, Block# and Slot# TX Enqueue in detail prompt ======================================== prompt set linesize 110 col c0 for 999 col c0 heading "INS" col c1 for a15 col c1 heading "Program Name " select inst_id c0,sid,program c1,ROW_WAIT_OBJ# object_no, ROW_WAIT_FILE# Rfile_no, ROW_WAIT_BLOCK# Block_no ,ROW_WAIT_ROW# Row_no from gv$session where (inst_id,sid) in (select inst_id,sid from gv$session_wait where p1='1415053318') / Output Will be here =============== INS SID Program Name OBJECT_NO RFILE_NO BLOCK_NO ROW_NO ----- ---------- ------------- --------------- --------- ------- 2 13 sqlplus@opcbsol 7261 9 12346 1 2 (TNS V1-V3) From the output, it is clear that it is waiting on Relative_File# 9, Block# 12346, Row Number 1. Here Row Number 1 means the slot number in the block 12346. This Row_No start from 0 (zero). 4. Let's Find the object details ============================= prompt prompt prompt Query 4. Object Involve for TX Enqueue in detail prompt =============================== prompt set linesize 100 set pagesize 100 col owner for a10 col object_name for a20 col object_type for a10 select owner,object_name,object_id,object_type from dba_objects where object_id in (select ROW_WAIT_OBJ# from gv$session where (inst_id, sid) in (select inst_id,sid from gv$session_wait where p1='1415053318')) / Output Will be here =============== OWNER OBJECT_NAME OBJECT_ID OBJECT_TYP --------- ------------ -------- ----------- AKDAS AKDAS 7261 TABLE 5. Let’s find the row value details ============================= prompt prompt prompt Query 5. Finding the row value prompt ==================== prompt select * from <Owner>.<Table Name> where rowid like DBMS_ROWID.ROWID_CREATE(1,&Object_No,&Rfile_No, &Block_No, &Row_Number) / From query 3 and 4 we will get the value for all variables. Owner = AKDAS Table_Name = AKDAS Object_No = 7261 Rfile_No = 9 Block_No = 12346 Row_Number = 1 Output Will be here =============== A1 Col1 Col2 ---------- --------------- ---------- 6 Hello Hi So we can drag down to the row value where TX Enqueue contention exists. 6. Let’s find the user activity that is "Holder" and "Waiter" ==================================================== set linesize 120 set pagesize 66 col c0 for 999 col c0 heading "INS" col c1 for a9 col c1 heading "OS User" col c2 for a9 col c2 heading "Oracle User" col c3 for a15 col c3 heading "Program Name" col b1 for a9 col b1 heading "Unix PID" col b2 for 9999 justify left col b2 heading "ORA SID" col b3 for 999999 justify left col b3 heading "SERIAL#" col sql_text for a45 set space 1 break on b1 nodup on c0 nodup on c3 nodup on c1 nodup on c2 nodup on b2 nodup on b3 skip 2 select a.inst_id c0,b.sid b2,c.spid b1, b.program c3, b.username c2,b.serial# b3, a.sql_text from gv$sql a, gv$session b, gv$process c where a.address = b.sql_address and b.paddr = c.addr and a.hash_value = b.sql_hash_value and a.inst_id=b.inst_id and a.inst_id=c.inst_id and a.inst_id like '&inst_id' and b.sid like '&sid' order by c.spid,a.hash_value / This query asks the Instance Number and Sid number, which you can get from step 1 and 2. But remember , you can see the waiter activity, but you may not see the holder activity. Reason is, the holder is sitting idle after doing the DML operation. So SQL for Holder should not be seen under gv$sql. This all query can be run for single instance database, but all GV$ view need to replace to V$ and there is no INST_ID for V$ View, that part need to be taken care.
来自:How to Find TX Enqueue Contention in RAC or OPS [ID 179582.1]
发表在 Oracle性能优化
评论关闭
Oracle 11g丢失access$恢复方法
最近接触到两个案例都是11g数据库因为异常关闭导致access$表丢失,使得数据库不能正常open.为什么这个表会丢失还未找到原因.我这里提供一种在upgrade模式下解决给问题方法.
数据库版本
SQL> select * from v$version; BANNER -------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production PL/SQL Release 11.2.0.3.0 - Production CORE 11.2.0.3.0 Production TNS for Linux: Version 11.2.0.3.0 - Production NLSRTL Version 11.2.0.3.0 - Production SQL> select to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') "xifenfei" from dual; xifenfei -------------------------------------- 2012-06-22 05:28:57
数据库启动报ORA-00704
SQL> startup ORACLE instance started. Total System Global Area 523108352 bytes Fixed Size 1346052 bytes Variable Size 448792060 bytes Database Buffers 67108864 bytes Redo Buffers 5861376 bytes Database mounted. ORA-01092: ORACLE instance terminated. Disconnection forced ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-00942: table or view does not exist Process ID: 1782 Session ID: 125 Serial number: 5
找出ORA-00704报错原因
SQL> conn / as sysdba Connected to an idle instance. SQL> startup mount; ORACLE instance started. Total System Global Area 523108352 bytes Fixed Size 1346052 bytes Variable Size 448792060 bytes Database Buffers 67108864 bytes Redo Buffers 5861376 bytes Database mounted. SQL> oradebug setmypid Statement processed. SQL> oradebug EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12 Statement processed. SQL> oradebug TRACEFILE_NAME /u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_2010.trc SQL> alter database open; alter database open * ERROR at line 1: ORA-01092: ORACLE instance terminated. Disconnection forced ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-00942: table or view does not exist Process ID: 2010 Session ID: 125 Serial number: 5
查看trace文件发现
PARSE ERROR #3063868604:len=56 dep=1 uid=0 oct=3 lid=0 tim=1340312320595472 err=942 select order#,columns,types from access$ where d_obj#=:1 ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-00942: table or view does not exist ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-00942: table or view does not exist *** 2012-06-22 04:58:40.596 USER (ospid: 2010): terminating the instance due to error 704
启动数据库至upgrade模式
SQL> startup upgrade ORACLE instance started. Total System Global Area 523108352 bytes Fixed Size 1346052 bytes Variable Size 448792060 bytes Database Buffers 67108864 bytes Redo Buffers 5861376 bytes Database mounted. Database opened.
创建access$表和index
SQL> create table access$ 2 ( d_obj# number not null, 3 order# number not null, 4 columns raw(126), 5 types number not null) 6 storage (initial 10k next 100k maxextents unlimited pctincrease 0) 7 / Table created. SQL> create index i_access1 on 2 access$(d_obj#, order#) 3 storage (initial 10k next 100k maxextents unlimited pctincrease 0) 4 / Index created. --创建语句可以在?\RDBMS\ADMIN\dcore.bsq中找到
重启数据库
SQL> shutdown immediate; Database closed. Database dismounted. ORACLE instance shut down. SQL> startup ORACLE instance started. Total System Global Area 523108352 bytes Fixed Size 1346052 bytes Variable Size 448792060 bytes Database Buffers 67108864 bytes Redo Buffers 5861376 bytes Database mounted. Database opened.
access$表作用(感谢vmcd同学提供)
When a database object is first referenced in a PL/SQL program, the PL/SQL engine checks the ACCESS$ table (owned by SYS) to see if the executor of the program has authority on that database object.
对于access$表丢失以前记录是否对系统产生严重影响还未知,希望知道的朋友告知下