Troubleshoot Oracle - hung process

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11673947/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 00:59:55  来源:igfitidea点击:

Troubleshoot Oracle - hung process

javadatabaseoracle

提问by sceaj

I'm trying to understand a problem we're having with a Java process that is hanging. This process has been running in production for about 4 months and earlier this week it started hanging. When I look at a thread dump of the process all of the relevant threads (3) have stacks like the following:

I'm trying to understand a problem we're having with a Java process that is hanging. This process has been running in production for about 4 months and earlier this week it started hanging. When I look at a thread dump of the process all of the relevant threads (3) have stacks like the following:

    "TxnParser_1" prio=6 tid=0x69bd3400 nid=0x2534 runnable [0x6aa2f000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at oracle.net.ns.Packet.receive(Unknown Source)
        at oracle.net.ns.DataPacket.receive(Unknown Source)
        at oracle.net.ns.NetInputStream.getNextPacket(Unknown Source)
        at oracle.net.ns.NetInputStream.read(Unknown Source)
        at oracle.net.ns.NetInputStream.read(Unknown Source)
        at oracle.net.ns.NetInputStream.read(Unknown Source)
        at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1099)
        at oracle.jdbc.driver.T4CMAREngine.unmarshalSB1(T4CMAREngine.java:1070)
        at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:478)
        at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:207)
        at oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:790)
        at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1039)
        at oracle.jdbc.driver.T4CStatement.executeMaybeDescribe(T4CStatement.java:830)
        at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1132)
        at oracle.jdbc.driver.OracleStatement.executeInternal(OracleStatement.java:1687)
        at oracle.jdbc.driver.OracleStatement.execute(OracleStatement.java:1653)
        - locked <0x40e22f88> (a oracle.jdbc.driver.T4CStatement)
        - locked <0x28f8d398> (a oracle.jdbc.driver.T4CConnection)
        at com.gcg.data.LogParsingInfo.initFromDB(LogParsingInfo.java:262)
        at com.gcg.om.OmQueueEntry.initParseInfoFromDB(OmQueueEntry.java:104)
        at com.gcg.om.GenericQueueEntry.run(GenericQueueEntry.java:237)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)

There are no threads waiting for locks so the process isn't deadlocked. These 3 threads that are doing the work are just blocked waiting for a response from Oracle, at least that is what it looks like to me.

There are no threads waiting for locks so the process isn't deadlocked. These 3 threads that are doing the work are just blocked waiting for a response from Oracle, at least that is what it looks like to me.

Looking at Oracle, when I query v$session, it looks like one of the connections associated with these threads is currently executing a query, although I can't see the sql.

Looking at Oracle, when I query v$session, it looks like one of the connections associated with these threads is currently executing a query, although I can't see the sql.

select ... from v$session where ...;
SQL_ADDRESS      SQL_HASH_VALUE SQL_ID        SQL_CHILD_NUMBER SQL_EXEC_START SQL_EXEC_ID PREV_SQL_ADDR    PREV_HASH_VALUE PREV_SQL_ID   PREV_CHILD_NUMBER PREV_EXEC_START PREV_EXEC_ID
---------------- -------------- ------------- ---------------- -------------- ----------- ---------------- --------------- ------------- ----------------- --------------- ------------
              00              0                                                           0000000239F59EE8      1483377872 fqr8pndc6p36h                 5 26-JUL-12           32080545
              00              0                                                           0000000239F59EE8      1483377872 fqr8pndc6p36h                 5 26-JUL-12           32080546
0000000148CABD88     1784444892 a16hxxtp5sxyw                                             0000000239F59EE8      1483377872 fqr8pndc6p36h                 5 26-JUL-12           32080544

select * from v$sql where sql_id = 'a16hxxtp5sxyw';

no rows selected

My questions are:

My questions are:

  1. Am I correct in my analysis that the process is simply blocked waiting for a response from Oracle?
  2. What should I be looking for in Oracle to understand why this process is blocking?
  1. Am I correct in my analysis that the process is simply blocked waiting for a response from Oracle?
  2. What should I be looking for in Oracle to understand why this process is blocking?

Updated:

Updated:

Based on the comment regarding looking in DBA_WAITERS and DBA_LOCKS

Based on the comment regarding looking in DBA_WAITERS and DBA_LOCKS

select * from dba_waiters;

no rows selected

select * from dba_locks where BLOCKING_OTHERS <> 'Not Blocking';

no rows selected 

There were 98 rows in dba_locks but since all are 'Not Blocking' I don't think it is a locking issue? The process in question has been in this state for over 3 hours so any deadlock would have been detected by now.

There were 98 rows in dba_locks but since all are 'Not Blocking' I don't think it is a locking issue? The process in question has been in this state for over 3 hours so any deadlock would have been detected by now.

I'm of the theory that the Oracle instance is not "healthy", but I'm at a loss as to what to look at. I have a request in to reboot the Oracle server, but that hasn't been done yet.

I'm of the theory that the Oracle instance is not "healthy", but I'm at a loss as to what to look at. I have a request in to reboot the Oracle server, but that hasn't been done yet.

Follow-up question: Is it normal for the v$session to contain a sql_id that doesn't exist in v$sql and if so, under what conditions?

Follow-up question: Is it normal for the v$session to contain a sql_id that doesn't exist in v$sql and if so, under what conditions?

采纳答案by sceaj

The problem was solved and the answer was right in the v$session table. Apparently Oracle sessions can block for reasons other than just locking. Notice the column FINAL_BLOCKING_SESSION - it identifies the session that is the root cause of the blocking. We investigated session 845 and found that the client process (identified by MACHINE and PORT) no longer existed. The DBA killed session 845 and all returned to normal.

The problem was solved and the answer was right in the v$session table. Apparently Oracle sessions can block for reasons other than just locking. Notice the column FINAL_BLOCKING_SESSION - it identifies the session that is the root cause of the blocking. We investigated session 845 and found that the client process (identified by MACHINE and PORT) no longer existed. The DBA killed session 845 and all returned to normal.

SID     SERIAL# STATUS    PROGRAM          TYPE SQL_ID        PREV_SQL_ID    BLOCKING_SESSION_STATUS BLOCKING_INSTANCE BLOCKING_SESSION FINAL_BLOCKING_SESSION_STATUS FINAL_BLOCKING_INSTANCE FINAL_BLOCKING_SESSION EVENT
------- ------- --------- ---------------- ---- ------------- -------------- ----------------------- ----------------- ---------------- ----------------------------- ----------------------- ---------------------- ----------------------------
 108    22447   ACTIVE    Gcg log parser 1 USER               fqr8pndc6p36h  VALID                   1                 1581             VALID                         1                       845                    library cache: mutex X
 639    40147   ACTIVE    Gcg log parser 3 USER               fqr8pndc6p36h  VALID                   1                 1581             VALID                         1                       845                    library cache: mutex X
 742    34683   ACTIVE    Gcg log parser 2 USER a16hxxtp5sxyw fqr8pndc6p36h  VALID                   1                 1581             VALID                         1                       845                    library cache: mutex X

回答by ryenus

I also encountered this problem recently, and used this query find locking/locked sessions in Oracle:

I also encountered this problem recently, and used this query find locking/locked sessions in Oracle:

select 
   inst_id||' '||sid||','||serial# inst_sid_s#, 
   username,
   row_wait_obj#||','||row_wait_block#||','||row_wait_row# obj_lck,
   blocking_session_Status||' '||blocking_instance||','||blocking_session blk_info,
   final_blocking_session_Status||' '||final_blocking_instance||','||final_blocking_session f_blk_info,
   event, 
   seconds_in_wait 
from 
   gv$session 
where 
   lockwait is not null
order by 
   inst_id;

Source: http://www.dba-oracle.com/t_final_blocking_session_final_blocking_instance.htm

Source: http://www.dba-oracle.com/t_final_blocking_session_final_blocking_instance.htm