Android - 如何调查 ANR?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/704311/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-20 02:23:57  来源:igfitidea点击:

Android - how do I investigate an ANR?

androidperformanceandroid-anr-dialog

提问by lostInTransit

Is there a way of finding out where my app threw an ANR (Application Not Responding). I took a look at the traces.txt file in /data and I see a trace for my application. This is what I see in the trace.

有没有办法找出我的应用程序在哪里抛出 ANR(应用程序无响应)。我查看了 /data 中的 traces.txt 文件,我看到了我的应用程序的跟踪。这是我在跟踪中看到的。

DALVIK THREADS:
"main" prio=5 tid=3 TIMED_WAIT
  | group="main" sCount=1 dsCount=0 s=0 obj=0x400143a8
  | sysTid=691 nice=0 sched=0/0 handle=-1091117924
  at java.lang.Object.wait(Native Method)
  - waiting on <0x1cd570> (a android.os.MessageQueue)
  at java.lang.Object.wait(Object.java:195)
  at android.os.MessageQueue.next(MessageQueue.java:144)
  at android.os.Looper.loop(Looper.java:110)
  at android.app.ActivityThread.main(ActivityThread.java:3742)
  at java.lang.reflect.Method.invokeNative(Native Method)
  at java.lang.reflect.Method.invoke(Method.java:515)
  at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:739)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:497)
  at dalvik.system.NativeStart.main(Native Method)

"Binder Thread #3" prio=5 tid=15 NATIVE
  | group="main" sCount=1 dsCount=0 s=0 obj=0x434e7758
  | sysTid=734 nice=0 sched=0/0 handle=1733632
  at dalvik.system.NativeStart.run(Native Method)

"Binder Thread #2" prio=5 tid=13 NATIVE
  | group="main" sCount=1 dsCount=0 s=0 obj=0x433af808
  | sysTid=696 nice=0 sched=0/0 handle=1369840
  at dalvik.system.NativeStart.run(Native Method)

"Binder Thread #1" prio=5 tid=11 NATIVE
  | group="main" sCount=1 dsCount=0 s=0 obj=0x433aca10
  | sysTid=695 nice=0 sched=0/0 handle=1367448
  at dalvik.system.NativeStart.run(Native Method)

"JDWP" daemon prio=5 tid=9 VMWAIT
  | group="system" sCount=1 dsCount=0 s=0 obj=0x433ac2a0
  | sysTid=694 nice=0 sched=0/0 handle=1367136
  at dalvik.system.NativeStart.run(Native Method)

"Signal Catcher" daemon prio=5 tid=7 RUNNABLE
  | group="system" sCount=0 dsCount=0 s=0 obj=0x433ac1e8
  | sysTid=693 nice=0 sched=0/0 handle=1366712
  at dalvik.system.NativeStart.run(Native Method)

"HeapWorker" daemon prio=5 tid=5 VMWAIT
  | group="system" sCount=1 dsCount=0 s=0 obj=0x4253ef88
  | sysTid=692 nice=0 sched=0/0 handle=1366472
  at dalvik.system.NativeStart.run(Native Method)

----- end 691 -----

How can I find out where the problem is? The methods in the trace are all SDK methods.

我怎样才能找出问题出在哪里?trace中的方法都是SDK方法。

Thanks.

谢谢。

采纳答案by sooniln

An ANR happens when some long operation takes place in the "main" thread. This is the event loop thread, and if it is busy, Android cannot process any further GUI events in the application, and thus throws up an ANR dialog.

当在“主”线程中发生一些长时间的操作时,就会发生 ANR。这是事件循环线程,如果它很忙,Android 将无法处理应用程序中的任何其他 GUI 事件,因此会引发 ANR 对话框。

Now, in the trace you posted, the main thread seems to be doing fine, there is no problem. It is idling in the MessageQueue, waiting for another message to come in. In your case the ANR was likely a longer operation, rather than something that blocked the thread permanently, so the event thread recovered after the operation finished, and your trace went through after the ANR.

现在,在您发布的跟踪中,主线程似乎运行良好,没有问题。它在 MessageQueue 中空闲,等待另一条消息进来。在您的情况下,ANR 可能是一个更长的操作,而不是永久阻塞线程的操作,因此事件线程在操作完成后恢复,并且您的跟踪通过在 ANR 之后。

Detecting where ANRs happen is easy if it is a permanent block (deadlock acquiring some locks for instance), but harder if it's just a temporary delay. First, go over your code and look for vunerable spots and long running operations. Examples may include using sockets, locks, thread sleeps, and other blocking operations from within the event thread. You should make sure these all happen in separate threads. If nothing seems the problem, use DDMS and enable the thread view. This shows all the threads in your application similar to the trace you have. Reproduce the ANR, and refresh the main thread at the same time. That should show you precisely whats going on at the time of the ANR

如果它是一个永久块(例如获取一些锁的死锁),则检测 ANR 发生的位置很容易,但如果它只是一个临时延迟,则更难。首先,检查您的代码并寻找易受攻击的地方和长时间运行的操作。示例可能包括在事件线程内使用套接字、锁、线程休眠和其他阻塞操作。您应该确保这些都发生在单独的线程中。如果没有任何问题,请使用 DDMS 并启用线程视图。这将显示您的应用程序中的所有线程,类似于您拥有的跟踪。重现ANR,同时刷新主线程。这应该会准确地向您显示 ANR 发生时的情况

回答by Dheeraj Vepakomma

You can enable StrictModein API level 9 and above.

您可以在 API 级别 9 及更高级别启用StrictMode

StrictMode is most commonly used to catch accidental disk or network access on the application's main thread, where UI operations are received and animations take place. By keeping your application's main thread responsive, you also prevent ANR dialogsfrom being shown to users.

StrictMode 最常用于捕获应用程序主线程上的意外磁盘或网络访问,其中接收 UI 操作并发生动画。通过保持应用程序的主线程响应,您还可以防止向用户显示ANR 对话框

public void onCreate() {
    StrictMode.setVmPolicy(new StrictMode.VmPolicy.Builder()
                           .detectAll()
                           .penaltyLog()
                           .penaltyDeath()
                           .build());
    super.onCreate();
}

using penaltyLog()you can watch the output of adb logcat while you use your application to see the violations as they happen.

使用penaltyLog()您可以在使用应用程序时查看 adb logcat 的输出,以查看发生的违规情况。

回答by Horyun Lee

You are wondering which task hold a UI Thread. Trace file gives you a hint to find the task. you need investigate a state of each thread

您想知道哪个任务持有 UI 线程。跟踪文件为您提供查找任务的提示。你需要调查每个线程的状态

State of thread

线程状态

  • running - executing application code
  • sleeping - called Thread.sleep()
  • monitor - waiting to acquire a monitor lock
  • wait - in Object.wait()
  • native - executing native code
  • vmwait - waiting on a VM resource
  • zombie - thread is in the process of dying
  • init - thread is initializing (you shouldn't see this)
  • starting - thread is about to start (you shouldn't see this either)
  • 运行 - 执行应用程序代码
  • 睡眠 - 称为 Thread.sleep()
  • 监视器 - 等待获取监视器锁
  • 等待 - 在 Object.wait() 中
  • 本机 - 执行本机代码
  • vmwait - 等待 VM 资源
  • 僵尸 - 线程正在死亡
  • init - 线程正在初始化(你不应该看到这个)
  • 开始 - 线程即将开始(你也不应该看到这个)

Focus on SUSPENDED, MONITOR state. Monitor state indicates which thread is investigated and SUSPENDED state of the thread is probably main reason for deadlock.

关注 SUSPENDED、MONITOR 状态。监视器状态指示正在调查哪个线程,线程的 SUSPENDED 状态可能是导致死锁的主要原因。

Basic investigate steps

基本调查步骤

  1. Find "waiting to lock"
    • you can find monitor state "Binder Thread #15" prio=5 tid=75 MONITOR
    • you are lucky if find "waiting to lock"
    • example : waiting to lock <0xblahblah> (a com.foo.A) held by threadid=74
  2. You can notice that "tid=74" hold a task now. So go to tid=74
  3. tid=74 maybe SUSPENDED state! find main reason!
  1. 找到“等待锁定”
    • 你可以找到监视器状态"Binder Thread #15" prio=5 tid=75 MONITOR
    • 如果发现“等待锁定”,你很幸运
    • 示例:等待锁定由 threadid=74 持有的 <0xblahblah>(com.foo.A)
  2. 您可以注意到“tid=74”现在持有一个任务。所以去 tid=74
  3. tid=74 可能是 SUSPENDED 状态!找到主要原因!

trace does not always contain "waiting to lock". in this case it is hard to find main reason.

跟踪并不总是包含“等待锁定”。在这种情况下,很难找到主要原因。

回答by Akhil Cherian Verghese

I've been learning android for the last few months, so I'm far from an expert, but I've been really disappointed with the documentation on ANRs.

过去几个月我一直在学习 android,所以我远非专家,但我对 ANR 的文档感到非常失望。

Most of the advice seems to be geared towards avoiding them or fixing them by blindly looking through your code, which is great, but I couldn't find anything on analyzing the trace.

大多数建议似乎都是为了避免它们或通过盲目查看您的代码来修复它们,这很好,但我在分析跟踪时找不到任何东西。

There are three things you really need to look for with ANR logs.

对于 ANR 日志,您确实需要注意三件事。

1) Deadlocks: When a thread is in the WAIT state, you can look through the details to find who it's "heldby=". Most of the time, it'll be held by itself, but if it's held by another thread, that's likely to be a danger sign. Go look at that thread and see what it's held by. You might find a loop, which is a clear sign that something has gone wrong. This is pretty rare, but it's the first point because when it happens, it's a nightmare

1)死锁:当一个线程处于WAIT状态时,你可以通过查看细节来找到它的“holdby=”是谁。大多数时候,它会被自己持有,但如果它被另一个线程持有,那很可能是一个危险信号。去看看那个线程,看看它是由什么持有的。您可能会发现一个循环,这是出现问题的明显迹象。这是非常罕见的,但这是第一点,因为一旦发生,就是一场噩梦

2) Main thread Waiting: If your main thread is in the WAIT state, check if it's held by another thread. This shouldn't happen, because your UI thread shouldn't be held by a background thread.

2)主线程等待:如果你的主线程处于等待状态,检查它是否被另一个线程持有。这不应该发生,因为您的 UI 线程不应该由后台线程持有。

Both of these scenarios, mean you need to rework your code significantly.

这两种情况都意味着您需要大量返工代码。

3) Heavy operations on the main thread: This is the most common cause of ANRs, but sometimes one of the harder to find and fix. Look at the main thread details. Scroll down the stack trace and until you see classes you recognize (from your app). Look at the methods in the trace and figure out if you're making network calls, db calls, etc. in these places.

3) 主线程上的繁重操作:这是 ANR 最常见的原因,但有时是更难找到和修复的原因之一。查看主线程详细信息。向下滚动堆栈跟踪,直到看到您识别的类(来自您的应用程序)。查看跟踪中的方法并确定您是否在这些地方进行网络调用、数据库调用等。

Finally, and I apologize for shamelessly plugging my own code, you can use the python log analyzer I wrote at https://github.com/HarshEvilGeek/Android-Log-AnalyzerThis will go through your log files, open ANR files, find deadlocks, find waiting main threads, find uncaught exceptions in your agent logs and print it all out on the screen in a relatively easy to read manner. Read the ReadMe file (which I'm about to add) to learn how to use it. It's helped me a ton in the last week!

最后,我为无耻地插入我自己的代码而道歉,你可以使用我在https://github.com/HarshEvilGeek/Android-Log-Analyzer写的 python 日志分析器这将遍历你的日志文件,打开 ANR 文件,找到死锁,查找等待的主线程,在代理日志中查找未捕获的异常并以相对易于阅读的方式将其全部打印在屏幕上。阅读自述文件(我即将添加)以了解如何使用它。它在上周帮助了我很多!

回答by Ulrich

Whenever you're analyzing timing issues, debugging often does not help, as freezing the app at a breakpoint will make the problem go away.

每当您分析时序问题时,调试通常无济于事,因为在断点处冻结应用程序会使问题消失。

Your best bet is to insert lots of logging calls (Log.XXX()) into the app's different threads and callbacks and see where the delay is at. If you need a stacktrace, create a new Exception (just instantiate one) and log it.

最好的办法是在应用程序的不同线程和回调中插入大量日志记录调用 (Log.XXX()),然后查看延迟在哪里。如果您需要堆栈跟踪,请创建一个新的异常(只需实例化一个)并记录它。

回答by Hyman

What Triggers ANR?

什么触发了 ANR?

Generally, the system displays an ANR if an application cannot respond to user input.

通常,如果应用程序无法响应用户输入,系统会显示 ANR。

In any situation in which your app performs a potentially lengthy operation, you should not perform the work on the UI thread, but instead create a worker thread and do most of the work there. This keeps the UI thread (which drives the user interface event loop) running and prevents the system from concluding that your code has frozen.

在您的应用程序执行潜在冗长操作的任何情况下,您不应该在 UI 线程上执行工作,而是创建一个工作线程并在那里完成大部分工作。这使 UI 线程(驱动用户界面事件循环)保持运行,并防止系统得出您的代码已冻结的结论。

How to Avoid ANRs

如何避免 ANR

Android applications normally run entirely on a single thread by default the "UI thread" or "main thread"). This means anything your application is doing in the UI thread that takes a long time to complete can trigger the ANR dialog because your application is not giving itself a chance to handle the input event or intent broadcasts.

默认情况下,Android 应用程序通常完全在单个线程上运行,即“UI 线程”或“主线程”)。这意味着您的应用程序在 UI 线程中执行的任何需要很长时间才能完成的操作都可能触发 ANR 对话框,因为您的应用程序没有给自己处理输入事件或意图广播的机会。

Therefore, any method that runs in the UI thread should do as little work as possible on that thread. In particular, activities should do as little as possible to set up in key life-cycle methods such as onCreate() and onResume(). Potentially long running operations such as network or database operations, or computationally expensive calculations such as resizing bitmaps should be done in a worker thread (or in the case of databases operations, via an asynchronous request).

因此,在 UI 线程中运行的任何方法都应该在该线程上执行尽可能少的工作。特别是,Activity 应该尽可能少地在关键生命周期方法中进行设置,例如 onCreate() 和 onResume()。可能长时间运行的操作,如网络或数据库操作,或计算量大的计算,如调整位图大小,应该在工作线程中完成(或者在数据库操作的情况下,通过异步请求)。

Code: Worker thread with the AsyncTask class

代码:带有 AsyncTask 类的工作线程

private class DownloadFilesTask extends AsyncTask<URL, Integer, Long> {
    // Do the long-running work in here
    protected Long doInBackground(URL... urls) {
        int count = urls.length;
        long totalSize = 0;
        for (int i = 0; i < count; i++) {
            totalSize += Downloader.downloadFile(urls[i]);
            publishProgress((int) ((i / (float) count) * 100));
            // Escape early if cancel() is called
            if (isCancelled()) break;
        }
        return totalSize;
    }

    // This is called each time you call publishProgress()
    protected void onProgressUpdate(Integer... progress) {
        setProgressPercent(progress[0]);
    }

    // This is called when doInBackground() is finished
    protected void onPostExecute(Long result) {
        showNotification("Downloaded " + result + " bytes");
    }
}

Code: Execute Worker thread

代码:执行工作线程

To execute this worker thread, simply create an instance and call execute():

要执行此工作线程,只需创建一个实例并调用 execute():

new DownloadFilesTask().execute(url1, url2, url3);

Source

来源

http://developer.android.com/training/articles/perf-anr.html

http://developer.android.com/training/articles/perf-anr.html

回答by yaniv

my issue with ANR , after much work i found out that a thread was calling a resource that did not exist in the layout, instead of returning an exception , i got ANR ...

我的 ANR 问题,经过大量工作后,我发现一个线程正在调用布局中不存在的资源,而不是返回异常,我得到了 ANR ...

回答by phnmnn

You need to look for "waiting to lock" in /data/anr/traces.txtfile

您需要在/data/anr/traces.txt文件中查找“等待锁定”

enter image description here

在此处输入图片说明

for more details: Engineer for High Performance with Tools from Android & Play (Google I/O '17)

有关更多详细信息:使用来自 Android & Play 的工具实现高性能的工程师 (Google I/O '17)

回答by alijandro

Basic on @Horyun Lee answer, I wrote a small python scriptto help to investigate ANR from traces.txt.

基于@Horyun Lee 的回答,我写了一个小的 python脚本来帮助调查traces.txt.

The ANRs will output as graphics by graphvizif you have installed grapvhvizon your system.

graphviz如果您已grapvhviz在系统上安装,ANR 将作为图形输出。

$ ./anr.py --format png ./traces.txt

A png will output like below if there are ANRs detected in file traces.txt. It's more intuitive.

如果在文件中检测到 ANR,png 将输出如下所示traces.txt。它更直观。

enter image description here

在此处输入图片说明

The sample traces.txtfile used above was get from here.

traces.txt上面使用的示例文件是从这里获取的。

回答by Mr-IDE

Consider using the ANR-Watchdoglibrary to accurately track and capture ANR stack traces in a high level of detail. You can then send them to your crash reporting library. I recommend using setReportMainThreadOnly()in this scenario. You can either make the app throw a non-fatal exception of the freeze point, or make the app force quit when the ANR happens.

考虑使用ANR-Watchdog库来准确跟踪和捕获高级别的 ANR 堆栈跟踪。然后您可以将它们发送到您的崩溃报告库。我建议setReportMainThreadOnly()在这种情况下使用。您可以让应用程序抛出一个非致命的冻结点异常,或者让应用程序在 ANR 发生时强制退出。

Note that the standard ANR reports sent to your Google Play Developer console are often not accurate enough to pinpoint the exact problem. That's why a third-party library is needed.

请注意,发送到您的 Google Play 开发者控制台的标准 ANR 报告通常不够准确,无法确定确切的问题。这就是为什么需要第三方库的原因。