How to make sense of Android crash logs

Jamie Lynch

Android Crash Log Errors: Explained

Crashes on Android can be immensely frustrating for users, so much so that after experiencing as little as two crashes, the typical user will uninstall your app. Fortunately, the Android Framework provides some great tools for debugging crashes, and provides several useful crash logs that developers can read to determine what caused that critical issue. In this blog post we’ll cover the three most important crash logs used by the system: exception stack traces, ANR traces, and NDK tombstones.

Exception Stack trace

JVM stack traces are the most common type of crash that typical Android applications will encounter, as the majority of apps are written in either Kotlin or Java. In JVM languages, an Exception is thrown in exceptional circumstances, and contains debug information about the error condition that went wrong, such as a stack trace with file/line number information, and an error message.

If an app hasn’t got a crash reporting SDK installed, then the next best method for retrieving crash logs is to use adb to view logcat. This is a convenient method if physical access to the device is an option, because the default UncaughtExceptionHandler in Android apps prints out the entire stack trace to Logcat before terminating the process, meaning the crash is effectively logged out to an accessible location for developers.

ANR Trace

ANRs (Application Not Responding) occur when an application does not respond to user input for a noticeable period of time. The visible effect of this is that an app has ‘frozen’ from a user’s perspective, which can be immensely frustrating. Common causes include performing disk reads/writes on the main thread, and other long-running tasks, which prevents the User Interface from updating in response to user input.

If the app is in the foreground, after approximately 5 seconds a dialog will be shown which allows the user to kill the app. At this point a trace including details of the ANR will be written to disk, from which valuable information for debugging can be retrieved. Again, this requires physical access to the device unless you have a crash reporting SDK installed that supports ANR detection.

Tombstone

Tombstone crash logs are written when a native crash in C/C++ code occurs in an Android application. The Android platform writes a trace of all the running threads at the time of the crash to /data/tombstones, along with additional information for debugging, such as information about memory and open files. Tombstones are the closest to the metal in terms of information, as they will record details such as raw memory addresses, and as such can be a bit trickier to understand unless you’re familiar with debugging native code. Again, tombstones require physical access to a rooted device in order to be read.

How to get Crash Logs from an Android device

As a prerequisite to all these steps, you should have installed Android Studio and added the command line tools to your path. These local methods make use of the adb tool. You should also have a device or emulator connected which has had developer options enabled.

Note: if you are comfortable using the Device File Explorer in Android Studio directly, you can open device files directly from there rather than using adb pull.

Exception Stack trace

By default, exception stack traces are printed out to the Logcat tool on Android devices. It is possible to retrieve crash logs via the following steps:

  1. Run the following command
adb logcat AndroidRuntime:E *:S
  1. Trigger a crash on the device. The stack trace will show up as new text in the terminal.
  2. Save the terminal output to a file of your choice for inspection later

If a crash has occurred recently on the device, you can skip step 2. This is because Logcat retains a buffer of recent logs which should include the exception. This is time sensitive however - so if you’re looking for a crash from a day ago, that information may be gone forever unless you use a crash reporting tool such as Bugsnag.

ANR Trace

  1. Trigger an ANR on the device.
  2. Run the following command, replacing the destination with a file of your choice
adb pull /data/anr/traces.txt <destination>
  1. Inspect the information in the ANR crash log by opening the saved file

Alternatively, you can inspect summary ANR information by running the following command

adb logcat ActivityManager:E *:S

Tombstone

  1. Root your device or emulator so that you can access the tombstone directory. (Beware when rooting devices, as this step can potentially brick your phone)
  2. Trigger a native crash on the device.
  3. Run the following command to determine what tombstone crash logs are present on the device
adb ls /data/tombstones
  1. Run the following command, replacing the destination with a file of your choice. tombstone_01 is shown as an example filename here, that would be obtained in the previous step
adb pull /data/tombstones/tombstone_01 <destination>
  1. Inspect the information in the Tombstone crash log by opening the saved file

Making sense of Android Crash Log Data

Exception Stack trace

Reading a JVM stack trace can be intimidating at first, but by breaking it down into its constituent parts the task becomes fairly easy. Let’s walk through it step by step, with the following RuntimeException that has been thrown in an example application:

2019-08-27 16:10:28.303 10773-10773/com.bugsnag.android.example E/AndroidRuntime: FATAL EXCEPTION: main
   Process: com.bugsnag.android.example, PID: 10773
   java.lang.RuntimeException: Fatal Crash
       at com.example.foo.CrashyClass.sendMessage(CrashyClass.java:10)
       at com.example.foo.CrashyClass.crash(CrashyClass.java:6)
       at com.bugsnag.android.example.ExampleActivity.crashUnhandled(ExampleActivity.kt:55)
       at com.bugsnag.android.example.ExampleActivity$onCreate$1.invoke(ExampleActivity.kt:33)
       at com.bugsnag.android.example.ExampleActivity$onCreate$1.invoke(ExampleActivity.kt:14)
       at com.bugsnag.android.example.ExampleActivity$sam$android_view_View_OnClickListener$0.onClick(ExampleActivity.kt)
       at android.view.View.performClick(View.java:5637)
       at android.view.View$PerformClick.run(View.java:22429)
       at android.os.Handler.handleCallback(Handler.java:751)
       at android.os.Handler.dispatchMessage(Handler.java:95)
       at android.os.Looper.loop(Looper.java:154)
       at android.app.ActivityThread.main(ActivityThread.java:6119)
       at java.lang.reflect.Method.invoke(Native Method)
       at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:886)
       at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:776)

The first place to start is towards the top of our crash log. Here we can see the process ID that the system assigned the executing app, along with the package name, which can be useful when correlating against other information obtained via logcat. In our example app, the package name is “com.bugsnag.android.example”:

Process: com.bugsnag.android.example, PID: 10773

The next useful piece of information is the exception class. In Java/Kotlin, all exceptions and errors are classes which extend Throwable or one of Throwable’s subclasses, and each exception class can have a different semantic meaning. For example, a developer may wish to throw an IllegalStateException if the program entered an unexpected state, or an IllegalArgumentException if a user attempted to save null as their name. In our case, we’ve thrown a RuntimeException, whose fully qualified class name is displayed below:

java.lang.RuntimeException: Fatal Crash

The error message is also printed to the crash log, which can be very useful for providing additional debug information. In our case, we have just supplied the text “Fatal Crash”, but we could equally pass the values of our variables at the time of the crash if we wanted further information for debugging.

The next thing in our crash log is the juicy part of the information - a stack trace of the thread where the exception occurred. Looking at the top stackframe will usually allow us to find exactly where the error was thrown from, and the frames below it will allow us to observe what the program state was at the time of the crash. The top stackframe is the following:

com.example.foo.CrashyClass.sendMessage(CrashyClass.java:10)

We can immediately see that this contains some very useful information. We are given the class, “com.example.foo.CrashyClass”, so we can open the source file and hunt for a bug there. We’re also given the method name, “sendMessage”, and the line number of the crash, 10, so we can pinpoint exactly in our source where the exception was thrown from.

Understanding the crash log from this point onwards is a case of reading the stack frames below, which are in the order in which the methods were originally called. Consider the example below:

at com.example.foo.CrashyClass.sendMessage(CrashyClass.java:10)
       at com.example.foo.CrashyClass.crash(CrashyClass.java:6)
       at com.bugsnag.android.example.ExampleActivity.crashUnhandled(ExampleActivity.kt:55)

Starting at the top, we can tell that “sendMessage” was invoked by “crash”, which was in turn invoked by a method named “crashUnhandled”, which appears to be the origin of our crash. Of course, the full exception stack trace was somewhat more complex, and also involved method calls in the Android framework, but the basic principle remains the same.

An important note is that in production, apps are often obfuscated by tools such as Proguard which can mean the original symbols are not present, and the stack trace becomes unintelligible. Fortunately, most crash reporting services provide plugins that automatically upload a mapping file that contains the information necessary to symbolicate crash reports from your production apps.

ANR Trace

ANR traces contain a very large amount of information. As an ANR can potentially be caused by multiple apps contending for a limited number of resources, the full crash log contains stack traces for multiple different processes. This full information can be very useful for debugging when all else fails, but most of the time the summary ANR information printed into Logcat is sufficient to debug the error. Consider the following:

2019-08-27 16:12:57.301 1717-1733/system_process E/ActivityManager: ANR in com.bugsnag.android.example (com.bugsnag.android.example/.ExampleActivity)
   PID: 10859
   Reason: Input dispatching timed out (Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago.  Wait queue length: 33.  Wait queue head age: 6178.1ms.)
   Load: 0.32 / 0.62 / 0.37
   CPU usage from 323086ms to -1ms ago (2019-08-27 13:16:43.467 to 2019-08-27 16:12:54.131):
     5.7% 1717/system_server: 2.6% user + 3% kernel / faults: 21251 minor
     4.7% 10392/com.bugsnag.android.example: 1.3% user + 3.3% kernel / faults: 587 minor
     3.9% 2375/com.google.android.gms: 2.9% user + 0.9% kernel / faults: 71377 minor 51 major
     3.1% 16/ksoftirqd/2: 0% user + 3.1% kernel
     2.5% 2254/com.google.android.googlequicksearchbox:search: 1.1% user + 1.4% kernel / faults: 10193 minor
     1.2% 10427/kworker/u8:0: 0% user + 1.2% kernel
     0.9% 8990/kworker/u8:2: 0% user + 0.9% kernel
     0.8% 1342/surfaceflinger: 0.1% user + 0.7% kernel / faults: 35 minor
     0.5% 1344/adbd: 0% user + 0.4% kernel / faults: 8471 minor
     0.4% 1896/com.google.android.gms.persistent: 0.3% user + 0% kernel / faults: 1106 minor
     0.4% 1288/logd: 0.1% user + 0.3% kernel / faults: 43 minor
     0.3% 1806/com.android.systemui: 0.2% user + 0% kernel / faults: 404 minor
     0.2% 1916/com.android.phone: 0.1% user + 0% kernel / faults: 203 minor
     0.2% 1410/audioserver: 0% user + 0.1% kernel / faults: 119 minor
     0.1% 10429/kworker/u8:3: 0% user + 0.1% kernel
     0.1% 10378/com.google.android.apps.photos: 0.1% user + 0% kernel / faults: 426 minor
     0.1% 8/rcu_preempt: 0% user + 0.1% kernel
     0% 1396/jbd2/dm-0-8: 0% user + 0% kernel
     0% 2179/com.google.android.apps.nexuslauncher: 0% user + 0% kernel / faults: 802 minor 1 major
     0% 1409/zygote: 0% user + 0% kernel / faults: 857 minor
     0% 3951/com.android.defcontainer: 0% user + 0% kernel / faults: 265 minor
     0% 10137/kworker/u9:0: 0% user + 0% kernel
     0% 1987/wpa_supplicant: 0% user + 0% kernel
     0% 10205/com.google.android.apps.docs: 0% user + 0% kernel / faults: 50 minor
     0% 1378/dmcrypt_write: 0% user + 0% kernel
     0% 2111/com.google.process.gapps: 0% user + 0% kernel / faults: 356 minor
     0% 3882/com.android.printspooler: 0% user + 0% kernel / faults: 241 minor
     0% 8829/kworker/u9:2: 0% user + 0% kernel
     0% 9808/kworker/u9:4: 0% user + 0% kernel
     0% 19/migration/3: 0% user + 0% kernel
     0% 1420/rild: 0% user + 0% kernel
     0% 10138/kworker/u9:1: 0% user + 0% kernel
     0% 1339/lmkd: 0% user + 0% kernel
     0% 1419/netd: 0% user + 0% kernel / faults: 59 minor
     0% 1793/com.android.inputmethod.latin: 0% user + 0% kernel / faults: 12 minor
     0% 10146/com.android.gallery3d: 0% user + 0% kernel / faults: 95 minor
     0% 10181/android.process.acore: 0% user + 0% kernel / faults: 52 minor
     0% 1281/kworker/0:1H: 0% user + 0% kernel
     0% 10162/kworker/2:1: 0% user + 0% kernel
     0% 10348/com.google.android.partnersetup: 0% user + 0% kernel / faults: 92 minor
     0% 20/ksoftirqd/3: 0% user + 0% kernel
     0% 10308/android.process.media: 0% user + 0% kernel / faults: 16 minor
     0% 1336/healthd: 0% user + 0% kernel
     0% 1354/logcat: 0% user + 0% kernel
     0% 1709/hostapd: 0% user + 0% kernel
     0% 3/ksoftirqd/0: 0% user + 0% kernel
     0% 1341/servicemanager: 0% user + 0% kernel
     0% 2091/com.google.android.ext.services: 0% user + 0% kernel / faults: 10 minor
     0% 10475/com.google.android.apps.photos:CameraShortcut: 0% user + 0% kernel / faults: 29 minor
     0% 4/kworker/0:0: 0% user + 0% kernel
     0% 12/ksoftirqd/1: 0% user + 0% kernel
     0% 1422/fingerprintd: 0% user + 0% kernel
     0% 1591/dhcpclient: 0% user + 0% kernel
     0% 1706/ipv6proxy: 0% user + 0% kernel
     0% 1913/sdcard: 0% user + 0% kernel
     0% 2137/com.google.android.googlequicksearchbox:interactor: 0% user + 0% kernel / faults: 3 minor
     0% 687/kworker/1:1: 0% user + 0% kernel
     0% 1297/vold: 0% user + 0% kernel / faults: 10 minor
     0% 1413/installd: 0% user + 0% kernel / faults: 35 minor
     0% 1//init: 0% user + 0% kernel
     0% 11/migration/1: 0% user + 0% kernel
     0% 466

First off, the crash log contains information about which process on the system suffered an ANR, and gives the process ID and package name, which comes in useful when finding the appropriate stack traces in the more detailed ANR trace:

E/ActivityManager: ANR in com.bugsnag.android.example (com.bugsnag.android.example/.ExampleActivity)
   PID: 10859

The Android framework gives us a reason for the ANR. In this case, the user touched the screen several times, and the dispatch queue waited for over 6 seconds without showing a visible response to these touch events. This represents a very bad user experience that would be noticeable as the whole app would appear to freeze from a user’s perspective:

Reason: Input dispatching timed out (Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago. Wait queue length: 33. Wait queue head age: 6178.1ms.)

Finally, we’re given some CPU load information. While some ANRs have simple causes such as performing IO on the main thread, this is not always the case. Sometimes an ANR can occur on a low-end device due to a lot of resource-hungry apps competing for CPU, so determining whether there are other apps using lots of resources at the same time as our application can be very helpful:

CPU usage from 323086ms to -1ms ago (2019-08-27 13:16:43.467 to 2019-08-27 16:12:54.131):
     5.7% 1717/system_server: 2.6% user + 3% kernel / faults: 21251 minor
     4.7% 10392/com.bugsnag.android.example: 1.3% user + 3.3% kernel / faults: 587 minor
     3.9% 2375/com.google.android.gms: 2.9% user + 0.9% kernel / faults: 71377 minor 51 major
     3.1% 16/ksoftirqd/2: 0% user + 3.1% kernel
     2.5% 2254/com.google.android.googlequicksearchbox:search: 1.1% user + 1.4% kernel / faults: 10193 minor

Tombstone

Like ANR traces, tombstones also contain a very large amount of information that wouldn’t be possible to walk through entirely. We’ll consider a truncated example, which shows the most important information near the top:

ABI: 'x86'
pid: 15300, tid: 15300, name: android.example  >>> com.bugsnag.android.example <<<
signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0xb19aa650
  eax b19aa650  ebx b19abfd8  ecx 00000005  edx b32a1230
  esi 99365d7e  edi bf9c2338
  xcs 00000073  xds 0000007b  xes 0000007b  xfs 0000003b  xss 0000007b
  eip b19aa688  ebp bf9c2148  esp bf9c2140  flags 00010296

backtrace:
  #00 pc 00000688  /data/app/com.bugsnag.android.example-2/lib/x86/libentrypoint.so (crash_write_read_only+40)
  #01 pc 000006ca  /data/app/com.bugsnag.android.example-2/lib/x86/libentrypoint.so (Java_com_bugsnag_android_example_ExampleActivity_doCrash+42)
  #02 pc 003e9d3c  /data/app/com.bugsnag.android.example-2/oat/x86/base.odex (offset 0x399000)

stack:
       bf9c2100  b32dc140  [anon:libc_malloc]
       bf9c2104  b3159f5c  /system/lib/libart.so
       bf9c2108  b315abb3  /system/lib/libart.so
       bf9c210c  b315ab82  /system/lib/libart.so
       bf9c2110  b315ab69  /system/lib/libart.so
       bf9c2114  b716ced4  /dev/ashmem/dalvik-LinearAlloc (deleted)
       bf9c2118  b328b400  [anon:libc_malloc]
       bf9c211c  b312b721  /system/lib/libart.so (_ZN3art14JniMethodStartEPNS_6ThreadE+17)
       bf9c2120  00430000
       bf9c2124  00590000
       bf9c2128  b328b400  [anon:libc_malloc]
       bf9c212c  b32a1230  [anon:libc_malloc]
       bf9c2130  b32b00c0  [anon:libc_malloc]
       bf9c2134  b328b400  [anon:libc_malloc]
       bf9c2138  00000043
       bf9c213c  b19aa66e  /data/app/com.bugsnag.android.example-2/lib/x86/libentrypoint.so (crash_write_read_only+14)
  #00  bf9c2140  bf9c2200
       bf9c2144  b19aa650  /data/app/com.bugsnag.android.example-2/lib/x86/libentrypoint.so
       bf9c2148  bf9c2178
       bf9c214c  b19aa6cb  /data/app/com.bugsnag.android.example-2/lib/x86/libentrypoint.so (Java_com_bugsnag_android_example_ExampleActivity_doCrash+43)
  #01  bf9c2150  00430000
       bf9c2154  00000013
       bf9c2158  05980a40
       bf9c215c  bf9c219c
       bf9c2160  b32a1230  [anon:libc_malloc]
       bf9c2164  0000000c
       bf9c2168  bf9c21cc
       bf9c216c  b2bc803f  /system/lib/libart.so (art_jni_dlsym_lookup_stub+15)
       bf9c2170  b328b400  [anon:libc_malloc]
       bf9c2174  0000000c
       bf9c2178  bf9c21cc
       bf9c217c  994aed3d  /data/app/com.bugsnag.android.example-2/oat/x86/base.odex
  #02  bf9c2180  b32a1230  [anon:libc_malloc]
       bf9c2184  bf9c219c
       bf9c2188  05980a40
       bf9c218c  00000001
       bf9c2190  b716ced4  /dev/ashmem/dalvik-LinearAlloc (deleted)
       bf9c2194  bf9c2c24
       bf9c2198  00000001
       bf9c219c  12c7b450  /dev/ashmem/dalvik-main space (deleted)
       bf9c21a0  00000006
       bf9c21a4  b31fbb74  /system/lib/libart.so
       bf9c21a8  bf9c2238
       bf9c21ac  b2f3a0a4  /system/lib/libart.so (_ZNK3art7OatFile8OatClass19GetOatMethodOffsetsEj+100)
       bf9c21b0  bf9c21cc
       bf9c21b4  99365d7e  /data/app/com.bugsnag.android.example-2/oat/x86/base.odex
       bf9c21b8  bf9c2338
       bf9c21bc  b2bc9263  /system/lib/libart.so (art_quick_invoke_stub+339)

Breaking it down, we’re given essential crash log information towards the top, such as the package name, and the process ID. Crucially because this is a native error and could be affected by the CPU architecture, the tombstone also contains the ABI of the device, which in this case is x86 as the crash was triggered on an emulator:

ABI: 'x86'
pid: 15300, tid: 15300, name: android.example  >>> com.bugsnag.android.example <<<

We're given information about the native error, which in this case was due to a SIGSEGV signal being raised. This contains the address where the fault was triggered, along with assembly instructions:

signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0xb19aa650
  eax b19aa650  ebx b19abfd8  ecx 00000005  edx b32a1230
  esi 99365d7e  edi bf9c2338
  xcs 00000073  xds 0000007b  xes 0000007b  xfs 0000003b  xss 0000007b
  eip b19aa688  ebp bf9c2148  esp bf9c2140  flags 00010296

The rest of the trace contains symbols which were being executed at the time of the crash, which can be symbolicated using debug information in your application’s shared object files, a feature that Bugsnag achieves automatically with a gradle plugin integration.

Next Steps

While getting crash logs of your Android device manually can be very useful in certain situations, it’s possible to automate collection of crash logs by installing a crash reporting SDK such as Bugsnag’s Android SDK in your application. Crash reporting SDKs automatically detect JVM crashes, NDK crashes, and ANRs, and automatically deliver a diagnostic report about the error to a web dashboard.

There are several advantages to collecting crash logs automatically. You might not always have access to a device, particularly if a crash has occurred on an end-user’s device, or if you are employing a 3rd party Quality Assurance firm, they may not have the necessary tools installed to obtain crash logs manually from a test device. Crash reporting SDKs offer automatic reporting of errors from your production application, so that you can gain an immediate insight into how many users are affected by a bug and determine how to fix it with diagnostic information.

A great advantage of crash reporting SDKs is that it’s possible to attach custom metadata to an error report. On-device crash logs are limited by the amount of information that the Android framework can collect, but there is no such limitation when it comes to 3rd party tools. For example, Bugsnag will automatically capture breadcrumbs on Android of lifecycle events and common system broadcasts, which can help track down those tricky bugs that are related to unexpected state in a lifecycle event.

Crash reporting services also offer powerful search and segmentation. Perhaps a bug is only occurring on certain OS versions, or a critical error with In-App-Purchases is affecting the conversion rate of your paid customers. By adding custom metadata to error reports, it’s possible to segment and search for the important bugs, so that you can prioritise what delivers value for your business, rather than wasting time manually collating crash logs from dozens of devices yourself.

Share