Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-28067

android emulator fails to start "error: device offline"

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • Jenkins 1.609
      android-emulator-plugin 2.12

      A bit more evidence in the long running saga of Android emulator failing to start properly. I have started a new issue but this is closely related to JENKINS-11952

      Notes:

      • The information below was gathered using plugin 2.12. That would fail often enough for me that I could get some slow debug.
      • I'm pretty sure that what I've found will apply to 2.13 and onwards.
      • I'm personally not convinced that switching back to the localhost:nnnn method of connection is 100% safe given my previous investigations for JENKINS-12821 but that is a different matter.

      For those who do not want to try to understand the logs below then the short story is that I suspect that the android-emulator-plugin is causing the emulator<->ADB connection to lock up by connecting to the emulator ADB port to check whether it has started up. It maybe that the lockup would happen anyway but I think we can remove the waitForSocket (and subsequent single ADB connect - if using the emulator-nnnn connection) without any adverse effects on the startup procedure.

      I do not have a sample pull request yet but I have a modified version of 2.12 running locally and will see if my once/twice daily lockups continue to happen.

      Only the brave need read below...

      Below is the TCP traffic to the emulator ADB port (5561 in this instance) on a startup that fails. I have a number of examples of this with the same symptoms.

      richm@bishop:~$ tcpdump -r /tmp/150423_android1.pcap port 5561
      reading from file /tmp/150423_android1.pcap, link-type LINUX_SLL (Linux cooked)
      => ADB server connects here (port 60975)
      10:44:18.487165 IP localhost.60975 > localhost.5561: Flags [S], seq 3774551783, win 43690, options [mss 65495,sackOK,TS val 791658429 ecr 0,nop,wscale 7], length 0
      10:44:18.487183 IP localhost.5561 > localhost.60975: Flags [R.], seq 0, ack 3774551784, win 0, length 0
      10:44:25.247907 IP localhost.32808 > localhost.5561: Flags [S], seq 1102610480, win 43690, options [mss 65495,sackOK,TS val 791660119 ecr 0,nop,wscale 7], length 0
      10:44:25.247950 IP localhost.5561 > localhost.32808: Flags [S.], seq 869467149, ack 1102610481, win 43690, options [mss 65495,sackOK,TS val 791660119 ecr 791660119,nop,wscale 7], length 0
      10:44:25.247987 IP localhost.32808 > localhost.5561: Flags [.], ack 1, win 342, options [nop,nop,TS val 791660119 ecr 791660119], length 0
      => ADB server sends start of CNXN handshake
      10:44:25.252799 IP localhost.32808 > localhost.5561: Flags [P.], seq 1:32, ack 1, win 342, options [nop,nop,TS val 791660121 ecr 791660119], length 31
      10:44:25.252886 IP localhost.5561 > localhost.32808: Flags [.], ack 32, win 342, options [nop,nop,TS val 791660121 ecr 791660121], length 0
      => Jenkins android plugin tests for emulator adb port alive
      => When things go wrong the emulator seems to ignore this for a while
      10:44:26.651163 IP localhost.32810 > localhost.5561: Flags [S], seq 571765708, win 43690, options [mss 65495,sackOK,TS val 791660470 ecr 0,nop,wscale 7], length 0
      10:44:26.651200 IP localhost.5561 > localhost.32810: Flags [S.], seq 2069788705, ack 571765709, win 43690, options [mss 65495,sackOK,TS val 791660470 ecr 791660470,nop,wscale 7], length 0
      10:44:26.651237 IP localhost.32810 > localhost.5561: Flags [.], ack 1, win 342, options [nop,nop,TS val 791660470 ecr 791660470], length 0
      => Jenkins android plugin immediately closes connection
      => Emulator looks to be sitting waiting for CNXN handshake
      10:44:26.651407 IP localhost.32810 > localhost.5561: Flags [F.], seq 1, ack 1, win 342, options [nop,nop,TS val 791660470 ecr 791660470], length 0
      10:44:26.652528 IP localhost.5561 > localhost.32810: Flags [.], ack 2, win 342, options [nop,nop,TS val 791660471 ecr 791660470], length 0
      10:45:41.012188 IP localhost.5561 > localhost.32808: Flags [F.], seq 1, ack 32, win 342, options [nop,nop,TS val 791679060 ecr 791660121], length 0
      => ADB server gives up waiting for CNXN handshake
      10:45:41.012442 IP localhost.32808 > localhost.5561: Flags [.], ack 2, win 342, options [nop,nop,TS val 791679061 ecr 791679060], length 0
      10:45:41.013081 IP localhost.32808 > localhost.5561: Flags [F.], seq 32, ack 2, win 342, options [nop,nop,TS val 791679061 ecr 791679060], length 0
      10:45:41.013127 IP localhost.5561 > localhost.32808: Flags [.], ack 33, win 342, options [nop,nop,TS val 791679061 ecr 791679061], length 0
      10:45:42.010478 IP localhost.5561 > localhost.32810: Flags [F.], seq 1, ack 2, win 342, options [nop,nop,TS val 791679310 ecr 791660470], length 0
      => Emulator finally closes the Jenkins android plugin connection
      10:45:42.010602 IP localhost.32810 > localhost.5561: Flags [R], seq 571765710, win 0, length 0
      

      At 10:44:25 netstat -nap shows the following connections to the emulator ADB port

      Active Internet connections (servers and established)
      Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
      tcp        0      0 127.0.0.1:5561          0.0.0.0:*               LISTEN      1693/emulator64-arm
      tcp       31      0 127.0.0.1:5561          127.0.0.1:32808         ESTABLISHED 1693/emulator64-arm
      tcp        0      0 127.0.0.1:32808         127.0.0.1:5561          ESTABLISHED 1666/adb
      

      The second connection to port 5561 has not been made yet but the 31 bytes connection handshake are clearly visible waiting to be read in the TCP stream.

      2 seconds later at 10:44:27 Jenkins has executed waitForSocket and opened the short lived connection to the emulator ADB port

      The emulator side of this connection is in a CLOSE_WAIT state because the emulator has not handled the connection to the ADB port with no data being handled. The Jenkins side of the connection is sitting in FIN_WAIT2 because Jenkins has closed the file descriptor associated with the socket and the kernel is waiting for the emulator to do the same at its end.

      Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
      tcp        0      0 127.0.0.1:5561          0.0.0.0:*               LISTEN      1693/emulator64-arm
      tcp       31      0 127.0.0.1:5561          127.0.0.1:32808         ESTABLISHED 1693/emulator64-arm
      tcp        1      0 127.0.0.1:5561          127.0.0.1:32810         CLOSE_WAIT  1693/emulator64-arm
      tcp        0      0 127.0.0.1:32808         127.0.0.1:5561          ESTABLISHED 1666/adb
      tcp6       0      0 127.0.0.1:32810         127.0.0.1:5561          FIN_WAIT2   -
      

      The FIN_WAIT2 state hangs around until 10:45:23 as per the normal TCP stack behaviour

      At 10:45:40 we are left with the proper ADB server to ADB emulator connection with 31 bytes of data unread and the rogue Jenkins connection in CLOSE_WAIT

      Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
      tcp        0      0 127.0.0.1:5561          0.0.0.0:*               LISTEN      1693/emulator64-arm
      tcp       31      0 127.0.0.1:5561          127.0.0.1:32808         ESTABLISHED 1693/emulator64-arm
      tcp        1      0 127.0.0.1:5561          127.0.0.1:32810         CLOSE_WAIT  1693/emulator64-arm
      tcp        0      0 127.0.0.1:32808         127.0.0.1:5561          ESTABLISHED 1666/adb
      

      At 10:45:41 the ADB server connection is now closed and in TIME_WAIT state

      Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
      tcp        0      0 127.0.0.1:5561          0.0.0.0:*               LISTEN      1693/emulator64-arm
      tcp        0      0 127.0.0.1:5561          127.0.0.1:32808         TIME_WAIT   -
      tcp        1      0 127.0.0.1:5561          127.0.0.1:32810         CLOSE_WAIT  1693/emulator64-arm
      

      Then one second later at 10:45:42 the rogue connection is now properly closed in TIME_WAIT but we are left with no connection to the ADB server.

      Active Internet connections (servers and established)
      Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
      tcp        0      0 127.0.0.1:5561          0.0.0.0:*               LISTEN      1693/emulator64-arm
      tcp        0      0 127.0.0.1:5561          127.0.0.1:32808         TIME_WAIT   -
      

      In the working case similar connections happen but the Jenkins connection gets closed much earlier within about 10 seconds. So it looks to me like the emulator ADB infrastructure gets clogged up. I have no evidence that it is the Jenkins connection that is doing that yet but it seems to be a very likely candidate.

      I did a quick test and removed the waitForSocket and also the single adb connect command. That did not break the android-emulator-plugin startup process but that was based on only a couple of jobs. That did show that the device was office at the start of the process (i.e. before the emulator had opened the ADB port) but that progressed to the standard waiting for boot complete after a few seconds.

      [android] Using Android SDK: /opt/android/android-sdk-linux_x86
      $ /opt/android/android-sdk-linux_x86/platform-tools/adb start-server
      * daemon not running. starting it now on port 6563 *
      * daemon started successfully *
      $ /opt/android/android-sdk-linux_x86/platform-tools/adb start-server
      [android] Starting Android emulator
      $ /opt/android/android-sdk-linux_x86/tools/emulator -no-boot-anim -ports 5562,5563 -prop persist.sys.language=en -prop persist.sys.country=GB -avd hudson_en-GB_160_WVGA_android-7 -no-snapshot-load -no-snapshot-save -no-window
      Failed to Initialize backend EGL display
      Failed to create secure directory (/run/user/1000/pulse): Permission denied
      emulator: WARNING: Could not initialize OpenglES emulation, using software renderer.
      emulator: warning: opening audio output failed
      
      [android] Waiting for emulator to finish booting...
      $ /opt/android/android-sdk-linux_x86/platform-tools/adb -s emulator-5562 shell getprop dev.bootcomplete
      error: device offline
      $ /opt/android/android-sdk-linux_x86/platform-tools/adb -s emulator-5562 shell getprop dev.bootcomplete
      error: device offline
      $ /opt/android/android-sdk-linux_x86/platform-tools/adb -s emulator-5562 shell getprop dev.bootcomplete
      error: device offline
      $ /opt/android/android-sdk-linux_x86/platform-tools/adb -s emulator-5562 shell getprop dev.bootcomplete
      error: device offline
      $ /opt/android/android-sdk-linux_x86/platform-tools/adb -s emulator-5562 shell getprop dev.bootcomplete
      $ /opt/android/android-sdk-linux_x86/platform-tools/adb -s emulator-5562 shell getprop dev.bootcomplete
      $ /opt/android/android-sdk-linux_x86/platform-tools/adb -s emulator-5562 logcat -v time
      [android] Attempting to unlock emulator screen
      $ /opt/android/android-sdk-linux_x86/platform-tools/adb -s emulator-5562 shell input keyevent 82
      $ /opt/android/android-sdk-linux_x86/platform-tools/adb -s emulator-5562 shell input keyevent 4
      [android] Emulator is ready for use (took 77 seconds)
      

            oldelvet Richard Mortimer
            oldelvet Richard Mortimer
            Votes:
            4 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: