Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-67097

Cannot cleanly delete namespace in which controller & agents run

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • kubernetes-plugin
    • None

      Suppose you have a Jenkins controller using the kubernetes plugin installed in a namespace, and it is actively running some builds. Now you kubectl delete ns xxx. This will pause indefinitely. The controller pod shuts down and is removed, as expected, but the agent pods do not die: they will try every ten seconds to connect back to the controller, which no longer exists.

      Nov 10, 2021 2:53:39 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connecting to controller.xxx.svc.cluster.local:50001
      Nov 10, 2021 2:53:39 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Trying protocol: JNLP4-connect
      Nov 10, 2021 2:53:39 AM org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader run
      INFO: Waiting for ProtocolStack to start.
      Nov 10, 2021 2:53:39 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Remote identity confirmed: ...
      Nov 10, 2021 2:53:40 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connected
      Nov 10, 2021 2:57:08 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Terminated
      Nov 10, 2021 2:57:18 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Failed to connect to the master. Will try again
      java.net.UnknownHostException: controller.xxx.svc.cluster.local
      	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
      	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
      	at java.net.Socket.connect(Socket.java:607)
      	at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
      	at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
      	at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
      	at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
      	at sun.net.www.http.HttpClient.New(HttpClient.java:339)
      	at sun.net.www.http.HttpClient.New(HttpClient.java:357)
      	at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1223)
      	at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
      	at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
      	at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990)
      	at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.waitForReady(JnlpAgentEndpointResolver.java:407)
      	at hudson.remoting.Engine.innerRun(Engine.java:830)
      	at hudson.remoting.Engine.run(Engine.java:540)
      
      Nov 10, 2021 2:57:28 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Failed to connect to the master. Will try again
      java.net.UnknownHostException: controller.xxx.svc.cluster.local
      	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
      	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
      	at java.net.Socket.connect(Socket.java:607)
      	at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
      	at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
      	at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
      	at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
      	at sun.net.www.http.HttpClient.New(HttpClient.java:339)
      	at sun.net.www.http.HttpClient.New(HttpClient.java:357)
      	at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1223)
      	at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
      	at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
      	at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990)
      	at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.waitForReady(JnlpAgentEndpointResolver.java:407)
      	at hudson.remoting.Engine.innerRun(Engine.java:830)
      	at hudson.remoting.Engine.run(Engine.java:540)
      
      ...
      

      Did the agent pod receive SIGTERM but ignore it?

            Unassigned Unassigned
            jglick Jesse Glick
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: