Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-63014

Agent connection error using nginx proxy with WebSocket

    XMLWordPrintable

Details

    • Jenkins 2.248

    Description

      I'm using nginx in a proxy setup. One public/ingress with a single upstream Jenkins instance. The attached docker-compose file can be used to recreate the exact environment. My test docker is running on Windows 10. The domain "ingress" was added hosts file as "127.0.0.1 ingress" for convenience.

      Basic Jenkins setup steps, root URL is set to _http://ingress/jenkins/

      Agent is connecting via JNPL with WebSocket option enabled. The following error message is reported:

      C:\workdir\jenkins\http\agent-1>java -jar agent.jar -jnlpUrl http://ingress/jenkins/computer/agent-1/slave-agent.jnlp -secret 4068cc653d7d0ca16f72404ac6ad62d5fe19f5798f5b3f0807c6ecf50fba4353 -workDir "c:\workdir\jenkins\http\agent-1"
      Jul 08, 2020 8:33:25 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
      INFO: Using c:\workdir\jenkins\http\agent-1\remoting as a remoting work directory
      Jul 08, 2020 8:33:26 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
      INFO: Both error and output logs will be printed to c:\workdir\jenkins\http\agent-1\remoting
      JNLP file http://ingress/jenkins/computer/agent-1/slave-agent.jnlp?encrypt=true has invalid arguments: [4068cc653d7d0ca16f72404ac6ad62d5fe19f5798f5b3f0807c6ecf50fba4353, agent-1, -webSocket, -workDir, c:\workdir\jenkins\http\agent-1, -internalDir, remoting, -url, http://ingress/jenkins/, -url, http://jenkins:8080/jenkins/, -headless, -workDir, c:\workdir\jenkins\http\agent-1, -internalDir, remoting]
      Most likely a configuration error in the master
      -webSocket supports only a single -url
      

      There are 2 URLs in the parameter list which is rejected.

      I cannot rule out the possibility of nginx configuration issue, but I followed all the guidelines.

       

      Attachments

        Issue Links

          Activity

            jglick Jesse Glick added a comment -

            The non-trivial-lts-backporting label is misleading: the patch merely removes a couple of lines, so it should certainly be trivial to backport. Whether it should be backported is of course up for discussion, as would be true for any patch.

            I would not say that the deleted code was unused. If your reverse proxy configuration was wrong, or you were otherwise accessing Jenkins via a nonstandard URL for some reason, it would allow inbound TCP agents to connect using the nonstandard URL in case the standard URL were broken. The point is that this was a dubious decision when written, and became even less advisable after subsequent improvements in Jenkins to: guide you to define the root URL in the setup wizard; show an administrative monitor if you had not; and display an administrative monitor if it could be detected that the root URL was configured yet incorrect.

            jglick Jesse Glick added a comment - The non-trivial-lts-backporting label is misleading: the patch merely removes a couple of lines, so it should certainly be trivial to backport. Whether it should be backported is of course up for discussion, as would be true for any patch. I would not say that the deleted code was unused. If your reverse proxy configuration was wrong, or you were otherwise accessing Jenkins via a nonstandard URL for some reason, it would allow inbound TCP agents to connect using the nonstandard URL in case the standard URL were broken. The point is that this was a dubious decision when written, and became even less advisable after subsequent improvements in Jenkins to: guide you to define the root URL in the setup wizard; show an administrative monitor if you had not; and display an administrative monitor if it could be detected that the root URL was configured yet incorrect.

            I think this fix introduced a regression in our setup and I'm not sure how to properly solve that.

            Our Jenkins master is currently exposed via 2 URLs:

            • One is "public", where our users connect on and authenticate against an authentication proxy before reaching Jenkins itself.
            • The other one is "internal", where only a subset of URLs are authorized, for our dynamically created Jenkins agents to connect back on the master using JNLP.

            In both cases, Jenkins is behind an nginx proxy is not reachable directly.

            Up until 2.248, the agents were receiving the 2 URLs to connect back on the master:

            Picked up _JAVA_OPTIONS: -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap
            Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main createEngine
            INFO: Setting up agent: jenkins-ops-docs-bpxmmgsmkz
            Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener <init>
            INFO: Jenkins agent is running in headless mode.
            Jul 26, 2020 1:03:39 PM hudson.remoting.Engine startEngine
            INFO: Using Remoting version: 4.3
            Jul 26, 2020 1:03:39 PM hudson.remoting.Engine startEngine
            WARNING: No Working Directory. Using the legacy JAR Cache location: /home/jenkins/.jenkins/cache/jars
            Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener status
            INFO: Locating server among [https://EXTERNAL/jenkins/, http://INTERNAL/jenkins/]
            Jul 26, 2020 1:03:39 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
            INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
            Jul 26, 2020 1:03:39 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver isPortVisible
            WARNING: Connection refused (Connection refused)
            Jul 26, 2020 1:03:39 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
            INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
            Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener status
            INFO: Agent discovery successful
              Agent address: INTERNAL
              Agent port:    36921
              Identity:      58:e8:9a:bd:ce:d2:c3:7f:d4:33:e3:cc:35:7d:15:a4
            Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener status
            INFO: Handshaking
            Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener status
            INFO: Connecting to INTERNAL:36921
            Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener status
            INFO: Trying protocol: JNLP4-connect
            Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener status
            INFO: Remote identity confirmed: 58:e8:9a:bd:ce:d2:c3:7f:d4:33:e3:cc:35:7d:15:a4
            Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener status
            INFO: Connected
            

            The first URL (EXTERNAL) fails with WARNING: Connection refused (Connection refused) but the agent can then fallback on the second URL, which works.
            I believe the first one (EXTERNAL) doesn't work because the domain name resolve on an IP address where the firewall doesn't open random port. The INTERNAL domain is a different address which doesn't have the same settings.

            After upgrading to 2.248, only the first address (EXTERNAL) is sent to the agent, and the connection always fails and there's no fallback anymore.

            I'd be happy to update our configuration for the new behavior introduced with this change, but I'm not sure how I should proceed:

            • AFAIK, the "root URL" which is sent to the agent is the "Jenkins URL" has configured in the main configuration panel. I think it's also used for other purpose, such as building URLs which are sent externally (such a build results on GitHub for example?), so we don't really want to change this one and we would like to keep the "official" URL used by our users to be EXTERNAL.
            • BUT, we would like our agents (only) to connect using this INTERNAL URL, but then I'm not sure how (if?) to configure that.

            In any case, I'm not sure this should be backported as it in a LTS version, as it may break some setups.

            multani Jonathan Ballet added a comment - I think this fix introduced a regression in our setup and I'm not sure how to properly solve that. Our Jenkins master is currently exposed via 2 URLs: One is "public", where our users connect on and authenticate against an authentication proxy before reaching Jenkins itself. The other one is "internal", where only a subset of URLs are authorized, for our dynamically created Jenkins agents to connect back on the master using JNLP. In both cases, Jenkins is behind an nginx proxy is not reachable directly. Up until 2.248, the agents were receiving the 2 URLs to connect back on the master: Picked up _JAVA_OPTIONS: -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main createEngine INFO: Setting up agent: jenkins-ops-docs-bpxmmgsmkz Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener <init> INFO: Jenkins agent is running in headless mode. Jul 26, 2020 1:03:39 PM hudson.remoting.Engine startEngine INFO: Using Remoting version: 4.3 Jul 26, 2020 1:03:39 PM hudson.remoting.Engine startEngine WARNING: No Working Directory. Using the legacy JAR Cache location: /home/jenkins/.jenkins/cache/jars Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among [https: //EXTERNAL/jenkins/, http://INTERNAL/jenkins/] Jul 26, 2020 1:03:39 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping] Jul 26, 2020 1:03:39 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver isPortVisible WARNING: Connection refused (Connection refused) Jul 26, 2020 1:03:39 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping] Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Agent discovery successful Agent address: INTERNAL Agent port: 36921 Identity: 58:e8:9a:bd:ce:d2:c3:7f:d4:33:e3:cc:35:7d:15:a4 Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Handshaking Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Connecting to INTERNAL:36921 Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Trying protocol: JNLP4-connect Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Remote identity confirmed: 58:e8:9a:bd:ce:d2:c3:7f:d4:33:e3:cc:35:7d:15:a4 Jul 26, 2020 1:03:39 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Connected The first URL ( EXTERNAL ) fails with WARNING: Connection refused (Connection refused) but the agent can then fallback on the second URL, which works. I believe the first one ( EXTERNAL ) doesn't work because the domain name resolve on an IP address where the firewall doesn't open random port. The INTERNAL domain is a different address which doesn't have the same settings. After upgrading to 2.248, only the first address ( EXTERNAL ) is sent to the agent, and the connection always fails and there's no fallback anymore. I'd be happy to update our configuration for the new behavior introduced with this change, but I'm not sure how I should proceed: AFAIK, the "root URL" which is sent to the agent is the "Jenkins URL" has configured in the main configuration panel. I think it's also used for other purpose, such as building URLs which are sent externally (such a build results on GitHub for example?), so we don't really want to change this one and we would like to keep the "official" URL used by our users to be EXTERNAL . BUT, we would like our agents (only) to connect using this INTERNAL URL, but then I'm not sure how (if?) to configure that. In any case, I'm not sure this should be backported as it in a LTS version, as it may break some setups.
            oleg_nenashev Oleg Nenashev added a comment -

            According to JENKINS-63222, this feature is in fact used by at least one plugin as additional option

            oleg_nenashev Oleg Nenashev added a comment - According to  JENKINS-63222 , this feature is in fact used by at least one plugin as additional option
            integer Kanstantsin Shautsou added a comment - - edited

            Having multiple urls is one of the things needed for HA scenario, even if master is a singleton. Once first failure the second could be run on the same JENKINS_HOME (hello CBE features?) and slaves should have ability to re-connect to master directly (no need to proxy via nginx etc internal isolated infrastructure). In general unability to control what urls as sent is a problem, but once "first external" is usually filtered on firewall with fast tcp reject it's wasn't a problem.

            integer Kanstantsin Shautsou added a comment - - edited Having multiple urls is one of the things needed for HA scenario, even if master is a singleton. Once first failure the second could be run on the same JENKINS_HOME (hello CBE features?) and slaves should have ability to re-connect to master directly (no need to proxy via nginx etc internal isolated infrastructure). In general unability to control what urls as sent is a problem, but once "first external" is usually filtered on firewall with fast tcp reject it's wasn't a problem.
            jglick Jesse Glick added a comment -

            Kubernetes handles failover automatically (and CloudBees CI uses that ability), but there is no need for multiple URLs—the Service routes requests to the active pod. You can set up the same manually. You can use an alternate cluster-internal URL to bypass ingress, but this does not mean multiple -url arguments, just a different one, as described in JENKINS-63222 w.r.t. the kubernetes plugin.

            Again, if your architecture specifically requires multiple URLs with dynamic fallback, you can still do that with TCP agents, using the lower-level and more explicit launch mode. This change (JENKINS-63014) merely removes a heuristic from the higher-level *.jnlp launch mode. Many Jenkins web features will not work correctly if you actually access them via nonstandard URLs, but TCP agents are a special case in that the HTTP request is used solely to grab a host name and port from response headers.

            jglick Jesse Glick added a comment - Kubernetes handles failover automatically (and CloudBees CI uses that ability), but there is no need for multiple URLs—the Service routes requests to the active pod. You can set up the same manually. You can use an alternate cluster-internal URL to bypass ingress, but this does not mean multiple -url arguments, just a different one, as described in JENKINS-63222 w.r.t. the kubernetes plugin. Again, if your architecture specifically requires multiple URLs with dynamic fallback, you can still do that with TCP agents, using the lower-level and more explicit launch mode. This change ( JENKINS-63014 ) merely removes a heuristic from the higher-level *.jnlp launch mode. Many Jenkins web features will not work correctly if you actually access them via nonstandard URLs, but TCP agents are a special case in that the HTTP request is used solely to grab a host name and port from response headers.

            People

              jglick Jesse Glick
              balazs_varnai Balazs Varnai
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: