Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-39150

Improve remoting channel diagnostics in Support Core

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      In order to diagnose a Jenkins master that developed a remoting related problem (such as channel clogging), we want remoting to be able to provide detailed statistics for a channel, and support core plugin to be able to pull this information into a bundle.

        Attachments

          Activity

          Hide
          kohsuke Kohsuke Kawaguchi added a comment -

          As a context, this need came up while analyzing a situation developed in one of the CloudBees' customer's Jenkins

          Show
          kohsuke Kohsuke Kawaguchi added a comment - As a context, this need came up while analyzing a situation developed in one of the CloudBees' customer's Jenkins
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          src/main/java/hudson/remoting/Channel.java
          src/test/java/hudson/remoting/ChannelTest.java
          http://jenkins-ci.org/commit/remoting/522a022ae961d31b7b4346f90d04c3d08115e7f7
          Log:
          JENKINS-39150 expose diagnostics across all the channels

          To be used by support-core, we need to be able to enumerate all active
          channels. We do this via WeakHashMap so that references get
          automatically garbage collected.

          Unclosed channel will remain in memory forever, which also helps us find
          those leaks.

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/hudson/remoting/Channel.java src/test/java/hudson/remoting/ChannelTest.java http://jenkins-ci.org/commit/remoting/522a022ae961d31b7b4346f90d04c3d08115e7f7 Log: JENKINS-39150 expose diagnostics across all the channels To be used by support-core, we need to be able to enumerate all active channels. We do this via WeakHashMap so that references get automatically garbage collected. Unclosed channel will remain in memory forever, which also helps us find those leaks.
          Hide
          kohsuke Kohsuke Kawaguchi added a comment -

          Fix ready

          Show
          kohsuke Kohsuke Kawaguchi added a comment - Fix ready
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          src/main/java/hudson/remoting/Channel.java
          src/main/java/hudson/remoting/Command.java
          src/test/java/hudson/remoting/ChannelTest.java
          http://jenkins-ci.org/commit/remoting/f9115111174fdac609ce16bb93fb1503b541a918
          Log:
          Merge pull request #122 from jenkinsci/JENKINS-39150

          JENKINS-39150 expose diagnostics across all the channels

          Compare: https://github.com/jenkinsci/remoting/compare/0c7df253043c...f9115111174f

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/hudson/remoting/Channel.java src/main/java/hudson/remoting/Command.java src/test/java/hudson/remoting/ChannelTest.java http://jenkins-ci.org/commit/remoting/f9115111174fdac609ce16bb93fb1503b541a918 Log: Merge pull request #122 from jenkinsci/ JENKINS-39150 JENKINS-39150 expose diagnostics across all the channels Compare: https://github.com/jenkinsci/remoting/compare/0c7df253043c...f9115111174f
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          pom.xml
          src/main/java/com/cloudbees/jenkins/support/impl/RemotingDiagnostics.java
          http://jenkins-ci.org/commit/support-core-plugin/d5b009c20f80839b49641d0419f6475e1cce513f
          Log:
          JENKINS-39150 report remoting diagnostics when it's available

          Since this plugin cannot assume the version of core, access the method
          in question via reflection (and also report the failure to find that
          method, so that we have evidence either way.)

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: pom.xml src/main/java/com/cloudbees/jenkins/support/impl/RemotingDiagnostics.java http://jenkins-ci.org/commit/support-core-plugin/d5b009c20f80839b49641d0419f6475e1cce513f Log: JENKINS-39150 report remoting diagnostics when it's available Since this plugin cannot assume the version of core, access the method in question via reflection (and also report the failure to find that method, so that we have evidence either way.)
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          src/main/java/com/cloudbees/jenkins/support/impl/RemotingDiagnostics.java
          http://jenkins-ci.org/commit/support-core-plugin/1f43ffb1e0cdf7de4ab1090aedf2b623f151e006
          Log:
          JENKINS-39150 report remoting diagnostics when it's available

          Since this plugin cannot assume the version of core, access the method
          in question via reflection (and also report the failure to find that
          method, so that we have evidence either way.)

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/com/cloudbees/jenkins/support/impl/RemotingDiagnostics.java http://jenkins-ci.org/commit/support-core-plugin/1f43ffb1e0cdf7de4ab1090aedf2b623f151e006 Log: JENKINS-39150 report remoting diagnostics when it's available Since this plugin cannot assume the version of core, access the method in question via reflection (and also report the failure to find that method, so that we have evidence either way.)
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Oleg Nenashev
          Path:
          pom.xml
          src/main/java/hudson/remoting/Channel.java
          http://jenkins-ci.org/commit/remoting/9dc931703eb89ceb7608b88c0dd04da36f528a3b
          Log:
          JENKINS-39150 - API stabilization && compliance with the compatibility policy (#125)

          • JENKINS-39150 - Restrict the newly introduced APIs in stable-2.x branch
          • JENKINS-39150 - Fix error processing in the diagnostics dump API
          • JENKINS-39150 - Even more diagnostics according to comments from @stephenc

          Addresses comments from @oleg-nenashev and @rsandell

          • JENKINS-39150 - Address comment from @oleg-nenashev about conflict with #109
          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: pom.xml src/main/java/hudson/remoting/Channel.java http://jenkins-ci.org/commit/remoting/9dc931703eb89ceb7608b88c0dd04da36f528a3b Log: JENKINS-39150 - API stabilization && compliance with the compatibility policy (#125) JENKINS-39150 - Restrict the newly introduced APIs in stable-2.x branch JENKINS-39150 - Fix error processing in the diagnostics dump API JENKINS-39150 - Fix Typos JENKINS-39150 - Even more diagnostics according to comments from @stephenc JENKINS-39150 - Statistic counters are long volatile now Addresses comments from @oleg-nenashev and @rsandell JENKINS-39150 - Address comment from @oleg-nenashev about conflict with #109 JENKINS-39150 - Fix another typo
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          Remoting fix has been integrated towards 2.62.3 and 3.1

          Show
          oleg_nenashev Oleg Nenashev added a comment - Remoting fix has been integrated towards 2.62.3 and 3.1
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Oleg Nenashev
          Path:
          pom.xml
          http://jenkins-ci.org/commit/jenkins/7a948d399585d201c4132597aed5723a495acf69
          Log:
          Update remoting to 2.31 in the Jenkins core. (#2628)

          The change introduces one serious bugfix (JENKINS-39596) and a bunch of various diagnostics improvements.

          Bugfixes:

          Improvements:

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: pom.xml http://jenkins-ci.org/commit/jenkins/7a948d399585d201c4132597aed5723a495acf69 Log: Update remoting to 2.31 in the Jenkins core. (#2628) The change introduces one serious bugfix ( JENKINS-39596 ) and a bunch of various diagnostics improvements. Bugfixes: JENKINS-39596 ( https://issues.jenkins-ci.org/browse/JENKINS-39596 ) - Jenkins URL in `hudson.remoting.Engine` was always `null` since `3.0`. It was causing connection failures of Jenkins JNLP agents when using Java Web Start. ( PR #131 ( https://github.com/jenkinsci/remoting/pull/131 )) JENKINS-39617 ( https://issues.jenkins-ci.org/browse/JENKINS-39617 ) - `hudson.remoting.Engine` was failing to establish connection if one of the URLs parameter in parameters was malformed. ( PR #131 ( https://github.com/jenkinsci/remoting/pull/131 )) Improvements: JENKINS-39150 ( https://issues.jenkins-ci.org/browse/JENKINS-39150 ) - Add logic for dumping diagnostics across all the channels. ( PR #122 ( https://github.com/jenkinsci/remoting/pull/122 ), PR #125 ( https://github.com/jenkinsci/remoting/pull/125 )) JENKINS-39543 ( https://issues.jenkins-ci.org/browse/JENKINS-39543 ) - Improve the caller/callee correlation diagnostics in thread dumps. ( PR #119 ( https://github.com/jenkinsci/remoting/pull/119 )) JENKINS-39290 ( https://issues.jenkins-ci.org/browse/JENKINS-39290 ) - Add the `org.jenkinsci.remoting.nio.NioChannelHub.disabled` flag for disabling NIO (mostly for debugging purposes). ( PR #123 ( https://github.com/jenkinsci/remoting/pull/123 )) JENKINS-38692 ( https://issues.jenkins-ci.org/browse/JENKINS-38692 ) - Add extra logging to help diagnosing `IOHub` Thread spikes. ( PR #116 ( https://github.com/jenkinsci/remoting/pull/116 )) JENKINS-39289 ( https://issues.jenkins-ci.org/browse/JENKINS-39289 ) - When a proxy fails, report what caused the channel to go down. ( PR #128 ( https://github.com/jenkinsci/remoting/pull/128 ))
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          The fix on the remoting side has been integrated towards 2.31

          Show
          oleg_nenashev Oleg Nenashev added a comment - The fix on the remoting side has been integrated towards 2.31
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          src/main/java/com/cloudbees/jenkins/support/impl/RemotingDiagnostics.java
          http://jenkins-ci.org/commit/support-core-plugin/5f2f23de4e674a6ca22a1a0578b9c0ffa7a58c7a
          Log:
          Merge pull request #78 from jenkinsci/JENKINS-39150

          [FIXED JENKINS-39150] report remoting diagnostics when it's available

          Compare: https://github.com/jenkinsci/support-core-plugin/compare/c05e6e1026fa...5f2f23de4e67

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/com/cloudbees/jenkins/support/impl/RemotingDiagnostics.java http://jenkins-ci.org/commit/support-core-plugin/5f2f23de4e674a6ca22a1a0578b9c0ffa7a58c7a Log: Merge pull request #78 from jenkinsci/ JENKINS-39150 [FIXED JENKINS-39150] report remoting diagnostics when it's available Compare: https://github.com/jenkinsci/support-core-plugin/compare/c05e6e1026fa...5f2f23de4e67

            People

            Assignee:
            kohsuke Kohsuke Kawaguchi
            Reporter:
            kohsuke Kohsuke Kawaguchi
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: