Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-39547

Corrupt node jar cache causes node to malfunction

      Remoting does not check whether the cached file's checksum matches before using it[1]. User with build permission can use this to inject new classes or different class implementations/resource content manipulating the cache and waiting for agent to restart.

      This can be used for denial of service attack against nodes as they will connect to Jenkins correctly but refuse to fulfill remoting requests with strange exceptions.

      More sophisticated attack can trick agent to execute custom code and therefore produce incorrect results, try to abuse existing {{SlaveToMasterCallable}}s or generate malicious files/reports to be published on masters.

      There do not seem to be a way to load those classes to master JVM or send new callable there if slave2master security is on, AFAIK.

      [1] https://github.com/jenkinsci/remoting/blob/0d8a2af7cffa6aa5a6f2675e810b286a984af04e/src/main/java/hudson/remoting/FileSystemJarCache.java#L65-70

          [JENKINS-39547] Corrupt node jar cache causes node to malfunction

          Daniel Beck added a comment -

          AFAIU this is a problem iff we somehow verify the integrity of the agent jar. And I don't think this currently happens, otherwise s2m security wouldn't be such a big deal.

          stephenconnolly WDYT?

          Daniel Beck added a comment - AFAIU this is a problem iff we somehow verify the integrity of the agent jar. And I don't think this currently happens, otherwise s2m security wouldn't be such a big deal. stephenconnolly WDYT?

          Jesse Glick added a comment -

          Exactly, we do not trust agents generally: SECURITY-144. The only way to trust them is to force actual build steps to run inside a container or at least another user account.

          Jesse Glick added a comment - Exactly, we do not trust agents generally: SECURITY-144. The only way to trust them is to force actual build steps to run inside a container or at least another user account.

          Well the only way to verify the integrity of the agent jar would be to send some code over to the agent jar and run a calculation... but if somebody can replace a cache jar... they can replace the agent jar, or even run the agent jar in a sandbox while intercepting commands.

          Yes we can rely on things like code signing certificates to make it harder, but if you are running on an untrusted agent you have to assume that any response from that agent could have been faked.

          I would thus view this not as a security issue but rather as a user safety issue to protect from corrupted jars in the cache rather than malicious jars in the cache (as the agent jar would be a much more powerful - and easy to use - target for that)

          Stephen Connolly added a comment - Well the only way to verify the integrity of the agent jar would be to send some code over to the agent jar and run a calculation... but if somebody can replace a cache jar... they can replace the agent jar, or even run the agent jar in a sandbox while intercepting commands. Yes we can rely on things like code signing certificates to make it harder, but if you are running on an untrusted agent you have to assume that any response from that agent could have been faked. I would thus view this not as a security issue but rather as a user safety issue to protect from corrupted jars in the cache rather than malicious jars in the cache (as the agent jar would be a much more powerful - and easy to use - target for that)

          Oleg Nenashev added a comment -

          olivergondza Please submit a PR against stable-2.x branch. IMO it deserves backporting

          Oleg Nenashev added a comment - olivergondza Please submit a PR against stable-2.x branch. IMO it deserves backporting

          oleg_nenashev, I prefer not to. The fix is in remoting and it is far from critical to squeeze it there this late in the cycle.

          Oliver Gondža added a comment - oleg_nenashev , I prefer not to. The fix is in remoting and it is far from critical to squeeze it there this late in the cycle.

          Code changed in jenkins
          User: Oliver Gondža
          Path:
          src/main/java/hudson/remoting/Checksum.java
          src/main/java/hudson/remoting/FileSystemJarCache.java
          src/main/java/hudson/remoting/JarLoaderImpl.java
          src/test/java/hudson/remoting/ChecksumTest.java
          src/test/java/hudson/remoting/FileSystemJarCacheTest.java
          http://jenkins-ci.org/commit/remoting/cdd5bce5725d338b129c63c22b8e6a132e77865c
          Log:
          JENKINS-39547 - Corrupt jar cache (#130)

          • Do not cache by URL
          • Do not perform Checksum caching in Checksum class

          We need a way to calculate reliable checksums and as implemented it causes
          temporary write-then-move files to have checksums cached never to use used.

          • [FIXED JENKINS-39547] Verify cached slave jars before using them.

          This moves checksum caching to FileSystemJarCache.

          • Address review comments

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oliver Gondža Path: src/main/java/hudson/remoting/Checksum.java src/main/java/hudson/remoting/FileSystemJarCache.java src/main/java/hudson/remoting/JarLoaderImpl.java src/test/java/hudson/remoting/ChecksumTest.java src/test/java/hudson/remoting/FileSystemJarCacheTest.java http://jenkins-ci.org/commit/remoting/cdd5bce5725d338b129c63c22b8e6a132e77865c Log: JENKINS-39547 - Corrupt jar cache (#130) Do not cache by URL Do not perform Checksum caching in Checksum class We need a way to calculate reliable checksums and as implemented it causes temporary write-then-move files to have checksums cached never to use used. [FIXED JENKINS-39547] Verify cached slave jars before using them. This moves checksum caching to FileSystemJarCache. Address review comments

          Code changed in jenkins
          User: Oleg Nenashev
          Path:
          pom.xml
          http://jenkins-ci.org/commit/jenkins/ef588be4f264b5ba285110f472f031e2bd771c71
          Log:
          Update Jenkins remoting to 3.3 (#2671)

          • JENKINS-25218 - Hardening of FifoBuffer operation logic. The change improves the original fix in `remoting-2.54`.
          • JENKINS-39547 - Corrupt agent JAR cache causes agents to malfunction.

          Improvements:

          • JENKINS-40491 - Improve diagnostincs of the preliminary FifoBuffer termination.
          • ProxyException now retains any suppressed exceptions.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: pom.xml http://jenkins-ci.org/commit/jenkins/ef588be4f264b5ba285110f472f031e2bd771c71 Log: Update Jenkins remoting to 3.3 (#2671) JENKINS-25218 - Hardening of FifoBuffer operation logic. The change improves the original fix in `remoting-2.54`. JENKINS-39547 - Corrupt agent JAR cache causes agents to malfunction. Improvements: JENKINS-40491 - Improve diagnostincs of the preliminary FifoBuffer termination. ProxyException now retains any suppressed exceptions.

          Oleg Nenashev added a comment -

          Released in jenkins-2.37, marking as LTS candidate

          Oleg Nenashev added a comment - Released in jenkins-2.37, marking as LTS candidate

          I did not have a time to get the tests to work https://github.com/jenkinsci/jenkins/pull/2620.
          Postponing for now.

          Oliver Gondža added a comment - I did not have a time to get the tests to work https://github.com/jenkinsci/jenkins/pull/2620 . Postponing for now.

          Expedited to 2.32.2 together with security fixes.

          Oliver Gondža added a comment - Expedited to 2.32.2 together with security fixes.

            olivergondza Oliver Gondža
            olivergondza Oliver Gondža
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: