Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-62146

Cache does not work with Slaves nodes

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Critical
    • Resolution: Fixed
    • Component/s: adoptopenjdk-plugin
    • Labels:
      None
    • Environment:
      Jenkins 2.222.3
      multiple Jenkins Slave on Windows
      multiple Jenkins Slave on Linux
    • Similar Issues:

      Description

      We are testing our code for JDK11. I setup a Tool to use JDK 11

      We run a build on jenkins nodeA for the first time. It install the JDK 11 as expected.

      We run a build on jenkins nodeB and I got this error:

      Installing AdoptOpenJDK to /var/lib/jenkins/tools/hudson.model.JDK/JDK_11
      ERROR: Failed to download file:/var/lib/jenkins/caches/adoptopenjdk/LINUX/amd64/jdk-11.0.7+10.zip from agent; will retry from master
      Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from prd-cm-as-06.fx.lan/10.1.3.105:58658
      		at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1788)
      		at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:356)
      		at hudson.remoting.Channel.call(Channel.java:998)
      		at hudson.FilePath.act(FilePath.java:1069)
      		at hudson.FilePath.act(FilePath.java:1058)
      		at hudson.FilePath.installIfNecessaryFrom(FilePath.java:914)
      		at hudson.FilePath.installIfNecessaryFrom(FilePath.java:850)
      		at io.jenkins.plugins.adoptopenjdk.AdoptOpenJDKInstaller.performInstallation(AdoptOpenJDKInstaller.java:121)
      		at hudson.tools.InstallerTranslator.getToolHome(InstallerTranslator.java:69)
      		at hudson.tools.ToolLocationNodeProperty.getToolHome(ToolLocationNodeProperty.java:109)
      		at hudson.tools.ToolInstallation.translateFor(ToolInstallation.java:206)
      		at hudson.model.JDK.forNode(JDK.java:148)
      		at hudson.model.JDK.forNode(JDK.java:60)
      		at org.jenkinsci.plugins.workflow.steps.ToolStep$Execution.run(ToolStep.java:152)
      		at org.jenkinsci.plugins.workflow.steps.ToolStep$Execution.run(ToolStep.java:133)
      		at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
      		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      		at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      		at java.lang.Thread.run(Thread.java:745)
      java.io.FileNotFoundException: /var/lib/jenkins/caches/adoptopenjdk/LINUX/amd64/jdk-11.0.7+10.zip (No such file or directory)
      	at java.io.FileInputStream.open0(Native Method)
      	at java.io.FileInputStream.open(FileInputStream.java:195)
      	at java.io.FileInputStream.<init>(FileInputStream.java:138)
      	at java.io.FileInputStream.<init>(FileInputStream.java:93)
      	at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
      	at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
      	at java.net.URL.openStream(URL.java:1045)
      	at hudson.FilePath$Unpack.invoke(FilePath.java:948)
      	at hudson.FilePath$Unpack.invoke(FilePath.java:942)
      	at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3069)
      	at hudson.remoting.UserRequest.perform(UserRequest.java:212)
      	at hudson.remoting.UserRequest.perform(UserRequest.java:54)
      	at hudson.remoting.Request$2.run(Request.java:369)
      	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:93)
      	at java.lang.Thread.run(Thread.java:748)

      The problem seems to be at code line 121

      expected.getParent().installIfNecessaryFrom(cache.toURI().toURL(), log, ..)
      

      cache it's a file on master node, the installer is performing a MasterToSlave callable using a file:// URL to get a resource from master to slave. This can not work, should be something http://jenkins-master/context/.... otherwise file content should be streamed by the callable.

      I see also a concurrent issue at line 137

      Path tmp = new File( cache.getPath()+".tmp").toPath();
      

      Two different jenkins slave on linux 64 than run a build for the first time will use the same filename tmp.

      I remember jenkins handled installers using a semphore, but I do not remember if this semaphore is for each Node or for all nodes. Latest case do not requires handle concurrency in the installer.

       

      The JVM property to disable cache requires open an IT tickets. Since the default Oracle JDK installer does not use cache and cached files must be cleanup manually with a SSH session on master node to free spaces I think coudl be better change default to disabled.

        Attachments

          Issue Links

            Activity

            Hide
            nfalco Nikolas Falco added a comment - - edited

            Other side effect is that when fails it retry twice download but before the entire "/var/lib/jenkins/tools/hudson.model.JDK" folder content is deleted.

            This causes other builds stops and also redownload other JDKs again.

            Show
            nfalco Nikolas Falco added a comment - - edited Other side effect is that when fails it retry twice download but before the entire "/var/lib/jenkins/tools/hudson.model.JDK" folder content is deleted. This causes other builds stops and also redownload other JDKs again.
            Hide
            mmchr Mads Mohr Christensen added a comment -

            I have tried to replicate the error and for me the cache seems to work. The error message in this issue looks like the same noise as reported in JENKINS-61913 and not actually a hard error.

            What I did to test the cache:

            1. Started local master
            2. Booted a centos/7 VM using Vagrant
            3. Setup a slave over SSH
            4. Started a job that used a JDK tool to setup the initial cache on master
            5. Disabled network on host computer
            6. Deleted tool installation on slave
            7. Started job using JDK tool to download cache from master

            This downloaded the JDK tool from the cache on the master for me. 

             

            Show
            mmchr Mads Mohr Christensen added a comment - I have tried to replicate the error and for me the cache seems to work. The error message in this issue looks like the same noise as reported in  JENKINS-61913 and not actually a hard error. What I did to test the cache: Started local master Booted a centos/7 VM using Vagrant Setup a slave over SSH Started a job that used a JDK tool to setup the initial cache on master Disabled network on host computer Deleted tool installation on slave Started job using JDK tool to download cache from master This downloaded the JDK tool from the cache on the master for me.   
            Hide
            nfalco Nikolas Falco added a comment -

            In our environment always happens. The inplementation is similar (but quite different) than Oracle plugin. In Oracle plugin never happens FileNotFound issue. I will debug both to find the difference in behavour

            Show
            nfalco Nikolas Falco added a comment - In our environment always happens. The inplementation is similar (but quite different) than Oracle plugin. In Oracle plugin never happens FileNotFound issue. I will debug both to find the difference in behavour
            Hide
            nfalco Nikolas Falco added a comment -

            In our cases slaves are indipendent not handled by master (we can not)

            Show
            nfalco Nikolas Falco added a comment - In our cases slaves are indipendent not handled by master (we can not)
            Hide
            mmchr Mads Mohr Christensen added a comment -

            The issue reported in https://issues.jenkins-ci.org/browse/JENKINS-62146?focusedCommentId=390042&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-390042 now has its own jira-issue: JENKINS-63191

            The solution to that issue might also solve this issue as well. Perhaps you could test it? It's https://github.com/jenkinsci/adoptopenjdk-plugin/pull/11

            Show
            mmchr Mads Mohr Christensen added a comment - The issue reported in https://issues.jenkins-ci.org/browse/JENKINS-62146?focusedCommentId=390042&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-390042  now has its own jira-issue:  JENKINS-63191 The solution to that issue might also solve this issue as well. Perhaps you could test it? It's  https://github.com/jenkinsci/adoptopenjdk-plugin/pull/11
            Hide
            nfalco Nikolas Falco added a comment -

            PR tested locally. Now it works as expected, no more error log and it does not wipeout other installed JDK tools

            Show
            nfalco Nikolas Falco added a comment - PR tested locally. Now it works as expected, no more error log and it does not wipeout other installed JDK tools
            Hide
            mmchr Mads Mohr Christensen added a comment -

            Fix included in v1.3

            Show
            mmchr Mads Mohr Christensen added a comment - Fix included in v1.3

              People

              Assignee:
              mmchr Mads Mohr Christensen
              Reporter:
              nfalco Nikolas Falco
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: