Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-17566

Severe polling error when using script trigger

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • scripttrigger-plugin
    • None
    • Scientific Linux 5.5 and Ubuntu 10.10
      Arch: x86_64
      Jenkins 1.480.3 running in standalone with Winstone

    Description

      I find the following in the script trigger logs for the attached test job configuration:
      [ERROR] - SEVERE - Polling error null

      The error disappears when the following tags are removed from the file TestJob/builds/2013-04-10_12-34-47/build.xml:
      <org.jenkinsci.plugins.envinject.EnvInjectPluginAction>
      <build class="build" reference="../../.."/>
      </org.jenkinsci.plugins.envinject.EnvInjectPluginAction>

      Attachments

        Activity

          aszostak Artur Szostak created issue -

          Please let me know your scripttrigger Jenkins plugin version and your EnvInject Jenkins plugin version?
          Thanks

          gbois Gregory Boissinot added a comment - Please let me know your scripttrigger Jenkins plugin version and your EnvInject Jenkins plugin version? Thanks
          aszostak Artur Szostak added a comment -

          Scripttrigger version is 0.22
          EnvInject version is 0.43

          aszostak Artur Szostak added a comment - Scripttrigger version is 0.22 EnvInject version is 0.43
          eldada Eldad Assis added a comment - - edited

          @Arthur - I also have this problem, but I don't have the lines you mentioned in the last build's build.xml.

          I found that manually triggering the build will release the scripttrigger error until the next time I restart or reload Jenkins.

          BTW - my environment is Jenkins 1.510 with ScriptTrigger 0.22, running on a windows server 2008 (trigger on a CentOS 6 slave) and another instance running on a windows server 2003 (trigger on master).

          eldada Eldad Assis added a comment - - edited @Arthur - I also have this problem, but I don't have the lines you mentioned in the last build's build.xml. I found that manually triggering the build will release the scripttrigger error until the next time I restart or reload Jenkins. BTW - my environment is Jenkins 1.510 with ScriptTrigger 0.22, running on a windows server 2008 (trigger on a CentOS 6 slave) and another instance running on a windows server 2003 (trigger on master).

          @Arthur
          I think it is not tie directly to the EnvInject plugin.
          Could you try to reproduce the issue in a simple Jenkins instance?

          gbois Gregory Boissinot added a comment - @Arthur I think it is not tie directly to the EnvInject plugin. Could you try to reproduce the issue in a simple Jenkins instance?
          eldada Eldad Assis added a comment -

          @Arthur,
          With the help of Gregory, I was able to build a working version of the plugin.
          He will release a formal release, but you might want to try it yourself by downloading the sources, setting <xtrigger.lib.version>0.19</xtrigger.lib.version> in the pom.xml and building.
          It worked for me.

          eldada Eldad Assis added a comment - @Arthur, With the help of Gregory, I was able to build a working version of the plugin. He will release a formal release, but you might want to try it yourself by downloading the sources, setting <xtrigger.lib.version>0.19</xtrigger.lib.version> in the pom.xml and building. It worked for me.
          gbois Gregory Boissinot made changes -
          Field Original Value New Value
          Status Open [ 1 ] In Progress [ 3 ]

          Please try version 0.23

          gbois Gregory Boissinot added a comment - Please try version 0.23
          thetaphi Uwe Schindler added a comment - - edited

          No change here with 0.23:

          Polling started on Apr 21, 2013 2:42:07 PM
          Polling for the job Lucene-Solr-trunk-Windows
          Looking nodes where the poll can be run.
          Looking for a node to the restricted label master.
          Can't find any eligible nodes.
          Trying to poll on master node.
          
          Polling on master.
          The expected script execution code is 1
          [ERROR] - SEVERE - Polling error null
          

          It partially works. On this Jenkins, also EnvInject is installed.

          One strange thing: The script trigger should run on node "master". The master node already has label "master" so it is not obvious what the previous errors mean. The script is a shell script, doing some "pgrep" stuff to look for already running virtual box VMs.

          thetaphi Uwe Schindler added a comment - - edited No change here with 0.23: Polling started on Apr 21, 2013 2:42:07 PM Polling for the job Lucene-Solr-trunk-Windows Looking nodes where the poll can be run. Looking for a node to the restricted label master. Can't find any eligible nodes. Trying to poll on master node. Polling on master. The expected script execution code is 1 [ERROR] - SEVERE - Polling error null It partially works. On this Jenkins, also EnvInject is installed. One strange thing: The script trigger should run on node "master". The master node already has label "master" so it is not obvious what the previous errors mean. The script is a shell script, doing some "pgrep" stuff to look for already running virtual box VMs.
          aszostak Artur Szostak added a comment -

          I have been trying to recreate the problem on a small jenkins setup will little success. So I went back to my full setup to check if the Java version has any effect. i.e. used 1.7 verse 1.6. But the same problem persisted. I then upgraded the script trigger plugin to 0.23 and saw a change but not a fix. Had to upgrade the Jenkins.war to 1.512 with scripttrigger 0.23 before the problem went away.

          aszostak Artur Szostak added a comment - I have been trying to recreate the problem on a small jenkins setup will little success. So I went back to my full setup to check if the Java version has any effect. i.e. used 1.7 verse 1.6. But the same problem persisted. I then upgraded the script trigger plugin to 0.23 and saw a change but not a fix. Had to upgrade the Jenkins.war to 1.512 with scripttrigger 0.23 before the problem went away.
          thetaphi Uwe Schindler added a comment -

          I have ScriptTrigger 0.23 running with Jenkins 1.512. Still the same issue. Sometimes it works (looks like it works after startup of Jenkins), but at some time it stops working and the above message is in the log. Unfortunately there is nothing else in Jenkin's main logs, so I have nothing else (like a stack trace).

          thetaphi Uwe Schindler added a comment - I have ScriptTrigger 0.23 running with Jenkins 1.512. Still the same issue. Sometimes it works (looks like it works after startup of Jenkins), but at some time it stops working and the above message is in the log. Unfortunately there is nothing else in Jenkin's main logs, so I have nothing else (like a stack trace).

          Please test error stack trace (on master and slave) with ScriptTrigger 0.24.
          I tried to improve error logging.

          gbois Gregory Boissinot added a comment - Please test error stack trace (on master and slave) with ScriptTrigger 0.24. I tried to improve error logging.
          thetaphi Uwe Schindler added a comment -

          Will do once installed! Thanks, Uwe

          thetaphi Uwe Schindler added a comment - Will do once installed! Thanks, Uwe
          thetaphi Uwe Schindler added a comment - - edited

          The plugin was updates to 0.24 and Jenkins to 1.513. The message changed a little bit, but no additional information:

          [ScriptTrigger] - Poll with a shell or batch script
          
          Polling started on Apr 29, 2013 7:12:13 AM
          Polling for the job Lucene-Solr-trunk-Windows
          Looking nodes where the poll can be run.
          Looking for a node to the restricted label master.
          Can't find any eligible nodes.
          Trying to poll on master node.
          
          Polling on master.
          The expected script execution code is 1
          [ERROR] - Polling error...
          

          In the general Jenkins main log file is no message at all about an error, just in the "Script Trigger Log" of the job.

          thetaphi Uwe Schindler added a comment - - edited The plugin was updates to 0.24 and Jenkins to 1.513. The message changed a little bit, but no additional information: [ScriptTrigger] - Poll with a shell or batch script Polling started on Apr 29, 2013 7:12:13 AM Polling for the job Lucene-Solr-trunk-Windows Looking nodes where the poll can be run. Looking for a node to the restricted label master. Can't find any eligible nodes. Trying to poll on master node. Polling on master. The expected script execution code is 1 [ERROR] - Polling error... In the general Jenkins main log file is no message at all about an error, just in the "Script Trigger Log" of the job.
          thetaphi Uwe Schindler added a comment -

          One detail about this configuration:

          • At the time of the polling, the slave node is not yet running (it is a VirtualBox VM started only when jobs are running), so when scripttrigger triggers the build, the job should be queued and the slave node starts up because of this triggering. Maybe this is the reason for the "Can't find eligible nodes" error. But scripttrigger is configured to run on node with label "master", so the child should not be involved.
          • The shell script behind script trigger is just a check ("pgrep ..."), if other VirtualBox VMs are running at the same time, so scripttrigger will not trigger the job, if another VM is running concurrently (because too many VMs cannot be handled by the server it is running on).

          The whole setup works sometimes, but in most cases you have to trigger the job manually.

          thetaphi Uwe Schindler added a comment - One detail about this configuration: At the time of the polling, the slave node is not yet running (it is a VirtualBox VM started only when jobs are running), so when scripttrigger triggers the build, the job should be queued and the slave node starts up because of this triggering. Maybe this is the reason for the "Can't find eligible nodes" error. But scripttrigger is configured to run on node with label "master", so the child should not be involved. The shell script behind script trigger is just a check ("pgrep ..."), if other VirtualBox VMs are running at the same time, so scripttrigger will not trigger the job, if another VM is running concurrently (because too many VMs cannot be handled by the server it is running on). The whole setup works sometimes, but in most cases you have to trigger the job manually.
          thetaphi Uwe Schindler added a comment -

          FYI: I changed the polling to use a groovy "system script", its seems to work better. By checking "system script" it always runs on master. My checks can also be done with Grovvy (it just needs to look if other virtual machine slaves are running). I will report back if this works.

          Otherwise: There should be somehow a setting for shell scripts to be run always on master (like system script). I think the NULL pointer exception may be caused by the fact that the node is not running when script trigger wants to run the shell script.

          thetaphi Uwe Schindler added a comment - FYI: I changed the polling to use a groovy "system script", its seems to work better. By checking "system script" it always runs on master. My checks can also be done with Grovvy (it just needs to look if other virtual machine slaves are running). I will report back if this works. Otherwise: There should be somehow a setting for shell scripts to be run always on master (like system script). I think the NULL pointer exception may be caused by the fact that the node is not running when script trigger wants to run the shell script.
          thetaphi Uwe Schindler added a comment -

          The problem also occurs with Groovy triggers:

          [ScriptTrigger] - Poll with a Groovy script
          
          Polling started on Apr 29, 2013 1:08:13 PM
          Polling for the job Lucene-Solr-trunk-Windows
          Looking nodes where the poll can be run.
          Looking for a candidate node to run the poll.
          Trying to find an eligible node with the assigned project label windows.
          Can't find any eligible nodes.
          Trying to poll on master node.
          
          Polling on master.
          [ERROR] - Polling error...
          

          So it does not matter if its a shell script or a groovy script.

          thetaphi Uwe Schindler added a comment - The problem also occurs with Groovy triggers: [ScriptTrigger] - Poll with a Groovy script Polling started on Apr 29, 2013 1:08:13 PM Polling for the job Lucene-Solr-trunk-Windows Looking nodes where the poll can be run. Looking for a candidate node to run the poll. Trying to find an eligible node with the assigned project label windows. Can't find any eligible nodes. Trying to poll on master node. Polling on master. [ERROR] - Polling error... So it does not matter if its a shell script or a groovy script.

          Code changed in jenkins
          User: Gregory Boissinot
          Path:
          src/main/java/org/jenkinsci/plugins/scripttrigger/ScriptTriggerExecutor.java
          src/main/java/org/jenkinsci/plugins/scripttrigger/groovy/GroovyScriptTrigger.java
          http://jenkins-ci.org/commit/scripttrigger-plugin/2a270bb56f2be489b2729229c7d66a41b2e5c14a
          Log:
          Add check if node is offline - Try to fix JENKINS-17566

          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Gregory Boissinot Path: src/main/java/org/jenkinsci/plugins/scripttrigger/ScriptTriggerExecutor.java src/main/java/org/jenkinsci/plugins/scripttrigger/groovy/GroovyScriptTrigger.java http://jenkins-ci.org/commit/scripttrigger-plugin/2a270bb56f2be489b2729229c7d66a41b2e5c14a Log: Add check if node is offline - Try to fix JENKINS-17566
          gbois Gregory Boissinot made changes -
          Resolution Fixed [ 1 ]
          Status In Progress [ 3 ] Resolved [ 5 ]
          thetaphi Uwe Schindler added a comment -

          Still same problem with latest version.

          I have seen your changes, but I am not sure what those intend to do (I have no properties file set...).

          [ScriptTrigger] - Poll with a Groovy script
          
          Polling started on May 4, 2013 9:38:01 AM
          Polling for the job Lucene-Solr-trunk-Windows
          Looking nodes where the poll can be run.
          Looking for a candidate node to run the poll.
          Trying to find an eligible node with the assigned project label windows.
          Can't find any eligible nodes.
          Trying to poll on master node.
          
          Polling on master.
          [ERROR] - Polling error...
          

          Is there no possibility in scripttrigger to make the plugin execute the polling script on the master (something like a checkbox: [X] always run script on master node).

          thetaphi Uwe Schindler added a comment - Still same problem with latest version. I have seen your changes, but I am not sure what those intend to do (I have no properties file set...). [ScriptTrigger] - Poll with a Groovy script Polling started on May 4, 2013 9:38:01 AM Polling for the job Lucene-Solr-trunk-Windows Looking nodes where the poll can be run. Looking for a candidate node to run the poll. Trying to find an eligible node with the assigned project label windows. Can't find any eligible nodes. Trying to poll on master node. Polling on master. [ERROR] - Polling error... Is there no possibility in scripttrigger to make the plugin execute the polling script on the master (something like a checkbox: [X] always run script on master node).
          thetaphi Uwe Schindler made changes -
          Resolution Fixed [ 1 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          thetaphi Uwe Schindler added a comment -

          Hi,
          I think the main problem here is:

          • the plugin should have an option to run the script on the master node only
          • the script should be completely independent on the status of nodes: The script should run on master node with complete access to the job and should only trigger a build of the job depending on script execution result on jenkins master

          If this is not an intended use-case of this plugin it is not the right plugin to make the requirement work, so I have no use case for it and need to uninstall it and find another solution to trigger jobs

          thetaphi Uwe Schindler added a comment - Hi, I think the main problem here is: the plugin should have an option to run the script on the master node only the script should be completely independent on the status of nodes: The script should run on master node with complete access to the job and should only trigger a build of the job depending on script execution result on jenkins master If this is not an intended use-case of this plugin it is not the right plugin to make the requirement work, so I have no use case for it and need to uninstall it and find another solution to trigger jobs

          From 0.26, you are able to put the label 'master' for restricting the poll on master.
          In addition, for polling on a slave node, the need to be active to do the work.

          Please test the version 0.26 if it suits you.
          Thanks

          gbois Gregory Boissinot added a comment - From 0.26, you are able to put the label 'master' for restricting the poll on master. In addition, for polling on a slave node, the need to be active to do the work. Please test the version 0.26 if it suits you. Thanks
          gbois Gregory Boissinot made changes -
          Resolution Fixed [ 1 ]
          Status Reopened [ 4 ] Resolved [ 5 ]
          thetaphi Uwe Schindler added a comment -

          Hi,

          I checked the new version. The trigger now definitely runs on the master, the corresponding messages are printed. But it still ends with "[ERROR] - Polling error", unfortunately it prints neither the exception or the stack trace to the log.

          I then reviewed the xtrigger source code and noticed that the method that prints the stack trace is only printing the Exception message and the stack trace goes to System.err.

          I then opened the servlet container's (winstones) output and found the root cause:

          java.lang.NullPointerException
                  at org.jenkinsci.lib.envinject.service.EnvVarsResolver.gatherEnvVarsNode(EnvVarsResolver.java:160)
                  at org.jenkinsci.lib.envinject.service.EnvVarsResolver.getDefaultEnvVarsJob(EnvVarsResolver.java:89)
                  at org.jenkinsci.lib.envinject.service.EnvVarsResolver.getEnVars(EnvVarsResolver.java:68)
                  at org.jenkinsci.lib.envinject.service.EnvVarsResolver.getPollingEnvVars(EnvVarsResolver.java:36)
                  at org.jenkinsci.plugins.scripttrigger.groovy.GroovyScriptTrigger.checkIfModified(GroovyScriptTrigger.java:140)
                  at org.jenkinsci.lib.xtrigger.AbstractTrigger$Runner.run(AbstractTrigger.java:198)
                  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
                  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
                  at java.lang.Thread.run(Thread.java:722)
          

          So the following things have to be checked:

          • What causes the NPE (maybe the EnvVarsResolver also uses the Node, which is offline). I run the script as system script - why is any node environment involved?
          • Fix the logging in the XTrigger lib to be more informative on unexpected Exceptions.

          Uwe

          thetaphi Uwe Schindler added a comment - Hi, I checked the new version. The trigger now definitely runs on the master, the corresponding messages are printed. But it still ends with " [ERROR] - Polling error", unfortunately it prints neither the exception or the stack trace to the log. I then reviewed the xtrigger source code and noticed that the method that prints the stack trace is only printing the Exception message and the stack trace goes to System.err. I then opened the servlet container's (winstones) output and found the root cause: java.lang.NullPointerException at org.jenkinsci.lib.envinject.service.EnvVarsResolver.gatherEnvVarsNode(EnvVarsResolver.java:160) at org.jenkinsci.lib.envinject.service.EnvVarsResolver.getDefaultEnvVarsJob(EnvVarsResolver.java:89) at org.jenkinsci.lib.envinject.service.EnvVarsResolver.getEnVars(EnvVarsResolver.java:68) at org.jenkinsci.lib.envinject.service.EnvVarsResolver.getPollingEnvVars(EnvVarsResolver.java:36) at org.jenkinsci.plugins.scripttrigger.groovy.GroovyScriptTrigger.checkIfModified(GroovyScriptTrigger.java:140) at org.jenkinsci.lib.xtrigger.AbstractTrigger$Runner.run(AbstractTrigger.java:198) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) So the following things have to be checked: What causes the NPE (maybe the EnvVarsResolver also uses the Node, which is offline). I run the script as system script - why is any node environment involved? Fix the logging in the XTrigger lib to be more informative on unexpected Exceptions. Uwe
          thetaphi Uwe Schindler made changes -
          Resolution Fixed [ 1 ]
          Status Resolved [ 5 ] Reopened [ 4 ]

          Code changed in jenkins
          User: Gregory Boissinot
          Path:
          src/main/java/org/jenkinsci/lib/envinject/service/EnvVarsResolver.java
          http://jenkins-ci.org/commit/envinject-lib/c6ce1fe1d0f1d00eb55e9c349f591a2c4ef764f6
          Log:
          Fix JENKINS-17566

          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Gregory Boissinot Path: src/main/java/org/jenkinsci/lib/envinject/service/EnvVarsResolver.java http://jenkins-ci.org/commit/envinject-lib/c6ce1fe1d0f1d00eb55e9c349f591a2c4ef764f6 Log: Fix JENKINS-17566

          Code changed in jenkins
          User: Gregory Boissinot
          Path:
          pom.xml
          http://jenkins-ci.org/commit/scripttrigger-plugin/304d50587c698f157438ae5d497875b17eb4c03e
          Log:
          Upgrade to xtrigger-lib 0.23 for JENKINS-17566

          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Gregory Boissinot Path: pom.xml http://jenkins-ci.org/commit/scripttrigger-plugin/304d50587c698f157438ae5d497875b17eb4c03e Log: Upgrade to xtrigger-lib 0.23 for JENKINS-17566

          Please test from scripttrigger 0.27

          gbois Gregory Boissinot added a comment - Please test from scripttrigger 0.27
          gbois Gregory Boissinot made changes -
          Resolution Fixed [ 1 ]
          Status Reopened [ 4 ] Resolved [ 5 ]
          thetaphi Uwe Schindler added a comment -

          Hi,
          I built the release tag 0.27 from github locally and uploaded the hpi file to jenkins. It seems to work: The node was down and the trigger was able to start it. I will report back if anything else goes wrong.

          Thank you very much, Uwe

          P.S.: (I think next time you don't need to build a new release for every change, testing it with a local build from github is perfectly fine to me!)

          thetaphi Uwe Schindler added a comment - Hi, I built the release tag 0.27 from github locally and uploaded the hpi file to jenkins. It seems to work: The node was down and the trigger was able to start it. I will report back if anything else goes wrong. Thank you very much, Uwe P.S.: (I think next time you don't need to build a new release for every change, testing it with a local build from github is perfectly fine to me!)
          aszostak Artur Szostak added a comment -

          Script trigger 0.28 has been working fine for a while so closing this ticket.

          aszostak Artur Szostak added a comment - Script trigger 0.28 has been working fine for a while so closing this ticket.
          aszostak Artur Szostak made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          rtyler R. Tyler Croy made changes -
          Workflow JNJira [ 148745 ] JNJira + In-Review [ 206574 ]

          People

            gbois Gregory Boissinot
            aszostak Artur Szostak
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: