Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-4756

the EXECUTOR_NUMBER environment variable is not unique

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • ant-plugin
    • None
    • Platform: All, OS: Linux

      When two ant tasks are run in parallel, sometimes the EXECUTOR_NUMBER for the
      two tasks are the same. The EXECTOR_NUMBER variable should be unique for each
      task. Here is the documentation URL for the EXECUTOR_NUMBER environment variable.

      http://wiki.jenkins-ci.org/display/JENKINS/Building+a+software+project

      Below are the steps to reproduce this issue.

      Note: This defect seems to be non-deterministic. I will list the steps that,
      when followed, will often produce the described defect.

      1. Create four new Hudson jobs.
      2. Configure each job. Under each job's "Build" section Add a build step,
      select "invoke Ant".
      3. Click the "Advanced" button under the "Invoke Ant" section. Paste the
      following into the "Properties" text box.

      EXECUTOR_NUMBER=$EXECUTOR_NUMBER

      4. Configure the Ant build step of each job to run an ant task that prints out
      the ${EXECUTOR_NUMBER} variable.
      5. Setup Hudson to be able to run four executors simultaneously.
      6. Launch each of the four jobs in parallel. Try to launch them at the same
      time if possible.
      7. Look at the console output of each of the jobs. On some runs, two separate
      jobs will use the same executor number.
      8. When step 7 has finished, if the defect has not manifested itself, repeat
      steps 6 and 7.

        1. task4.png
          task4.png
          492 kB
        2. task3.png
          task3.png
          551 kB
        3. task2.png
          task2.png
          500 kB
        4. task1.png
          task1.png
          519 kB
        5. fix-HUDSON-4756+7357.patch
          5 kB
        6. executor_list.png
          executor_list.png
          22 kB
        7. configuring_tasks.png
          configuring_tasks.png
          348 kB
        8. build.xml
          0.2 kB

          [JENKINS-4756] the EXECUTOR_NUMBER environment variable is not unique

          Alan Harder added a comment -

          yes, they should be unique within one node.
          I followed these steps and was not able to reproduce the issue.

          Alan Harder added a comment - yes, they should be unique within one node. I followed these steps and was not able to reproduce the issue.

          Alan Harder added a comment -

          What webserver/container did you use? Do you see the same behavior if you do "java -jar hudson.war"? If you can list the steps in even more detail to reproduce this issue in a new Hudson install, please let us know.. otherwise, not sure what more we can do.

          Alan Harder added a comment - What webserver/container did you use? Do you see the same behavior if you do "java -jar hudson.war"? If you can list the steps in even more detail to reproduce this issue in a new Hudson install, please let us know.. otherwise, not sure what more we can do.

          mdemmitt added a comment -

          I will try to setup a bare-bones test that you can run to reproduce this issue. Expect to hear back with detailed instructs on how to reproduce in 3 weeks or so.

          mdemmitt added a comment - I will try to setup a bare-bones test that you can run to reproduce this issue. Expect to hear back with detailed instructs on how to reproduce in 3 weeks or so.

          mdemmitt added a comment -

          Attaching a build.xml file that can be used by a Hudson task to echo out the EXECUTOR_NUMBER

          mdemmitt added a comment - Attaching a build.xml file that can be used by a Hudson task to echo out the EXECUTOR_NUMBER

          mdemmitt added a comment -

          Using Hudson 1.336, I've noticed that the EXECUTOR_NUMBER uniqueness degrades over time. When Hudson is first started, each task will have a unique EXECUTOR_NUMBER. However, at some point in time the EXECUTOR_NUMBER uniqueness ends and a restart of Hudson will fix the issue. I have not figured out what is causing EXECUTOR_NUMBER to be non-unique while Hudson is running, but this might give a clue to some of you folks that know the code well.

          I've attached a build.xml file (use the "default" target) that a Hudson task can use to echo out the EXECUTOR_NUMBER.

          mdemmitt added a comment - Using Hudson 1.336, I've noticed that the EXECUTOR_NUMBER uniqueness degrades over time. When Hudson is first started, each task will have a unique EXECUTOR_NUMBER. However, at some point in time the EXECUTOR_NUMBER uniqueness ends and a restart of Hudson will fix the issue. I have not figured out what is causing EXECUTOR_NUMBER to be non-unique while Hudson is running, but this might give a clue to some of you folks that know the code well. I've attached a build.xml file (use the "default" target) that a Hudson task can use to echo out the EXECUTOR_NUMBER.

          TimoTM added a comment -

          sounds similar

          TimoTM added a comment - sounds similar

          Alan Harder added a comment -

          See the linked duplicate issue for another possible way to reproduce the issue (adding+removing executors).

          Alan Harder added a comment - See the linked duplicate issue for another possible way to reproduce the issue (adding+removing executors).

          I think the problem is that when an Executor is added to a Computer its number is set to Computer.executors.size(). This is not correct because when the number of executors was decreased for some reason (executor crash, decrease by config, ...) it is not always the last Executer that is removed from the Computer. If the number of executors is increased again (Thread restart, increase by config) the new executor should not just take Computer.executors.size() as its number but it should that a number in the range [0,numExecutors) that is not taken yet by another executor.

          I added a patch that fixes this issue.

          janick reynders added a comment - I think the problem is that when an Executor is added to a Computer its number is set to Computer.executors.size(). This is not correct because when the number of executors was decreased for some reason (executor crash, decrease by config, ...) it is not always the last Executer that is removed from the Computer. If the number of executors is increased again (Thread restart, increase by config) the new executor should not just take Computer.executors.size() as its number but it should that a number in the range [0,numExecutors) that is not taken yet by another executor. I added a patch that fixes this issue.

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          core/src/main/java/hudson/model/Computer.java
          test/src/test/java/hudson/model/ExecutorTest.java
          http://jenkins-ci.org/commit/core/f88a2ba3336714a7d03b1d7b0f5a32b3c106cffd
          Log:
          Merge branch 'JENKINS-4756'

          • JENKINS-4756:
            minor touch up
            FIXED JENKINS-4756 the executor number was not always unique. Computer.executors.size() does not provide a unique number within one Computer.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: core/src/main/java/hudson/model/Computer.java test/src/test/java/hudson/model/ExecutorTest.java http://jenkins-ci.org/commit/core/f88a2ba3336714a7d03b1d7b0f5a32b3c106cffd Log: Merge branch ' JENKINS-4756 ' JENKINS-4756 : minor touch up FIXED JENKINS-4756 the executor number was not always unique. Computer.executors.size() does not provide a unique number within one Computer.

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          changelog.html
          http://jenkins-ci.org/commit/core/85a8dd009a35377ca64d067d6cc4254e8ec29a16
          Log:
          [FIXED JENKINS-4756] recording the pull request #33 from Janick Reynders

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: changelog.html http://jenkins-ci.org/commit/core/85a8dd009a35377ca64d067d6cc4254e8ec29a16 Log: [FIXED JENKINS-4756] recording the pull request #33 from Janick Reynders

            Unassigned Unassigned
            mdemmitt mdemmitt
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: