Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-60213

P4_CLIENT environment variable non-reentrant

    XMLWordPrintable

Details

    Description

      Version 1.10.6 of the p4-plugin causes issues when multiple jobs are ran in parallel.

      The result is that the p4 created environment variables of one job can corrupt those in another job that is running at the same time.

      This bug wasn't in 1.10.5

      Steps to reproduce issue:

      • Create a new Jenkins test job with the following settings
        • General:
          • Execute concurrent builds if necessary
          • Use custom workspace
            • Directory
              /tmp/${NODE_NAME}-${JOB_NAME}-${EXECUTOR_NUMBER}
        • Source Code Management
          • Perforce Software:
            • Workspace behaviour: Template (view generated for each node)
              • Template workspace: Your test template
              • Workspace Name Format being
                 jenkins-${NODE_NAME}-${JOB_NAME}-${EXECUTOR_NUMBER}
      • Add a build step

      To see the issue:

      • Kick off two runs of the newly creates client

      Result:

      • Pre 1.10.5 both would have worked and both will set the expected P4 environment variables.
      • 1.10.6 gets confused. So in my case:

        Job1 Environment variables

        P4_CLIENT	_jenkins-master-ClientClashTest-0
        WORKSPACE	/tmp/master-ClientClashTest-0
        

        Job2 Environment variables (THIS IS THE ISSUE – Wrong P4_CLIENT value)

        P4_CLIENT	_jenkins-master-ClientClashTest-0
        WORKSPACE	/tmp/master-ClientClashTest-1
        

        If I reinstall the 1.10.5 (or earlier) plugin then for job 2 I would see:

        P4_CLIENT	_jenkins-master-ClientClashTest-1
        WORKSPACE	/tmp/master-ClientClashTest-1
        

      Attachments

        Issue Links

          Activity

            p4karl Karl Wirth added a comment -

            Hi jbateman - Thanks for letting is know. I'll go back through the test harness to see if there's anything we have assumed that could be the difference.

            p4karl Karl Wirth added a comment - Hi jbateman - Thanks for letting is know. I'll go back through the test harness to see if there's anything we have assumed that could be the difference.
            p4karl Karl Wirth added a comment -

            Verified still sporadically broken in 1.10.7. P4_CLIENT is wrong. The full expansion for me is correct and matches the rest of the job:

            P4: saving built changes.
            Found last change 2240 on syncID jenkins-NODE_NAME-JENKINS-60213-ConcurrentVariables-EXECUTOR_NUMBER
            ... p4 login -s 
            ... p4 client -o jenkins-master-JENKINS-60213-ConcurrentVariables-3 
            ... p4 info 
            ... p4 info 
            ... p4 client -o jenkins-master-JENKINS-60213-ConcurrentVariables-3 
            ... p4 login -s 
            ... p4 client -o jenkins-master-JENKINS-60213-ConcurrentVariables-3 
            ... p4 info 
            ... p4 info 
            ... p4 client -o jenkins-master-JENKINS-60213-ConcurrentVariables-3 
            ... done
            
            [JENKINS-60213-ConcurrentVariables@2] $ /bin/bash /tmp/jenkins6561639748343680378.sh
            ==========================================
            WORKSPACE:            /var/lib/jenkins/workspace/JENKINS-60213-ConcurrentVariables@2
            P4_CLIENT:            jenkins-master-JENKINS-60213-ConcurrentVariables-4
            EXECUTOR NUMBER:      3
            CLIENT_FROM_VARS:     jenkins-master-JENKINS-60213-ConcurrentVariables-3
            ==========================================
            Finished: SUCCESS
            

            Note for dev: In the above ${P4_CLIENT} is wrong but expansion is correct.

             

             

             

            p4karl Karl Wirth added a comment - Verified still sporadically broken in 1.10.7. P4_CLIENT is wrong. The full expansion for me is correct and matches the rest of the job: P4: saving built changes. Found last change 2240 on syncID jenkins-NODE_NAME-JENKINS-60213-ConcurrentVariables-EXECUTOR_NUMBER ... p4 login -s ... p4 client -o jenkins-master-JENKINS-60213-ConcurrentVariables-3 ... p4 info ... p4 info ... p4 client -o jenkins-master-JENKINS-60213-ConcurrentVariables-3 ... p4 login -s ... p4 client -o jenkins-master-JENKINS-60213-ConcurrentVariables-3 ... p4 info ... p4 info ... p4 client -o jenkins-master-JENKINS-60213-ConcurrentVariables-3 ... done [JENKINS-60213-ConcurrentVariables@2] $ /bin/bash /tmp/jenkins6561639748343680378.sh ========================================== WORKSPACE: / var /lib/jenkins/workspace/JENKINS-60213-ConcurrentVariables@2 P4_CLIENT: jenkins-master-JENKINS-60213-ConcurrentVariables-4 EXECUTOR NUMBER: 3 CLIENT_FROM_VARS: jenkins-master-JENKINS-60213-ConcurrentVariables-3 ========================================== Finished: SUCCESS Note for dev: In the above ${P4_CLIENT} is wrong but expansion is correct.      
            msmeeth Matthew Smeeth added a comment - - edited

            Hi, I've written an automated test to reproduce this issue, Looking into it further, it looks to me like the P4_CLIENT variable is actually correct, and it's actually the EXECUTOR NUMBER that is wrong in your script.

             

            I can demonstrate this by putting an extra bit of debugging into the p4 plugin when running jobs. I have modified my build so that it outputs the executor number as the job is executing and this is what I get:
             

            Obtained concurrentBuildsTest/jenkinsfile from p4-brunoCred-//depot/... //jenkins-${NODE_NAME}${JOB_NAME}${EXECUTOR_NUMBER}/...
            Running in Durability level: MAX_SURVIVABILITY
            [Pipeline] Start of Pipeline
            [Pipeline] node
            Running on Jenkins in /var/jenkins_home/workspace/concurrentBuildsTestProject
            [Pipeline] {
            [Pipeline] stage
            [Pipeline] { (testStage)
            [Pipeline] script
            [Pipeline] {
            [Pipeline] dir
            Running in /var/jenkins_home/workspace/concurrentBuildsTestProject/test
            [Pipeline] {
            [Pipeline] checkout
            Executor no at runtime:0 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Extra debugging I've added to the p4 plugin
             ......
             ......
             ......
             + concurrentBuildsTest/script.sh
             ================================= 
             WORKSPACE: /var/jenkins_home/workspace/concurrentBuildsTestProject <<<<<<<<<<<<<<<<<<<<<<<<<<<<< CORRECT
             P4_CLIENT: jenkins-master-concurrentBuildsTestProject-0 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< CORRECT
             EXECUTOR NUMBER: 1 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Wrong executor number
            =================================
            ERROR - Bad Workspace 
            ERROR - Bad P4 Client
            ......
            ......
            Finished: FAILURE
            

             

            I have run this several times and every time the job fails, the result is the same, the workspace and client are correct, and the executor number outputted by the script is incorrect.

            I'm theorising that this is because when builds are running concurrently, in between the sync and the running of the script, the executor number gets overridden by the other jobs, hence it ends up incorrect when the script runs. Therefore I don't think the script we're using is suitable for testing this issue.

             

            It's worth noting I am using the latest changes which contains https://github.com/jenkinsci/p4-plugin/pull/113. Which fixes an issue very similar to this, where the workspace was being set wrong. I believe this was why you were getting a different p4 client and workspace.

             

            jbateman, when available please can you retest your original scenario using a build with https://github.com/jenkinsci/p4-plugin/pull/113 in. As I suspect this may fix your issue.

             

            msmeeth Matthew Smeeth added a comment - - edited Hi, I've written an automated test to reproduce this issue, Looking into it further, it looks to me like the P4_CLIENT variable is actually correct, and it's actually the EXECUTOR NUMBER that is wrong in your script.   I can demonstrate this by putting an extra bit of debugging into the p4 plugin when running jobs. I have modified my build so that it outputs the executor number as the job is executing and this is what I get:   Obtained concurrentBuildsTest/jenkinsfile from p4-brunoCred- //depot/... //jenkins-${NODE_NAME}${JOB_NAME}${EXECUTOR_NUMBER}/... Running in Durability level: MAX_SURVIVABILITY [Pipeline] Start of Pipeline [Pipeline] node Running on Jenkins in / var /jenkins_home/workspace/concurrentBuildsTestProject [Pipeline] { [Pipeline] stage [Pipeline] { (testStage) [Pipeline] script [Pipeline] { [Pipeline] dir Running in / var /jenkins_home/workspace/concurrentBuildsTestProject/test [Pipeline] { [Pipeline] checkout Executor no at runtime:0 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Extra debugging I've added to the p4 plugin ...... ...... ...... + concurrentBuildsTest/script.sh ================================= WORKSPACE: / var /jenkins_home/workspace/concurrentBuildsTestProject <<<<<<<<<<<<<<<<<<<<<<<<<<<<< CORRECT P4_CLIENT: jenkins-master-concurrentBuildsTestProject-0 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< CORRECT EXECUTOR NUMBER: 1 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Wrong executor number ================================= ERROR - Bad Workspace ERROR - Bad P4 Client ...... ...... Finished: FAILURE   I have run this several times and every time the job fails, the result is the same, the workspace and client are correct, and the executor number outputted by the script is incorrect. I'm theorising that this is because when builds are running concurrently, in between the sync and the running of the script, the executor number gets overridden by the other jobs, hence it ends up incorrect when the script runs. Therefore I don't think the script we're using is suitable for testing this issue.   It's worth noting I am using the latest changes which contains  https://github.com/jenkinsci/p4-plugin/pull/113 . Which fixes an issue very similar to this, where the workspace was being set wrong. I believe this was why you were getting a different p4 client and workspace.   jbateman , when available please can you retest your original scenario using a build with  https://github.com/jenkinsci/p4-plugin/pull/113  in. As I suspect this may fix your issue.  

            There seems to be underlying issues with the executor number in Jenkins core. 

            JENKINS-48882, JENKINS-24679, JENKINS-7357, JENKINS-4756.

            Our testing confirms that we calculate the correct executor number for the client, however the environment variable EXECUTOR_NUMBER set by Jenkins (hudson.model.Computer) seems always be incorrect with concurrent execution.

            Closing this issue as we have resolved the client name part of the issue. Please raise this against Jenkins core if you require the EXECUTOR_NUMBER to be fixed.

            msmeeth Matthew Smeeth added a comment - There seems to be underlying issues with the executor number in Jenkins core.  JENKINS-48882 , JENKINS-24679 , JENKINS-7357 , JENKINS-4756 . Our testing confirms that we calculate the correct executor number for the client, however the environment variable EXECUTOR_NUMBER set by Jenkins (hudson.model.Computer) seems always be incorrect with concurrent execution. Closing this issue as we have resolved the client name part of the issue. Please raise this against Jenkins core if you require the EXECUTOR_NUMBER to be fixed.
            tomreed81 Tom Reed added a comment - - edited

            I have been battling a similar issue.

            My situation is that the P4_CLIENT I build uses the variables "jenkins-${JOB_NAME}-{NODE_NAME}"

            The NODE_NAME is wrong a few times a night with 6 parallel builds on 6 different nodes. About 40-050 builds are queued a night, and the wrong P4_CLIENT is causing the wrong P4_CHANGELIST to be reported.

            tomreed81 Tom Reed added a comment - - edited I have been battling a similar issue. My situation is that the P4_CLIENT I build uses the variables "jenkins-${JOB_NAME}-{NODE_NAME}" The NODE_NAME is wrong a few times a night with 6 parallel builds on 6 different nodes. About 40-050 builds are queued a night, and the wrong P4_CLIENT is causing the wrong P4_CHANGELIST to be reported.

            People

              msmeeth Matthew Smeeth
              jbateman James Bateman
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: