Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-65018

syncing files at change: can fail silently

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Component/s: p4-plugin
    • Environment:
      P4 plugin version 1.11.0
      node OS Centos 7
      Jenkins version 2.257
    • Similar Issues:

      Description

      During a  

      checkout perforce(…)  

      something outside the jenkins process caused the user session to end – might have been a server failover, or maybe a human or script issuing a p4 logout, or just a bug in the p4 stack somewhere. This looks like it might have happened immediately before the plugin issued the p4 sync to the desired checkout revision.  

      The result is that the crucial "P4 Task: syncing files at change: 831729" failed, but the plugin did not catch the error and therefore the step as a whole succeeded, leaving the build with an unexpected (old) revision. It would be better to fail. I think this is difficult to reproduce precisely, but getting the correct revision is clearly critical.  

      Here is some annotated detail from the checkout output.  

      The early checks for the remaining life of the login session are good:  

       

      Executor number at runtime: 0  
      
      (p4):cmd:... p4 login -s  
      
      p4 login -s  
      User p4builduser ticket expires in 11 hours 25 minutes.
      

      There are a bunch of successful calls including for example getting details of changes: 

       

      P4 Task: reverting all pending and shelved revisions. 
      (p4):cmd:... p4 revert /home/build/jenkins/workspace/Full/... 
      /home/build/jenkins/workspace/Full/... - file(s) not opened on this client. 
      P4 Task: cleaning workspace to match have list. 
      (p4):cmd:... p4 reconcile -f -w /home/build/jenkins/workspace/Full/... 
      P4: saving built changes. 
      Found last change 831703 on syncID tw-jenkins-noarch-Full 
      (p4):cmd:... p4 login -s 
      p4 login -s
       
      User p4builduser ticket expires in 11 hours 58 minutes.  
      

      When it gets to the sync command, to sync the changes to the correct revision, it fails to do so due to an access failure, but more importantly, it fails to notice, which means we have a checkout from the previous build, but it has NOT been updated and is therefore wrong. 

       

      (p4):stop:9  
      
      duration: 1m 52s  
      P4 Task: syncing files at change: 831729  
      
      (p4):cmd:... p4 sync --parallel=threads=4,min=1,minsize=1024 -q /home/build/jenkins/works___  
      
      Perforce password (P4PASSWD) invalid or unset.  
       
      (p4):stop:10  
      
      duration: (458ms) 
      

       Now we have a checkout which does not match other checkouts involved in a co-ordinated build. But the plugin continues, including successful perforce commands – I am not sure if the plugin logged back in, or whether whatever caused the logout logged back in, or whether the message was misleading, but further calls worked, e.g. this changes command: 

       

       

      (p4):cmd:... p4 changes -l -m1 @831710  
      
      p4 changes -l -m1 @831710  
      
      Change 831710 on 2021/03/03 by XXXXXXX@XXXXXX 'commit message redacted'  
      
      (p4):stop:37  
      

       

       

      After this, the build continues but with the wrong revision checked out, and the checkout step is marked as successful.  

        Attachments

          Activity

          callmewilko Mark Wilkinson created issue -
          callmewilko Mark Wilkinson made changes -
          Field Original Value New Value
          Attachment jenkins_65018_failed_sync.txt [ 54118 ]
          callmewilko Mark Wilkinson made changes -
          Attachment jenkins_65018_failed_sync.txt [ 54118 ]
          callmewilko Mark Wilkinson made changes -
          Attachment jenkins_65018_failed_sync.txt [ 54119 ]
          p4karl Karl Wirth made changes -
          Assignee Karl Wirth [ p4karl ]
          Hide
          p4karl Karl Wirth added a comment -

          Hi Mark Wilkinson - Which version of P4D are you using and how is your credential set up (ticket, password or ticket file)? I have seen a problem in the past where the 'p4 sync' command (at the command line) worked but one of more of the parrallel transfer threads failed with P4PASSWD error but the P4 command reported success. So this may be P4Java bug but may also be P4D related.

          Also do you have a way of reproducing this? If you do then can you try it with parralllel sync disabled.

          Show
          p4karl Karl Wirth added a comment - Hi Mark Wilkinson - Which version of P4D are you using and how is your credential set up (ticket, password or ticket file)? I have seen a problem in the past where the 'p4 sync' command (at the command line) worked but one of more of the parrallel transfer threads failed with P4PASSWD error but the P4 command reported success. So this may be P4Java bug but may also be P4D related. Also do you have a way of reproducing this? If you do then can you try it with parralllel sync disabled.
          p4karl Karl Wirth made changes -
          Labels P4_SUPPORT
          Hide
          callmewilko Mark Wilkinson added a comment -

          Thanks for looking, and hello. 

          Server version: P4D/LINUX26X86_64/2019.1/1918131 (2020/02/12)  

          The credential is a username/password/port via the Jenkins vault. We also write a P4CONFIG file (using the plugin checkout values) for humans and our builds to be able to query Perforce.

          Unfortunately I don't think I can reproduce this, especially at that precise moment – I'm not aware of any way to hook in, and this is a regular running job that normally works.  I had a look back at build history and for the recent history this is a one off event.

          I probably should have titled this a little clearer, maybe I could change it, but for the sake of disambiguation, I raised this about catching the failure (and maybe just exiting), rather than trying to prevent the session loss in the first place. I hope that makes sense.

          In the meantime I will disable parallel threads, just in case it happens again.

          Show
          callmewilko Mark Wilkinson added a comment - Thanks for looking, and hello.  Server version: P4D/LINUX26X86_64/2019.1/1918131 (2020/02/12)   The credential is a username/password/port via the Jenkins vault. We also write a P4CONFIG file (using the plugin checkout values) for humans and our builds to be able to query Perforce. Unfortunately I don't think I can reproduce this, especially at that precise moment – I'm not aware of any way to hook in, and this is a regular running job that normally works.  I had a look back at build history and for the recent history this is a one off event. I probably should have titled this a little clearer, maybe I could change it, but for the sake of disambiguation, I raised this about catching the failure (and maybe just exiting), rather than trying to prevent the session loss in the first place. I hope that makes sense. In the meantime I will disable parallel threads, just  in case it happens again.
          Hide
          p4karl Karl Wirth added a comment -

          Mark Wilkinson- Thanks again for highlighting this. I confirm this is a known bug with P4D 2019.1 that caused the sync to report success even when it had failed. As you demonstrated you need the passowrd timeout to occur at just the right time to trigger this. Upgrading to 2019.2 or later should fix the problem.

          At the moment the Jenkins code usually trusts the return values from the server however if we get more cases where they cannot be trusted we can add text parsing to look for errors. The downside being that if the server team change the messaging we will still miss the failures.

           

           

          Show
          p4karl Karl Wirth added a comment - Mark Wilkinson - Thanks again for highlighting this. I confirm this is a known bug with P4D 2019.1 that caused the sync to report success even when it had failed. As you demonstrated you need the passowrd timeout to occur at just the right time to trigger this. Upgrading to 2019.2 or later should fix the problem. At the moment the Jenkins code usually trusts the return values from the server however if we get more cases where they cannot be trusted we can add text parsing to look for errors. The downside being that if the server team change the messaging we will still miss the failures.    
          p4karl Karl Wirth made changes -
          Labels P4_SUPPORT P4_SUPPORT P4_VERIFY
          Hide
          p4karl Karl Wirth added a comment -

          Caused by P4D 2019.1 behavior. P4D upgrade should fix it. If future occurrences occur on later versions please reopen and we will investigate defensive code and raising with P4D server team again.

          Show
          p4karl Karl Wirth added a comment - Caused by P4D 2019.1 behavior. P4D upgrade should fix it. If future occurrences occur on later versions please reopen and we will investigate defensive code and raising with P4D server team again.
          p4karl Karl Wirth made changes -
          Resolution Done [ 10000 ]
          Status Open [ 1 ] Resolved [ 5 ]

            People

            Assignee:
            p4karl Karl Wirth
            Reporter:
            callmewilko Mark Wilkinson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: