Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-28400

Not timing out when launch pid fails to appear

    XMLWordPrintable

Details

    Description

      ShellController.exitStatus if more than a minute has elapsed since doLaunch, and still _pid == 0, return -2

      Also consider uncommenting ps.stdout(listener) under some conditions.

      Attachments

        Issue Links

          Activity

            tfennelly Tom FENNELLY created issue -
            tfennelly Tom FENNELLY made changes -
            Field Original Value New Value
            Summary Not timing when launch pid fails to appear Not timing out when launch pid fails to appear
            jglick Jesse Glick made changes -
            Labels workflow
            oleg_nenashev Oleg Nenashev added a comment - - edited

            We got into such issue in JENKINS-28821. The build may hang infinitely, because durable-task's callback may be unable able to write pidfile and other metadata from the docker run. In such case the server side will never get a termination event.

            Sample code:

            // The temporary variable is to ensure JENKINS_SERVER_COOKIE=durable-… does not appear even in argv[], lest it be confused with the environment.
                    String cmd = String.format("echo $$ > '%s'; jsc=%s; %s=$jsc '%s' > '%s' 2>&1; echo $? > '%s'",
                            c.pidFile(ws),
                            cookieValue,
                            cookieVariable,
                            shf,
                            c.getLogFile(ws),
                            c.getResultFile(ws)
                            )./* escape against EnvVars jobEnv in LocalLauncher.launch */replace("$", "$$");
            

            The script is error-prone. If it is not being launched, Durable task thinks the task thinks the task is running => we need an additional flag as a workaround

            oleg_nenashev Oleg Nenashev added a comment - - edited We got into such issue in JENKINS-28821 . The build may hang infinitely, because durable-task's callback may be unable able to write pidfile and other metadata from the docker run. In such case the server side will never get a termination event. Sample code: // The temporary variable is to ensure JENKINS_SERVER_COOKIE=durable-… does not appear even in argv[], lest it be confused with the environment. String cmd = String.format("echo $$ > '%s'; jsc=%s; %s=$jsc '%s' > '%s' 2>&1; echo $? > '%s'", c.pidFile(ws), cookieValue, cookieVariable, shf, c.getLogFile(ws), c.getResultFile(ws) )./* escape against EnvVars jobEnv in LocalLauncher.launch */replace("$", "$$"); The script is error-prone. If it is not being launched, Durable task thinks the task thinks the task is running => we need an additional flag as a workaround
            oleg_nenashev Oleg Nenashev made changes -
            Link This issue is related to JENKINS-28821 [ JENKINS-28821 ]
            jglick Jesse Glick made changes -
            Link This issue is related to JENKINS-31769 [ JENKINS-31769 ]
            jglick Jesse Glick made changes -
            Labels workflow diagnostics workflow
            jglick Jesse Glick made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            jglick Jesse Glick made changes -
            Remote Link This issue links to "PR 17 (Web Link)" [ 13905 ]
            jglick Jesse Glick made changes -
            Link This issue is related to JENKINS-26105 [ JENKINS-26105 ]

            Code changed in jenkins
            User: Jesse Glick
            Path:
            src/main/java/org/jenkinsci/plugins/durabletask/BourneShellScript.java
            http://jenkins-ci.org/commit/durable-task-plugin/27ed9917ef9a54b2dc6777aceae627384fcfeeb1
            Log:
            [FIXED JENKINS-28400] If the PID does not appear after 15s, assume the launch failed.

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: src/main/java/org/jenkinsci/plugins/durabletask/BourneShellScript.java http://jenkins-ci.org/commit/durable-task-plugin/27ed9917ef9a54b2dc6777aceae627384fcfeeb1 Log: [FIXED JENKINS-28400] If the PID does not appear after 15s, assume the launch failed.
            scm_issue_link SCM/JIRA link daemon made changes -
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Resolved [ 5 ]

            Code changed in jenkins
            User: Jesse Glick
            Path:
            src/main/java/org/jenkinsci/plugins/durabletask/BourneShellScript.java
            http://jenkins-ci.org/commit/durable-task-plugin/45ce6e4ff069baa0b33073e716fdc759c2b8914e
            Log:
            JENKINS-28400 Display diagnostics the first time a process is launched in a given workspace.

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: src/main/java/org/jenkinsci/plugins/durabletask/BourneShellScript.java http://jenkins-ci.org/commit/durable-task-plugin/45ce6e4ff069baa0b33073e716fdc759c2b8914e Log: JENKINS-28400 Display diagnostics the first time a process is launched in a given workspace.

            Code changed in jenkins
            User: Jesse Glick
            Path:
            src/main/java/org/jenkinsci/plugins/durabletask/BourneShellScript.java
            src/main/java/org/jenkinsci/plugins/durabletask/WindowsBatchScript.java
            http://jenkins-ci.org/commit/durable-task-plugin/fa1959dec3984127afe02ca9f65339500d2e0512
            Log:
            Merge pull request #17 from jglick/launch-failure-JENKINS-28400

            JENKINS-28400 Better handle failure to start wrapper sh script

            Compare: https://github.com/jenkinsci/durable-task-plugin/compare/7f14ad2fab13...fa1959dec398

            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: src/main/java/org/jenkinsci/plugins/durabletask/BourneShellScript.java src/main/java/org/jenkinsci/plugins/durabletask/WindowsBatchScript.java http://jenkins-ci.org/commit/durable-task-plugin/fa1959dec3984127afe02ca9f65339500d2e0512 Log: Merge pull request #17 from jglick/launch-failure- JENKINS-28400 JENKINS-28400 Better handle failure to start wrapper sh script Compare: https://github.com/jenkinsci/durable-task-plugin/compare/7f14ad2fab13...fa1959dec398
            rtyler R. Tyler Croy made changes -
            Workflow JNJira [ 163239 ] JNJira + In-Review [ 197143 ]
            abayer Andrew Bayer made changes -
            Labels diagnostics workflow diagnostics pipeline workflow
            abayer Andrew Bayer made changes -
            Labels diagnostics pipeline workflow diagnostics pipeline
            csanchez Carlos Sanchez made changes -
            Link This issue is related to JENKINS-41810 [ JENKINS-41810 ]

            People

              jglick Jesse Glick
              tfennelly Tom FENNELLY
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: