Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-9104

Visual studio builds started by Jenkins fail with "Fatal error C1090" because mspdbsrv.exe gets killed

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • core
    • None
    • Windows XP, Windows 7 using MSBuild or devenv.exe to build MS Visual Studio Projects

      I run into errors when using a customized build system which uses Visual Studio's devenv.exe under the hood to compile VisualStudio 2005 projects (with VC++ compiler). When starting two parallel builds with Jenkins (on different code base) the second job will always fail with "Fatal error C1090: PDB API call failed, error code '23' : '(" in exactly the same second the first job finishes processing. Running both jobs outside Jenkins does not produce the error.
      This has also been reported for builds executed by MSBuild on the Jenkins user mailing list [1].

      I analysed this issue thoroughly and can track the problem down to the usage of mspdbsrv.exe. This program is automatically spawned when building a VisualStudio project. All Visual Studio instances normally share one common pdb-server which shutdown itself after a idle period (standard is 10 minutes). "It ensures access to .pdb files is properly serialized in parallel builds when multiple instances of the compiler try to access the same .pdb file" [2].
      I assume that Jenkins does a clean up of its build environment when a automatically started job finishes (like as described at http://wiki.jenkins-ci.org/display/JENKINS/Aborting+a+build). I checked mspbsrv.exe with ProcessExplorer and the process indeed has a variable JENKINS_COOKIE/HUDSON_COOKIE set in its environment if started through Jenkins. Killing mspdbsrv.exe while projects are still connected will break compilation.

      Jenkins mustn't kill mspdbsrv.exe to be able to build more than one Visual Studio project at the same time.


      [1] http://jenkins.361315.n4.nabble.com/MSBuild-fatal-errors-when-build-triggered-by-timer-td385181.html
      [2] http://social.msdn.microsoft.com/Forums/en-US/vcgeneral/thread/b1d1bceb-06b6-47ef-a0ea-23ea752e0c4f/

          [JENKINS-9104] Visual studio builds started by Jenkins fail with "Fatal error C1090" because mspdbsrv.exe gets killed

          Kevin Phillips added a comment - - edited

          I have a few follow up questions:
          1. If I understand correctly, this "process tree killer" feature was pre-existing in earlier Jenkins releases, but only in the latest update was it "changed" to add recursive killing of processes, correct?

          2. That being the case, does setting "BUILD_ID=dontKillMe" disable termination of all processes or just this new "recursive" behavior? If it disables all process terminations I'd say this proposal would not be a viable workaround since it could risk leaving other rogue processes orphaned on a build machine, which has many adverse side effects (which, I'm guessing you already know since I suspect this feature was implemented to resolve these exact problems)

          3. Won't setting the "BUILD_ID=dontKillMe" affect other parts of the build? The BUILD_ID env var is used as a unique identifier throughout the job after all. Changing it from the unique identifier it is meant to be, to a statically defined character string seems fragile at best.

          Kevin Phillips added a comment - - edited I have a few follow up questions: 1. If I understand correctly, this "process tree killer" feature was pre-existing in earlier Jenkins releases, but only in the latest update was it "changed" to add recursive killing of processes, correct? 2. That being the case, does setting "BUILD_ID=dontKillMe" disable termination of all processes or just this new "recursive" behavior? If it disables all process terminations I'd say this proposal would not be a viable workaround since it could risk leaving other rogue processes orphaned on a build machine, which has many adverse side effects (which, I'm guessing you already know since I suspect this feature was implemented to resolve these exact problems) 3. Won't setting the "BUILD_ID=dontKillMe" affect other parts of the build? The BUILD_ID env var is used as a unique identifier throughout the job after all. Changing it from the unique identifier it is meant to be, to a statically defined character string seems fragile at best.

          Kevin Phillips added a comment - - edited

          So far, based on the recent comment threads, my admittedly superficial understanding of the root cause, and some quick Googling, it seems there are only a few viable options to resolve this issue:

          1. A python script was written by an earlier commenter, which leverages the BUILD_ID env var to strategically control the lifetime of the pdbsrv process itself without affecting other parts of the build.

          • This seems like a pretty harsh workaround to what is obviously a problem introduced by changes made in the latest Jenkins LTS update.

          2. Roll back the version of this "process tree killer" used by Jenkins LTS to v1.16, before this new "recursive" behavior was added according to an earlier comment.

          • I assume LTS releases are expected to maintain a certain level of stability and consistency in their behaviors. That being the case, this change obviously caused critical, debilitating side effects to Visual Studio users and thus should not have been included in an update release.

          3. Provide some kind of workaround within the "process tree killer" or the Jenkins core libraries to compensate for this newly discovered problem.

          • From what I gather from the earlier comments, this may be a non trivial task. However, if this new recursive logic in the process tree killer is absolutely required in Jenkins LTS for some reason, I think this work must be done. Anything else (scripting, documentation notes, etc.) would just be trying to hide the fact that this is an underlying architectural problem - imo.

          4. Accept the fact that Visual Studio users will likely never use Jenkins version that include this "new feature", forcing them to use versions of Jenkins that predate this change.

          • Currently this is the solution that my team and I have chosen to adopt until a more reasonable solution can be found.
          • Just to clarify our rationale for this decision: Using v1.532.x works just fine with Visual Studio. Upgrading to v1.554.x does not work - at all. Period. To do otherwise would require extra time (and, hence, money) on our part to workaround the problem, for little to no benefit on our part.

          Kevin Phillips added a comment - - edited So far, based on the recent comment threads, my admittedly superficial understanding of the root cause, and some quick Googling, it seems there are only a few viable options to resolve this issue: 1. A python script was written by an earlier commenter, which leverages the BUILD_ID env var to strategically control the lifetime of the pdbsrv process itself without affecting other parts of the build. This seems like a pretty harsh workaround to what is obviously a problem introduced by changes made in the latest Jenkins LTS update. 2. Roll back the version of this "process tree killer" used by Jenkins LTS to v1.16, before this new "recursive" behavior was added according to an earlier comment . I assume LTS releases are expected to maintain a certain level of stability and consistency in their behaviors. That being the case, this change obviously caused critical, debilitating side effects to Visual Studio users and thus should not have been included in an update release. 3. Provide some kind of workaround within the "process tree killer" or the Jenkins core libraries to compensate for this newly discovered problem. From what I gather from the earlier comments, this may be a non trivial task. However, if this new recursive logic in the process tree killer is absolutely required in Jenkins LTS for some reason, I think this work must be done. Anything else (scripting, documentation notes, etc.) would just be trying to hide the fact that this is an underlying architectural problem - imo. 4. Accept the fact that Visual Studio users will likely never use Jenkins version that include this "new feature", forcing them to use versions of Jenkins that predate this change. Currently this is the solution that my team and I have chosen to adopt until a more reasonable solution can be found. Just to clarify our rationale for this decision: Using v1.532.x works just fine with Visual Studio. Upgrading to v1.554.x does not work - at all. Period. To do otherwise would require extra time (and, hence, money) on our part to workaround the problem, for little to no benefit on our part.

          Aside
          I probably should say that I truly believe the real root cause of this problem is an underlying architectural issue with Visual Studio and it's use of this pdbsrv process in their newer compilers, but numerous forums and bug reports to Microsoft appear to fall on deaf ears (ie: they claim it's working this way by design). Given the fact that this has been a problem in Visual Studio for several releases spread across many years it's unlikely to change any time soon, so you may be forced to compensate for it here in your tool. To do otherwise will simply make it more difficult (and, by extension, less likely) for Visual Studio users to adopt / continue using your tool.

          Kevin Phillips added a comment - Aside I probably should say that I truly believe the real root cause of this problem is an underlying architectural issue with Visual Studio and it's use of this pdbsrv process in their newer compilers, but numerous forums and bug reports to Microsoft appear to fall on deaf ears (ie: they claim it's working this way by design). Given the fact that this has been a problem in Visual Studio for several releases spread across many years it's unlikely to change any time soon, so you may be forced to compensate for it here in your tool. To do otherwise will simply make it more difficult (and, by extension, less likely) for Visual Studio users to adopt / continue using your tool.

          Daniel Beck added a comment -

          Does this also happen with MSBuild, or only Devenv? Can you switch to the former? What about systems without Visual Studio installed, instead using only MSBuild/Windows SDK?

          (I'm not too familiar with Visual Studio projects beyond pressing an F-key to build them, so this might well be a stupid question)

          Daniel Beck added a comment - Does this also happen with MSBuild, or only Devenv? Can you switch to the former? What about systems without Visual Studio installed, instead using only MSBuild/Windows SDK? (I'm not too familiar with Visual Studio projects beyond pressing an F-key to build them, so this might well be a stupid question)

          From what I understand this is a problem with the compiler, which I think is the same compiler used under the hood by both msbuild and devenv, however I have not confirmed first hand the same problems arise in both situations. I'd be surprised if they didn't.

          As for building our projects without Visual Studio, with just MSBuild / Windows SDK, we have as of yet been unable to do so. We have heavy dependencies on MFC which hasn't, until recently, been available outside of Visual Studio. Plus we have had numerous technical issues migrating to the newer versions of the SDK / MSBuild that do include them. Regardless, again I'd be surprised if any of this made any difference unless the compiler that ships with the SDK is fundamentally architecturally different than the one that ships with VS.

          If I can spare some time to confirm a few of these details I'll let you know, even if just for curiosities sake.

          Kevin Phillips added a comment - From what I understand this is a problem with the compiler, which I think is the same compiler used under the hood by both msbuild and devenv, however I have not confirmed first hand the same problems arise in both situations. I'd be surprised if they didn't. As for building our projects without Visual Studio, with just MSBuild / Windows SDK, we have as of yet been unable to do so. We have heavy dependencies on MFC which hasn't, until recently, been available outside of Visual Studio. Plus we have had numerous technical issues migrating to the newer versions of the SDK / MSBuild that do include them. Regardless, again I'd be surprised if any of this made any difference unless the compiler that ships with the SDK is fundamentally architecturally different than the one that ships with VS. If I can spare some time to confirm a few of these details I'll let you know, even if just for curiosities sake.

          Steve Carter added a comment -

          Solution 5: Don't run MsBuild projects in parallel.

          Before I built the python workaround, that's what I did using a throttling plugin. Works fine. pdbsrv gets killed at the end of each build, and started afresh by the microsoft toolchain on the next job. But if you are trying to do continuous build on development branches, then this won't have the capacity to keep up.

          Solution 6: Set BUILD_ID to hide pdbsrv from the processtreekiller. Live with the chance that once in a while pdbsrv might time out mid-build.

          Steve Carter added a comment - Solution 5: Don't run MsBuild projects in parallel. Before I built the python workaround, that's what I did using a throttling plugin. Works fine. pdbsrv gets killed at the end of each build, and started afresh by the microsoft toolchain on the next job. But if you are trying to do continuous build on development branches, then this won't have the capacity to keep up. Solution 6: Set BUILD_ID to hide pdbsrv from the processtreekiller. Live with the chance that once in a while pdbsrv might time out mid-build.

          Solution 5: Don't run MsBuild projects in parallel.

          That may be fine for small projects but not for larger ones. For example, our main codebase is configured with about 40 jobs per configuration to build each "tier" or "module" in our codebase more efficiently - running jobs in parallel whenever possible. Doing so reduced our "clean" build times from 14 hours to 3. Numbers like that are hard to argue against.

          Are there other ways we could achieve similar results? Possibly, but they all require time and effort (aka: money) which we do not have.

          Solution 6: Set BUILD_ID to hide pdbsrv from the processtreekiller. Live with the chance that once in a while pdbsrv might time out mid-build.

          Could you clarify what you are referring to here? I assume you mean something other than using your python script since that was the very first "potential fix" I had mentioned above.

          It has been my experience that so long as you leave Visual Studio to it's own internal details to manage pdbsrv it works reliably for extended periods, keeping the service alive when needed and terminating it safely when it isn't, even if you run multiple builds in parallel via Jenkins. In fact that is what we do now and it never causes problems with our builds. This is saying something considering the size and scale of our build farm, with hundreds of jobs spread across nearly a dozen servers, all running 24/7!

          Kevin Phillips added a comment - Solution 5: Don't run MsBuild projects in parallel. That may be fine for small projects but not for larger ones. For example, our main codebase is configured with about 40 jobs per configuration to build each "tier" or "module" in our codebase more efficiently - running jobs in parallel whenever possible. Doing so reduced our "clean" build times from 14 hours to 3. Numbers like that are hard to argue against. Are there other ways we could achieve similar results? Possibly, but they all require time and effort (aka: money) which we do not have. Solution 6: Set BUILD_ID to hide pdbsrv from the processtreekiller. Live with the chance that once in a while pdbsrv might time out mid-build. Could you clarify what you are referring to here? I assume you mean something other than using your python script since that was the very first "potential fix" I had mentioned above. It has been my experience that so long as you leave Visual Studio to it's own internal details to manage pdbsrv it works reliably for extended periods, keeping the service alive when needed and terminating it safely when it isn't, even if you run multiple builds in parallel via Jenkins. In fact that is what we do now and it never causes problems with our builds. This is saying something considering the size and scale of our build farm, with hundreds of jobs spread across nearly a dozen servers, all running 24/7!

          Daniel Beck added a comment -

          Maybe the following workaround would work: If mspdbsrv.exe runs as the user launching devenv, you could create a whole bunch of slaves all running on the same machine, but as different users, each having a single executor.

          Daniel Beck added a comment - Maybe the following workaround would work: If mspdbsrv.exe runs as the user launching devenv, you could create a whole bunch of slaves all running on the same machine, but as different users, each having a single executor.

          Seems a bit heavy. The extra overhead of running multiple agents alone seems like it would be significant, let alone the complexities involved with having multiple user profiles being used, all of which would need to have a consistent configuration to ensure the agents all behave the same, not to mention managing security and permissions and whatnot. Given that each of our agents currently runs with between 4 and 6 executors, that would increase our agent count by the same factor.

          Also, this would make managing overall load on a given system more complex. Consider jobs that are configured to use 100% of the agents resources to prevent parallel build problems, as an example. These would need to be configured to work across agents somehow. I'm not even sure that is possible....

          Kevin Phillips added a comment - Seems a bit heavy. The extra overhead of running multiple agents alone seems like it would be significant, let alone the complexities involved with having multiple user profiles being used, all of which would need to have a consistent configuration to ensure the agents all behave the same, not to mention managing security and permissions and whatnot. Given that each of our agents currently runs with between 4 and 6 executors, that would increase our agent count by the same factor. Also, this would make managing overall load on a given system more complex. Consider jobs that are configured to use 100% of the agents resources to prevent parallel build problems, as an example. These would need to be configured to work across agents somehow. I'm not even sure that is possible....

          I looked into the difficulty of adding a "process whitelist" for processes that must not be killed. It would require some changes to winp but it's the only workable solution, besides "disable process killing for this entire task", which can, itself, cause build failures.

          Unfortunately, because the necessary changes have to span two projects, it'll be a bit of a large task without cooperation from everyone involved.

          > It has been my experience that so long as you leave Visual Studio to it's own internal details to manage pdbsrv it works reliably for extended periods, keeping the service alive when needed and terminating it safely when it isn't, even if you run multiple builds in parallel via Jenkins. In fact that is what we do now and it never causes problems with our builds. This is saying something considering the size and scale of our build farm, with hundreds of jobs spread across nearly a dozen servers, all running 24/7!

          Unfortunately I've found this isn't the case - there seem to be situations where mspdbsrv times out mid-build and is restarted cleanly, and if that doesn't happen within a BUILD_ID replacement block, then when the restarting build finishes, Jenkins will happily kill mspdbsrv and break other builds.

          I suspect "running 24/7" is why you're not seeing this - it's happening somewhat frequently on a much smaller farm of mine with much fewer jobs.

          Ben Rog-Wilhelm added a comment - I looked into the difficulty of adding a "process whitelist" for processes that must not be killed. It would require some changes to winp but it's the only workable solution, besides "disable process killing for this entire task", which can, itself, cause build failures. Unfortunately, because the necessary changes have to span two projects, it'll be a bit of a large task without cooperation from everyone involved. > It has been my experience that so long as you leave Visual Studio to it's own internal details to manage pdbsrv it works reliably for extended periods, keeping the service alive when needed and terminating it safely when it isn't, even if you run multiple builds in parallel via Jenkins. In fact that is what we do now and it never causes problems with our builds. This is saying something considering the size and scale of our build farm, with hundreds of jobs spread across nearly a dozen servers, all running 24/7! Unfortunately I've found this isn't the case - there seem to be situations where mspdbsrv times out mid-build and is restarted cleanly, and if that doesn't happen within a BUILD_ID replacement block, then when the restarting build finishes, Jenkins will happily kill mspdbsrv and break other builds. I suspect "running 24/7" is why you're not seeing this - it's happening somewhat frequently on a much smaller farm of mine with much fewer jobs.

          I suspect "running 24/7" is why you're not seeing this - it's happening somewhat frequently on a much smaller farm of mine with much fewer jobs.

          That is totally possible. Running so many jobs in parallel so often it is probably a rare condition that no jobs are running at all on any given server on our farm, and this may be preventing the service from timing out.

          Thanks for pointing that out.

          Kevin Phillips added a comment - I suspect "running 24/7" is why you're not seeing this - it's happening somewhat frequently on a much smaller farm of mine with much fewer jobs. That is totally possible. Running so many jobs in parallel so often it is probably a rare condition that no jobs are running at all on any given server on our farm, and this may be preventing the service from timing out. Thanks for pointing that out.

          Tony Jomaa added a comment -

          I very new to the Jenkins world. I am running into this issue a lot. This would be a show stopper for us when it comes to adopting Jenkins for our build processes. Our builds get manually triggered by many users at random times. We could have 20 or more builds running at the same time; all running in parallel. I tried the Python script given by Steve Carter in a Execute Shell command box but I get an error about some "sh" -ex was not found! what gives? I thought I am running a Python script not Linux? or do they both need to run Linux?

          In short, if I do not get this resolved, we will have to go back to our previous way of building.
          Has anyone solved this issue yet?

          Thank you,

          Tony Jomaa added a comment - I very new to the Jenkins world. I am running into this issue a lot. This would be a show stopper for us when it comes to adopting Jenkins for our build processes. Our builds get manually triggered by many users at random times. We could have 20 or more builds running at the same time; all running in parallel. I tried the Python script given by Steve Carter in a Execute Shell command box but I get an error about some "sh" -ex was not found! what gives? I thought I am running a Python script not Linux? or do they both need to run Linux? In short, if I do not get this resolved, we will have to go back to our previous way of building. Has anyone solved this issue yet? Thank you,

          Daniel Beck added a comment -

          Tony: Please address requests for assistance to the jenkinsci-users mailing list, or #jenkins IRC channel on Freenode.

          Daniel Beck added a comment - Tony: Please address requests for assistance to the jenkinsci-users mailing list, or #jenkins IRC channel on Freenode.

          Shannon Kerr added a comment -

          I just ran into this one for the first time as far as I can tell. I did a quick look back and see no other instances and I don't recall seeing this before. For now, I'll take no action. danielbeck or anyone else, please let me know if I can provide you with any information that could help in resolving this. Build env where we saw this error: MS Win 7 x64, VS2010

          Shannon Kerr added a comment - I just ran into this one for the first time as far as I can tell. I did a quick look back and see no other instances and I don't recall seeing this before. For now, I'll take no action. danielbeck or anyone else, please let me know if I can provide you with any information that could help in resolving this. Build env where we saw this error: MS Win 7 x64, VS2010

          Shannon Kerr added a comment -

          I hit three more instances of this. Two yesterday and one other a week ago.

          Shannon Kerr added a comment - I hit three more instances of this. Two yesterday and one other a week ago.

          Daniel Beck added a comment -

          How are you starting these builds? Batch? MSBuild plugin? What exact commands? If batch, did you try setting BUILD_ID as described on https://wiki.jenkins-ci.org/display/JENKINS/ProcessTreeKiller ?

          Daniel Beck added a comment - How are you starting these builds? Batch? MSBuild plugin? What exact commands? If batch, did you try setting BUILD_ID as described on https://wiki.jenkins-ci.org/display/JENKINS/ProcessTreeKiller ?

          Shannon Kerr added a comment -

          Batch. In the Jenkins project, we use "Execute Windows Batch Command" to call a batch script that automates a bunch of pre build work and ends up calling the builds via devenv.

          I did not try the BUILD_ID suggestion as I saw that there were still issues mentioned in this ticket with this work-around. I was trying hang in there until the final solution was provided, but the failures seem to be picking up for us lately. I guess we'll use this work around for now.

          Shannon Kerr added a comment - Batch. In the Jenkins project, we use "Execute Windows Batch Command" to call a batch script that automates a bunch of pre build work and ends up calling the builds via devenv. I did not try the BUILD_ID suggestion as I saw that there were still issues mentioned in this ticket with this work-around. I was trying hang in there until the final solution was provided, but the failures seem to be picking up for us lately. I guess we'll use this work around for now.

          Shannon Kerr added a comment -

          I'm am trying the BUILD_ID suggestion now, but this is a hack (right?) and not a final solution? The final solution is to have Jenkins not kill specified jobs like mspdbsrv.exe. Whether that is in a whitelist managed by the user or hardcoded by Jenkins for now, doesn't matter to me. Hopefully there will be a long-term fix to Jenkins for this.

          Shannon Kerr added a comment - I'm am trying the BUILD_ID suggestion now, but this is a hack (right?) and not a final solution? The final solution is to have Jenkins not kill specified jobs like mspdbsrv.exe. Whether that is in a whitelist managed by the user or hardcoded by Jenkins for now, doesn't matter to me. Hopefully there will be a long-term fix to Jenkins for this.

          Del Hyman-Jones added a comment - - edited

          Does anyone know if there is an option to stop Jenkins from killing processes completely as a global option instead of having to add the BUILD_ID to every single job? I have tried adding this as an env variable at the node level but it doesn't appear to give the desired results (same PDB errors were still occurring), maybe I'm doing something wrong or misunderstanding how this is working under the hood?

          We were running CruiseControl for years and never had this problem but we did however have issues where processes were not terminating properly and builds would run forever until someone intervened. Sometimes we still get this with Jenkins so from my point of view one problem is better than two so I'd rather just have an option to tell Jenkins not to force terminate anything - ever. If this cannot be done with the current version (ours is 1.566) Can we at least add a check box that says "Do not auto-terminate processes" as an option in a future release and let the user decide?

          Del Hyman-Jones added a comment - - edited Does anyone know if there is an option to stop Jenkins from killing processes completely as a global option instead of having to add the BUILD_ID to every single job? I have tried adding this as an env variable at the node level but it doesn't appear to give the desired results (same PDB errors were still occurring), maybe I'm doing something wrong or misunderstanding how this is working under the hood? We were running CruiseControl for years and never had this problem but we did however have issues where processes were not terminating properly and builds would run forever until someone intervened. Sometimes we still get this with Jenkins so from my point of view one problem is better than two so I'd rather just have an option to tell Jenkins not to force terminate anything - ever. If this cannot be done with the current version (ours is 1.566) Can we at least add a check box that says "Do not auto-terminate processes" as an option in a future release and let the user decide?

          Shannon Kerr added a comment -

          How are you trying to run this, Del? At first, I didn't have success getting it going, but now I seem to have it working fine. The BUILD_ID does seem to be an effective solution (I do worry about the memory leak though). I'm using the simple batch solution in comment 6, not the python solution. You just have to make sure that the mspdbsrv file is in your path and it should work fine. We use a batch wrapper, which is under version control, for our builds and I added code that says "If this is a Jenkins build, execute this block". To decide if this if a Jenkins build, I just check to see if JENKINS_URL is defined. Since I added that, we've not seen this issue return. Let me know if I can help in some way.

          Shannon Kerr added a comment - How are you trying to run this, Del? At first, I didn't have success getting it going, but now I seem to have it working fine. The BUILD_ID does seem to be an effective solution (I do worry about the memory leak though). I'm using the simple batch solution in comment 6, not the python solution. You just have to make sure that the mspdbsrv file is in your path and it should work fine. We use a batch wrapper, which is under version control, for our builds and I added code that says "If this is a Jenkins build, execute this block". To decide if this if a Jenkins build, I just check to see if JENKINS_URL is defined. Since I added that, we've not seen this issue return. Let me know if I can help in some way.

          Del Hyman-Jones added a comment - - edited

          I've added the block from above into the Jenkins command for the job at the moment but yesterday I got this error and there was only one build running so it is likely a different issue.

          33>X509Helper.h(118): fatal error C1090: PDB API call failed, error code '23' : '(

          I've even tried setting BUILD_ID=dontKillMe under the node configuration in Environment variables but I have been getting the original problem with that setting also. I even tried restarting the jenkins client service on the build server just in case it was needed for the env variable to be set for all child processes but it's not helping it seems. If this is working for yourself (@Shannon) I have to be doing something stupid.

          It seems that putting BUILD_ID under the node settings will be overridden when the build runds and will set BUILD_ID back to the build time. Which rules out having a global setting allowing me to turn this off.

          Del Hyman-Jones added a comment - - edited I've added the block from above into the Jenkins command for the job at the moment but yesterday I got this error and there was only one build running so it is likely a different issue. 33>X509Helper.h(118): fatal error C1090: PDB API call failed, error code '23' : '( I've even tried setting BUILD_ID=dontKillMe under the node configuration in Environment variables but I have been getting the original problem with that setting also. I even tried restarting the jenkins client service on the build server just in case it was needed for the env variable to be set for all child processes but it's not helping it seems. If this is working for yourself (@Shannon) I have to be doing something stupid. It seems that putting BUILD_ID under the node settings will be overridden when the build runds and will set BUILD_ID back to the build time. Which rules out having a global setting allowing me to turn this off.

          One thing I felt needed to be expressed here is that the fact that this defect arose in an update to the LTS edition at all worries me. Combined with the fact that this defect has been opened and under active discussion for months now without any 'real' resolution - other than some hacks and workarounds - is even more concerning. According to the Jenkins website LTS editions should "...change(s) less often and only for important bug fixes...". This policy seems to have been completely negated here. Given the severity / impact of this change I would have expected whatever "improvement" was made that caused this problem would have been reserved for the "latest" release, or at the very least reverted from the LTS edition after this problem was discovered.

          Perhaps someone with more knowledge about the cause of this error could elaborate on why neither of these approaches has been taken here.

          Kevin Phillips added a comment - One thing I felt needed to be expressed here is that the fact that this defect arose in an update to the LTS edition at all worries me. Combined with the fact that this defect has been opened and under active discussion for months now without any 'real' resolution - other than some hacks and workarounds - is even more concerning. According to the Jenkins website LTS editions should "...change(s) less often and only for important bug fixes...". This policy seems to have been completely negated here. Given the severity / impact of this change I would have expected whatever "improvement" was made that caused this problem would have been reserved for the "latest" release, or at the very least reverted from the LTS edition after this problem was discovered. Perhaps someone with more knowledge about the cause of this error could elaborate on why neither of these approaches has been taken here.

          Shannon Kerr added a comment -

          @Del, Yes, you cannot set BUILD_ID for a slave setting. It is set by Jenkins on a per build basis. You'd either have to set it in the batch section of the job itself (we did this for our most frequently used builds) or if you call a batch script or some other script, you can put it there.

          Shannon Kerr added a comment - @Del, Yes, you cannot set BUILD_ID for a slave setting. It is set by Jenkins on a per build basis. You'd either have to set it in the batch section of the job itself (we did this for our most frequently used builds) or if you call a batch script or some other script, you can put it there.

          Christian Bremer added a comment - 200$ is up for grabs for solving this issue at: https://freedomsponsors.org/issue/596/visual-studio-builds-started-by-jenkins-fail-with-fatal-error-c1090-because-mspdbsrvexe-gets-killed

          Daniel Weber added a comment -

          I implemented a whitelist solution, see pull request: https://github.com/jenkinsci/jenkins/pull/1562

          Daniel Weber added a comment - I implemented a whitelist solution, see pull request: https://github.com/jenkinsci/jenkins/pull/1562

          Steve Carter added a comment - - edited

          Nice work Daniel. Will be interesting to see whether that solves the problem.

          For the good of the thread, I'm going to try to summarize this from the top down as there's a lot of talk on here that seems to miss the key points.

          1) BUILD_ID is an environment variable, set by Jenkins when it starts a job.

          2) Environment variables are inherited when processes start other processes, except when overwritten. For e.g. in bash scripts you can go

          MYVAR=myvalue myscript.sh

          and myscript.sh will run with MYVAR set to myvalue.

          3) Therefore, all processes started by a jenkins job have the same BUILD_ID. This is recursive.

          4) Jenkins, in order to catch rogue processes at job end (i.e. those that have broken ties with their parent process) scans the whole process space for those with the particular BUILD_ID in their environment, and kills them.

          This is correct and good behavior by Jenkins.

          5) When you start an MSBUILD job, pdbsrv is started, which catches requests from parallel compilations and serializes them to write pdb files. When started from Jenkins, that pbdsrv process inherits BUILD_ID from the job.

          6) If you run two MSBUILD builds at once, then they share the same pdbsrv process.

          7) When the first job ends, it kills the pdbsrv process – because its BUILD_ID matches the first job's build id. The second job then fails.

          8) Solution 1: start pdbsrv with a BUILD_ID that doesn't match the build jobs. Then pdbsrv will not be killed at the end of the job.

          9) Solution 2: use Daniel's whitelist feature to not kill pdbsrv at the end of the job.

          Casual readers stop here.
          =========================

          10) The problem with Solutions 1 and 2 are this: pdbsrv still has a timeout, so you will get sporadic failures when the server goes away.

          11) My "heavyweight" python fix is trying to deal with that. Basically wrapping pdbsrv with a proper timeout and reference counting so that pdbsrv is present exactly when needed.

          12) pdbsrv's timeout doesn't get a new lease every time you use pdbsrv. I regard this as a bug in pdbsrv.

          13) You can't leave pdbsrv running forever because it (allegedly) has memory leaks. I regard this as a bug in pdbsrv.

          I really think to roll back Jenkins' ProcessTreeKiller is NOT a solution. The use of BUILD_ID brings the Jenkins machine under better control against rogue processes, and the workaround (for well-behaved servers) is easy, set BUILD_ID before starting the server, or use Daniel's whitelist.

          14) Solution 3: start pdbsrv periodically, e.g. every day with a day-long timeout. That will mitigate against the memory leaks. If you use some concurrency control, e.g. Job Weight plugin, you can make sure this "kill and restart pdbsrv" job does not fire during a build.

          =========================

          Solution 0: Finally, it would be remiss of me not to mention again my python workaround, which has been happily keeping parallel builds working for 54 weeks now without trouble.

          Steve Carter added a comment - - edited Nice work Daniel. Will be interesting to see whether that solves the problem. For the good of the thread, I'm going to try to summarize this from the top down as there's a lot of talk on here that seems to miss the key points. 1) BUILD_ID is an environment variable, set by Jenkins when it starts a job. 2) Environment variables are inherited when processes start other processes, except when overwritten. For e.g. in bash scripts you can go MYVAR=myvalue myscript.sh and myscript.sh will run with MYVAR set to myvalue. 3) Therefore, all processes started by a jenkins job have the same BUILD_ID. This is recursive. 4) Jenkins, in order to catch rogue processes at job end (i.e. those that have broken ties with their parent process) scans the whole process space for those with the particular BUILD_ID in their environment, and kills them. This is correct and good behavior by Jenkins. 5) When you start an MSBUILD job, pdbsrv is started, which catches requests from parallel compilations and serializes them to write pdb files. When started from Jenkins, that pbdsrv process inherits BUILD_ID from the job. 6) If you run two MSBUILD builds at once, then they share the same pdbsrv process. 7) When the first job ends, it kills the pdbsrv process – because its BUILD_ID matches the first job's build id. The second job then fails. 8) Solution 1: start pdbsrv with a BUILD_ID that doesn't match the build jobs. Then pdbsrv will not be killed at the end of the job. 9) Solution 2: use Daniel's whitelist feature to not kill pdbsrv at the end of the job. Casual readers stop here. ========================= 10) The problem with Solutions 1 and 2 are this: pdbsrv still has a timeout, so you will get sporadic failures when the server goes away. 11) My "heavyweight" python fix is trying to deal with that. Basically wrapping pdbsrv with a proper timeout and reference counting so that pdbsrv is present exactly when needed. 12) pdbsrv's timeout doesn't get a new lease every time you use pdbsrv. I regard this as a bug in pdbsrv. 13) You can't leave pdbsrv running forever because it (allegedly) has memory leaks. I regard this as a bug in pdbsrv. I really think to roll back Jenkins' ProcessTreeKiller is NOT a solution. The use of BUILD_ID brings the Jenkins machine under better control against rogue processes, and the workaround (for well-behaved servers) is easy, set BUILD_ID before starting the server, or use Daniel's whitelist. 14) Solution 3: start pdbsrv periodically, e.g. every day with a day-long timeout. That will mitigate against the memory leaks. If you use some concurrency control, e.g. Job Weight plugin, you can make sure this "kill and restart pdbsrv" job does not fire during a build. ========================= Solution 0: Finally, it would be remiss of me not to mention again my python workaround, which has been happily keeping parallel builds working for 54 weeks now without trouble.

          Steve Carter added a comment -

          penny drops just seen how whitelisting differs from BUILD_ID solution subtle, but it might just work...

          Steve Carter added a comment - penny drops just seen how whitelisting differs from BUILD_ID solution subtle, but it might just work...

          Just a quick ping-back on this issue. Outstanding for like 4 years, no comments for months now, and all for a debilitating, crippling problem in the system! I did notice the pull request Daniel Webber created, which does seem to have some more recent activity on it but still no complete resolution to the issue even in the latest LTS release.

          Are there plans for finishing this work any time soon? We are still stuck on an LTS version from like a year or two ago because we can not accept this bug into our production environment. If there is any way to get this fix in sooner rather than later I know I'd appreciate it and I'm sure many others would as well.

          Kevin Phillips added a comment - Just a quick ping-back on this issue. Outstanding for like 4 years, no comments for months now, and all for a debilitating, crippling problem in the system! I did notice the pull request Daniel Webber created, which does seem to have some more recent activity on it but still no complete resolution to the issue even in the latest LTS release. Are there plans for finishing this work any time soon? We are still stuck on an LTS version from like a year or two ago because we can not accept this bug into our production environment. If there is any way to get this fix in sooner rather than later I know I'd appreciate it and I'm sure many others would as well.

          @steve carter
          First, let me thank you for summarizing the earlier comment threads. That does help bring everything into focus.

          4) Jenkins, in order to catch rogue processes at job end (i.e. those that have broken ties with their parent process) scans the whole process space for those with the particular BUILD_ID in their environment, and kills them. This is correct and good behavior by Jenkins.

          Agreed. This is a perfectly valid and useful enhancement for the majority of cases. However, given the debilitating effect it has on this specific use case combined with the fact that the change was included on an LTS release which is expected to be kept as stable as possible is where I take issue. I see this problem as a bug, albeit a difficult to detect bug and admittedly a bug that is really caused by some questionable behavior provided by the Microsoft build tools, but a bug none the less. In that case critical, production halt kind of bugs like this should be fixed immediately or reverted until an appropriate fix can be made. Doing otherwise reduces users' confidence in the stability of the tool. There is a reason shops like ours choose to use LTS editions for production work - to avoid problems like this that may be found on the latest, cutting edge versions.

          8) Solution 1: start pdbsrv with a BUILD_ID that doesn't match the build jobs. Then pdbsrv will not be killed at the end of the job.

          This should be called a workaround or hack rather than a solution. That point aside, this workaround again won't work for our particular build environment. We use the BUILD_ID throughout our build processes to embed metadata in the binary files we generate. If we reset that environment variable as part of our build this metadata will essentially get corrupted. Changing our tooling to use an alternative environment variable would require significant effort as well, having to be propagated out to dozens of products across several release branches each.

          9) Solution 2: use Daniel's whitelist feature to not kill pdbsrv at the end of the job.

          Based on my review of his pull request, Daniel's feature has not yet been completed nor has it been included in any actual LTS release. I do believe this would be a reasonable and appropriate solution to this defect though, so hopefully this work can be completed sooner rather than later.

          10) The problem with Solutions 1 and 2 are this: pdbsrv still has a timeout, so you will get sporadic failures when the server goes away.

          I know some earlier posters did indicate that this was an issue for them I have not been able to reproduce the problem as described. When a compile begins and this process is running it makes use of the existing process, and if the process is not already running it starts it. I have never had a compile running and seen the mspdbsrv process terminate mid-compile without any other background process or system event occurring. Also, I work with many development teams including many dozens of developers and have never once had a report of this bug outside of the reproducible use cases I've stated before.

          Conversely, I have shown the problem is reproducible outside of Jenkins in very hard to detect ways which I suspect may appear to some to be an intermittent timeout. For example, if you are logged in to a system which is performing a compile in a background process which is also running under the same user profile as your local session, by simply logging out of the system the service terminates. The reason for this is the pdbsrv process is shared by the background process and your local user session and when you log out from the local session all processes in that memory space are terminated, including pdbsrv. This was a very difficult use case to isolate and not very obvious to users of the target systems and even went undiagnosed at my place of work for months under the assumption that the failure was unpredictable and intermittent.

          I know that my argument doesn't prove that this particular problem couldn't ever happen but I am extremely skeptical to say the least. If someone does believe that this problem does in fact exist I would greatly appreciate a detailed description on how to reproduce the problem. Maybe we're using a slightly older or slightly newer version of the compiler that doesn't exhibit the problem or something. Either way, if these individuals were willing to compare notes maybe we can help further isolate the root of this discrepancy.

          12) pdbsrv's timeout doesn't get a new lease every time you use pdbsrv. I regard this as a bug in pdbsrv.

          As I've stated in earlier posts, my team manages a build farm with close to a dozen agents now, running over 1000 build jobs and never once have I ever had this error occur on any of those systems, nor have any of the development teams we support report this problem on any of their local development machines. I would have to say that if this were in fact a core issue with the Microsoft toolset we would have discovered it by now. Again, if anyone can give me a reproducible use case that proves otherwise I would be happy to hear from them. Maybe we are doing something they aren't, or vice versa.

          13) You can't leave pdbsrv running forever because it (allegedly) has memory leaks. I regard this as a bug in pdbsrv.

          Again, this is something we have not been able to reproduce. For example, I have watches some of our agents that are under the most considerable load wrt build operations - machines which essentially run 24/7 compiling one or more projects in parallel nearly all the time and these systems continue to run stably day after day, week after week without requiring any outside intervention from me or my team. The pdbsrv process is nearly always active, the memory consumption increases and decreases with the load on the machines, and never causes any fatal errors in our build processes.

          If anyone can provide specific, reproducible criteria for this problem I would be interested to hear it. If there is something we have overlooked that may be causing us grief elsewhere that we have not yet considered I would definitely want to know about it.

          I really think to roll back Jenkins' ProcessTreeKiller is NOT a solution.

          Agreed. I don't think 'just' rolling back this change is the best solution. I think fixing this bug is the best solution. However in the absence of an appropriate fix for this bug, combined with the severity of it's impact, I think that rolling back the change until an appropriate fix was put in place would have been a better solution rather than stranding users of your tool on an old, out of date release as we have been.

          Just my 2 cents.

          The use of BUILD_ID brings the Jenkins machine under better control against rogue processes...

          Totally agree that the improvement is well worth the effort. My concern is that the change includes a relatively significant bug.

          ...and the workaround (for well-behaved servers) is easy, set BUILD_ID before starting the server, or use Daniel's whitelist.

          Again, 'easy' workaround is a relative term. As just mentioned we would need to rework our build tools and roll that change out to many teams for many products, and backport those changes to many branches for this to work, after which we'd need to going through all 1000+ jobs on our farm and update them with the hack to the environment variable. Obviously significant effort in our case. Also the whitelist solution has yet to be completed from what I can tell, so that is not a usable solution yet.

          14) Solution 3: start pdbsrv periodically, e.g. every day with a day-long timeout. That will mitigate against the memory leaks. If you use some concurrency control, e.g. Job Weight plugin, you can make sure this "kill and restart pdbsrv" job does not fire during a build.

          Again, just to be clear this is clearly a workaround and not a solution.

          This hack may work for us in the interim until an appropriate fix can be made. I will test it out as soon as I can and report back. In our case we'll likely just setup a scheduled task that runs on boot and forces the service to start, and stay running indefinitely as there is no need for it to shut down ever that we have seen.

          However, for those individuals who claim that the service does need periodic resetting a solution like this would likely be more complex. Assuming they to need to ensure the utmost stability of their build farm as we do, they would need to ensure the pdbsrv service gets started before any compilation operation runs, including after reboots, power outages, crashes and the like. I don't believe there is any way to achieve this using a Jenkins operation. This means an external process would be needed like the Scheduled Task idea I just mentioned. But then the external process would be running independently from the Jenkins agent making it even more difficult to coordinate the two. For example, I suspect it would be difficult at best to make sure the scheduled task restarts the service at an opportune moment when no compilation operations are happening on the agent. Just something else for those users to keep in mind.

          Kevin Phillips added a comment - @steve carter First, let me thank you for summarizing the earlier comment threads. That does help bring everything into focus. 4) Jenkins, in order to catch rogue processes at job end (i.e. those that have broken ties with their parent process) scans the whole process space for those with the particular BUILD_ID in their environment, and kills them. This is correct and good behavior by Jenkins. Agreed. This is a perfectly valid and useful enhancement for the majority of cases. However, given the debilitating effect it has on this specific use case combined with the fact that the change was included on an LTS release which is expected to be kept as stable as possible is where I take issue. I see this problem as a bug, albeit a difficult to detect bug and admittedly a bug that is really caused by some questionable behavior provided by the Microsoft build tools, but a bug none the less. In that case critical, production halt kind of bugs like this should be fixed immediately or reverted until an appropriate fix can be made. Doing otherwise reduces users' confidence in the stability of the tool. There is a reason shops like ours choose to use LTS editions for production work - to avoid problems like this that may be found on the latest, cutting edge versions. 8) Solution 1: start pdbsrv with a BUILD_ID that doesn't match the build jobs. Then pdbsrv will not be killed at the end of the job. This should be called a workaround or hack rather than a solution. That point aside, this workaround again won't work for our particular build environment. We use the BUILD_ID throughout our build processes to embed metadata in the binary files we generate. If we reset that environment variable as part of our build this metadata will essentially get corrupted. Changing our tooling to use an alternative environment variable would require significant effort as well, having to be propagated out to dozens of products across several release branches each. 9) Solution 2: use Daniel's whitelist feature to not kill pdbsrv at the end of the job. Based on my review of his pull request, Daniel's feature has not yet been completed nor has it been included in any actual LTS release. I do believe this would be a reasonable and appropriate solution to this defect though, so hopefully this work can be completed sooner rather than later. 10) The problem with Solutions 1 and 2 are this: pdbsrv still has a timeout, so you will get sporadic failures when the server goes away. I know some earlier posters did indicate that this was an issue for them I have not been able to reproduce the problem as described. When a compile begins and this process is running it makes use of the existing process, and if the process is not already running it starts it. I have never had a compile running and seen the mspdbsrv process terminate mid-compile without any other background process or system event occurring. Also, I work with many development teams including many dozens of developers and have never once had a report of this bug outside of the reproducible use cases I've stated before. Conversely, I have shown the problem is reproducible outside of Jenkins in very hard to detect ways which I suspect may appear to some to be an intermittent timeout. For example, if you are logged in to a system which is performing a compile in a background process which is also running under the same user profile as your local session, by simply logging out of the system the service terminates. The reason for this is the pdbsrv process is shared by the background process and your local user session and when you log out from the local session all processes in that memory space are terminated, including pdbsrv. This was a very difficult use case to isolate and not very obvious to users of the target systems and even went undiagnosed at my place of work for months under the assumption that the failure was unpredictable and intermittent. I know that my argument doesn't prove that this particular problem couldn't ever happen but I am extremely skeptical to say the least. If someone does believe that this problem does in fact exist I would greatly appreciate a detailed description on how to reproduce the problem. Maybe we're using a slightly older or slightly newer version of the compiler that doesn't exhibit the problem or something. Either way, if these individuals were willing to compare notes maybe we can help further isolate the root of this discrepancy. 12) pdbsrv's timeout doesn't get a new lease every time you use pdbsrv. I regard this as a bug in pdbsrv. As I've stated in earlier posts, my team manages a build farm with close to a dozen agents now, running over 1000 build jobs and never once have I ever had this error occur on any of those systems, nor have any of the development teams we support report this problem on any of their local development machines. I would have to say that if this were in fact a core issue with the Microsoft toolset we would have discovered it by now. Again, if anyone can give me a reproducible use case that proves otherwise I would be happy to hear from them. Maybe we are doing something they aren't, or vice versa. 13) You can't leave pdbsrv running forever because it (allegedly) has memory leaks. I regard this as a bug in pdbsrv. Again, this is something we have not been able to reproduce. For example, I have watches some of our agents that are under the most considerable load wrt build operations - machines which essentially run 24/7 compiling one or more projects in parallel nearly all the time and these systems continue to run stably day after day, week after week without requiring any outside intervention from me or my team. The pdbsrv process is nearly always active, the memory consumption increases and decreases with the load on the machines, and never causes any fatal errors in our build processes. If anyone can provide specific, reproducible criteria for this problem I would be interested to hear it. If there is something we have overlooked that may be causing us grief elsewhere that we have not yet considered I would definitely want to know about it. I really think to roll back Jenkins' ProcessTreeKiller is NOT a solution. Agreed. I don't think 'just' rolling back this change is the best solution. I think fixing this bug is the best solution. However in the absence of an appropriate fix for this bug, combined with the severity of it's impact, I think that rolling back the change until an appropriate fix was put in place would have been a better solution rather than stranding users of your tool on an old, out of date release as we have been. Just my 2 cents. The use of BUILD_ID brings the Jenkins machine under better control against rogue processes... Totally agree that the improvement is well worth the effort. My concern is that the change includes a relatively significant bug. ...and the workaround (for well-behaved servers) is easy, set BUILD_ID before starting the server, or use Daniel's whitelist. Again, 'easy' workaround is a relative term. As just mentioned we would need to rework our build tools and roll that change out to many teams for many products, and backport those changes to many branches for this to work, after which we'd need to going through all 1000+ jobs on our farm and update them with the hack to the environment variable. Obviously significant effort in our case. Also the whitelist solution has yet to be completed from what I can tell, so that is not a usable solution yet. 14) Solution 3: start pdbsrv periodically, e.g. every day with a day-long timeout. That will mitigate against the memory leaks. If you use some concurrency control, e.g. Job Weight plugin, you can make sure this "kill and restart pdbsrv" job does not fire during a build. Again, just to be clear this is clearly a workaround and not a solution. This hack may work for us in the interim until an appropriate fix can be made. I will test it out as soon as I can and report back. In our case we'll likely just setup a scheduled task that runs on boot and forces the service to start, and stay running indefinitely as there is no need for it to shut down ever that we have seen. However, for those individuals who claim that the service does need periodic resetting a solution like this would likely be more complex. Assuming they to need to ensure the utmost stability of their build farm as we do, they would need to ensure the pdbsrv service gets started before any compilation operation runs, including after reboots, power outages, crashes and the like. I don't believe there is any way to achieve this using a Jenkins operation. This means an external process would be needed like the Scheduled Task idea I just mentioned. But then the external process would be running independently from the Jenkins agent making it even more difficult to coordinate the two. For example, I suspect it would be difficult at best to make sure the scheduled task restarts the service at an opportune moment when no compilation operations are happening on the agent. Just something else for those users to keep in mind.

          PS: Sorry for the rant. My team and I have been aggravated for some time now, hoping this bug would be fixed so we can move off the old version of Jenkins we're currently stuck on and thus able to pick up some new bug fixes both in the core as well as in numerous plugins which only support newer versions. Hopefully I don't come across as overly adversarial.

          Kevin Phillips added a comment - PS: Sorry for the rant. My team and I have been aggravated for some time now, hoping this bug would be fixed so we can move off the old version of Jenkins we're currently stuck on and thus able to pick up some new bug fixes both in the core as well as in numerous plugins which only support newer versions. Hopefully I don't come across as overly adversarial.

          Lars Rosenboom added a comment - - edited

          Maybe there is a way to shut down the mspdbsrv.exe softly, so it stops only after all active request (by parallel builds) are done. Then it should simply restart on the next request.

          Another solution would be to allow the user to give a list of process names not to kill (or maybe hardcode not to kill mspdbsrv.exe).

          Lars Rosenboom added a comment - - edited Maybe there is a way to shut down the mspdbsrv.exe softly, so it stops only after all active request (by parallel builds) are done. Then it should simply restart on the next request. Another solution would be to allow the user to give a list of process names not to kill (or maybe hardcode not to kill mspdbsrv.exe).

          Gavin Swanson added a comment -

          Stopping after a timeout period after all active requests and continuing to run when it gets a new request are the way mspdbsrv runs normally when something doesn't go around killing it (ala Jenkins).

          I believe the correct solution is a whitelist.

          Gavin Swanson added a comment - Stopping after a timeout period after all active requests and continuing to run when it gets a new request are the way mspdbsrv runs normally when something doesn't go around killing it (ala Jenkins). I believe the correct solution is a whitelist.

          Update
          So, it turns out setting up some kind of background process to spawn a copy of the pdbsrv process isn't going to work as expected. From what I can tell Windows seems to be able to tell when a process has been launched from a system service and it will prevent those sub-processes from using other processes that are spawned elsewhere. The particulars of my test case are as follows:

          1. Setup a small Python script that launches a copy of mspdbsrv.exe when called
          2. Setup a scheduled task in Windows to run the python script on boot
          3. Reboot the agent - confirm the mspdbsrv.exe process is running
          4. trigger a compilation operation via the Jenkins dashboard
          5. A new, secondary copy of mspdbsrv.exe is spawned to serve the Jenkins agent. This sub-process is then terminated as per usual once the Jenkins build is complete.

          I have confirmed that both the service that runs the Jenkins agent and the scheduled task use the same user profile and credentials and that both environments are using the same version of mspdbsrv.exe with the same set of command line parameters (ie: -start -spawn).

          Looks like I have to head back to the drawing board.

          Kevin Phillips added a comment - Update So, it turns out setting up some kind of background process to spawn a copy of the pdbsrv process isn't going to work as expected. From what I can tell Windows seems to be able to tell when a process has been launched from a system service and it will prevent those sub-processes from using other processes that are spawned elsewhere. The particulars of my test case are as follows: Setup a small Python script that launches a copy of mspdbsrv.exe when called Setup a scheduled task in Windows to run the python script on boot Reboot the agent - confirm the mspdbsrv.exe process is running trigger a compilation operation via the Jenkins dashboard A new, secondary copy of mspdbsrv.exe is spawned to serve the Jenkins agent. This sub-process is then terminated as per usual once the Jenkins build is complete. I have confirmed that both the service that runs the Jenkins agent and the scheduled task use the same user profile and credentials and that both environments are using the same version of mspdbsrv.exe with the same set of command line parameters (ie: -start -spawn). Looks like I have to head back to the drawing board.

          Update
          As a quick sanity check I decided to throw together a quick ad-hoc test configuration where by I overload the BUILD_ID in the environment for one of my compilation jobs just to see if one of the hacks proposed earlier will potentially work. Unfortunately it looks like this is not a robust solution either. I have confirmed in the trivial case that the solution does work, as in:

          1. Setup a job with a single shell operation as a build step, configured as follows:
            • override the BUILD_ID env var with some arbitrary value
            • call into MSBuild to perform the compilation
          2. run a build of the given job
          3. upon completion, confirm that the mspdbsrv.exe process is still running - TEST SUCCESSFUL

          However, unfortunately I've found another case where this solution doesn't work. Apparently if you manually kill the build while it is running Jenkins still somehow manages to locate the orphaned pdbsrv process and kill it, despite the changes described above. So, to put it more clearly:

          1. Setup a job with a single shell operation as a build step, configured as follows:
            • override the BUILD_ID env var with some arbitrary value
            • call into MSBuild to perform the compilation
          2. run a build of the given job
          3. while the compilation operation is running, and you have confirmed the mspdbsrv.exe process has been launched, manually force the running build to terminate (ie: by clicking on the X icon next to the running build on the Jenkins dashboard)
          4. FAILURE - Jenkins still terminates the pdbsrv process

          I have confirmed that the pdbsrv process does correctly inherit the overloaded BUILD_ID, so Jenkins is somehow able to locate and terminate the process in this case. I suspect what may be happening in my test env is that at the point at which I manually kill the build Jenkins is still running one or more Visual Studio operations which have a direct link to the mspdbsrv.exe process and thus it detects and kills the thread by recursively transcending the process tree killing all running processes / threads that are tied to the agent at the time.

          Either way, this example shows that even this 'hack' of overriding the BUILD_ID is fragile at best. It looks like we may have no choice but to wait for a fix for that 'whitelist' solution before we can consider upgrading our Jenkins instance.

          Kevin Phillips added a comment - Update As a quick sanity check I decided to throw together a quick ad-hoc test configuration where by I overload the BUILD_ID in the environment for one of my compilation jobs just to see if one of the hacks proposed earlier will potentially work. Unfortunately it looks like this is not a robust solution either. I have confirmed in the trivial case that the solution does work, as in: Setup a job with a single shell operation as a build step, configured as follows: override the BUILD_ID env var with some arbitrary value call into MSBuild to perform the compilation run a build of the given job upon completion, confirm that the mspdbsrv.exe process is still running - TEST SUCCESSFUL However, unfortunately I've found another case where this solution doesn't work. Apparently if you manually kill the build while it is running Jenkins still somehow manages to locate the orphaned pdbsrv process and kill it, despite the changes described above. So, to put it more clearly: Setup a job with a single shell operation as a build step, configured as follows: override the BUILD_ID env var with some arbitrary value call into MSBuild to perform the compilation run a build of the given job while the compilation operation is running, and you have confirmed the mspdbsrv.exe process has been launched, manually force the running build to terminate (ie: by clicking on the X icon next to the running build on the Jenkins dashboard) FAILURE - Jenkins still terminates the pdbsrv process I have confirmed that the pdbsrv process does correctly inherit the overloaded BUILD_ID, so Jenkins is somehow able to locate and terminate the process in this case. I suspect what may be happening in my test env is that at the point at which I manually kill the build Jenkins is still running one or more Visual Studio operations which have a direct link to the mspdbsrv.exe process and thus it detects and kills the thread by recursively transcending the process tree killing all running processes / threads that are tied to the agent at the time. Either way, this example shows that even this 'hack' of overriding the BUILD_ID is fragile at best. It looks like we may have no choice but to wait for a fix for that 'whitelist' solution before we can consider upgrading our Jenkins instance.

          Update
          While reporting the issue in my last comment I had the idea for a slight variation of the configuration described there which does appear to work in both use cases. The main modification that I made was to separate the build operation into two separate build operations:

          • the first is a simple Windows command line call which overrides BUILD_ID and then launches mspdbsrv.exe. Once this first operation completes, Jenkins terminates the shell session that is linked to the pdbsrv process thus decoupling it from the agent. Combined with the overloaded BUILD_ID env var, Jenkins can no longer track the process.
          • the second operation is just another instance of a Windows shell session that then calls into msbuild to proceed with the build.

          Theoretically even this solution "could" fall prey to the same problem I described in my previous comment, however the execution time of this initial build step is negligible and is highly unlikely to be exploited in practice (ie: a user would need to hit the kill button on the build at just that small fraction of a second it takes Jenkins to launch mspdbsrv.exe).

          I'm not sure how easy this hack will be for us to roll out into production at the scale we need, but just in case others find this tidbit of information helpful I thought I'd provide it here.

          Kevin Phillips added a comment - Update While reporting the issue in my last comment I had the idea for a slight variation of the configuration described there which does appear to work in both use cases. The main modification that I made was to separate the build operation into two separate build operations: the first is a simple Windows command line call which overrides BUILD_ID and then launches mspdbsrv.exe. Once this first operation completes, Jenkins terminates the shell session that is linked to the pdbsrv process thus decoupling it from the agent. Combined with the overloaded BUILD_ID env var, Jenkins can no longer track the process. the second operation is just another instance of a Windows shell session that then calls into msbuild to proceed with the build. Theoretically even this solution "could" fall prey to the same problem I described in my previous comment, however the execution time of this initial build step is negligible and is highly unlikely to be exploited in practice (ie: a user would need to hit the kill button on the build at just that small fraction of a second it takes Jenkins to launch mspdbsrv.exe). I'm not sure how easy this hack will be for us to roll out into production at the scale we need, but just in case others find this tidbit of information helpful I thought I'd provide it here.

          Code changed in jenkins
          User: Daniel Weber
          Path:
          core/src/main/java/hudson/util/ProcessKillingVeto.java
          core/src/main/java/hudson/util/ProcessTree.java
          test/src/test/java/hudson/util/ProcessTreeKillerTest.java
          http://jenkins-ci.org/commit/jenkins/a220431770cfe716e4f69fd76a4a59bbb27aa045
          Log:
          JENKINS-9104 Add ProcessKillingVeto extension point

          This allows extensions to veto killing of certain processes.

          Issue 9104 is not yet solved by this, it is only part of the solution. The
          rest should be taken care of in plugins.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Daniel Weber Path: core/src/main/java/hudson/util/ProcessKillingVeto.java core/src/main/java/hudson/util/ProcessTree.java test/src/test/java/hudson/util/ProcessTreeKillerTest.java http://jenkins-ci.org/commit/jenkins/a220431770cfe716e4f69fd76a4a59bbb27aa045 Log: JENKINS-9104 Add ProcessKillingVeto extension point This allows extensions to veto killing of certain processes. Issue 9104 is not yet solved by this, it is only part of the solution. The rest should be taken care of in plugins.

          Code changed in jenkins
          User: Daniel Beck
          Path:
          core/src/main/java/hudson/util/ProcessKillingVeto.java
          core/src/main/java/hudson/util/ProcessTree.java
          test/src/test/java/hudson/util/ProcessTreeKillerTest.java
          http://jenkins-ci.org/commit/jenkins/9a047acd4b5a4e805cee7260f3d091405dc7b930
          Log:
          Merge pull request #1684 from DanielWeber/JENKINS-9104

          JENKINS-9104 Add extension point that allows extensions to veto killing...

          Compare: https://github.com/jenkinsci/jenkins/compare/3c785d5af0ad...9a047acd4b5a

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Daniel Beck Path: core/src/main/java/hudson/util/ProcessKillingVeto.java core/src/main/java/hudson/util/ProcessTree.java test/src/test/java/hudson/util/ProcessTreeKillerTest.java http://jenkins-ci.org/commit/jenkins/9a047acd4b5a4e805cee7260f3d091405dc7b930 Log: Merge pull request #1684 from DanielWeber/ JENKINS-9104 JENKINS-9104 Add extension point that allows extensions to veto killing... Compare: https://github.com/jenkinsci/jenkins/compare/3c785d5af0ad...9a047acd4b5a

          dogfood added a comment -

          Integrated in jenkins_main_trunk #4205
          JENKINS-9104 Add ProcessKillingVeto extension point (Revision a220431770cfe716e4f69fd76a4a59bbb27aa045)

          Result = UNSTABLE
          daniel.weber.dev : a220431770cfe716e4f69fd76a4a59bbb27aa045
          Files :

          • core/src/main/java/hudson/util/ProcessKillingVeto.java
          • core/src/main/java/hudson/util/ProcessTree.java
          • test/src/test/java/hudson/util/ProcessTreeKillerTest.java

          dogfood added a comment - Integrated in jenkins_main_trunk #4205 JENKINS-9104 Add ProcessKillingVeto extension point (Revision a220431770cfe716e4f69fd76a4a59bbb27aa045) Result = UNSTABLE daniel.weber.dev : a220431770cfe716e4f69fd76a4a59bbb27aa045 Files : core/src/main/java/hudson/util/ProcessKillingVeto.java core/src/main/java/hudson/util/ProcessTree.java test/src/test/java/hudson/util/ProcessTreeKillerTest.java

          MiFoe added a comment -

          When you use the commandline switch /Z7 the debug info is stored in the object and no server process is needed. This should also solve the problem.

          MiFoe added a comment - When you use the commandline switch /Z7 the debug info is stored in the object and no server process is needed. This should also solve the problem.

          Gavin Swanson added a comment -

          How does the /Z7 flag affect performance? My impression is that the point of mspdbsrv.exe is to keep the data around for other builds to use, thus decreasing build times for subsequent builds.

          Gavin Swanson added a comment - How does the /Z7 flag affect performance? My impression is that the point of mspdbsrv.exe is to keep the data around for other builds to use, thus decreasing build times for subsequent builds.

          MiFoe added a comment -

          It does not affect performance but size of object file. with this option the debug information is stored in each object file instead of one pdb. At linktime, the debug information is written in a PDB file.

          MiFoe added a comment - It does not affect performance but size of object file. with this option the debug information is stored in each object file instead of one pdb. At linktime, the debug information is written in a PDB file.

          Kevin Navero added a comment -

          Just wanted to note that this also occurs on my slave nodes and each slave node only has one executor. So at first glance, since I'm not running concurrent builds on any individual slave node, it seems like this error occurring on my slave nodes doesn't make any sense.

          Kevin Navero added a comment - Just wanted to note that this also occurs on my slave nodes and each slave node only has one executor. So at first glance, since I'm not running concurrent builds on any individual slave node, it seems like this error occurring on my slave nodes doesn't make any sense.

          Code changed in jenkins
          User: Daniel Weber
          Path:
          pom.xml
          src/main/java/hudson/plugins/msbuild/MsBuildKillingVeto.java
          src/test/java/hudson/plugins/msbuild/MsBuildKillingVetoTest.java
          http://jenkins-ci.org/commit/msbuild-plugin/855a84479b64f32ceb30f73433858dfe2efb5e9f
          Log:
          [FIXED JENKINS-9104] Veto killing mspdbsrv.exe

          Making use of the newly introduced ProcessKillingVeto extension point,
          we now make sure that mspdbsrv.exe survives process killing during build
          cleanup.

          This requires a Jenkins version >= 1.625, the new extension point was
          added there. I marked the extension as optional, so that the msbuild
          plugin should still work with older Jenkins releases.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Daniel Weber Path: pom.xml src/main/java/hudson/plugins/msbuild/MsBuildKillingVeto.java src/test/java/hudson/plugins/msbuild/MsBuildKillingVetoTest.java http://jenkins-ci.org/commit/msbuild-plugin/855a84479b64f32ceb30f73433858dfe2efb5e9f Log: [FIXED JENKINS-9104] Veto killing mspdbsrv.exe Making use of the newly introduced ProcessKillingVeto extension point, we now make sure that mspdbsrv.exe survives process killing during build cleanup. This requires a Jenkins version >= 1.625, the new extension point was added there. I marked the extension as optional, so that the msbuild plugin should still work with older Jenkins releases.

          Code changed in jenkins
          User: Gregory Boissinot
          Path:
          pom.xml
          src/main/java/hudson/plugins/msbuild/MsBuildKillingVeto.java
          src/test/java/hudson/plugins/msbuild/MsBuildKillingVetoTest.java
          http://jenkins-ci.org/commit/msbuild-plugin/48084be76d434195c9e8b2ddc66f1fb5255a78de
          Log:
          Merge pull request #19 from DanielWeber/master

          [FIXED JENKINS-9104] Veto killing mspdbsrv.exe

          Compare: https://github.com/jenkinsci/msbuild-plugin/compare/98f71956d897...48084be76d43

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Gregory Boissinot Path: pom.xml src/main/java/hudson/plugins/msbuild/MsBuildKillingVeto.java src/test/java/hudson/plugins/msbuild/MsBuildKillingVetoTest.java http://jenkins-ci.org/commit/msbuild-plugin/48084be76d434195c9e8b2ddc66f1fb5255a78de Log: Merge pull request #19 from DanielWeber/master [FIXED JENKINS-9104] Veto killing mspdbsrv.exe Compare: https://github.com/jenkinsci/msbuild-plugin/compare/98f71956d897...48084be76d43

          Code changed in jenkins
          User: Gregory Boissinot
          Path:
          pom.xml
          src/main/java/hudson/plugins/msbuild/MsBuildKillingVeto.java
          src/test/java/hudson/plugins/msbuild/MsBuildKillingVetoTest.java
          http://jenkins-ci.org/commit/msbuild-plugin/b9a5b02117e0ee097aaf030ab2574daa3dcd217d
          Log:
          Revert "[FIXED JENKINS-9104] Veto killing mspdbsrv.exe"

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Gregory Boissinot Path: pom.xml src/main/java/hudson/plugins/msbuild/MsBuildKillingVeto.java src/test/java/hudson/plugins/msbuild/MsBuildKillingVetoTest.java http://jenkins-ci.org/commit/msbuild-plugin/b9a5b02117e0ee097aaf030ab2574daa3dcd217d Log: Revert " [FIXED JENKINS-9104] Veto killing mspdbsrv.exe"

          Code changed in jenkins
          User: Gregory Boissinot
          Path:
          pom.xml
          src/main/java/hudson/plugins/msbuild/MsBuildKillingVeto.java
          src/test/java/hudson/plugins/msbuild/MsBuildKillingVetoTest.java
          http://jenkins-ci.org/commit/msbuild-plugin/031a05982b16e42cba5544c4ba9511515941c62f
          Log:
          Merge pull request #20 from jenkinsci/revert-19-master

          Revert "[FIXED JENKINS-9104] Veto killing mspdbsrv.exe"

          Compare: https://github.com/jenkinsci/msbuild-plugin/compare/48084be76d43...031a05982b16

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Gregory Boissinot Path: pom.xml src/main/java/hudson/plugins/msbuild/MsBuildKillingVeto.java src/test/java/hudson/plugins/msbuild/MsBuildKillingVetoTest.java http://jenkins-ci.org/commit/msbuild-plugin/031a05982b16e42cba5544c4ba9511515941c62f Log: Merge pull request #20 from jenkinsci/revert-19-master Revert " [FIXED JENKINS-9104] Veto killing mspdbsrv.exe" Compare: https://github.com/jenkinsci/msbuild-plugin/compare/48084be76d43...031a05982b16

          damian dixon added a comment - - edited

          > Revert "[FIXED JENKINS-9104] Veto killing mspdbsrv.exe"

          I'm confused why has the code fix been reverted?

          The reason I am looking at this again is that the BUILD_ID work around is no longer working for me.

          Neither is the 1.25 msbuild plugin which is meant to have the fix in.

          I upgraded from 1.595 to 1.645.

          damian dixon added a comment - - edited > Revert " [FIXED JENKINS-9104] Veto killing mspdbsrv.exe" I'm confused why has the code fix been reverted? The reason I am looking at this again is that the BUILD_ID work around is no longer working for me. Neither is the 1.25 msbuild plugin which is meant to have the fix in. I upgraded from 1.595 to 1.645.

          Daniel Beck added a comment -

          Daniel Beck added a comment - damiandixon https://github.com/jenkinsci/msbuild-plugin/pull/20

          Daniel Weber added a comment -

          damiandixon: My changes have been reverted by accident, the msbuild plugin release 1.25 does not contain the change required to fix this issue.
          There is a new PR reverting the revert: https://github.com/jenkinsci/msbuild-plugin/pull/21

          Daniel Weber added a comment - damiandixon : My changes have been reverted by accident, the msbuild plugin release 1.25 does not contain the change required to fix this issue. There is a new PR reverting the revert: https://github.com/jenkinsci/msbuild-plugin/pull/21

          Daniel Weber added a comment -

          This is still not resolved. We need an update of the msbuild-plugin, see PR https://github.com/jenkinsci/msbuild-plugin/pull/21

          Daniel Weber added a comment - This is still not resolved. We need an update of the msbuild-plugin, see PR https://github.com/jenkinsci/msbuild-plugin/pull/21

          Daniel Beck added a comment -

          danielweber This issue is filed against the core component, and that change has been included a long time ago.

          Daniel Beck added a comment - danielweber This issue is filed against the core component, and that change has been included a long time ago.

          Is there a plan for Visual Studio builds not started by the msbuild-plugin, please?

          I'm asking because our job configurations use a "Execute Windows batch command" build step rather than "Build a Visual Studio project or solution using MSBuild" build step (and our batch process is non-trivial).

          Antony Bartlett added a comment - Is there a plan for Visual Studio builds not started by the msbuild-plugin, please? I'm asking because our job configurations use a "Execute Windows batch command" build step rather than "Build a Visual Studio project or solution using MSBuild" build step (and our batch process is non-trivial).

          Daniel Beck added a comment -

          akb The proposed MSBuild Plugin change only requires the plugin to be installed to be effective (assuming mspdbsrv.exe is what you don't want killed).

          Daniel Beck added a comment - akb The proposed MSBuild Plugin change only requires the plugin to be installed to be effective (assuming mspdbsrv.exe is what you don't want killed).

          That's great - thank you very much for clarifying this, and for your efforts to fix the wider issue - I'm looking forward to having more projects and configurations built automatically in a timely fashion through judicious use of parallelization

          Antony Bartlett added a comment - That's great - thank you very much for clarifying this, and for your efforts to fix the wider issue - I'm looking forward to having more projects and configurations built automatically in a timely fashion through judicious use of parallelization

          Daniel Beck added a comment -

          akb Forwarding the praise to my (first)namesake danielweber who did all the work

          Daniel Beck added a comment - akb Forwarding the praise to my (first)namesake danielweber who did all the work

          Daniel Weber added a comment -

          danielbeck: Well, the core stuff is done. But from a user's perspective the issue still exists.

          How can I get someone to merge the pending PR and create a release of the msbuild plugin?

          Daniel Weber added a comment - danielbeck : Well, the core stuff is done. But from a user's perspective the issue still exists. How can I get someone to merge the pending PR and create a release of the msbuild plugin?

          Pete W added a comment -

          What's happened to this fix? It sounds like its ready to go. How can we get a new release of the plugin?

          Pete W added a comment - What's happened to this fix? It sounds like its ready to go. How can we get a new release of the plugin?

          I tried parallel builds with MSBuild plugin 1.25 on top of Jenkins 1.580.1 but unfortunately I still get this error (fatal error C1090: PDB API call failed, error code '23'). Did I miss something ?

          Yannick Kamezac added a comment - I tried parallel builds with MSBuild plugin 1.25 on top of Jenkins 1.580.1 but unfortunately I still get this error (fatal error C1090: PDB API call failed, error code '23'). Did I miss something ?

          When do you publish new version of plugin with fix? It's been month since you released version with(out) fix...

          Aleksander Stojanowski added a comment - When do you publish new version of plugin with fix? It's been month since you released version with(out) fix...

          Jaime Ramos added a comment - - edited

          I'm in need of a fix for this too, it's consistently failing numerous jobs for me. Is there an old version of Jenkins to revert to that avoids this particular problem? I'm willing to go that route as a workaround.
          So far this has been a cause of a pretty bad first impressions for a team I setup a CI build setup for who had never seen Jenkins before.
          I'm using VS2010 devenv.exe to build the solution files.

          Jaime Ramos added a comment - - edited I'm in need of a fix for this too, it's consistently failing numerous jobs for me. Is there an old version of Jenkins to revert to that avoids this particular problem? I'm willing to go that route as a workaround. So far this has been a cause of a pretty bad first impressions for a team I setup a CI build setup for who had never seen Jenkins before. I'm using VS2010 devenv.exe to build the solution files.

          Olexandr Maltsev added a comment - - edited

          Hello Jaime,
          I found a solution.
          I think it is a workaround, but it works for me.
          I set for every project the addition String parameter.
          Go to the Jenkins Project and set "This build is parameterized", “Name” – “BUILD_ID”, “Default Value” – “DoNotKillMe”.

          Olexandr Maltsev added a comment - - edited Hello Jaime, I found a solution. I think it is a workaround, but it works for me. I set for every project the addition String parameter. Go to the Jenkins Project and set "This build is parameterized", “Name” – “BUILD_ID”, “Default Value” – “DoNotKillMe”.

          Edgars Batna added a comment - - edited

          Stumbled upon this issue immediately after trying parallel builds. Been open for 5 years now, so I guess you can simply check for 'mspdbsrv.exe' and leave it alone? Please free us of our pain.

          Edgars Batna added a comment - - edited Stumbled upon this issue immediately after trying parallel builds. Been open for 5 years now, so I guess you can simply check for 'mspdbsrv.exe' and leave it alone? Please free us of our pain.

          Ilya I. added a comment -

          Somebody, publish the new version please. Apparently, the fix is already in the source code on GitHub. Can someone else (other than the maintainer) release the new version?

          Ilya I. added a comment - Somebody, publish the new version please. Apparently, the fix is already in the source code on GitHub. Can someone else (other than the maintainer) release the new version?

          James Telfer added a comment -

          FWIW, we implemented a workaround to this issue that doesn't involve wiping out the BUILD_ID variable (as we need to use it). Having a release with the Veto would be better, but this avoids random crashes in the meantime.

          Instead of allowing the MSBuild process to start the daemon itself, you cause the daemon to start using an environment that you choose. MSBuild then just uses the instance you started rather than starting its own.

          The Powershell we use is as follows. Use the Powershell plugin to run this as a step before the MSBuild plugin step (could be translated to Windows batch too if you like).

          # https://wiki.jenkins-ci.org/display/JENKINS/ProcessTreeKiller
          
          var originalBuildID = $Env:BUILD_ID
          $Env:BUILD_ID = "DoNotKillMe"
          try
          {
              start mspdbsrv -argumentlist '-start','-spawn' -NoNewWindow
          }
          catch {}
          $Env:BUILD_ID = originalBuildID
          

          James Telfer added a comment - FWIW, we implemented a workaround to this issue that doesn't involve wiping out the BUILD_ID variable (as we need to use it). Having a release with the Veto would be better, but this avoids random crashes in the meantime. Instead of allowing the MSBuild process to start the daemon itself, you cause the daemon to start using an environment that you choose. MSBuild then just uses the instance you started rather than starting its own. The Powershell we use is as follows. Use the Powershell plugin to run this as a step before the MSBuild plugin step (could be translated to Windows batch too if you like). # https: //wiki.jenkins-ci.org/display/JENKINS/ProcessTreeKiller var originalBuildID = $Env:BUILD_ID $Env:BUILD_ID = "DoNotKillMe" try { start mspdbsrv -argumentlist '-start' , '-spawn' -NoNewWindow } catch {} $Env:BUILD_ID = originalBuildID

          Daniel Beck added a comment -

          msbuild-1.26 should contain the fix. Can we finally resolve this, or is something missing?

          Daniel Beck added a comment - msbuild-1.26 should contain the fix. Can we finally resolve this, or is something missing?

          James Telfer added a comment -

          IMO, as soon as 1.26 is released.

          James Telfer added a comment - IMO, as soon as 1.26 is released.

          Daniel Beck added a comment -

          *sigh*

          1.26 is tagged in GitHub but no artifacts are uploaded. Looks like a failed release. Sorry about that.

          Note that MSBuild Plugin is almost certainly not currently maintained, as Gregory stopped working on his plugins, so if someone here wants to take over (danielweber perhaps?) that should be possible.

          Daniel Beck added a comment - *sigh* 1.26 is tagged in GitHub but no artifacts are uploaded. Looks like a failed release. Sorry about that. Note that MSBuild Plugin is almost certainly not currently maintained, as Gregory stopped working on his plugins, so if someone here wants to take over ( danielweber perhaps?) that should be possible.

          James Telfer added a comment -

          danielbeck no need to apologise, I appreciate you looking at it.

          James Telfer added a comment - danielbeck no need to apologise, I appreciate you looking at it.

          As a workaround I have created a Jenkins Job that executes a Windows batch command on the jenkins node where Visual Studio is installed.
          The jenkins job triggers the batch command once a day and works in my environment for several years now.
          The batch command looks like this:

          set MSPDBSRV_EXE=mspdbsrv.exe
          set MSPDBSRV_PATH=C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE
          
          set PATH=%MSPDBSRV_PATH%;%PATH%
          set ORIG_BUILD_ID=%BUILD_ID%
          set BUILD_ID=DoNotKillMe
          
          echo stop mspdbsrv.exe
          %MSPDBSRV_EXE% -stop
          
          echo wait 7 sec
          %windir%\system32\ping.exe -n 7 localhost> nul
          
          echo restart mspdbsrv.exe with a shutdowntime of 25 hours
          start /b %MSPDBSRV_EXE% -start -spawn -shutdowntime 90000
          
          set BUILD_ID=%ORIG_BUILD_ID%
          set ORIG_BUILD_ID=
          exit 0
          

          What the batch command does is:
          stop the mspdbsrv.exe to free up resources
          start mspdbsrv.exe with BUILD_ID=DoNotKillMe and a shutdowntime of 25 hours, that leaks the mspdbsrv process without getting killed and it runs for 25 hours so that other build jobs can use the already running process

          What you maybe have to do is to change the Path to mspdbsrv -> set MSPDBSRV_PATH=C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE

          Johannes Schmieder added a comment - As a workaround I have created a Jenkins Job that executes a Windows batch command on the jenkins node where Visual Studio is installed. The jenkins job triggers the batch command once a day and works in my environment for several years now. The batch command looks like this: set MSPDBSRV_EXE=mspdbsrv.exe set MSPDBSRV_PATH=C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE set PATH=%MSPDBSRV_PATH%;%PATH% set ORIG_BUILD_ID=%BUILD_ID% set BUILD_ID=DoNotKillMe echo stop mspdbsrv.exe %MSPDBSRV_EXE% -stop echo wait 7 sec %windir%\system32\ping.exe -n 7 localhost> nul echo restart mspdbsrv.exe with a shutdowntime of 25 hours start /b %MSPDBSRV_EXE% -start -spawn -shutdowntime 90000 set BUILD_ID=%ORIG_BUILD_ID% set ORIG_BUILD_ID= exit 0 What the batch command does is: stop the mspdbsrv.exe to free up resources start mspdbsrv.exe with BUILD_ID=DoNotKillMe and a shutdowntime of 25 hours, that leaks the mspdbsrv process without getting killed and it runs for 25 hours so that other build jobs can use the already running process What you maybe have to do is to change the Path to mspdbsrv -> set MSPDBSRV_PATH=C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE

          Michael Brock added a comment -

          Updating the msbuild plugin won't work in our situation. We run into this issue, but we don't have the plugin installed. Rather the issue comes for us in the Final Builder scripts we run via Jenkins that call msbuild.

          Michael Brock added a comment - Updating the msbuild plugin won't work in our situation. We run into this issue, but we don't have the plugin installed. Rather the issue comes for us in the Final Builder scripts we run via Jenkins that call msbuild.

          Daniel Beck added a comment -

          Then install it. MSBuild will veto all mspdbsrv killing.

          Daniel Beck added a comment - Then install it. MSBuild will veto all mspdbsrv killing.

          Markus Winter added a comment - - edited

          set the environment variable
          _MSPDBSRV_ENDPOINT_=$JENKINS_COOKIE
          (The variable starts and ends with a single '_')
          This will lead to separate instance of mspdbsrv being started.

          Markus Winter added a comment - - edited set the environment variable _ MSPDBSRV_ENDPOINT _=$JENKINS_COOKIE (The variable starts and ends with a single '_') This will lead to separate instance of mspdbsrv being started.

          Mark Grills added a comment - - edited

          mwinter69, thanks for the pointer.

          We couldn't get it working with $JENKINS_COOKIE but managed to correct it by adding the following property via EnvInject prior to kicking off the build

          _MSPDBSRV_ENDPOINT_=$BUILD_TAG

          This resulted in a separate process being initiated for each build and no conflicts/error.

          Edit: Correction due to formatting. Refer below

          Mark Grills added a comment - - edited mwinter69 , thanks for the pointer. We couldn't get it working with $JENKINS_COOKIE but managed to correct it by adding the following property via EnvInject prior to kicking off the build _MSPDBSRV_ENDPOINT_=$BUILD_TAG This resulted in a separate process being initiated for each build and no conflicts/error. Edit: Correction due to formatting. Refer below

          Daniel Fischer added a comment - - edited

          It is

          _MSPDBSRV_ENDPOINT_

          (with underlines) not MSPDBSRV_ENDPOINT.

          Just realized it myself that it's a formatting issue. If you enclose the word in underlines it will get italicised and the underlines disappear.

          Daniel Fischer added a comment - - edited It is _MSPDBSRV_ENDPOINT_ (with underlines) not MSPDBSRV_ENDPOINT. Just realized it myself that it's a formatting issue. If you enclose the word in underlines it will get italicised and the underlines disappear.

          Mark Grills added a comment -

          Apologies, yes an underscore at each end.

          Mark Grills added a comment - Apologies, yes an underscore at each end.

          Andy Neebel added a comment -

          We recently re-encountered this on our build network and I did some investigation, here's what I found:

          • On the master node, the veto from MSBuild plugin works properly, I was able to confirm the log message show it.
          • On a slave node, I do not see the log message from the veto. Instead I see a message that my process is being killed recursively (I was watching the process list to get the id during the build).

          It appears that the veto logic doesn't execute on the slave nodes. Is there something special that has to be done in order for it to be detected and executed there? I don't understand enough about how the remoting logic in Jenkins operates to know the answer to this.

          Most of the other work-arounds for this are ones that we cannot easily deploy in our environment. If this is truly the issue, does anyone have an idea what it would take to fix it and how long that would take to carry out?

          Andy Neebel added a comment - We recently re-encountered this on our build network and I did some investigation, here's what I found: On the master node, the veto from MSBuild plugin works properly, I was able to confirm the log message show it. On a slave node, I do not see the log message from the veto. Instead I see a message that my process is being killed recursively (I was watching the process list to get the id during the build). It appears that the veto logic doesn't execute on the slave nodes. Is there something special that has to be done in order for it to be detected and executed there? I don't understand enough about how the remoting logic in Jenkins operates to know the answer to this. Most of the other work-arounds for this are ones that we cannot easily deploy in our environment. If this is truly the issue, does anyone have an idea what it would take to fix it and how long that would take to carry out?

          Andy Neebel added a comment -

          I spent some more time chasing code and I have a suspicion as to the cause of the issue. In ProcessTree.java, there are two different functions that appear to need information from the master and yet operate in different manners

          • getVeto() is how the whitelist extension is accessed to block the killing of the process. This function just gets the list as it exists, no attempt to go ask the master for any information.
          • getKillers() is used to access the list of ProcessKillers if there are any classes implementing that extension point. This function gets the channel back to the master so it can ask for the master's list of classes implementing this extension.

          I think that getVeto() needs to have part of it implemented more like getKillers(), so that it will go to the master for the list. It may be also that the accessor belongs in ProcessTree instead, so that it caches the data and doesn't go back to the master quite as much. Then, I think the veto logic should work properly on both a master and a slave. Unfortuntely, this means a change to Jenkins core and upgrading the full instance to fix the issue instead of just a fix to the plugin itself.

          Andy Neebel added a comment - I spent some more time chasing code and I have a suspicion as to the cause of the issue. In ProcessTree.java, there are two different functions that appear to need information from the master and yet operate in different manners getVeto() is how the whitelist extension is accessed to block the killing of the process. This function just gets the list as it exists, no attempt to go ask the master for any information. getKillers() is used to access the list of ProcessKillers if there are any classes implementing that extension point. This function gets the channel back to the master so it can ask for the master's list of classes implementing this extension. I think that getVeto() needs to have part of it implemented more like getKillers(), so that it will go to the master for the list. It may be also that the accessor belongs in ProcessTree instead, so that it caches the data and doesn't go back to the master quite as much. Then, I think the veto logic should work properly on both a master and a slave. Unfortuntely, this means a change to Jenkins core and upgrading the full instance to fix the issue instead of just a fix to the plugin itself.

          Stefan Walter added a comment -

          Is there any workaround to this issue, because it completely breaks our usage of Jenkins?

          Stefan Walter added a comment - Is there any workaround to this issue, because it completely breaks our usage of Jenkins?

          Mark Grills added a comment -

          Hi Stefan, refer my comments above. This fixed it for us. Cheers

          Mark Grills added a comment - Hi Stefan, refer my comments above. This fixed it for us. Cheers

          Stefan Walter added a comment -

          Hi grillba, thanks a lot for your suggestion. It seems that this solved our issues.

          Stefan Walter added a comment - Hi grillba , thanks a lot for your suggestion. It seems that this solved our issues.

          Little side note: It might not be sufficient to just specify _MSPDBSRV_ENDPOINT_ env variable in order to avoid conflicts. I recommend to additionally also set TMP , TEMP and TEMPDIR to an isolated folder if you plan on invoking MSBUILD in parallel as various plugins for MSBUILD as well as MSBUILD itself will place files there.

          Further catch of using _MSPDBSRV_ENDPOINT_ is, that now serialization of parallel builds in the same working directory will break in return, unless you made sure that the tempoary files for the different architectures (e.g. the temporary program database created with the individual object files, and commonly named just e.g. "Debug\vc120.pdb", notice the lack of a prefix for the architecture) are completely isolated as well. Otherwise the different mspdbsrv-instances will now collide accessing the same file.

          Andreas Ringlstetter added a comment - Little side note: It might not be sufficient to just specify _ MSPDBSRV_ENDPOINT _ env variable in order to avoid conflicts. I recommend to additionally also set TMP , TEMP and TEMPDIR to an isolated folder if you plan on invoking MSBUILD in parallel as various plugins for MSBUILD as well as MSBUILD itself will place files there. Further catch of using _ MSPDBSRV_ENDPOINT _ is, that now serialization of parallel builds in the same working directory will break in return, unless you made sure that the tempoary files for the different architectures (e.g. the temporary program database created with the individual object files, and commonly named just e.g. "Debug\vc120.pdb", notice the lack of a prefix for the architecture) are completely isolated as well. Otherwise the different mspdbsrv-instances will now collide accessing the same file.

          Bill Hoo added a comment - - edited

          grillba, walteste Hi there, we've got this issue too, and we followed your suggestions to config the master Jenkins node like this:

          Configure system > Environment variables > Add new key value pair below:

           

          KEY: _MSPDBSRV_ENDPOINT_

          VALUE: $BUILD_TAG

           

          But we got nothing, the error still raised up on windows slave, could you please explain the solution in detail? Should we set this Key-Value on the slave node? Thanks in advance

          Bill Hoo added a comment - - edited grillba , walteste Hi there, we've got this issue too, and we followed your suggestions to config the master Jenkins node like this: Configure system > Environment variables > Add new key value pair below:   KEY: _ MSPDBSRV_ENDPOINT _ VALUE: $BUILD_TAG   But we got nothing, the error still raised up on windows slave, could you please explain the solution in detail? Should we set this Key-Value on the slave node? Thanks in advance

          Mark Grills added a comment -

          @billhoo,

          You need to do it at the Job level - Not the system level. Use envinject to add the environment variable

          Have a look here for how to use envinject,  https://wiki.jenkins.io/display/JENKINS/EnvInject+Plugin

          Make sure you follow the "Inject variables as a build step" topic

          Regards

          Mark

           

           

           

           

          Mark Grills added a comment - @billhoo, You need to do it at the Job level - Not the system level. Use envinject to add the environment variable Have a look here for how to use envinject,   https://wiki.jenkins.io/display/JENKINS/EnvInject+Plugin Make sure you follow the "Inject variables as a build step" topic Regards Mark        

          Bill Hoo added a comment -

          grillba,

          Thanks for the timely reply, we've followed your guide and found that there were already 3 seprated mspdbsvr.exe processes(for test purpose, we've ran 3 jobs on one windows slave concurrently) ran in background, so it seems worked, but unfortunately, one of our job still failed due to C1090 error.

           

          This is the screenshot of EnvInject in each of our 3 Pipeline jobs configuration page,

          I don't think there's anything wrong here, do I miss something?

           

          Thanks,

          Bill.

          Bill Hoo added a comment - grillba , Thanks for the timely reply, we've followed your guide and found that there were already 3 seprated mspdbsvr.exe processes(for test purpose, we've ran 3 jobs on one windows slave concurrently) ran in background, so it seems worked, but unfortunately, one of our job still failed due to C1090 error.   This is the screenshot of EnvInject in each of our 3 Pipeline jobs configuration page, I don't think there's anything wrong here, do I miss something?   Thanks, Bill.

          Adam Cornwell added a comment - - edited

          Just in case this helps anyone, I was able to fix all problems mentioned so far in this issue and comments by following the recommendations on this blog post:
          http://blog.peter-b.co.uk/2017/02/stop-mspdbsrv-from-breaking-ci-build.html

          The solution involves
          1. Installing the MSBuild plugin ver. 1.26 or higher in Jenkins. Setup for use on the server is optional, only needs to be installed. This stops Jenkins from killing the mspdbsrv process automatically.

          2. Using the _MSPDBSRV_ENDPOINT_ environment variable as done in the comment above.

          3. Spawning and killing a new specific mspdbsrv instance of the right Visual Studio version at the beginning and end of each job which uses it.

          Powershell implementation of the Python solution in the blog (change VS140COMNTOOLS to the version of Visual Studio being used):

          # Manually start mspdbsrv so a parallel job's instance isn't used, works because _MSPDBSRV_ENDPOINT_ is set to a unique value
          # (otherwise results in "Fatal error C1090: PDB API call failed, error code '23'" when one of the builds completes).
          $mspdbsrv_proc = Start-Process -FilePath "${env:VS140COMNTOOLS}\..\IDE\mspdbsrv.exe" -ArgumentList ('-start','-shutdowntime','-1') -passthru
          
          .\{PowershellBuildScriptName}.ps1
          
          # Manually kill mspdbsrv once the build completes using the previously saved process id
          Stop-Process $mspdbsrv_proc.Id

           

          Adam Cornwell added a comment - - edited Just in case this helps anyone, I was able to fix all problems mentioned so far in this issue and comments by following the recommendations on this blog post: http://blog.peter-b.co.uk/2017/02/stop-mspdbsrv-from-breaking-ci-build.html The solution involves 1. Installing the MSBuild plugin ver. 1.26 or higher in Jenkins. Setup for use on the server is optional, only needs to be installed. This stops Jenkins from killing the mspdbsrv process automatically. 2. Using the _ MSPDBSRV_ENDPOINT _ environment variable as done in the comment above. 3. Spawning and killing a new specific mspdbsrv instance of the right Visual Studio version at the beginning and end of each job which uses it. Powershell implementation of the Python solution in the blog (change VS140COMNTOOLS to the version of Visual Studio being used): # Manually start mspdbsrv so a parallel job 's instance isn' t used, works because _MSPDBSRV_ENDPOINT_ is set to a unique value # (otherwise results in "Fatal error C1090: PDB API call failed, error code '23' " when one of the builds completes). $mspdbsrv_proc = Start- Process -FilePath "${env:VS140COMNTOOLS}\..\IDE\mspdbsrv.exe" -ArgumentList ( '-start' , '-shutdowntime' , '-1' ) -passthru .\{PowershellBuildScriptName}.ps1 # Manually kill mspdbsrv once the build completes using the previously saved process id Stop- Process $mspdbsrv_proc.Id  

          Jakub Orava added a comment -

          I had the same problem with parallel builds (eg. running in parallel job A from trunk and job A from branch), I tried the solution with _MSPDBSRV_ENDPOINT_ with value BUILD_TAG and it worked almost for all jobs. In one situation I still had that error. So I replaced BUILD_TAG with JOB_NAME environment variable and suddenly it was fine, for now we are out of problems. If anyone has still the problem with ENDPOINT solution, try to change BUILD_TAG for something else. If you do not allow parallel build in single job, JOB_NAME should be enough, otherwise you can try JOB_NAME + BUILD_NUMBER combination.

          Maybe ENDPOINT has some restrictions, but I did not have a time to inspect this deeper. What I know is that the problematic job has the longest name in my Jenkins - approx. 48 characters.

          Jakub Orava added a comment - I had the same problem with parallel builds (eg. running in parallel job A from trunk and job A from branch), I tried the solution with _ MSPDBSRV_ENDPOINT _ with value BUILD_TAG and it worked almost for all jobs. In one situation I still had that error. So I replaced BUILD_TAG with JOB_NAME environment variable and suddenly it was fine, for now we are out of problems. If anyone has still the problem with ENDPOINT solution, try to change BUILD_TAG for something else. If you do not allow parallel build in single job, JOB_NAME should be enough, otherwise you can try JOB_NAME + BUILD_NUMBER combination. Maybe ENDPOINT has some restrictions, but I did not have a time to inspect this deeper. What I know is that the problematic job has the longest name in my Jenkins - approx. 48 characters.

          David Aldrich added a comment -

          Please can anyone advise me how to set _MSPDBSRV_ENDPOINT_ with value BUILD_TAG in a pipeline declarative script?

          I don’t really understand the difference between defining and injecting an environment variable. I could do:

          stage('build_VisualStudio') {
                  environment { _MSPDBSRV_ENDPOINT_=$BUILD_TAG }
          etc.
          

          Would that be sufficient or must environment variable injection be done in a different way?

          David Aldrich added a comment - Please can anyone advise me how to set _MSPDBSRV_ENDPOINT_ with value BUILD_TAG in a pipeline declarative script? I don’t really understand the difference between defining and injecting an environment variable. I could do: stage( 'build_VisualStudio' ) { environment { _MSPDBSRV_ENDPOINT_=$BUILD_TAG } etc. Would that be sufficient or must environment variable injection be done in a different way?

          Code changed in jenkins
          User: Daniel Beck
          Path:
          content/_data/changelogs/weekly.yml
          http://jenkins-ci.org/commit/jenkins.io/0391fcb9b4c957e9e41fde03409de330a3de571d
          Log:
          Remove JENKINS-9104 fix from release to unblock it

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Daniel Beck Path: content/_data/changelogs/weekly.yml http://jenkins-ci.org/commit/jenkins.io/0391fcb9b4c957e9e41fde03409de330a3de571d Log: Remove JENKINS-9104 fix from release to unblock it

          Code changed in jenkins
          User: Daniel Beck
          Path:
          content/_data/changelogs/weekly.yml
          http://jenkins-ci.org/commit/jenkins.io/62409d42a5769cac66337cbd4b5df5754f0e2384
          Log:
          Merge pull request #1522 from daniel-beck/changelog-2.119-amended

          Remove JENKINS-9104 fix from release to unblock it

          Compare: https://github.com/jenkins-infra/jenkins.io/compare/58f029c79331...62409d42a576

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Daniel Beck Path: content/_data/changelogs/weekly.yml http://jenkins-ci.org/commit/jenkins.io/62409d42a5769cac66337cbd4b5df5754f0e2384 Log: Merge pull request #1522 from daniel-beck/changelog-2.119-amended Remove JENKINS-9104 fix from release to unblock it Compare: https://github.com/jenkins-infra/jenkins.io/compare/58f029c79331...62409d42a576

          Code changed in jenkins
          User: Jesse Glick
          Path:
          core/src/main/java/hudson/util/ProcessTree.java
          test/src/test/java/hudson/util/ProcessTreeKillerTest.java
          http://jenkins-ci.org/commit/jenkins/3465da4764c322baf4fb5b90651ef6b9bcd409fb
          Log:
          Merge pull request #3419 from dwnusbaum/JENKINS-9104-test-fix

          Fix test failure by cleaning up static state after tests

          Compare: https://github.com/jenkinsci/jenkins/compare/ddbc4bbce7d3...3465da4764c3
          *NOTE:* This service been marked for deprecation: https://developer.github.com/changes/2018-04-25-github-services-deprecation/

          Functionality will be removed from GitHub.com on January 31st, 2019.

          SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: core/src/main/java/hudson/util/ProcessTree.java test/src/test/java/hudson/util/ProcessTreeKillerTest.java http://jenkins-ci.org/commit/jenkins/3465da4764c322baf4fb5b90651ef6b9bcd409fb Log: Merge pull request #3419 from dwnusbaum/ JENKINS-9104 -test-fix Fix test failure by cleaning up static state after tests Compare: https://github.com/jenkinsci/jenkins/compare/ddbc4bbce7d3...3465da4764c3 * NOTE: * This service been marked for deprecation: https://developer.github.com/changes/2018-04-25-github-services-deprecation/ Functionality will be removed from GitHub.com on January 31st, 2019.

          Daniel Beck added a comment -

          Jenkins 2.120 contains a fix for the previous problem of the ProcessKillingVeto extension point not working on agents.

          Daniel Beck added a comment - Jenkins 2.120 contains a fix for the previous problem of the ProcessKillingVeto extension point not working on agents.

          John Doe added a comment -

          I'm occasionally getting this error with the latest versions of Jenkins and all the plugins. It started in the recent months, haven't been a problem for a year before that. The problem seems to have NOT been resolved, or possibly re-emerged.

          What can I do, is there a workaround? Sporadic build failures for no reason are super annoying.

          John Doe added a comment - I'm occasionally getting this error with the latest versions of Jenkins and all the plugins. It started in the recent months, haven't been a problem for a year before that. The problem seems to have NOT been resolved, or possibly re-emerged. What can I do, is there a workaround? Sporadic build failures for no reason are super annoying.

          Bill Hoo added a comment -

          Same error with latest Jenkins ver. 2.150.3

          The error is aways occured when running two jobs concurrently on the same agent with VS2015:
          fatal error C1090: PDB API

          Bill Hoo added a comment - Same error with latest Jenkins ver. 2.150.3 The error is aways occured when running two jobs concurrently on the same agent with VS2015: fatal error C1090: PDB API

          John Doe added a comment -

          billhoo, thanks for the tip! I was running VS 2017 (v141 toolset), but there were indeed two simultaneous jobs! So the workaround is to limit this agent to one job at a time. Pity, as it's a pretty powerful multicore server, but it's better than flaky builds.

          John Doe added a comment - billhoo , thanks for the tip! I was running VS 2017 (v141 toolset), but there were indeed two simultaneous jobs! So the workaround is to limit this agent to one job at a time. Pity, as it's a pretty powerful multicore server, but it's better than flaky builds.

          Bill Hoo added a comment -

          vuiletgiraffe, totaly the same, we have many different jobs which use MSVC14 as toolchain, but now we can only perform one build at a time, its a huge waste of mashine resources ;(

          Hope it can be truly solved.

          Bill Hoo added a comment - vuiletgiraffe , totaly the same, we have many different jobs which use MSVC14 as toolchain, but now we can only perform one build at a time, its a huge waste of mashine resources ;( Hope it can be truly solved.

          Andreas Ringlstetter added a comment - - edited

          Solution is still the same, before invoking `msbuild`, set the following environment variables to something unique:

          _MSPDBSRV_ENDPOINT_=<UUID>
          TMP=<Unique Tempdir>
          TEMP=$TMP
          TMPDIR=$TMP

          Once you have done that, you can launch as many parallel MSBuild instances as you like, even mixing different msbuild versions or whatever. They will not interfere in any way. Doing that on a regular base with mixed MSVC12, MSVC14 and MSVC15 toolchains on the same machine, and didn't have any issues since.

          The "official" fix for this problem (trying not to kill the job scheduler) is plain wrong, and causes massive issues. Mostly because MSBuild itself isn't exactly stable either when using the same job server for multiple parallel builds. And if the builds are using different toolchains, a crash is ensured.

          Andreas Ringlstetter added a comment - - edited Solution is still the same, before invoking `msbuild`, set the following environment variables to something unique: _MSPDBSRV_ENDPOINT_=<UUID> TMP=<Unique Tempdir> TEMP=$TMP TMPDIR=$TMP Once you have done that, you can launch as many parallel MSBuild instances as you like, even mixing different msbuild versions or whatever. They will not interfere in any way. Doing that on a regular base with mixed MSVC12, MSVC14 and MSVC15 toolchains on the same machine, and didn't have any issues since. The "official" fix for this problem (trying not to kill the job scheduler) is plain wrong, and causes massive issues. Mostly because MSBuild itself isn't exactly stable either when using the same job server for multiple parallel builds. And if the builds are using different toolchains, a crash is ensured.

          Roman Pickl added a comment - - edited

          I used ext3h's solution:

          https://issues.jenkins-ci.org/browse/JENKINS-9104?focusedCommentId=360603&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-360603

          we solved it like this in a jenkins github multi-branch setup with jenkinsfiles:

                          bat """
                              mkdir tmp
                              set _MSPDBSRV_ENDPOINT_= ${BUILD_TAG}
                              set TMP=${Workspace}\\tmp
                              set TEMP=${Workspace}\\tmp
                              set TMPDIR=${Workspace}\\tmp
                              build.bat
                          """ 

           

          Roman Pickl added a comment - - edited I used ext3h 's solution: https://issues.jenkins-ci.org/browse/JENKINS-9104?focusedCommentId=360603&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-360603 we solved it like this in a jenkins github multi-branch setup with jenkinsfiles: bat """ mkdir tmp set _MSPDBSRV_ENDPOINT_= ${BUILD_TAG} set TMP=${Workspace}\\tmp set TEMP=${Workspace}\\tmp set TMPDIR=${Workspace}\\tmp build.bat """  

            danielweber Daniel Weber
            gordin Christoph Vogtländer
            Votes:
            71 Vote for this issue
            Watchers:
            94 Start watching this issue

              Created:
              Updated:
              Resolved: