Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-52408

Multiple jobs on same pipeline: java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob

    XMLWordPrintable

Details

    Description

      Hi guys,

      I have a complex workflow with multiple stage and step (some of them run in parallel).

      All worked fine until last month.

      Now when I launch my job in different github branch (with a multiple pipeline job), I get the following error (not every time)

      java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob 

      Here are steps of the stage which fails:

      Stage : Start - (13 sec in block) Configuration Success

      Configuration - (11 sec in block) Success

      Shell Script - (1.8 sec in self) mkdir -p logs cache/container cache/dev cache/test Console Output Success

      Shell Script - (1.7 sec in self) cp alk_service_auth/application/config/parameters_test.yml.dist alk_service_auth/application/config/parameters_test.yml ||: Console Output
      Success

      Bind credentials to variables : Start - (6.6 sec in block) Console Output Success

      Bind credentials to variables : Body : Start - (4.2 sec in block) Success

      Shell Script - (1.3 sec in self) Console Output Success

      Shell Script - (1.7 sec in self) sed -i s/DB_SCHEMA/alk_auth_b1267c9d/g alk_service_auth/application/config/parameters_test.yml Console Output Success

      Stage : Start - (4 min 13 sec in block) Build & Test Success

      Build & Test - (4 min 11 sec in block) Success

      Verify if file exists in workspace - (1.7 sec in self) /var/www/service-auth.alkemics.com/shared/env Success

      Shell Script - (1.7 sec in self) Console Output Success

      Shell Script - (14 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/pip install -e .[test] --process-dependency-links Console Output Success

      Shell Script - (3.3 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/pip check Console Output Success

      Shell Script - (2.1 sec in self) Console Output Success

      Shell Script - (2.8 sec in self) Console Output Success

      Shell Script - (3.6 sec in self) Console Output Success

      Shell Script - (4.4 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/flake8 alk_service_auth Console Output Success

      Shell Script - (3 sec in self) Console Output Success

      Shell Script - (3 min 21 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/nosetests --nologcapture --verbose --with-xunit --xunit-file=xunit.xml
      alk_service_auth Console Output Success

      Shell Script - (6.2 sec in self) Console Output Success

      Shell Script - (2.2 sec in self) Console Output Success

      Shell Script - (1.7 sec in self) Console Output Success

      Error signal - (0.39 sec in self) Python unit tests failed: java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob Failed

      Print Message - (1.6 sec in self) hudson.AbortException: Python unit tests failed: java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob

       

      It cuts unit tests step without obvious reasons.

       

      Do you have any idea about it ?

       

      Attachments

        Activity

          oleg_nenashev Oleg Nenashev added a comment -

          kiva I think that it is something in your Pipeline library. " Python unit tests failed" ... there is no such text patterns in the Jenkins codebase. My guess is that datadog.endStep() is a Pipeline library method, which accesses the WorkflowJob class using whitelisted Java API and saves it to a local variable. This logic happens in a method without NonCPS annotation, so Pipeline tries to save the context and legitimately fails since the class must not be serialized in such way.

          Please check your Pipeline lib to verify my theory 

          oleg_nenashev Oleg Nenashev added a comment - kiva I think that it is something in your Pipeline library. " Python unit tests failed" ... there is no such text patterns in the Jenkins codebase. My guess is that datadog.endStep() is a Pipeline library method, which accesses the WorkflowJob class using whitelisted Java API and saves it to a local variable. This logic happens in a method without NonCPS annotation, so Pipeline tries to save the context and legitimately fails since the class must not be serialized in such way. Please check your Pipeline lib to verify my theory 
          kiva Romain N added a comment -

          Hi,

          datadog is not an external library, it's a class write by ourself to send metric to datadog

          Here is the code if you want to see it:

           

          // Datadog.groovy
          
          import groovy.transform.Field
          
          import java.time.Instant
          import java.time.temporal.ChronoUnit
          
          @Field def configFileDatadog = "/etc/datadog-cli.conf"
          @Field def commandDatadog = "/opt/datadog-agent/bin/dog --config ${configFileDatadog} metric post "
          @Field def datadog_enabled = true
          
          def initStep(step, repo_slug = "") {
              metricsStep(step, repo_slug)
              return Instant.now()
          }
          
          def getTags(repo_slug, tag_separator=',', value_separator=':') {
              if (repo_slug == '') {
                  return ''
              }
              tags = "role${value_separator}${repo_slug}${tag_separator}workflow${value_separator}"
              if (common.is_integration_branch()) {
                  tags = tags + "PR${tag_separator}branch${value_separator}"
                  if (common.is_pr_staging_master()) {
                      tags = tags + "master"
                  } else {
                      tags = tags + "staging"
                  }
              } else {
                  tags = tags + "build${tag_separator}branch${value_separator}${env.BRANCH_NAME}"
              }
              return tags
          }
          
          def metricsStep(step, repo_slug = "", unit = "total") {
              postCountMetric("${step}.${unit}", 1, getTags(repo_slug))
          }
          
          def endStep(step, repo_slug, date, success = true) {
              state_name = success ? 'success' : 'fail'
              metricsStep(step, repo_slug, state_name)
              if (date != null) {
                  def duration = ChronoUnit.MILLIS.between(date, Instant.now())
                  postGaugeMetric("${step}.${state_name}.duration", duration, getTags(repo_slug))
                  // Keep the old one for retrocompat
                  postGaugeMetric("${step}.duration", duration, getTags(repo_slug))
              }
          }
          
          def postCountMetric(name, value = 1, tags = "") {
              postMetric(name, value, "rate", tags)
          }
          
          def postGaugeMetric(name, value, tags = "") {
              postMetric(name, value, "gauge", tags)
          }
          
          def postMetric(name, value, type, tags) {
              if (!(name ==~ /^ci\..*$/))
                  name = "ci.${name}"
              if (datadog_enabled) {
                  sh returnStatus: true, script:"${commandDatadog} --no_host --type ${type} --tags 'jenkins,${tags}' ${name} ${value}"
              }
          }
          
          return this
          
          kiva Romain N added a comment - Hi, datadog is not an external library, it's a class write by ourself to send metric to datadog Here is the code if you want to see it:   // Datadog.groovy import groovy.transform.Field import java.time.Instant import java.time.temporal.ChronoUnit @Field def configFileDatadog = "/etc/datadog-cli.conf" @Field def commandDatadog = "/opt/datadog-agent/bin/dog --config ${configFileDatadog} metric post " @Field def datadog_enabled = true def initStep(step, repo_slug = "") { metricsStep(step, repo_slug) return Instant.now() } def getTags(repo_slug, tag_separator= ',' , value_separator= ':' ) { if (repo_slug == '') { return '' } tags = "role${value_separator}${repo_slug}${tag_separator}workflow${value_separator}" if (common.is_integration_branch()) { tags = tags + "PR${tag_separator}branch${value_separator}" if (common.is_pr_staging_master()) { tags = tags + "master" } else { tags = tags + "staging" } } else { tags = tags + "build${tag_separator}branch${value_separator}${env.BRANCH_NAME}" } return tags } def metricsStep(step, repo_slug = "", unit = " total") { postCountMetric( "${step}.${unit}" , 1, getTags(repo_slug)) } def endStep(step, repo_slug, date, success = true ) { state_name = success ? 'success' : 'fail' metricsStep(step, repo_slug, state_name) if (date != null ) { def duration = ChronoUnit.MILLIS.between(date, Instant.now()) postGaugeMetric( "${step}.${state_name}.duration" , duration, getTags(repo_slug)) // Keep the old one for retrocompat postGaugeMetric( "${step}.duration" , duration, getTags(repo_slug)) } } def postCountMetric(name, value = 1, tags = "") { postMetric(name, value, "rate" , tags) } def postGaugeMetric(name, value, tags = "") { postMetric(name, value, "gauge" , tags) } def postMetric(name, value, type, tags) { if (!(name ==~ /^ci\..*$/)) name = "ci.${name}" if (datadog_enabled) { sh returnStatus: true , script: "${commandDatadog} --no_host --type ${type} --tags 'jenkins,${tags}' ${name} ${value}" } } return this
          oleg_nenashev Oleg Nenashev added a comment -

          Hard to say what exactly goes wrong, but it is not JEP-200 from what I see.

          WorkflowJob is likely referenced from "step". Although a String is passed as a first parameter to endStep() in the original Jenkinsfile, the provided library method clearly references an object like "${step.duration}". I am not sure what happens, but it is something you firstly need to check on your side. If you want to pass non-serializable objects as method arguments, they should be NonCPS.

          Since it is not JEP-200 so far, I will leave the investigation (if needed) to the plugin maintainers

          oleg_nenashev Oleg Nenashev added a comment - Hard to say what exactly goes wrong, but it is not JEP-200 from what I see. WorkflowJob is likely referenced from "step". Although a String is passed as a first parameter to endStep() in the original Jenkinsfile, the provided library method clearly references an object like "${step.duration}". I am not sure what happens, but it is something you firstly need to check on your side. If you want to pass non-serializable objects as method arguments, they should be NonCPS. Since it is not JEP-200 so far, I will leave the investigation (if needed) to the plugin maintainers
          kiva Romain N added a comment -

          You misread the code.
          It’s not « ${step.duration} » but « ${step}.duration »
          Step is indeed a string as you said

          kiva Romain N added a comment - You misread the code. It’s not « ${step.duration} » but « ${step}.duration » Step is indeed a string as you said
          kiva Romain N added a comment -

          After many investigations, I used Jenkins.getInstance().getJobs() in a groovy file and this call is not Serializable.

          Added a @CPS annotation resolve this problem.

          kiva Romain N added a comment - After many investigations, I used Jenkins.getInstance().getJobs() in a groovy file and this call is not Serializable. Added a @CPS annotation resolve this problem.

          People

            Unassigned Unassigned
            kiva Romain N
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: