Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-66566

Job DSL / Job Configuration History plugin creates tons of xml files

    XMLWordPrintable

Details

    Description

      We have a number of jobs described via DSL files and a seed job that turns them into actual Jenkins items. This job is basically triggered on each commit or PR that changes .groovy files (90+ in two repos), and goes like

      findAllDslFiles().each { jobDsl(scriptText: readFile(it)) }
      

      Recently, we noticed that there is a tremendous amount of obscure XML files in `${JENKINS_HOME}/config-history/javaposse.jobdsl.plugin.ExecuteDslScripts`:

      # ls 2021-09-06_15-42*
      2021-09-06_15-42-06:
       history.xml javaposse.jobdsl.plugin.ExecuteDslScripts.xml
      2021-09-06_15-42-09:
       history.xml javaposse.jobdsl.plugin.ExecuteDslScripts.xml
      ...
      # ls -1 | wc -l
      160316

      These piled over little more than a year and currently take about 85GB of space. The folders correlate to number of DSL files: f.x. today (2021.09.06) the seed job triggered at 10.38, there are 94 folders with names from `2021-09-06_10-31-01` to `2021-09-06_10-33-56`, and we have 93 DSL files in repo. The files are almost identical:

      --- 2021-09-06_10-31-01/javaposse.jobdsl.plugin.ExecuteDslScripts.xml	2021-09-06 10:31:01.118703955 +0300
      +++ 2021-09-06_10-31-02/javaposse.jobdsl.plugin.ExecuteDslScripts.xml	2021-09-06 10:31:02.615671917 +0300
      @@ -457,6 +457,13 @@
             </javaposse.jobdsl.plugin.SeedReference>
           </entry>
           <entry>
      +      <string>staging/D10811/test</string>
      +      <javaposse.jobdsl.plugin.SeedReference>
      +        <seedJobName>maintenance/dsl-deployer</seedJobName>
      +        <digest>cf5ff13ea2e9fa70733bdaf08281fd14</digest>
      +      </javaposse.jobdsl.plugin.SeedReference>
      +    </entry>
      +    <entry>
             <string>staging/D9320_dsl_check/maintenance/sync-ubase</string>
             <javaposse.jobdsl.plugin.SeedReference>
               <seedJobName>staging/D9320/maintenance/dsl-deployer</seedJobName>
      @@ -2585,13 +2592,6 @@
             </javaposse.jobdsl.plugin.SeedReference>
           </entry>
           <entry>
      -      <string>staging/D10811</string>
      -      <javaposse.jobdsl.plugin.SeedReference>
      -        <seedJobName>maintenance/dsl-deployer</seedJobName>
      -        <digest>d267a35b960390d61d110ab098c875cf</digest>
      -      </javaposse.jobdsl.plugin.SeedReference>
      -    </entry>
      -    <entry>
             <string>staging/D10506/build/custom/lnvr/diff</string>
             <javaposse.jobdsl.plugin.SeedReference>
               <seedJobName>maintenance/dsl-deployer</seedJobName>
      

      , and it seems very weird to have so much of them.

      Since our seed job recreates the whole folder+job tree on each trigger, is it safe to assume that we can simply remove everything older than N days to preserve disk space?

      Attachments

        Issue Links

          Activity

            pjdarton pjdarton added a comment -

            I encountered this issue too - every time the JobDSL plugin (re-)generates a Jenkins job that's "different", the hash of the job changes and the DSL plugin records the new hash, resulting in a new version of the file for every build of a JobDSL-using job.  It's unnecessary churn.

            I worked around it by adding to the "Exclusion pattern" (that's concealed within the "Advanced" bit of the plugin's config) and told it to exclude:

            |/jobs/|/javaposse\.jobdsl\.plugin\.ExecuteDslScripts\.xml
            

            in addition to the default stuff - /jobs/ because I'm generating the config.xml that defines my jobs and hence I don't need backups of those (and there's gazillions of changes as there's a generation timestamp put in the job's description, so it changes every time the DSL job runs even if the job hasn't changed in any meaninful manner - the jobConfigPlugin can't tell that the changes aren't important) and javaposse.jobdsl.plugin.ExecuteDslScripts.xml because that's the file that churns.
            That seems to have made the problem manageable so, for anyone else hitting this issue, try this as a workaround.

            pjdarton pjdarton added a comment - I encountered this issue too - every time the JobDSL plugin (re-)generates a Jenkins job that's "different", the hash of the job changes and the DSL plugin records the new hash, resulting in a new version of the file for every build of a JobDSL-using job.  It's unnecessary churn. I worked around it by adding to the "Exclusion pattern" (that's concealed within the "Advanced" bit of the plugin's config) and told it to exclude: |/jobs/|/javaposse\.jobdsl\.plugin\.ExecuteDslScripts\.xml in addition to the default stuff - /jobs/ because I'm generating the config.xml that defines my jobs and hence I don't need backups of those (and there's gazillions of changes as there's a generation timestamp put in the job's description, so it changes every time the DSL job runs even if the job hasn't changed in any meaninful manner - the jobConfigPlugin can't tell that the changes aren't important) and javaposse.jobdsl.plugin.ExecuteDslScripts.xml because that's the file that churns. That seems to have made the problem manageable so, for anyone else hitting this issue, try this as a workaround.

            People

              jamietanna Jamie Tanna
              artalus Artalus S.
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: