Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-62442

Lots of changes in source repo results in malformed XML Plastic->Jenkins which fails build job

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Blocker Blocker
    • plasticscm-plugin
    • None
    • Jenkins 2.238, Plastic SCM plugin 3.3, Plastic SCM client 9.0.16.4255

      We recently performed a significant update of a sub-repository. (Significant = 150k changed files). Now, when Jenkins attempts to build a project related to the parent repository, something goes wrong when Plastic is to provide Jenkins with the list of changes.

       

      It begins normally, like so:

       

      [JenkinsBuildScripts] $ C:\PlasticSCM\client\cm.exe unco --all --silent c:\Jenkins\workspace\MyJob@libs\JenkinsBuildScripts
      [JenkinsBuildScripts] $ C:\PlasticSCM\client\cm.exe ss wk:shl-2143265103
      Selector for workspace shl-2143265103:
      repository "JenkinsBuildScripts@MyOrg@cloud"
        path "/"
          smartbranch "/main"[JenkinsBuildScripts] $ C:\PlasticSCM\client\cm.exe update c:\Jenkins\workspace\MyJob@libs\JenkinsBuildScripts
      Searching for changed items in the workspace...
      The workspace c:\Jenkins\workspace\MyJob@libs\JenkinsBuildScripts is up-to-date (cset:119@JenkinsBuildScripts@MyOrg@cloud)
      [JenkinsBuildScripts] $ C:\PlasticSCM\client\cm.exe status --cset C:\Jenkins\workspace\MyJob@libs\JenkinsBuildScripts
      cs:119@rep:JenkinsBuildScripts@repserver:MyOrg@cloud
      [JenkinsBuildScripts] $ C:\PlasticSCM\client\cm.exe log cs:119 --xml --encoding=utf-8
      

      That command begins to emit XML...

       

       

      <?xml version="1.0" encoding="utf-8"?>
      <LogList>
        <Changeset>
          <ObjId>3026</ObjId>
          <ChangesetId>119</ChangesetId>
          <Branch>/main</Branch>
          <Comment>Added "sendSlackBuildSucceededNotification_custom" command, which will allow us to customize the info part (arbitrary download links etc)</Comment>
          <Owner>one@email.address</Owner>
          <GUID>34134429-84ea-4667-8084-d80eb4e9eadd</GUID>
          <Changes>
            <Item>
              <Branch>/main</Branch>
              <RevNo>119</RevNo>
              <Owner>kalms@falldamagestudio.com</Owner>
              <RevId>3002</RevId>
              <ParentRevId>789</ParentRevId>
              <SrcCmPath>/src/com/falldamagestudio/FormatSlackNotification.groovy</SrcCmPath>
              <SrcParentItemId>159</SrcParentItemId>
              <DstCmPath>/src/com/falldamagestudio/FormatSlackNotification.groovy</DstCmPath>
              <DstParentItemId>159</DstParentItemId>
              <Date>2020-03-23T08:15:49.0000000+01:00</Date>
              <Type>Changed</Type>
            </Item>
            <Item>
              <Branch>/main</Branch>
              <RevNo>119</RevNo>
              <Owner>kalms@falldamagestudio.com</Owner>
              <RevId>3004</RevId>
              <ParentRevId>791</ParentRevId>
              <SrcCmPath>/test/groovy/FormatSlackNotification/FormatSlackNotificationTest.groovy</SrcCmPath>
              <SrcParentItemId>631</SrcParentItemId>
              <DstCmPath>/test/groovy/FormatSlackNotification/FormatSlackNotificationTest.groovy</DstCmPath>
              <DstParentItemId>631</DstParentItemId>
              <Date>2020-03-23T08:15:49.0000000+01:00</Date>
              <Type>Changed</Type>
            </Item>
      ...

      ... snip 55MB ...

       

      ... and toward the end, it looks like this:

       

      ...
            <Item>
              <Branch />
              <RevNo>12</RevNo>
              <Owner>kalms@falldamagestudio.com</Owner>
              <RevId>636454</RevId>
              <ParentRevId>-1</ParentRevId>
              <SrcCmPath>/UE4/Engine/Source/ThirdParty/ICU/icu4c-53_1/source/data/coll/zh_CN.txt</SrcCmPath>
              <SrcParentItemId>758792</SrcParentItemId>
              <DstCmPath>/UE4/Engine/Source/ThirdParty/ICU/icu4c-53_1/source/data/coll/zh_CN.txt</DstCmPath>
              <DstParentItemId>758792</DstParentItemId>
              <Date>2020-05-20T22:54:20.0000000+02:00</Date>
              <Type>Added</Type>
            </Item>
            <Item>
              <Branch />
              <RevNo>12</RevNo>
              <Owner>kalms@falldamagestudio.com</Owner>
              <RevId>636455</RevId>
              <ParentRevId>-1</ParentRevId>
              <SrcCmPath>/UE4/Engine/Source/ThirdParty/ICU/icu4c-53_1/source/data/coll/zh_Hans.txt</SrcCmPath>
              <SrcParentItemId>758792</SrcParentItemId>
              <DstCmPath>/UE4/Engine/Source/ThirdParty/ICU/icu4c-53_1/source/data/coll/zh_Hans.txt</DstCmPath>
              <DstParentItemId>758792</DstParentItemId>FATAL: Parse error: XML document structures must start and end within the same entity.

      Notice how the XML tags aren't closed properly.

       

      We have tried kicking the same build job, 3-4 times, with cutoff happning after 44-65MB of output. On one occasion the cutoff was in the middle of an XML tag:

      ...
            <Item>
              <Branch />
              <RevNo>12</RevNo>
              <Owner>kalms@falldamagestudio.com</Owner>
              <RevFATAL: Parse error: XML document structures must start and end within the same entity.

      Given the above, it seems that either cm.exe is truncating its output while writing, the plasticscm plugin is truncating the stream while reading, or Jenkins is truncating the stream during transmission/ingestion. I'm not sure where exactly the problem is located.

       

      After this error, the pipeline fails. This stops our build jobs dead. We do not need the change info, but also, we do not see a way to navigate around this blocker.

          [JENKINS-62442] Lots of changes in source repo results in malformed XML Plastic->Jenkins which fails build job

          Mikael Kalms added a comment -

          Looking a bit closer:

          The Plastic SCM plugin calls "cm log cs:<changeset>". This retrieves all the changes in that specific changeset.

          We can work around this and get our build system un-stuck by performing a dummy check-in that changes just a single file. The next time that Jenkins attempts to build the product, the "cm log" command will only list a single file change. Jenkins is then able to proceed with the build job.

          This makes me wonder; why does the Plastic SCM plugin need to list the changes that are part of the current changeset? I would understand if it retrieved all changesets between the previous and the current build, but ... just the changeset at head?

          Mikael Kalms added a comment - Looking a bit closer: The Plastic SCM plugin calls "cm log cs:<changeset>". This retrieves all the changes in that specific changeset. We can work around this and get our build system un-stuck by performing a dummy check-in that changes just a single file. The next time that Jenkins attempts to build the product, the "cm log" command will only list a single file change. Jenkins is then able to proceed with the build job. This makes me wonder; why does the Plastic SCM plugin need to list the changes that are part of the current changeset? I would understand if it retrieved all changesets between the previous and the current build, but ... just the changeset at head?

          Hi Mikael,

           

          We considered that this issue was caused by the command output being too long (44-65 MB, as you said). We fixed it in release 3.4. The 'cm log'  and 'cm find' commands will output their XML contents to a file, and the plugin code will parse it. This is a better approach than simply loading the command output in a memory buffer as before.

           

          Regarding your last question, the plugin will retrieve the list of changes in all changesets involved in the build. I'm not certain about the specifics of your job, but the plugin normally detects all changesets in the configured branch that appeared between the last build and the current one.

           

          Cheers,

          Miguel

          Miguel González added a comment - Hi Mikael,   We considered that this issue was caused by the command output being too long (44-65 MB, as you said). We fixed it in release 3.4. The 'cm log'  and 'cm find' commands will output their XML contents to a file, and the plugin code will parse it. This is a better approach than simply loading the command output in a memory buffer as before.   Regarding your last question, the plugin will retrieve the list of changes in all changesets involved in the build. I'm not certain about the specifics of your job, but the plugin normally detects all changesets in the configured branch that appeared between the last build and the current one.   Cheers, Miguel

          We fixed this issue in release 3.4. We changed the 'cm log' and 'cm find' commands to make them output their XML directly to a file.

           

          This will avoid issues with in-memory buffers if the XML output is too long.

          Miguel González added a comment - We fixed this issue in release 3.4. We changed the 'cm log' and 'cm find' commands to make them output their XML directly to a file.   This will avoid issues with in-memory buffers if the XML output is too long.

            mig42 Miguel González
            kalms Mikael Kalms
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: