Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-21670

Option to anonymize customer labels

    XMLWordPrintable

    Details

    • Epic Name:
      Bundle anonymization
    • Similar Issues:

      Description

      For sites with stringent security policies, there should be an option when generating a support bundle (or perhaps just a global setting applicable also to auto-generated bundles) that would search for mentions in all files of labels created by the customer which might reflect proprietary processes: job, folder, view, slave, and template names, slave labels, etc.

      The plugin would gather a list of all such labels, create randomized tokens, and produce a mapping so that a job AppBuild becomes Job_ayrzw. For labels with spaces or other special characters, which could have triggered bugs, the mapping should follow, so App ? Build should become Job_ayrzw ? X, and the mapping should also include encoded variants such as App%20%E2%86%92%20Build to Job_ayrzw%20%E2%86%92%20X and App%20%e2%86%92%20Build to Job_ayrzw%20%e2%86%92%20X.

      Then these substitutions would be applied to all files included in the support bundle, particularly log files and thread dumps.

      It is impossible to guarantee that customer text does not appear in some unusual context, e.g. an exception quoting a syntactically incorrect Groovy script, but these substitutions would sanitize the great majority of what the support bundle produces, and make it feasible for the customer to do a final inspection without needing to do much or any manual editing.

        Attachments

          Issue Links

            Activity

            Hide
            aheritier Arnaud Héritier added a comment -

            IPs and network settings have also to be shadowed.

            Show
            aheritier Arnaud Héritier added a comment - IPs and network settings have also to be shadowed.
            Hide
            minudika Minudika Malshan added a comment -

            Please help me to clarify the following things.

            1) How to find and keep track of labels created by the customer from the plugin side.
            2) What is the purpose of creating randomized tokens, Producing a mapping and substitution?

            Show
            minudika Minudika Malshan added a comment - Please help me to clarify the following things. 1) How to find and keep track of labels created by the customer from the plugin side. 2) What is the purpose of creating randomized tokens, Producing a mapping and substitution?
            Hide
            aheritier Arnaud Héritier added a comment -

            Hi Minudika Malshan

            From my POV (but I hope that many others will comment) I would like to have in the bundle generation form a set of new options to decide what kind of informations I would like to anonymise (by default everything checked). These king of informations may be something like URLs, IPs, ...
            Based on these settings we should try to find in the bundle all the entries matching them and for each different entry we should replace it by a unique entry. This is what is explaining Jesse Glick.
            It is critical within a bundle to to always replace the same entry by the same value to be able to understand the relation in all files.
            Nowadays we don't allow to export job configuration files or build informations ( JENKINS-30468 ) but I hope that one day we will and thus in that case we'll have to use the same mechanism.

            Show
            aheritier Arnaud Héritier added a comment - Hi Minudika Malshan From my POV (but I hope that many others will comment) I would like to have in the bundle generation form a set of new options to decide what kind of informations I would like to anonymise (by default everything checked). These king of informations may be something like URLs, IPs, ... Based on these settings we should try to find in the bundle all the entries matching them and for each different entry we should replace it by a unique entry. This is what is explaining Jesse Glick . It is critical within a bundle to to always replace the same entry by the same value to be able to understand the relation in all files. Nowadays we don't allow to export job configuration files or build informations ( JENKINS-30468 ) but I hope that one day we will and thus in that case we'll have to use the same mechanism.
            Hide
            jglick Jesse Glick added a comment -

            By the way I would suggest using something like this library instead of unreadable tokens. Easier for humans to remember and match.

            Show
            jglick Jesse Glick added a comment - By the way I would suggest using something like this library instead of unreadable tokens. Easier for humans to remember and match.
            Hide
            aheritier Arnaud Héritier added a comment -

            +1 with Jesse Glick

            Show
            aheritier Arnaud Héritier added a comment - +1 with Jesse Glick
            Hide
            aheritier Arnaud Héritier added a comment -

            Here are more feedbacks of what could be considered as “Sensitive/Non-Public”:

            1/ System and network informations : processes, accounts, IPs, hostnames and everything related to the hosting system and its network configuration.
            -> Current status: Such informations are provided by different bundles options which can be already deactivated (Environment variables, File descriptors (Unix only), Networking Interface, Root CAs, System configuration (Linux only), System properties) but there are probably others places where they should be anonymised

            2/ All kind of credentials
            -> Current status: Never stored in bundles. Should never appear in Jenkins logs (but ...). Should never appear in jobs logs (but ... it can - there are some known issues) but for now no job informations/logs are provided in bundles

            3/ All kind of audit logs to know who did what
            -> Current status: it should be necessary to not bundle the audit plugin logs if it is installed (I don't know it very well thus I need to investigate).

            4/ Users informations (Name, emails, logon, ...)
            -> Provided by option "About user (basic authentication details only)" but we may probably have to anonymise them in more locations

            Show
            aheritier Arnaud Héritier added a comment - Here are more feedbacks of what could be considered as “Sensitive/Non-Public”: 1/ System and network informations : processes, accounts, IPs, hostnames and everything related to the hosting system and its network configuration. -> Current status: Such informations are provided by different bundles options which can be already deactivated (Environment variables, File descriptors (Unix only), Networking Interface, Root CAs, System configuration (Linux only), System properties) but there are probably others places where they should be anonymised 2/ All kind of credentials -> Current status: Never stored in bundles. Should never appear in Jenkins logs (but ...). Should never appear in jobs logs (but ... it can - there are some known issues) but for now no job informations/logs are provided in bundles 3/ All kind of audit logs to know who did what -> Current status: it should be necessary to not bundle the audit plugin logs if it is installed (I don't know it very well thus I need to investigate). 4/ Users informations (Name, emails, logon, ...) -> Provided by option "About user (basic authentication details only)" but we may probably have to anonymise them in more locations
            Hide
            minudika Minudika Malshan added a comment -

            Arnaud Héritier Jesse Glick Since the entities like job names does not have a specific format like network addresses do, how to find a string as such a name which is needed to be anatomized?

            Show
            minudika Minudika Malshan added a comment - Arnaud Héritier Jesse Glick Since the entities like job names does not have a specific format like network addresses do, how to find a string as such a name which is needed to be anatomized?
            Hide
            aheritier Arnaud Héritier added a comment -

            Really good question cc Steven Christou
            As we are running in Jenkins I think that the best is to use its APIs to find all kind of Jobs (Jenkins.instance.getAllItems...) and then to create a map to replace each name.
            We probably need to do the same for views, users/accounts, ....
            After this we need a mechanism of filter when we are getting logs, config files, ... to replace all orginal names by the "protected" value

            Show
            aheritier Arnaud Héritier added a comment - Really good question cc Steven Christou As we are running in Jenkins I think that the best is to use its APIs to find all kind of Jobs (Jenkins.instance.getAllItems...) and then to create a map to replace each name. We probably need to do the same for views, users/accounts, .... After this we need a mechanism of filter when we are getting logs, config files, ... to replace all orginal names by the "protected" value
            Hide
            minudika Minudika Malshan added a comment - - edited

            Arnaud Héritier What did you mean by "views"? Jenkins.getInstance().getViewActions() ?
            Also could you please tell me is there a way to get the information of users/accounts through API?

            Jesse Glick Is this https://github.com/kohsuke/wordnet-random-name random word generator available as a maven dependency? Or do we have to add that jar or classes to Support core plug-in project manually?
            Thanks a lot!

            Show
            minudika Minudika Malshan added a comment - - edited Arnaud Héritier What did you mean by "views"? Jenkins.getInstance().getViewActions() ? Also could you please tell me is there a way to get the information of users/accounts through API? Jesse Glick Is this https://github.com/kohsuke/wordnet-random-name random word generator available as a maven dependency? Or do we have to add that jar or classes to Support core plug-in project manually? Thanks a lot!
            Hide
            sag47 Sam Gleske added a comment -

            User-defined list of expressions to search and replace can help with job names that have no convention. An enterprise may have keywords which they don't want leaked.

            Show
            sag47 Sam Gleske added a comment - User-defined list of expressions to search and replace can help with job names that have no convention. An enterprise may have keywords which they don't want leaked.
            Hide
            dnusbaum Devin Nusbaum added a comment -

            To make sure that users understand that this feature is not guaranteed to anonymize all uses of confidential information, we should add a warning immediately before the "Generate Bundle" button that explains that the anonymization and encrypted secret masking are best-effort and that users should double-check and redact any confidential information before sending the bundle to a third party. We should also change the description for configuration file components from

            ... (Encrypted secrets are redacted)

            to

            ... (Encrypted secrets are redacted. See the <a href=#warning>warning</a> for details)

            Show
            dnusbaum Devin Nusbaum added a comment - To make sure that users understand that this feature is not guaranteed to anonymize all uses of confidential information, we should add a warning immediately before the "Generate Bundle" button that explains that the anonymization and encrypted secret masking are best-effort and that users should double-check and redact any confidential information before sending the bundle to a third party. We should also change the description for configuration file components from ... (Encrypted secrets are redacted) to ... (Encrypted secrets are redacted. See the <a href=#warning>warning</a> for details)
            Hide
            jvz Matt Sicker added a comment -

            Here's a proposed admin console for this feature so far:

            Show
            jvz Matt Sicker added a comment - Here's a proposed admin console for this feature so far:
            Hide
            jglick Jesse Glick added a comment -

            PR 144 seems to be the current link.

            Show
            jglick Jesse Glick added a comment - PR 144 seems to be the current link.
            Hide
            dnusbaum Devin Nusbaum added a comment -

            Released in Support Core 2.48. Thanks Matt Sicker!

            Show
            dnusbaum Devin Nusbaum added a comment - Released in Support Core 2.48 . Thanks Matt Sicker !
            Hide
            jvz Matt Sicker added a comment -

            Updated documentation in wiki to reflect released feature.

            Show
            jvz Matt Sicker added a comment - Updated documentation in wiki to reflect released feature.
            Hide
            jvz Matt Sicker added a comment -

            Feature is released now, marking this as closed.

            Show
            jvz Matt Sicker added a comment - Feature is released now, marking this as closed.

              People

              Assignee:
              jvz Matt Sicker
              Reporter:
              jglick Jesse Glick
              Votes:
              2 Vote for this issue
              Watchers:
              11 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: