-
Bug
-
Resolution: Not A Defect
-
Major
-
None
-
Jenkins LTS 2.176.3
-
Powered by SuggestiMate
Jenkins LTS 2.176.3 incorporated commit ace596, which factors the Session ID into the computation of CSRF crumbs; since a new Session ID is generated if none is provided, previously issued crumbs are rendered useless in the absence of a reusable Session ID. This currently prevents Swarm clients from connecting to Jenkins masters secured with the DefaultCrumbIssuer, since the generated crumb is immediately rendered useless.
I think a fix would involve the Swarm plugin using a persistent session ID on the client-side. I labeled this issue as "minor", because an easy workaround exists (setting hudson.security.csrf.DefaultCrumbIssuer.EXCLUDE_SESSION_ID to true on the Jenkins master). It should be noted, however, that this reduces the efficacy of the fixes to SECURITY-626 and SECURITY-1491.
[JENKINS-59193] Session-ID missing alongside CSRF tokens
Hey Basil - Thanks so much for following up on this, and for not giving up on the issue (despite your frustrations trying to reproduce it). I've unfortunately not had time to revisit this myself, but after your recent comment, the least I could do was to try your recommended solution.
Although you were correct in guessing that my Swarm user (i.e. worker_node) did not previously have permission Overall/Read, granting this permission unfortunately did not solve the problem. The permissions currently granted to worker_node are: [{{Overall/Read}}, {{Credentials/UseItem}}, {{Agent/Build}}, {{Agent/Configure}}, {{Agent/Connect}} (through another permission), {{Agent/Create}}, {{Agent/Delete}}, {{Agent/Disconnect}}, {{Job/Build}}].
I'll try to revisit this in the next week or so - Thanks again for your continued attention to it!
Thanks for the reply, katzdm. Can you provide more information about your Authorization Strategy? Are you using matrix-based security, project-based Matrix Authorization Strategy, or Role-Based Strategy? Are you using a Jenkins API token or a password? Also please verify you are using the latest release of the Swarm plugin and Swarm client, as it contains some security related fixes relating to authorization. If you have some time and don't mind writing some Java code, see if you can reproduce the problem with some variant of your configuration programmatically in AuthorizationStrategyTest. Otherwise please tell me as much as you can about your authorization setup, obviously without revealing any sensitive information.
We're using a project-based Matrix Authorization Strategy, and I don't believe we're using an API token/password (AFAIK). We're currently on version 1.21 of the plugin, but version 1.18 of the client - Have there been relevant fixes since then? If so, I'll get the client upgraded and we can try again.
I'll try to dig into the Java, when I've exhausted every other idea Which won't be long now.
Yes, please ensure you are running the latest version of both the plugin and the client, as both contain multiple fixes regarding CSRF and authorization.
Hey Basil - Tried today with the latest versions of both the plugin and client; no dice - still seeing the issue, when I disable EXCLUDE_SESSION_ID.
I'll try to dig more and look into implementing an integration test, but I still think this makes some sense: The commit I cited in the original bug incorporates the "Session ID" (which as far as I can tell derives from the JSESSIONID cookie) into the crumb-generation logic. If we don't pass back the JSESSIONID (which we don't appear to), then our crumb will be invalidated. Example of the Set-Cookie header from the Jenkins master:
'Set-Cookie': 'ACEGI_SECURITY_HASHED_REMEMBER_ME_COOKIE=; Path=/; Expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Secure; HttpOnly,JSESSIONID.f443e328=node01s03czg5u9gagotkcld9tm98225.node0; Path=/; Secure; HttpOnly'
Note the JESSIONID.xyz component, which we don't seem to return to the master in subsequent requests.
Anyway, as I said - I'll try to dig more on my side.
Thanks for following up with the latest versions. I stepped through the code in the Java debugger against my production Jenkins instance, and as far as I can tell I am correctly returning the JSESSIONID in Swarm. Here is a walkthrough of the code:
- First, hudson.plugins.swarm.SwarmClient#createSwarmSlave creates an instance of HttpClientContext.
- Next, hudson.plugins.swarm.SwarmClient#getCsrfCrumb calls the crumbIssuer API, successfully obtaining a crumb request field and a crumb. Note that this requires Overall/Read permission for the user. The server's response sets a JSESSIONID cookie, which is visible in the HttpClientContext's BasicCookieStore (e.g., JSESSIONID.472fe218).
- Next, hudson.plugins.swarm.SwarmClient#createSwarmSlave adds the CSRF header returned by hudson.plugins.swarm.SwarmClient#getCsrfCrumb to the POST request.
- Next, hudson.plugins.swarm.SwarmClient#getCsrfCrumb executes the POST request. During this process, org.apache.http.client.protocol.RequestAddCookies examines the HttpClientContext's BasicCookieStore, finds the JSESSIONID cookie, and adds it to the set of headers before making the HTTP POST request.
The key invariants for all this to work are that the HTTP GET in SwarmClient#getCsrfCrumb must always be called immediately before making any POST request, the HttpClientContext (and therefore the BasicCookieStore) must always be shared between the CSRF crumb retrieval GET and the subsequent POST, and the POST must contain both a CSRF header (obtained from the previous CSRF crumb retrieval GET) and a JSESSIONID header (which must be the same JSESSIONID from the previous CSRF crumb retrieval GET). As far as I can tell all of these invariants are met in the code.
To see where things are going off the rails you might want to start by turning up logging to a higher level as described here. Then see if there are any errors logged by SwarmClient#getCsrfCrumb or related methods. If you can, sanitize the log and post it here.
If you can, try to step through the above process in an IDE's debugger and see where things are going off the rails, or write a test in AuthorizationStrategyTest that reproduces the problem. If I could attach a debugger to an instance that is exhibiting this problem I'm sure I could figure it out quickly, but it is difficult to guess at the cause from far away.
Hey Basil - Increasing logging verbosity turned out to be the key here - And my apologies! As you suspected, this was on my end.
Turns out that our Swarm workers reach the Jenkins master through a minimal and very poorly built proxy served from localhost (this was implemented well before my time working on our stack, and I only discovered the existence of the proxy yesterday). The only purpose of the proxy is to append an "X-Forwarded-User: worker_node" header to identify itself to the Jenkins master; this header is typically appended to human-users' requests by our OAuth2 reverse-proxy, but Swarm agents access the master directly.
Turns out that the Jenkins master sends JSESSIONID cookies with the Secure attribute, which (correctly) forces the Swarm client to omit the cookie from http://localhost-bound requests. That solves that mystery.
If I could poke you with one more question, I am now left with the issue of, "How to move forward" - I would still like to remove that EXCLUDE_SESSION_ID override from our configuration, after all. Is there an easier way for me to inject an X-Forwarded-User header into Swarm-issued requests? If not, would you entertain adding a command-line option (--include-header or something?) to support such a use case (or maybe something to that effect already exists within the underlying Apache machinery)? Or perhaps we're going about this wrong, and you would recommend a different setup configuration altogether.
Thanks again for your time, and thanks in advance for any guidance you could give on best practices here!
---Dan
Great, I'm glad you were able to figure this out! There is some documentation for using Swarm with proxies here. I'm not very familiar with proxies or cookies with the Secure attribute, but from a quick Google search I gather that
A cookie with the Secure attribute is sent to the server only with an encrypted request over the HTTPS protocol, never with unsecured HTTP, and therefore can't easily be accessed by a man-in-the-middle attacker.
I am reticent to add support for arbitrary headers. It seems fairly outside the scope of what the Swarm client is meant to do. It is not meant to be a general purpose utility like curl(1) but rather just a wrapper around the Remoting JAR, and its proxy support matches that of Remoting. I would be willing to reconsider this position if other users express an interest and can demonstrate a common use case that requires this. As far as your specific issue is concerned, a proxy seems like the right approach to add custom headers, but it seems like your particular proxy configuration isn't compatible with how Jenkins sets the JSESSIONID cookie. So in my opinion the most appropriate solution would be to make your proxy configuration compatible by enabling TLS on your proxy server and setting -Dhttps.proxyHost and -Dhttps.proxyPort (note the "s") per the Swarm proxy documentation. Please let me know if I've misunderstood any of this.
Hey Basil - After reading up on Jenkins API tokens, I was able to make this work for our use case! I think we can safely close this - Thanks for your help with debugging!
Hey katzdm, ohzaki, and bruce, I think I have figured this out. This doesn't have anything to do with the CSRF configuration but rather the Authorization Strategy configuration. Your Swarm user needs the Overall/Read permission in order to obtain a CSRF token. I just recently documented the recommended configuration for Swarm with examples and screenshots for matrix-based security, project-based Matrix Authorization Strategy, and Role-Based Strategy. Please ensure that you have configured your permissions appropriately following the above documentation.