[JENKINS-34923] One-Shot-Executor requirements

Type: Improvement
Resolution: Fixed
Priority: Minor
Component/s: core
Labels:
None

Similar Issues:
Powered by SuggestiMate

Show

one-shot-executor is an alternate approach to slave allocation, relying on a 1:1 lifecycle between build and slave running it. Current implementation relies on hacks and implementation details.

expose Computer #removeExecutor to subclasses so they can remove the slave just as the executor has completed the build.
Let Run create the log's BuildListener before the Computer has been started, as same log will host slave launcher log and build log.

Point 2 require to change the way StreamTaskListener do handle encoding. Current design is to rely on build slave default encoding, which require access to computer to get this information. Forcing conversion to UTF-8 as the log is collected for storage on master would avoid this and let us release this lifecycle dependency between BuildListener and Computer.

links to

Original attempt to fix by ndeloof

Nicolas De Loof created issue - 2016-05-18 17:17

Nicolas De Loof made changes - 2016-05-18 17:18

Description

Original: one-shot-executor is an alternate approach to slave allocation, relying on a 1:1 lifecycle between build and slave running it. Current implementation relies on hacks and implementation details.

# expose Computer #removeExecutor to subclasses so they can remove the slave just as the executor has completed the build.
# Let Run create the log's BuildListener before the Computer has been started.

Point 2 require to change the way StreamTaskListener do handle encoding. Current design is to rely on build slave default encoding, which require access to computer to get this information. Forcing conversion to UTF-8 as the log is collected for storage on master would avoid this and let us release this lifecycle dependency between BuildListener and Computer.

New: one-shot-executor is an alternate approach to slave allocation, relying on a 1:1 lifecycle between build and slave running it. Current implementation relies on hacks and implementation details.

# expose Computer #removeExecutor to subclasses so they can remove the slave just as the executor has completed the build.
# Let Run create the log's BuildListener before the Computer has been started, as same log will host slave launcher log and build log.

Point 2 require to change the way StreamTaskListener do handle encoding. Current design is to rely on build slave default encoding, which require access to computer to get this information. Forcing conversion to UTF-8 as the log is collected for storage on master would avoid this and let us release this lifecycle dependency between BuildListener and Computer.

Kohsuke Kawaguchi made changes - 2016-06-06 23:09

Remote Link

New: This issue links to "Original attempt to fix by ndeloof (Web Link)" [ 14418 ]

Kohsuke Kawaguchi added a comment - 2016-06-07 00:52

I took a look at this more closely.

Change under consideration

The original PR #2302 attempted by ndeloof tries to change the encoding of the console log file that the master keeps. It is currently written in the encoding of the build agent that runs the build, and the PR attempted to change this to UTF-8.

Motivation

One-off-executor plugin would like to launch a new build agent synchronously from within Executor that's carrying out a build (such as creating a new container on Kubernetes.) In general, such an activity might produce some logs that logically should go to Run.getLogFile().

Run.execute() currently determines (A) the platform encoding of the build agent, then creates (B) a log file and starts writing. This ordering is problematic for the OOE plugin because (C) the the launch activitiy needs to happen to be able to determine (A), and (C) requires (B) to be able to log any progress/failure during the launch. So there's cyclic dependency between (A)->(B)->(C)->(A).

Analysis

Upon closer look, I don't think such encoding change is feasible without various regressions, and here is why; Run.charset has an intimate relationship with the encoding of Run.getLogFile(). (D) Too many code makes an assumption that Run.getLogFile() is encoded in Run.charset (correctly as of today), such as ConsoleCommand.run(). And likewise, (E) too many code makes an assumption that TaskListener.getLogger() receives text encoded in Run.charset (also correctly as of today), such as Maven.perform() where MavenConsoleAnnotator filters log output. In the world where logs are stored in UTF-8, the encoding in which Run.getLogFile() is written vs the encoding of bytes that stream through TaskListener.getLogger() will be different, and this will break (D) or (E). The only way out of (D) vs (E) dilemma is to use UTF-8 for everything from TaskListener.getLogger() to Run.getLogFile() by pushing re-encoding (from the platform encoding of the build agent to UTF-8) into Launcher (for example by tweaking the contract that ProcStarter.stdout deals with UTF-8), but doing this now creates IMO undeisrable interaction between Launcher and Executor, and also breaks an use case where Launcher is used to fork off a process whose I/O is not text-based (such as java -jar slave.jar)

Suggested solution

Given the difficulty in the original attempted fix, my suggestion to this problem is as follows:

(F) Define RunListener.onInitialize(Run) that is fired in Run.execute() before Computer.getDefaultCharset() is called to provide the OOE plugin an opportunity to perform a launch and avoid relying on a fragile assumption that Computer.getDefaultCharset() is the point in which provisioning has to happen.

(G) Run.execute() will open log file with the append flag, so that the activity in (F) can open and write to Run.getLogFile() without getting overwritten. Because of the (A)->(B)->(C)->(A) cycle above, in general we cannot be certain that (H) the encoding in which the launch activity is written and (A) are identical, but I think this is the least of the evil because:

In a homogeneous environment where the platform encoding of the master & agents are the same, this is not a problem
OOE plugin can stick to ASCII charset to reduce the likelihood of this problem
There might be out-of-the-band information to help plugins like OOE determine the eventual platform encoding of the build agent. For example, in CJK countries, Linux servers have been traditionally run with UTF-8. Docker containers appear to be mostly packaged with the "POSIX" locale (meaning ASCII), and UTF-8 at best (source), so for Docker slave plugin, it can write the log file in UTF-8 to minimize the encoding mismatch problem between (A) and (H).

I'm going to write a PR based on this suggested solution.

Kohsuke Kawaguchi added a comment - 2016-06-07 00:52 I took a look at this more closely. Change under consideration The original PR #2302 attempted by ndeloof tries to change the encoding of the console log file that the master keeps. It is currently written in the encoding of the build agent that runs the build, and the PR attempted to change this to UTF-8. Motivation One-off-executor plugin would like to launch a new build agent synchronously from within Executor that's carrying out a build (such as creating a new container on Kubernetes.) In general, such an activity might produce some logs that logically should go to Run.getLogFile() . Run.execute() currently determines (A) the platform encoding of the build agent, then creates (B) a log file and starts writing. This ordering is problematic for the OOE plugin because (C) the the launch activitiy needs to happen to be able to determine (A), and (C) requires (B) to be able to log any progress/failure during the launch. So there's cyclic dependency between (A)->(B)->(C)->(A). Analysis Upon closer look, I don't think such encoding change is feasible without various regressions, and here is why; Run.charset has an intimate relationship with the encoding of Run.getLogFile() . (D) Too many code makes an assumption that Run.getLogFile() is encoded in Run.charset (correctly as of today), such as ConsoleCommand.run() . And likewise, (E) too many code makes an assumption that TaskListener.getLogger() receives text encoded in Run.charset (also correctly as of today), such as Maven.perform() where MavenConsoleAnnotator filters log output. In the world where logs are stored in UTF-8, the encoding in which Run.getLogFile() is written vs the encoding of bytes that stream through TaskListener.getLogger() will be different, and this will break (D) or (E). The only way out of (D) vs (E) dilemma is to use UTF-8 for everything from TaskListener.getLogger() to Run.getLogFile() by pushing re-encoding (from the platform encoding of the build agent to UTF-8) into Launcher (for example by tweaking the contract that ProcStarter.stdout deals with UTF-8), but doing this now creates IMO undeisrable interaction between Launcher and Executor , and also breaks an use case where Launcher is used to fork off a process whose I/O is not text-based (such as java -jar slave.jar ) Suggested solution Given the difficulty in the original attempted fix, my suggestion to this problem is as follows: (F) Define RunListener.onInitialize(Run) that is fired in Run.execute() before Computer.getDefaultCharset() is called to provide the OOE plugin an opportunity to perform a launch and avoid relying on a fragile assumption that Computer.getDefaultCharset() is the point in which provisioning has to happen. (G) Run.execute() will open log file with the append flag, so that the activity in (F) can open and write to Run.getLogFile() without getting overwritten. Because of the (A)->(B)->(C)->(A) cycle above, in general we cannot be certain that (H) the encoding in which the launch activity is written and (A) are identical, but I think this is the least of the evil because: In a homogeneous environment where the platform encoding of the master & agents are the same, this is not a problem OOE plugin can stick to ASCII charset to reduce the likelihood of this problem There might be out-of-the-band information to help plugins like OOE determine the eventual platform encoding of the build agent. For example, in CJK countries, Linux servers have been traditionally run with UTF-8. Docker containers appear to be mostly packaged with the "POSIX" locale (meaning ASCII), and UTF-8 at best ( source ), so for Docker slave plugin, it can write the log file in UTF-8 to minimize the encoding mismatch problem between (A) and (H). I'm going to write a PR based on this suggested solution.

Nicolas De Loof added a comment - 2016-06-07 07:16 - edited

Thanks for detailed analysis.
see https://github.com/jenkinsci/jenkins/pull/2395
and https://github.com/jenkinsci/one-shot-executor-plugin/tree/JENKINS-34923 for plugin to use it

My main concern here is that doing this we throw the encoding responsibility to the OOE implementation, and ComputerLauncher it uses.
I'm not as confident as you are with low risk in assuming launcher log encoding will be compatible with build log. As you said, considering Docker-slaves we made the assumption of UTF-8 everywhere, which is probably fine for most usages. Sounds to me we will need to give the end-user some option to declare the slave encoding to prevent charset mismatch in build log.

Nicolas De Loof added a comment - 2016-06-07 07:16 - edited Thanks for detailed analysis. see https://github.com/jenkinsci/jenkins/pull/2395 and https://github.com/jenkinsci/one-shot-executor-plugin/tree/JENKINS-34923 for plugin to use it My main concern here is that doing this we throw the encoding responsibility to the OOE implementation, and ComputerLauncher it uses. I'm not as confident as you are with low risk in assuming launcher log encoding will be compatible with build log. As you said, considering Docker-slaves we made the assumption of UTF-8 everywhere, which is probably fine for most usages. Sounds to me we will need to give the end-user some option to declare the slave encoding to prevent charset mismatch in build log.

SCM/JIRA link daemon added a comment - 2016-06-09 15:57

Code changed in jenkins
User: Nicolas De Loof
Path:
core/src/main/java/hudson/model/Computer.java
core/src/main/java/hudson/model/Run.java
core/src/main/java/hudson/model/listeners/RunListener.java
core/src/main/java/hudson/slaves/SlaveComputer.java
http://jenkins-ci.org/commit/jenkins/310c952602f7241cb74454b7e7fb2be0bf3d2980
Log:
~~JENKINS-34923~~

Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>

SCM/JIRA link daemon added a comment - 2016-06-09 15:57 Code changed in jenkins User: Nicolas De Loof Path: core/src/main/java/hudson/model/Computer.java core/src/main/java/hudson/model/Run.java core/src/main/java/hudson/model/listeners/RunListener.java core/src/main/java/hudson/slaves/SlaveComputer.java http://jenkins-ci.org/commit/jenkins/310c952602f7241cb74454b7e7fb2be0bf3d2980 Log: JENKINS-34923 Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>

SCM/JIRA link daemon added a comment - 2016-06-09 15:57

Code changed in jenkins
User: Nicolas De loof
Path:
core/src/main/java/hudson/model/Computer.java
core/src/main/java/hudson/model/Run.java
core/src/main/java/hudson/model/listeners/RunListener.java
core/src/main/java/hudson/slaves/SlaveComputer.java
http://jenkins-ci.org/commit/jenkins/ceb36b54b5783c36ff15b41a6ea9816154b46ec1
Log:
Merge pull request #2395 from ndeloof/one-shot-6

~~JENKINS-34923~~

Compare: https://github.com/jenkinsci/jenkins/compare/057791ca34c8...ceb36b54b578

SCM/JIRA link daemon added a comment - 2016-06-09 15:57 Code changed in jenkins User: Nicolas De loof Path: core/src/main/java/hudson/model/Computer.java core/src/main/java/hudson/model/Run.java core/src/main/java/hudson/model/listeners/RunListener.java core/src/main/java/hudson/slaves/SlaveComputer.java http://jenkins-ci.org/commit/jenkins/ceb36b54b5783c36ff15b41a6ea9816154b46ec1 Log: Merge pull request #2395 from ndeloof/one-shot-6 JENKINS-34923 Compare: https://github.com/jenkinsci/jenkins/compare/057791ca34c8...ceb36b54b578

Nicolas De Loof made changes - 2016-06-09 16:05

Resolution		New: Fixed [ 1 ]
Status	Original: Open [ 1 ]	New: Closed [ 6 ]

R. Tyler Croy made changes - 2016-07-26 00:01

Workflow

Original: JNJira [ 171124 ]

New: JNJira + In-Review [ 210054 ]

Nicolas De Loof made changes - 2016-09-14 15:06

Assignee

New: Nicolas De Loof [ ndeloof ]

Assignee:: Nicolas De Loof

Reporter:: Nicolas De Loof

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2016-05-18 17:17

Updated:: 2016-09-14 15:06

Resolved:: 2016-06-09 16:05

Jenkins

Details

Description

Attachments

Issue Links

Activity

Collapse comment: Kohsuke Kawaguchi added a comment - 2016-06-07 00:52

Change under consideration

Motivation

Analysis

Suggested solution

Expand comment: Kohsuke Kawaguchi added a comment - 2016-06-07 00:52

Collapse comment: Nicolas De Loof added a comment - 2016-06-07 07:16, Edited by Nicolas De Loof - 2016-06-07 08:29

Expand comment: Nicolas De Loof added a comment - 2016-06-07 07:16, Edited by Nicolas De Loof - 2016-06-07 08:29

Collapse comment: SCM/JIRA link daemon added a comment - 2016-06-09 15:57

Expand comment: SCM/JIRA link daemon added a comment - 2016-06-09 15:57

Collapse comment: SCM/JIRA link daemon added a comment - 2016-06-09 15:57

Expand comment: SCM/JIRA link daemon added a comment - 2016-06-09 15:57

People

Dates