[JENKINS-15315] Slave polling hungup

Rob Petti added a comment - 2012-09-27 02:23

Can you provide a threaddump when the hang occurs?
http://jenkinsurl/threadDump

Rob Petti added a comment - 2012-09-27 02:23 Can you provide a threaddump when the hang occurs? http://jenkinsurl/threadDump

Alexey Larsky added a comment - 2012-09-28 13:45 - edited

Treaddump is attached.

Perforce Polling Log

Started on Sep 27, 2012 9:31:26 PM
Looking for changes...
Using node: Builder
Using remote perforce client: Alexey.Larsky_NM_v01_builder
...nothing here...

Alexey Larsky added a comment - 2012-09-28 13:45 - edited Treaddump is attached. Perforce Polling Log Started on Sep 27, 2012 9:31:26 PM Looking for changes... Using node: Builder Using remote perforce client: Alexey.Larsky_NM_v01_builder ...nothing here...

Rob Petti added a comment - 2012-09-28 15:03

It looks like your p4 client executable is hanging for some reason. Double check all your settings, and make sure you are using a client version that matches your server version. Also make sure that you aren't using any special characters in your workspace name.

Rob Petti added a comment - 2012-09-28 15:03 It looks like your p4 client executable is hanging for some reason. Double check all your settings, and make sure you are using a client version that matches your server version. Also make sure that you aren't using any special characters in your workspace name.

Alexey Larsky added a comment - 2012-10-01 05:49

Thanks Rob.
I have updated p4 to P4/NTX64/2012.1/490371 (2012/07/02) from 2010.1 on server and client. In workspace's names using only dots and underscores.
I will check issue on updated version.

Alexey Larsky added a comment - 2012-10-01 05:49 Thanks Rob. I have updated p4 to P4/NTX64/2012.1/490371 (2012/07/02) from 2010.1 on server and client. In workspace's names using only dots and underscores. I will check issue on updated version.

Jesper Hansen added a comment - 2012-10-04 14:23

I have also experienced this problem a few times. The log only contains:

Perforce Polling Log

Started on Oct 4, 2012 12:18:23 PM
Looking for changes...
Using node: cphwrk0249
Using remote perforce client: jenkins_bsp-trunk--379981060

I can't find any p4.exe processes running on cphwrk0249.

Two of the times I experienced this was on days where I know that the network connection on cphwrk0249 had been disconnected for several minutes during the day.

perforce client is version P4/NTX64/2012.1/442152 (2012/04/06)
server is P4D/LINUX26X86/2012.1/518826 (2012/08/30)

Jesper Hansen added a comment - 2012-10-04 14:23 I have also experienced this problem a few times. The log only contains: Perforce Polling Log Started on Oct 4, 2012 12:18:23 PM Looking for changes... Using node: cphwrk0249 Using remote perforce client: jenkins_bsp-trunk--379981060 I can't find any p4.exe processes running on cphwrk0249. Two of the times I experienced this was on days where I know that the network connection on cphwrk0249 had been disconnected for several minutes during the day. perforce client is version P4/NTX64/2012.1/442152 (2012/04/06) server is P4D/LINUX26X86/2012.1/518826 (2012/08/30)

Alexey Larsky added a comment - 2012-10-12 10:07 - edited

I again get hungup sitiation on slave. P4 versions - lastest 2012.2.
Attaching new threaddump.

Alexey Larsky added a comment - 2012-10-12 10:07 - edited I again get hungup sitiation on slave. P4 versions - lastest 2012.2. Attaching new threaddump.

Alexey Larsky added a comment - 2012-10-12 10:09 - edited

Uploading "Thread dump [Jenkins] (2012-10-12 14-00-59).htm"
Threaddump for p4 polling hungup on slaves.

Alexey Larsky added a comment - 2012-10-12 10:09 - edited Uploading "Thread dump [Jenkins] (2012-10-12 14-00-59).htm" Threaddump for p4 polling hungup on slaves.

Rob Petti added a comment - 2012-10-12 14:15

It's hanging in IO, which to me suggests a network problem. I'm not sure what else I can do aside from adding some kind of timeout. :/

Rob Petti added a comment - 2012-10-12 14:15 It's hanging in IO, which to me suggests a network problem. I'm not sure what else I can do aside from adding some kind of timeout. :/

Alexey Larsky added a comment - 2012-10-12 14:26 - edited

But network problem - is ordinary situation between two computers (master and slave(s)).
And this situation mustn't influence to future builds. In other words task should be able to restore after network error.

Thank you

Alexey Larsky added a comment - 2012-10-12 14:26 - edited But network problem - is ordinary situation between two computers (master and slave(s)). And this situation mustn't influence to future builds. In other words task should be able to restore after network error. Thank you

Rob Petti added a comment - 2012-10-15 16:25

Yeah, I agree, but the remoting API is supposed to take care of those details. If there is a connection issue, it should be failing outright instead of hanging.

Rob Petti added a comment - 2012-10-15 16:25 Yeah, I agree, but the remoting API is supposed to take care of those details. If there is a connection issue, it should be failing outright instead of hanging.

Rob Petti added a comment - 2012-10-15 18:39

Actually, can you post your java, jenkins, and perforce plugin versions?

Rob Petti added a comment - 2012-10-15 18:39 Actually, can you post your java, jenkins, and perforce plugin versions?

Oleg Nenashev added a comment - 2013-06-18 11:28

Hello Robert,

This issue is quite painful for our users, so I would like to fix it. This issue has been reproduced at the latest plugin version (Jenkins version – 1.480.3, java version "1.7.0_19", OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode) and several previous versions as well.

According to the stacktraces, P4 hangs at BufferedReader::readLine(), which is infinitely waits for new line or EOF. I suppose that P4 command-line client finishes before call of readLine() or somehow enters interactive mode and waits till user’s input. However, I can’t reproduce issue at my testing stand with slaves with debugger. Restart of the slave fixes the problem.

There are 84 BufferedReader::readLine() calls in p4 plugin, so we can’t just fix getPerforceResponse() function.

Possible solutions:
• We can add something like timeouts to the checkout() and other top-level overrides. BTW, it’s just a workaround for operative notification, because only restart can fix issue’s origin.
• Add timeout to the getPerforceResponse() only and wait for other errors (yeah, just fix the known issue)
• Replace BufferedReaders by wrapper, which knows how to handle issue.

I’m going to implement second approach. It will be possible to configure timeout via perforce global configuration.

Best regards,
Oleg Nenashev
R&D Engineer, Synopsys Inc.
www.synopsys.com

Oleg Nenashev added a comment - 2013-06-18 11:28 Hello Robert, This issue is quite painful for our users, so I would like to fix it. This issue has been reproduced at the latest plugin version (Jenkins version – 1.480.3, java version "1.7.0_19", OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode) and several previous versions as well. According to the stacktraces, P4 hangs at BufferedReader::readLine(), which is infinitely waits for new line or EOF. I suppose that P4 command-line client finishes before call of readLine() or somehow enters interactive mode and waits till user’s input. However, I can’t reproduce issue at my testing stand with slaves with debugger. Restart of the slave fixes the problem. There are 84 BufferedReader::readLine() calls in p4 plugin, so we can’t just fix getPerforceResponse() function. Possible solutions: • We can add something like timeouts to the checkout() and other top-level overrides. BTW, it’s just a workaround for operative notification, because only restart can fix issue’s origin. • Add timeout to the getPerforceResponse() only and wait for other errors (yeah, just fix the known issue) • Replace BufferedReaders by wrapper, which knows how to handle issue. I’m going to implement second approach. It will be possible to configure timeout via perforce global configuration. Best regards, Oleg Nenashev R&D Engineer, Synopsys Inc. www.synopsys.com

Rob Petti added a comment - 2013-06-18 16:28

The timeout needs to be reset every time a line comes through. As I've mentioned, some users have a very large amount of data that can take hours to sync, so it shouldn't time out an operation if it's clearly still doing something.

Rob Petti added a comment - 2013-06-18 16:28 The timeout needs to be reset every time a line comes through. As I've mentioned, some users have a very large amount of data that can take hours to sync, so it shouldn't time out an operation if it's clearly still doing something.

Oleg Nenashev added a comment - 2013-06-18 17:50

Yes, I agree with you.
Hope to finish testing and create pull request today.

Oleg Nenashev added a comment - 2013-06-18 17:50 Yes, I agree with you. Hope to finish testing and create pull request today.

Oleg Nenashev added a comment - 2013-06-18 19:19 - edited

Pull request with the discussed workaround: https://github.com/jenkinsci/perforce-plugin/pull/32

Oleg Nenashev added a comment - 2013-06-18 19:19 - edited Pull request with the discussed workaround: https://github.com/jenkinsci/perforce-plugin/pull/32

SCM/JIRA link daemon added a comment - 2013-06-27 17:59

Code changed in jenkins
User: Oleg Nenashev
Path:
src/main/java/com/tek42/perforce/parse/AbstractPerforceTemplate.java
src/main/java/hudson/plugins/perforce/PerforceSCM.java
http://jenkins-ci.org/commit/perforce-plugin/b8b0115c5b630566ea2473ad6ced2f0769cc0c7b
Log:
Added optional timeout to com.tek42.perforce.parse.AbstractPerforceTemplate::getPerforceResponse()
Should prevent hanging of p4 checkout in case of https://issues.jenkins-ci.org/browse/JENKINS-15315

Signed-off-by: Oleg Nenashev <nenashev@synopsys.com>

SCM/JIRA link daemon added a comment - 2013-06-27 17:59 Code changed in jenkins User: Oleg Nenashev Path: src/main/java/com/tek42/perforce/parse/AbstractPerforceTemplate.java src/main/java/hudson/plugins/perforce/PerforceSCM.java http://jenkins-ci.org/commit/perforce-plugin/b8b0115c5b630566ea2473ad6ced2f0769cc0c7b Log: Added optional timeout to com.tek42.perforce.parse.AbstractPerforceTemplate::getPerforceResponse() Should prevent hanging of p4 checkout in case of https://issues.jenkins-ci.org/browse/JENKINS-15315 Signed-off-by: Oleg Nenashev <nenashev@synopsys.com>

SCM/JIRA link daemon added a comment - 2013-06-27 17:59

Code changed in jenkins
User: Rob Petti
Path:
src/main/java/com/tek42/perforce/parse/AbstractPerforceTemplate.java
src/main/java/com/tek42/perforce/process/CmdLineExecutor.java
src/main/java/com/tek42/perforce/process/Executor.java
src/main/java/hudson/plugins/perforce/HudsonP4DefaultExecutor.java
src/main/java/hudson/plugins/perforce/HudsonP4RemoteExecutor.java
src/main/java/hudson/plugins/perforce/PerforceSCM.java
src/main/resources/hudson/plugins/perforce/PerforceSCM/global.jelly
src/main/webapp/help/p4ReadLineTimeout.html
http://jenkins-ci.org/commit/perforce-plugin/043a336b1afc14c1e3c1ce6e29e570d3ae09f592
Log:
Merge pull request #32 from synopsys-arc-oss/p4-hangs-issue-workaround

"Slave polling hangup" issue workaround (JENKINS-15315)

Compare: https://github.com/jenkinsci/perforce-plugin/compare/1fc190959170...043a336b1afc

SCM/JIRA link daemon added a comment - 2013-06-27 17:59 Code changed in jenkins User: Rob Petti Path: src/main/java/com/tek42/perforce/parse/AbstractPerforceTemplate.java src/main/java/com/tek42/perforce/process/CmdLineExecutor.java src/main/java/com/tek42/perforce/process/Executor.java src/main/java/hudson/plugins/perforce/HudsonP4DefaultExecutor.java src/main/java/hudson/plugins/perforce/HudsonP4RemoteExecutor.java src/main/java/hudson/plugins/perforce/PerforceSCM.java src/main/resources/hudson/plugins/perforce/PerforceSCM/global.jelly src/main/webapp/help/p4ReadLineTimeout.html http://jenkins-ci.org/commit/perforce-plugin/043a336b1afc14c1e3c1ce6e29e570d3ae09f592 Log: Merge pull request #32 from synopsys-arc-oss/p4-hangs-issue-workaround "Slave polling hangup" issue workaround ( JENKINS-15315 ) Compare: https://github.com/jenkinsci/perforce-plugin/compare/1fc190959170...043a336b1afc

Jesse Glick added a comment - 2013-09-13 13:52

This change is in 1.3.25; should the JIRA ticket be resolved, or are you still planning some further fixes?

Jesse Glick added a comment - 2013-09-13 13:52 This change is in 1.3.25; should the JIRA ticket be resolved, or are you still planning some further fixes?

Rob Petti added a comment - 2013-09-13 14:09

I've had a colleague report intermittent issues with functionality related to this, so I wouldn't call it resolved just yet. It also seems like the timeout code can still deadlock, since ready() does not guarantee that the next readLine() won't block.

Rob Petti added a comment - 2013-09-13 14:09 I've had a colleague report intermittent issues with functionality related to this, so I wouldn't call it resolved just yet. It also seems like the timeout code can still deadlock, since ready() does not guarantee that the next readLine() won't block.

Oleg Nenashev added a comment - 2013-09-13 14:34

Yes, ready() guarantees only that next read() is valid.
I haven't experienced unterminated hangups since the PR, but I agree with Robert. Issue has not been fixed.

What do you mean under "intermittent issues", Rob? Could you update the issue?

Oleg Nenashev added a comment - 2013-09-13 14:34 Yes, ready() guarantees only that next read() is valid. I haven't experienced unterminated hangups since the PR, but I agree with Robert. Issue has not been fixed. What do you mean under "intermittent issues", Rob? Could you update the issue?

Rob Petti added a comment - 2013-09-13 14:39

Not sure if they are related, but retrieving the perforce response sometimes results in no data being received since 1.3.25:

Caught exception communicating with perforce. Problem getting user information for <USER>
com.tek42.perforce.PerforceException: Problem getting user information for <USER>
	at hudson.plugins.perforce.PerforceSCM.retrieveUserInformation(PerforceSCM.java:711)
	at hudson.plugins.perforce.PerforceSCM.checkout(PerforceSCM.java:994)
	at hudson.model.AbstractProject.checkout(AbstractProject.java:1369)
	at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:676)
	at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:581)
	at hudson.model.Run.execute(Run.java:1576)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
	at hudson.model.ResourceController.execute(ResourceController.java:88)
	at hudson.model.Executor.run(Executor.java:241)
Caused by: com.tek42.perforce.PerforceException: No output for: /usr/local/bin/p4 user -o <USER> 
	at com.tek42.perforce.parse.AbstractPerforceTemplate.getPerforceResponse(AbstractPerforceTemplate.java:434)
	at com.tek42.perforce.parse.AbstractPerforceTemplate.getPerforceResponse(AbstractPerforceTemplate.java:298)
	at com.tek42.perforce.parse.Users.getUser(Users.java:56)
	at hudson.plugins.perforce.PerforceSCM.retrieveUserInformation(PerforceSCM.java:709)
	... 9 more
ERROR: Unable to communicate with perforce. Problem getting user information for <USER>

Rob Petti added a comment - 2013-09-13 14:39 Not sure if they are related, but retrieving the perforce response sometimes results in no data being received since 1.3.25: Caught exception communicating with perforce. Problem getting user information for <USER> com.tek42.perforce.PerforceException: Problem getting user information for <USER> at hudson.plugins.perforce.PerforceSCM.retrieveUserInformation(PerforceSCM.java:711) at hudson.plugins.perforce.PerforceSCM.checkout(PerforceSCM.java:994) at hudson.model.AbstractProject.checkout(AbstractProject.java:1369) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:676) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:581) at hudson.model.Run.execute(Run.java:1576) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:241) Caused by: com.tek42.perforce.PerforceException: No output for : /usr/local/bin/p4 user -o <USER> at com.tek42.perforce.parse.AbstractPerforceTemplate.getPerforceResponse(AbstractPerforceTemplate.java:434) at com.tek42.perforce.parse.AbstractPerforceTemplate.getPerforceResponse(AbstractPerforceTemplate.java:298) at com.tek42.perforce.parse.Users.getUser(Users.java:56) at hudson.plugins.perforce.PerforceSCM.retrieveUserInformation(PerforceSCM.java:709) ... 9 more ERROR: Unable to communicate with perforce. Problem getting user information for <USER>

Rob Petti added a comment - 2013-09-13 14:47

Ah, yeah, the read loop starts with:

while (reader.ready() || p4.isAlive())

It's entirely possible that when the loop starts, no data has been sent back by the remote slave yet, so reader.ready() returns false, no data is read, and the plugin throws that error.

Rob Petti added a comment - 2013-09-13 14:47 Ah, yeah, the read loop starts with: while (reader.ready() || p4.isAlive()) It's entirely possible that when the loop starts, no data has been sent back by the remote slave yet, so reader.ready() returns false, no data is read, and the plugin throws that error.

Jesse Glick added a comment - 2013-09-13 15:17

FYI, a No output for issue is filed as JENKINS-15904; unsure if there is any relation.

Jesse Glick added a comment - 2013-09-13 15:17 FYI, a No output for issue is filed as JENKINS-15904 ; unsure if there is any relation.

Oleg Nenashev added a comment - 2013-09-13 15:27

What executor do you use in such case?

In case of HudsonP4RemoteExecutor, isAlive() runs after exec(). currentProcess should be available => p4.isAlive() should return true from the start.

Therefore, only one case is possible:

Process has been already completed
... but no data has been received yet

I've tried remote hosts with ~1 second delays, but I have not managed to reproduce your case.
Anyway, I should start from automated tests for Perforce checkout operations, which will be able to accept various global configurations.

HudsonP4RemoteExecutor:
@Override
public boolean isAlive() throws IOException, InterruptedException {
   return currentProcess != null ? currentProcess.isAlive() : false;
}

RemoteProc:
@Override
public boolean isAlive() throws IOException, InterruptedException {
   return !process.isDone();
}

Oleg Nenashev added a comment - 2013-09-13 15:27 What executor do you use in such case? In case of HudsonP4RemoteExecutor, isAlive() runs after exec(). currentProcess should be available => p4.isAlive() should return true from the start. Therefore, only one case is possible: Process has been already completed ... but no data has been received yet I've tried remote hosts with ~1 second delays, but I have not managed to reproduce your case. Anyway, I should start from automated tests for Perforce checkout operations, which will be able to accept various global configurations. HudsonP4RemoteExecutor: @Override public boolean isAlive() throws IOException, InterruptedException { return currentProcess != null ? currentProcess.isAlive() : false ; } RemoteProc: @Override public boolean isAlive() throws IOException, InterruptedException { return !process.isDone(); }

Rob Petti added a comment - 2013-09-13 15:55

@Jesse That one is different. It's about the remote executor not passing OS-level exceptions back to Jenkins, and instead just closing the pipe as if nothing is wrong. The plugin sees that no data comes back, and throws that error instead of the actual exception (Cannot run program).

@Oleg all remote operations are using the remote executor. I'm not sure if this happens on the master, but it definitely occurs on the slaves. The only thing I can think of is that it may be possible for a remote process to register as being terminated before the data actually becomes available on the pipe, assuming the buffer is large enough for all the data being returned by the command. That would explain why it's failing on relatively small operations, such as p4 user and p4 users, since they terminate quite quickly compared to things such as syncs, and return only a small amount of data.

It may be necessary to remove the loop condition, and just break once we know that the pipe is closed or the timeout has been reached.

Rob Petti added a comment - 2013-09-13 15:55 @Jesse That one is different. It's about the remote executor not passing OS-level exceptions back to Jenkins, and instead just closing the pipe as if nothing is wrong. The plugin sees that no data comes back, and throws that error instead of the actual exception (Cannot run program). @Oleg all remote operations are using the remote executor. I'm not sure if this happens on the master, but it definitely occurs on the slaves. The only thing I can think of is that it may be possible for a remote process to register as being terminated before the data actually becomes available on the pipe, assuming the buffer is large enough for all the data being returned by the command. That would explain why it's failing on relatively small operations, such as p4 user and p4 users, since they terminate quite quickly compared to things such as syncs, and return only a small amount of data. It may be necessary to remove the loop condition, and just break once we know that the pipe is closed or the timeout has been reached.

Rob Petti added a comment - 2013-09-25 21:02

It seems like the problem is with the reader.ready() call. Apparently this never becomes true, even when there is data on the pipe. I try to just check for this before reading, but it hangs indefinitely. Apart from reading from the raw InputStream, I can't see any other way of handling this. :/

Rob Petti added a comment - 2013-09-25 21:02 It seems like the problem is with the reader.ready() call. Apparently this never becomes true, even when there is data on the pipe. I try to just check for this before reading, but it hangs indefinitely. Apart from reading from the raw InputStream, I can't see any other way of handling this. :/

Oleg Nenashev added a comment - 2013-09-26 05:39

Another approach: We could add a wrapper to launcher's IO streams and perform monitoring of its activity via external thread, which can interrupt the launcher and close the stream. External thread in a significant overhead, but it could be a general approach for all external calls in P4 plugin.

BTW, reader.ready() works for me (local and remote Windows slave).

Oleg Nenashev added a comment - 2013-09-26 05:39 Another approach: We could add a wrapper to launcher's IO streams and perform monitoring of its activity via external thread, which can interrupt the launcher and close the stream. External thread in a significant overhead, but it could be a general approach for all external calls in P4 plugin. BTW, reader.ready() works for me (local and remote Windows slave).

Rob Petti added a comment - 2013-09-26 15:01

I'm testing on a Linux master,and all small operations fail to return ready as true. I already tried using a watchdog thread, but read cannot be interrupted at all. We would need to write our own reader so we can manipulate the stream directly.

Rob Petti added a comment - 2013-09-26 15:01 I'm testing on a Linux master,and all small operations fail to return ready as true. I already tried using a watchdog thread, but read cannot be interrupted at all. We would need to write our own reader so we can manipulate the stream directly.

Rob Petti added a comment - 2013-09-26 15:57

It seems like InputStream.available() is always returning 0 as well... I'm at a total loss now. We might not have any choice but to back out the changes.

Rob Petti added a comment - 2013-09-26 15:57 It seems like InputStream.available() is always returning 0 as well... I'm at a total loss now. We might not have any choice but to back out the changes.

Oleg Nenashev added a comment - 2013-09-26 18:49

InputStream.available() returns null by default. Several child classes like BufferedInputStream override this method.

What about usage of Future wrapper? StackOverflow has several samples: http://stackoverflow.com/questions/804951/is-it-possible-to-read-from-a-inputstream-with-a-timeout

P.S: I suppose that usage of newest P4Java versions could be the best solution for this issue (not for workaround), but it almost means rewriting from scratch.

Oleg Nenashev added a comment - 2013-09-26 18:49 InputStream.available() returns null by default. Several child classes like BufferedInputStream override this method. What about usage of Future wrapper? StackOverflow has several samples: http://stackoverflow.com/questions/804951/is-it-possible-to-read-from-a-inputstream-with-a-timeout P.S: I suppose that usage of newest P4Java versions could be the best solution for this issue (not for workaround), but it almost means rewriting from scratch.

Rob Petti added a comment - 2013-09-26 19:07

Look at the comments for the answer the suggests Futures. If we used this, we'd leak threads every time Perforce hangs until Jenkins is restarted, since there's absolutely no way to interrupt a read operation in Java apart from killing the JVM entirely... I don't think this is an option here.

Rob Petti added a comment - 2013-09-26 19:07 Look at the comments for the answer the suggests Futures. If we used this, we'd leak threads every time Perforce hangs until Jenkins is restarted, since there's absolutely no way to interrupt a read operation in Java apart from killing the JVM entirely... I don't think this is an option here.

Rob Petti added a comment - 2013-09-26 20:06

I rewrote the timeout functionality to spawn a thread that waits, then closes the underlying InputStream if no lines have been received for a while. This seems to work fine, at least on my system.

Also, there's still no timeout on several of the perforce response methods being used by the plugin. Only one of them currently has a timeout.

Rob Petti added a comment - 2013-09-26 20:06 I rewrote the timeout functionality to spawn a thread that waits, then closes the underlying InputStream if no lines have been received for a while. This seems to work fine, at least on my system. Also, there's still no timeout on several of the perforce response methods being used by the plugin. Only one of them currently has a timeout.

James Howe added a comment - 2014-06-25 08:59

I'm still having problems.
Slave polling p4 command hangs (on OSX slave), perforce plugin timeout feature isn't killing it.
Eventually every fork on the slave failed with EAGAIN.
dtruss shows nothing happening, netstat shows p4's sockets in SYN_SENT

Jenkins 1.569, plugin 1.3.27, p4 2013.3
Will update to 2014.1 and see if anything changes.

James Howe added a comment - 2014-06-25 08:59 I'm still having problems. Slave polling p4 command hangs (on OSX slave), perforce plugin timeout feature isn't killing it. Eventually every fork on the slave failed with EAGAIN. dtruss shows nothing happening, netstat shows p4's sockets in SYN_SENT Jenkins 1.569, plugin 1.3.27, p4 2013.3 Will update to 2014.1 and see if anything changes.

Oleg Nenashev added a comment - 2014-06-25 09:44 - edited

The issue has not been solved yet, timeouts are not reliable.
BTW, an update to the new client version may workaround the issue

Oleg Nenashev added a comment - 2014-06-25 09:44 - edited The issue has not been solved yet, timeouts are not reliable. BTW, an update to the new client version may workaround the issue

James Howe added a comment - 2014-06-25 09:53

>Will update to 2014.1 and see if anything changes.
It didn't.

James Howe added a comment - 2014-06-25 09:53 >Will update to 2014.1 and see if anything changes. It didn't.

James Howe added a comment - 2014-06-25 10:31

Running the same commands manually that are hanging, with the same credentials, shows no issues.

James Howe added a comment - 2014-06-25 10:31 Running the same commands manually that are hanging, with the same credentials, shows no issues.

Brantone added a comment - 2014-07-23 08:37

In my case, after leaving it for a week on polling, netstat shows several thousand entires sitting in FIN_WAIT_2

Brantone added a comment - 2014-07-23 08:37 In my case, after leaving it for a week on polling, netstat shows several thousand entires sitting in FIN_WAIT_2

Alexey Larsky added a comment - 2015-03-21 19:21

Version 1.31 is most stable. 1.33 and 1,34 hungs on slaves periodically.

Alexey Larsky added a comment - 2015-03-21 19:21 Version 1.31 is most stable. 1.33 and 1,34 hungs on slaves periodically.

Oleg Nenashev added a comment - 2017-06-16 17:19

I do not longer work on the plugin. Unassigned

Oleg Nenashev added a comment - 2017-06-16 17:19 I do not longer work on the plugin. Unassigned

Jenkins

Details

Description

Attachments

Attachments

Issue Links

Activity

Collapse comment: Rob Petti added a comment - 2012-09-27 02:23

Expand comment: Rob Petti added a comment - 2012-09-27 02:23

Collapse comment: Alexey Larsky added a comment - 2012-09-28 13:45, Edited by Alexey Larsky - 2012-09-28 13:45

Expand comment: Alexey Larsky added a comment - 2012-09-28 13:45, Edited by Alexey Larsky - 2012-09-28 13:45

Collapse comment: Rob Petti added a comment - 2012-09-28 15:03

Expand comment: Rob Petti added a comment - 2012-09-28 15:03

Collapse comment: Alexey Larsky added a comment - 2012-10-01 05:49

Expand comment: Alexey Larsky added a comment - 2012-10-01 05:49

Collapse comment: Jesper Hansen added a comment - 2012-10-04 14:23

Expand comment: Jesper Hansen added a comment - 2012-10-04 14:23

Collapse comment: Alexey Larsky added a comment - 2012-10-12 10:07, Edited by Alexey Larsky - 2012-10-12 12:18

Expand comment: Alexey Larsky added a comment - 2012-10-12 10:07, Edited by Alexey Larsky - 2012-10-12 12:18

Collapse comment: Alexey Larsky added a comment - 2012-10-12 10:09, Edited by Alexey Larsky - 2012-10-12 10:11

Expand comment: Alexey Larsky added a comment - 2012-10-12 10:09, Edited by Alexey Larsky - 2012-10-12 10:11

Collapse comment: Rob Petti added a comment - 2012-10-12 14:15

Expand comment: Rob Petti added a comment - 2012-10-12 14:15

Collapse comment: Alexey Larsky added a comment - 2012-10-12 14:26, Edited by Alexey Larsky - 2012-10-12 14:27

Expand comment: Alexey Larsky added a comment - 2012-10-12 14:26, Edited by Alexey Larsky - 2012-10-12 14:27

Collapse comment: Rob Petti added a comment - 2012-10-15 16:25

Expand comment: Rob Petti added a comment - 2012-10-15 16:25

Collapse comment: Rob Petti added a comment - 2012-10-15 18:39

Expand comment: Rob Petti added a comment - 2012-10-15 18:39

Collapse comment: Oleg Nenashev added a comment - 2013-06-18 11:28

Expand comment: Oleg Nenashev added a comment - 2013-06-18 11:28

Collapse comment: Rob Petti added a comment - 2013-06-18 16:28

Expand comment: Rob Petti added a comment - 2013-06-18 16:28

Collapse comment: Oleg Nenashev added a comment - 2013-06-18 17:50

Expand comment: Oleg Nenashev added a comment - 2013-06-18 17:50

Collapse comment: Oleg Nenashev added a comment - 2013-06-18 19:19, Edited by Oleg Nenashev - 2013-06-18 19:19

Expand comment: Oleg Nenashev added a comment - 2013-06-18 19:19, Edited by Oleg Nenashev - 2013-06-18 19:19

Collapse comment: SCM/JIRA link daemon added a comment - 2013-06-27 17:59

Expand comment: SCM/JIRA link daemon added a comment - 2013-06-27 17:59

Collapse comment: SCM/JIRA link daemon added a comment - 2013-06-27 17:59

Expand comment: SCM/JIRA link daemon added a comment - 2013-06-27 17:59

Collapse comment: Jesse Glick added a comment - 2013-09-13 13:52

Expand comment: Jesse Glick added a comment - 2013-09-13 13:52

Collapse comment: Rob Petti added a comment - 2013-09-13 14:09

Expand comment: Rob Petti added a comment - 2013-09-13 14:09

Collapse comment: Oleg Nenashev added a comment - 2013-09-13 14:34

Expand comment: Oleg Nenashev added a comment - 2013-09-13 14:34

Collapse comment: Rob Petti added a comment - 2013-09-13 14:39

Expand comment: Rob Petti added a comment - 2013-09-13 14:39

Collapse comment: Rob Petti added a comment - 2013-09-13 14:47

Expand comment: Rob Petti added a comment - 2013-09-13 14:47

Collapse comment: Jesse Glick added a comment - 2013-09-13 15:17

Expand comment: Jesse Glick added a comment - 2013-09-13 15:17

Collapse comment: Oleg Nenashev added a comment - 2013-09-13 15:27

Expand comment: Oleg Nenashev added a comment - 2013-09-13 15:27

Collapse comment: Rob Petti added a comment - 2013-09-13 15:55

Expand comment: Rob Petti added a comment - 2013-09-13 15:55

Collapse comment: Rob Petti added a comment - 2013-09-25 21:02

Expand comment: Rob Petti added a comment - 2013-09-25 21:02

Collapse comment: Oleg Nenashev added a comment - 2013-09-26 05:39

Expand comment: Oleg Nenashev added a comment - 2013-09-26 05:39

Collapse comment: Rob Petti added a comment - 2013-09-26 15:01

Expand comment: Rob Petti added a comment - 2013-09-26 15:01

Collapse comment: Rob Petti added a comment - 2013-09-26 15:57

Expand comment: Rob Petti added a comment - 2013-09-26 15:57

Collapse comment: Oleg Nenashev added a comment - 2013-09-26 18:49

Expand comment: Oleg Nenashev added a comment - 2013-09-26 18:49

Collapse comment: Rob Petti added a comment - 2013-09-26 19:07

Expand comment: Rob Petti added a comment - 2013-09-26 19:07

Collapse comment: Rob Petti added a comment - 2013-09-26 20:06

Expand comment: Rob Petti added a comment - 2013-09-26 20:06

Collapse comment: James Howe added a comment - 2014-06-25 08:59

Expand comment: James Howe added a comment - 2014-06-25 08:59

Collapse comment: Oleg Nenashev added a comment - 2014-06-25 09:44, Edited by Oleg Nenashev - 2014-06-25 09:45

Expand comment: Oleg Nenashev added a comment - 2014-06-25 09:44, Edited by Oleg Nenashev - 2014-06-25 09:45

Collapse comment: James Howe added a comment - 2014-06-25 09:53

Expand comment: James Howe added a comment - 2014-06-25 09:53

Collapse comment: James Howe added a comment - 2014-06-25 10:31

Expand comment: James Howe added a comment - 2014-06-25 10:31