-
Improvement
-
Resolution: Fixed
-
Minor
-
None
The plugins it is robust, most of the network outages and difficulties do not affect
to the uploads, or only increases the times. Downloads are a little more weak
and network outages affects them breaking the download.
Network outages test
Scenario: we have a Jenkins instance with the S3 Artifact Manager Plugin installed,
we have a ToxiProxy service connected to a
Squid http proxy. We configured a role on ToxiProxy
that redirects port 8888 to 3128 squid service port, We configured the Jenkins instance
to use the the port 8888 as proxy (with java properties -Dhttp.proxyHost=127.0.0.1 -Dhttp.proxyPort=8888 -Dhttps.proxyHost=127.0.0.1 -Dhttps.proxyPort=8888).
Test Scripts
Big-file test
def file = "test.bin" timestamps { node() { stage('Generating ${file}') { sh "[ -f ${file} ] || dd if=/dev/urandom of=${file} bs=10240 count=102400" } stage('Archive') { archiveArtifacts file } stage('Unarchive') { unarchive mapping: ["${file}": 'test.bin'] } } }
Small-files Test
timestamps { node() { stage('Setup') { for(def i = 1; i < 1; i++) { writeFile file: "test/test-${i}.txt", text: "test ${i}" } } stage('Archive') { archiveArtifacts "test/*" } stage('Unarchive') { dir('unarch') { deleteDir() unarchive mapping: ["test/": '.'] } } } }
Stash Test
timestamps { node() { stage('Setup') { for(def i = 1; i < 100; i++) { writeFile file: "test/test-${it}.txt", text: "test ${it}" } } stage('Archive') { stash name: 'stuff', includes: 'test/' } stage('Unarchive') { dir('unarch') { deleteDir() unstash name: 'stuff' } } } }
Prepare the environment
export NET=172.18.5.0/24 export squid=172.18.5.10 export toxiproxy=172.18.5.11 export HOSTS="--add-host squid.example.com:${squid} \ --add-host toxiproxy.example.com:${toxiproxy}" docker network create --subnet=${NET} toxiNetwork
start the toxiproxy docker container
docker pull shopify/toxiproxy docker run -it -p 8474:8474 -p 8888:8888 --rm --ip ${toxiproxy} --net toxiNetwork ${HOSTS} --name toxiproxy shopify/toxiproxy
start squid
docker run -d -p 3128:3128 --rm --ip ${squid} --net toxiNetwork ${HOSTS} --name squid minimum2scp/squid
create a configuration file with two redirection rules
cat <<EOF > toxiproxy.json [{ "name": "squid", "listen": "${toxiproxy}:8888", "upstream": "${squid}:3128" }] EOF
load the configuration in toxiproxy
curl -X POST http://127.0.0.1:8474/populate -d"@toxiproxy.json" && echo
this command enable/disable the Proxy, we will use it to simulate network outages.
curl -X POST http://127.0.0.1:8474/proxies/squid -d '{"enabled": true}' && echo
Simulate a TIME second continuous network outage
launch the jobs, then execute this script,
you have to change the variable TIME to test 1,5,10,30 seconds of outages
per 1 second of connection.
export TIME=1 while [ true ]; do curl -X POST http://127.0.0.1:8474/proxies/squid -d '{"enabled": true}' && echo sleep 1 curl -X POST http://127.0.0.1:8474/proxies/squid -d '{"enabled": false}' && echo sleep ${TIME} done
Simulate a TIME second isolated network outage
launch the jobs, then execute this script once on each job stage,
you have to change the variable TIME to test 1,5,10,30 seconds of outages.
export TIME=1 curl -X POST http://127.0.0.1:8474/proxies/squid -d '{"enabled": true}' && echo sleep 1 curl -X POST http://127.0.0.1:8474/proxies/squid -d '{"enabled": false}' && echo sleep ${TIME} curl -X POST http://127.0.0.1:8474/proxies/squid -d '{"enabled": true}' && echo sleep 1 curl -X POST http://127.0.0.1:8474/proxies/squid -d '{"enabled": false}' && echo sleep ${TIME} curl -X POST http://127.0.0.1:8474/proxies/squid -d '{"enabled": true}' && echo
Simulate latency
Create the toxic and excute the test jobs
curl -X POST http://127.0.0.1:8474/proxies/squid/toxics -d '{"name": "latency_squid", "type": "latency", "stream": "upstream", "toxicity": 1.0, "attributes": {"latency": 1000, "jitter": 1000} }' && echo
when you finished delete the toxic
curl -X DELETE http://127.0.0.1:8474/proxies/squid/toxics/latency_squid
Simulate bandwidth limitations
Create the toxic and excute the test jobs
curl -X POST http://127.0.0.1:8474/proxies/squid/toxics -d '{"name": "bandwidth_squid", "type": "bandwidth", "stream": "upstream", "toxicity": 1.0, "attributes": {"rate": 1024} }' && echo
when you finished delete the toxic
curl -X DELETE http://127.0.0.1:8474/proxies/squid/toxics/bandwidth_squid
Simulate slow_close
Create the toxic and excute the test jobs
curl -X POST http://127.0.0.1:8474/proxies/squid/toxics -d '{"name": "slowclose_squid", "type": "slow_close", "stream": "upstream", "toxicity": 1.0, "attributes": {"delay": 1000} }' && echo
when you finished delete the toxic
curl -X DELETE http://127.0.0.1:8474/proxies/squid/toxics/slowclose_squid
Test Results
Continuous networks outages
We simulate a network outage of N seconds, then we restore the network for one second,
and start a new network outage, we repeat this process until the job finished.
Big files we run a test job that archives a file of 1GB, and we try different network outage times
- 1 second - it fails consistenly
- 5 second - it fails consistenly
- 10 second - it fails consistenly
- 30 second - it fails consistenly
Small files
we run a test job that archive and unarchive a few files, and we try different network outage times
- 1 second - archive is not affected, unarchive fails 90% of times
- 5 second - archive is not affected, unarchive fails 90% of times
- 10 second - archive is not affected, unarchive fails 90% of times
- 30 second - archive is not affected, unarchive fails 90% of times
Stash
we run a test job that stash and unstash a few files, and we try different network outage times
- 1 second - stash is not affected, unstash fails 70% of times
- 5 second - stash is not affected, unstash fails 70% of times
- 10 second - stash is not affected, unstash fails 80% of times
- 30 second - stash is not affected, unstash fails 90% of times
Isolated network outages
We simulate a network outage of N seconds, then we restore the network for one second,
and start a new network outage, finally, we restore the network again.
Big files we run a test job that archives a file of 1GB, and we try different network outage times
- 1 second - it is not affected
- 5 second - it is not affected
- 10 second - it is not affected, we can see reties messages on the logs
- 30 second - it is not affected, we can see reties messages on the logs
Small files
we run a test job that archive and unarchive a few files, and we try different network outage times
- 1 second - it is not affected
- 5 second - archive is not affected, we can see reties messages on the logs. Unarchive fails consistenly.
- 10 second - archive is not affected, we can see reties messages on the logs. Unarchive fails consistenly.
- 30 second - archive is not affected, we can see reties messages on the logs. Unarchive fails consistenly.
Stash
we run a test job that stash and unstash a few files, and we try different network outage times
- 1 second - archive is not affected, we can see reties messages on the logs. Unarchive fails consistenly.
- 5 second - archive is not affected, we can see reties messages on the logs. Unarchive fails consistenly.
- 10 second - archive is not affected, we can see reties messages on the logs. Unarchive fails consistenly.
- 30 second - archive is not affected, we can see reties messages on the logs. Unarchive fails consistenly.
Latency and jitter
we create a toxics with latency and jitter.
- 1000 ms - not affected
- 10000 ms - increases times
- 30000 ms - fails consistenly
Limited bandwidth
we create a toxics to limit the bandwidth.
- 1 KB/s - increases times
- 100 KB/s - increases times
- 1024 KB/s - increases times
- 10240 KB/s - increases times
Slow close
we create a toxics that delay the TCP socket from closing until delay has elapsed.
- 1000 ms - not affected
- 10000 ms - increases times
- 30000 ms - increases times
- links to
shanexpert28 Just to make sure, are you really work on it? If no, please avoid changing issues. This JIRA instance is not a playground.