Hello, during migration from NFS to CephFS file storage we faced with performance degradation of Server startup due to RunIdMigrator.
After trace analysis we figure out following thing:
AtomicFileWriter create FileChannelWriterwith with only one OpenOption - StandardOpenOption.WRITE.
For a newly created File, in case when AtomicFileWriter used to create new Empty file (ex: jenkins.model.RunIdMigrator#save) it leads to full fs sync instead of fsync on dirty inodes
As a result this operation took up to 5 sec on CephFS.
As a fix we add StandardOpenOption.CREATE OpenOption. MR - https://github.com/jenkinsci/jenkins/pull/4357
Ceph logs Before Fix:
Ceph logs After Fix:
Server startup with 2k job required to be migrated:
- before fix startup took ~30min
- after startup 2 min