Bug 1249210 - Install files using multiple threads on Windows; r=glandium

As previous measurements have shown, creating/appending files
on Windows/NTFS is slow because the CloseHandle() Win32 API takes
1-3ms to complete. This is apparently due to a fundamental issue
with NTFS extents. A way to work around this slowness is to use
multiple threads for I/O so file closing doesn't block execution
as much.

This commit updates the file copier to use a thread pool of 4
threads when processing file copies. Additional threads appear
to have diminishing returns.

On my i7-6700K, this reduces the time for processing the tests install
manifest (24,572 files) on Windows from ~22.0s to ~12.5s in the best
case.

Using the thread pool globally resulted in a performance regression
on Linux. Given the performance sensitivity of manifest copying,
I thought it best to implement a slightly redundant non-Windows
branch to preserve performance. For the record, that same machine
running Linux is capable of processing nearly the same install
manifest (24,616 files) in ~2.2s in the best case.

MozReview-Commit-ID: B9LbKaOoO1u
This commit is contained in:
Gregory Szorc 2016-02-19 18:28:26 -08:00
parent 67ee1c7010
commit 2a5babf470

View File

@ -6,6 +6,7 @@ from __future__ import absolute_import
import os
import stat
import sys
from mozpack.errors import errors
from mozpack.files import (
@ -18,6 +19,7 @@ from collections import (
Counter,
OrderedDict,
)
import concurrent.futures as futures
class FileRegistry(object):
@ -375,10 +377,30 @@ class FileCopier(FileRegistry):
dest_files = set()
# Install files.
for p, f in self:
destfile = os.path.normpath(os.path.join(destination, p))
#
# Creating/appending new files on Windows/NTFS is slow. So we use a
# thread pool to speed it up significantly. The performance of this
# loop is so critical to common build operations on Linux that the
# overhead of the thread pool is worth avoiding, so we have 2 code
# paths. We also employ a low water mark to prevent thread pool
# creation if number of files is too small to benefit.
copy_results = []
if sys.platform == 'win32' and len(self) > 100:
with futures.ThreadPoolExecutor(4) as e:
fs = []
for p, f in self:
destfile = os.path.normpath(os.path.join(destination, p))
fs.append((destfile, e.submit(f.copy, destfile, skip_if_older)))
copy_results = [(destfile, f.result) for destfile, f in fs]
else:
for p, f in self:
destfile = os.path.normpath(os.path.join(destination, p))
copy_results.append((destfile, f.copy(destfile, skip_if_older)))
for destfile, copy_result in copy_results:
dest_files.add(destfile)
if f.copy(destfile, skip_if_older):
if copy_result:
result.updated_files.add(destfile)
else:
result.existing_files.add(destfile)