Files
UnrealEngineUWP/Engine/Source/ThirdParty/mimalloc/docs/overrides.html
danny couture e6f54c17c4 Update mimalloc to version 2.0.0-2762784 for HUGE memory usage improvement in editor workloads
This version of mimalloc is very efficient at distributing threaded allocations in a way that maintains
    locality which in turn improve the amount of memory that we're able to send back to the system
    after heavily multithreaded workloads. This also improves performance as less page fault and cache misses
    are expected coming from more densily packed allocations.

    mimalloc v1 seemed to waste more memory because of its commit size being larger than TBB.
    However, its allocation patterns was already way tigther than TBB but for it to become apparent, you had
    to activate the "page_reset" and "reset_decommits" options, which came at a performance loss.

    mimalloc v2 offers both better locality and by default will more agressively decommit memory with
    only minor performance loss in some cases and performance gain in many.

    Given the advantages of mimalloc v2 compared to Intel TBB, we should probably consider it
    as our next default allocator for the editor.

 - All tests performed on AMD TR 3970X with 256GB RAM
   - Loading FramingCameraTest map on special project with -ddc=cold and waiting until every asset is built
     - 699s @ 32GB for tbb malloc
     - 655s @ 37GB for mimalloc v1
     - 757s @ 12GB for mimalloc v1 + page_reset and reset_decommits
     - 604s @ 15GB for mimalloc v2
   - Loading P_World on Reverb -ddc=cold and waiting until every asset is built
     - 2372s @ 71GB for tbb malloc
     - 2587s @ 75GB for mimalloc v1
     - 3212s @ 34GB for mimalloc v1 + page_reset and reset_decommits
     - 2503s @ 37GB for mimalloc v2
   - Loading P_Construct_WP on special project with -ddc=cold and waiting until every asset is built
     - 6404s @ 56GB for tbb malloc
     - 6640s @ 37GB for mimalloc v2
   - Loading Apollo_Terrain on FortniteGame with -ddc=cold and waiting until every asset is built
     - 751s @ 33GB for tbb malloc
     - 744s @ 25GB for mimalloc v2
    - Cooking FramingCameraTest map on special project with a warmed-up DDC
     - 379s @ 34GB for tbb malloc
     - 367s @ 29GB for mimalloc v2

#rb Brandon.Dawson, Yuriy.ODonnell, Stefan.Boberg

[CL 15859558 by danny couture in ue5-main branch]
2021-03-30 06:38:15 -04:00

143 lines
15 KiB
HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "https://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
<meta name="generator" content="Doxygen 1.8.15"/>
<meta name="viewport" content="width=device-width, initial-scale=1"/>
<title>mi-malloc: Overriding Malloc</title>
<link href="tabs.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="jquery.js"></script>
<script type="text/javascript" src="dynsections.js"></script>
<link href="navtree.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="resize.js"></script>
<script type="text/javascript" src="navtreedata.js"></script>
<script type="text/javascript" src="navtree.js"></script>
<script type="text/javascript">
/* @license magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt GPL-v2 */
$(document).ready(initResizable);
/* @license-end */</script>
<link href="search/search.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="search/searchdata.js"></script>
<script type="text/javascript" src="search/search.js"></script>
<script type="text/javascript">
/* @license magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt GPL-v2 */
$(document).ready(function() { init_search(); });
/* @license-end */
</script>
<link href="doxygen.css" rel="stylesheet" type="text/css" />
<link href="mimalloc-doxygen.css" rel="stylesheet" type="text/css"/>
</head>
<body>
<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
<div id="titlearea">
<table cellspacing="0" cellpadding="0">
<tbody>
<tr style="height: 56px;">
<td id="projectlogo"><img alt="Logo" src="mimalloc-logo.svg"/></td>
<td id="projectalign" style="padding-left: 0.5em;">
<div id="projectname">mi-malloc
&#160;<span id="projectnumber">1.6</span>
</div>
</td>
<td> <div id="MSearchBox" class="MSearchBoxInactive">
<span class="left">
<img id="MSearchSelect" src="search/mag_sel.png"
onmouseover="return searchBox.OnSearchSelectShow()"
onmouseout="return searchBox.OnSearchSelectHide()"
alt=""/>
<input type="text" id="MSearchField" value="Search" accesskey="S"
onfocus="searchBox.OnSearchFieldFocus(true)"
onblur="searchBox.OnSearchFieldFocus(false)"
onkeyup="searchBox.OnSearchFieldChange(event)"/>
</span><span class="right">
<a id="MSearchClose" href="javascript:searchBox.CloseResultsWindow()"><img id="MSearchCloseImg" border="0" src="search/close.png" alt=""/></a>
</span>
</div>
</td>
</tr>
</tbody>
</table>
</div>
<!-- end header part -->
<!-- Generated by Doxygen 1.8.15 -->
<script type="text/javascript">
/* @license magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt GPL-v2 */
var searchBox = new SearchBox("searchBox", "search",false,'Search');
/* @license-end */
</script>
</div><!-- top -->
<div id="side-nav" class="ui-resizable side-nav-resizable">
<div id="nav-tree">
<div id="nav-tree-contents">
<div id="nav-sync" class="sync"></div>
</div>
</div>
<div id="splitbar" style="-moz-user-select:none;"
class="ui-resizable-handle">
</div>
</div>
<script type="text/javascript">
/* @license magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt GPL-v2 */
$(document).ready(function(){initNavTree('overrides.html','');});
/* @license-end */
</script>
<div id="doc-content">
<!-- window showing the filter options -->
<div id="MSearchSelectWindow"
onmouseover="return searchBox.OnSearchSelectShow()"
onmouseout="return searchBox.OnSearchSelectHide()"
onkeydown="return searchBox.OnSearchSelectKey(event)">
</div>
<!-- iframe showing the search results (closed by default) -->
<div id="MSearchResultsWindow">
<iframe src="javascript:void(0)" frameborder="0"
name="MSearchResults" id="MSearchResults">
</iframe>
</div>
<div class="PageDoc"><div class="header">
<div class="headertitle">
<div class="title">Overriding Malloc </div> </div>
</div><!--header-->
<div class="contents">
<div class="textblock"><p>Overriding the standard <code>malloc</code> can be done either <em>dynamically</em> or <em>statically</em>.</p>
<h2>Dynamic override</h2>
<p>This is the recommended way to override the standard malloc interface.</p>
<h3>Linux, BSD</h3>
<p>On these systems we preload the mimalloc shared library so all calls to the standard <code>malloc</code> interface are resolved to the <em>mimalloc</em> library.</p>
<ul>
<li><code>env LD_PRELOAD=/usr/lib/libmimalloc.so myprogram</code></li>
</ul>
<p>You can set extra environment variables to check that mimalloc is running, like: </p><div class="fragment"><div class="line">env MIMALLOC_VERBOSE=1 LD_PRELOAD=/usr/lib/libmimalloc.so myprogram</div></div><!-- fragment --><p> or run with the debug version to get detailed statistics: </p><div class="fragment"><div class="line">env MIMALLOC_SHOW_STATS=1 LD_PRELOAD=/usr/lib/libmimalloc-debug.so myprogram</div></div><!-- fragment --><h3>MacOS</h3>
<p>On macOS we can also preload the mimalloc shared library so all calls to the standard <code>malloc</code> interface are resolved to the <em>mimalloc</em> library.</p>
<ul>
<li><code>env DYLD_FORCE_FLAT_NAMESPACE=1 DYLD_INSERT_LIBRARIES=/usr/lib/libmimalloc.dylib myprogram</code></li>
</ul>
<p>Note that certain security restrictions may apply when doing this from the <a href="https://stackoverflow.com/questions/43941322/dyld-insert-libraries-ignored-when-calling-application-through-bash">shell</a>.</p>
<p>(Note: macOS support for dynamic overriding is recent, please report any issues.)</p>
<h3>Windows</h3>
<p>Overriding on Windows is robust and has the particular advantage to be able to redirect all malloc/free calls that go through the (dynamic) C runtime allocator, including those from other DLL's or libraries.</p>
<p>The overriding on Windows requires that you link your program explicitly with the mimalloc DLL and use the C-runtime library as a DLL (using the <code>/MD</code> or <code>/MDd</code> switch). Also, the <code>mimalloc-redirect.dll</code> (or <code>mimalloc-redirect32.dll</code>) must be available in the same folder as the main <code>mimalloc-override.dll</code> at runtime (as it is a dependency). The redirection DLL ensures that all calls to the C runtime malloc API get redirected to mimalloc (in <code>mimalloc-override.dll</code>).</p>
<p>To ensure the mimalloc DLL is loaded at run-time it is easiest to insert some call to the mimalloc API in the <code>main</code> function, like <code>mi_version()</code> (or use the <code>/INCLUDE:mi_version</code> switch on the linker). See the <code>mimalloc-override-test</code> project for an example on how to use this. For best performance on Windows with C++, it is also recommended to also override the <code>new</code>/<code>delete</code> operations (by including <a href="https://github.com/microsoft/mimalloc/blob/master/include/mimalloc-new-delete.h"><code>mimalloc-new-delete.h</code></a> a single(!) source file in your project).</p>
<p>The environment variable <code>MIMALLOC_DISABLE_REDIRECT=1</code> can be used to disable dynamic overriding at run-time. Use <code>MIMALLOC_VERBOSE=1</code> to check if mimalloc was successfully redirected.</p>
<p>(Note: in principle, it is possible to even patch existing executables without any recompilation if they are linked with the dynamic C runtime (<code>ucrtbase.dll</code>) &ndash; just put the <code>mimalloc-override.dll</code> into the import table (and put <code>mimalloc-redirect.dll</code> in the same folder) Such patching can be done for example with <a href="https://ntcore.com/?page_id=388">CFF Explorer</a>).</p>
<h2>Static override</h2>
<p>On Unix systems, you can also statically link with <em>mimalloc</em> to override the standard malloc interface. The recommended way is to link the final program with the <em>mimalloc</em> single object file (<code>mimalloc-override.o</code>). We use an object file instead of a library file as linkers give preference to that over archives to resolve symbols. To ensure that the standard malloc interface resolves to the <em>mimalloc</em> library, link it as the first object file. For example:</p>
<div class="fragment"><div class="line">gcc -o myprogram mimalloc-<span class="keyword">override</span>.o myfile1.c ...</div></div><!-- fragment --><h2>List of Overrides:</h2>
<p>The specific functions that get redirected to the <em>mimalloc</em> library are:</p>
<div class="fragment"><div class="line"><span class="comment">// C</span></div><div class="line"><span class="keywordtype">void</span>* malloc(<span class="keywordtype">size_t</span> size);</div><div class="line"><span class="keywordtype">void</span>* calloc(<span class="keywordtype">size_t</span> size, <span class="keywordtype">size_t</span> n);</div><div class="line"><span class="keywordtype">void</span>* realloc(<span class="keywordtype">void</span>* p, <span class="keywordtype">size_t</span> newsize);</div><div class="line"><span class="keywordtype">void</span> free(<span class="keywordtype">void</span>* p);</div><div class="line"></div><div class="line"><span class="comment">// C++</span></div><div class="line"><span class="keywordtype">void</span> <span class="keyword">operator</span> <span class="keyword">delete</span>(<span class="keywordtype">void</span>* p);</div><div class="line"><span class="keywordtype">void</span> <span class="keyword">operator</span> <span class="keyword">delete</span>[](<span class="keywordtype">void</span>* p);</div><div class="line"></div><div class="line"><span class="keywordtype">void</span>* <span class="keyword">operator</span> <span class="keyword">new</span>(std::size_t n) noexcept(<span class="keyword">false</span>);</div><div class="line"><span class="keywordtype">void</span>* <span class="keyword">operator</span> <span class="keyword">new</span>[](std::size_t n) noexcept(<span class="keyword">false</span>);</div><div class="line"><span class="keywordtype">void</span>* <span class="keyword">operator</span> <span class="keyword">new</span>( std::size_t n, std::align_val_t align) noexcept(<span class="keyword">false</span>);</div><div class="line"><span class="keywordtype">void</span>* <span class="keyword">operator</span> <span class="keyword">new</span>[]( std::size_t n, std::align_val_t align) noexcept(<span class="keyword">false</span>);</div><div class="line"></div><div class="line"><span class="keywordtype">void</span>* <span class="keyword">operator</span> <span class="keyword">new</span> ( std::size_t count, <span class="keyword">const</span> std::nothrow_t&amp; tag);</div><div class="line"><span class="keywordtype">void</span>* <span class="keyword">operator</span> <span class="keyword">new</span>[]( std::size_t count, <span class="keyword">const</span> std::nothrow_t&amp; tag);</div><div class="line"><span class="keywordtype">void</span>* <span class="keyword">operator</span> <span class="keyword">new</span> ( std::size_t count, std::align_val_t al, <span class="keyword">const</span> std::nothrow_t&amp;);</div><div class="line"><span class="keywordtype">void</span>* <span class="keyword">operator</span> <span class="keyword">new</span>[]( std::size_t count, std::align_val_t al, <span class="keyword">const</span> std::nothrow_t&amp;);</div><div class="line"></div><div class="line"><span class="comment">// Posix</span></div><div class="line"><span class="keywordtype">int</span> posix_memalign(<span class="keywordtype">void</span>** p, <span class="keywordtype">size_t</span> alignment, <span class="keywordtype">size_t</span> size);</div><div class="line"></div><div class="line"><span class="comment">// Linux</span></div><div class="line"><span class="keywordtype">void</span>* memalign(<span class="keywordtype">size_t</span> alignment, <span class="keywordtype">size_t</span> size);</div><div class="line"><span class="keywordtype">void</span>* aligned_alloc(<span class="keywordtype">size_t</span> alignment, <span class="keywordtype">size_t</span> size);</div><div class="line"><span class="keywordtype">void</span>* valloc(<span class="keywordtype">size_t</span> size);</div><div class="line"><span class="keywordtype">void</span>* pvalloc(<span class="keywordtype">size_t</span> size);</div><div class="line"><span class="keywordtype">size_t</span> malloc_usable_size(<span class="keywordtype">void</span> *p);</div><div class="line"></div><div class="line"><span class="comment">// BSD</span></div><div class="line"><span class="keywordtype">void</span>* reallocarray( <span class="keywordtype">void</span>* p, <span class="keywordtype">size_t</span> count, <span class="keywordtype">size_t</span> size );</div><div class="line"><span class="keywordtype">void</span>* reallocf(<span class="keywordtype">void</span>* p, <span class="keywordtype">size_t</span> newsize);</div><div class="line"><span class="keywordtype">void</span> cfree(<span class="keywordtype">void</span>* p);</div><div class="line"></div><div class="line"><span class="comment">// Windows</span></div><div class="line"><span class="keywordtype">void</span>* _expand(<span class="keywordtype">void</span>* p, <span class="keywordtype">size_t</span> newsize);</div><div class="line"><span class="keywordtype">size_t</span> _msize(<span class="keywordtype">void</span>* p);</div><div class="line"></div><div class="line"><span class="keywordtype">void</span>* _malloc_dbg(<span class="keywordtype">size_t</span> size, <span class="keywordtype">int</span> block_type, <span class="keyword">const</span> <span class="keywordtype">char</span>* fname, <span class="keywordtype">int</span> line);</div><div class="line"><span class="keywordtype">void</span>* _realloc_dbg(<span class="keywordtype">void</span>* p, <span class="keywordtype">size_t</span> newsize, <span class="keywordtype">int</span> block_type, <span class="keyword">const</span> <span class="keywordtype">char</span>* fname, <span class="keywordtype">int</span> line);</div><div class="line"><span class="keywordtype">void</span>* _calloc_dbg(<span class="keywordtype">size_t</span> count, <span class="keywordtype">size_t</span> size, <span class="keywordtype">int</span> block_type, <span class="keyword">const</span> <span class="keywordtype">char</span>* fname, <span class="keywordtype">int</span> line);</div><div class="line"><span class="keywordtype">void</span>* _expand_dbg(<span class="keywordtype">void</span>* p, <span class="keywordtype">size_t</span> size, <span class="keywordtype">int</span> block_type, <span class="keyword">const</span> <span class="keywordtype">char</span>* fname, <span class="keywordtype">int</span> line);</div><div class="line"><span class="keywordtype">size_t</span> _msize_dbg(<span class="keywordtype">void</span>* p, <span class="keywordtype">int</span> block_type);</div><div class="line"><span class="keywordtype">void</span> _free_dbg(<span class="keywordtype">void</span>* p, <span class="keywordtype">int</span> block_type);</div></div><!-- fragment --> </div></div><!-- PageDoc -->
</div><!-- contents -->
</div><!-- doc-content -->
<!-- start footer part -->
<div id="nav-path" class="navpath"><!-- id is needed for treeview function! -->
<ul>
<li class="footer">Generated by
<a href="http://www.doxygen.org/index.html">
<img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.15 </li>
</ul>
</div>
</body>
</html>