Commit Graph

58 Commits

Author SHA1 Message Date
Chandler Carruth aa58b624e7 Tweak the core loop in StringRef::find to avoid calling memcmp on every
iteration.

Instead, load the byte at the needle length, compare it directly, and
save it to use in the lookup table of lengths we can skip forward.

I also added an annotation to expect that the comparison fails so that
the loop gets laid out contiguously without the call to memcpy (and the
substantial register shuffling that the ABI requires of that call).

Finally, because this behaves especially badly with a needle length of
one (by calling memcmp with a zero length) special case that to directly
call memchr, which is what we should have been doing anyways.

This was motivated by the fact that there are a large number of test
cases in 'check-llvm' where FileCheck's performance is dominated by
calls to StringRef::find (in a release, no-asserts build). I'm working
on patches to generally improve matters there, but this alone was worth
a 12.5% improvement in one test case where FileCheck spent 92% of its
time in this routine.

I experimented a bunch with different minor variations on this theme,
for example setting the pointer *at* the last byte and indexing
backwards for the call to memcmp. That didn't improve anything on this
version and seemed more complex. I also tried other things to make the
loop flow more nicely and none worked. =/ It is a bit unfortunate, the
generated code here remains pretty gross, but I don't see any obvious
ways to improve it. At this point, most of my ideas would be really
elaborate:

1) While the remainder of the string is long enough, we could load
   a 16-byte or 32-byte vector at the address of the last byte and use
   palignr to rotate that and check the first 15- or 31-bytes at the
   front of the next segment, essentially pre-loading the first several
   bytes of the next iteration so we could quickly detect a mismatch in
   those bytes without an additional memory access. Down side would be
   the code complexity, having a fallback loop, and likely misaligned
   vector load. Plus it would make the common case of the last byte not
   matching somewhat slower (need some extraction from a vector).
2) While we have space, we could do an aligned load of a 16- or 32-byte
   vector that *contains* the end byte, and use any peceding bytes to
   have a more precise "no" test, and any subsequent bytes could be
   saved for the next iteration. This remove any unaligned load penalty,
   but still requires us to pay the overhead of vector extraction for
   the cases where we didn't need to do anything other than load and
   compare the last byte.
3) Try to walk from the last byte in a way that is more friendly to
   cache and/or memory pre-fetcher considering we have to poke the last
   byte anyways.

No idea if any of these are really worth pursuing though. They all seem
somewhat unlikely to yield big wins in practice and to be a lot of work
and complexity. So I settled here, which at least seems like a strict
improvement over the previous version.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289373 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-11 07:46:21 +00:00
Zachary Turner fd3428261b [Support] Add StringRef::find_lower and contains_lower.
Differential Revision: https://reviews.llvm.org/D25299

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286724 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-12 17:17:12 +00:00
Zachary Turner 20bb0322fd Speculative fix for build failures due to consumeInteger.
A recent patch added support for consumeInteger() and made
getAsInteger delegate to this function.  A few buildbots are
failing as a result with an assertion failure.  On a hunch,
I tested what happens if I call getAsInteger() on an empty
string, and sure enough it crashes the same way that the
buildbots are crashing.

I confirmed that getAsInteger() on an empty string did not
crash before my patch, so I suspect this to be the cause.

I also added a unit test for the empty string.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282170 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-22 15:55:05 +00:00
Zachary Turner fb27f55135 [Support] Add StringRef::consumeInteger.
StringRef::getInteger() exists and treats the entire string as
an integer of the specified radix, failing if any invalid characters
are encountered or the number overflows.

Sometimes you might have something like "123456foo" and you want
to get the number 123456 and leave the string "foo" remaining.
This is similar to what would be possible by using the standard
runtime library functions strtoul et al and specifying an end
pointer.

This patch adds consumeInteger(), which does exactly that.  It
consumes as much as possible until an invalid character is found,
and modifies the StringRef in place so that upon return only
the portion of the StringRef after the number remains.

Differential Revision: https://reviews.llvm.org/D24778

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282164 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-22 15:05:19 +00:00
Colin LeMahieu 8b96439e72 [MCParser] Accept uppercase radix variants 0X and 0B
Differential Revision: http://reviews.llvm.org/D14781

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263802 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-18 18:22:07 +00:00
Chandler Carruth 32e95cb90f [ADT] Rewrite the StringRef::find implementation to be simpler, clearer,
and tremendously less reliant on the optimizer to fix things.

The code is always necessarily looking for the entire length of the
string when doing the equality tests in this find implementation, but it
previously was needlessly re-checking the size each time among other
annoyances.

By writing this so simply an ddirectly in terms of memcmp, it also is
about 8x faster in a debug build, which in turn makes FileCheck about 2x
faster in 'ninja check-llvm'. This saves about 8% of the time for
FileCheck-heavy parts of the test suite like the x86 backend tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247269 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 11:17:49 +00:00
Chandler Carruth f41971f6e7 [ADT] Fix a confusing interface spec and some annoying peculiarities
with the StringRef::split method when used with a MaxSplit argument
other than '-1' (which nobody really does today, but which should
actually work).

The spec claimed both to split up to MaxSplit times, but also to append
<= MaxSplit strings to the vector. One of these doesn't make sense.
Given the name "MaxSplit", let's go with it being a max over how many
*splits* occur, which means the max on how many strings get appended is
MaxSplit+1. I'm not actually sure the implementation correctly provided
this logic either, as it used a really opaque loop structure.

The implementation was also playing weird games with nullptr in the data
field to try to rely on a totally opaque hidden property of the split
method that returns a pair. Nasty IMO.

Replace all of this with what is (IMO) simpler code that doesn't use the
pair returning split method, and instead just finds each separator and
appends directly. I think this is a lot easier to read, and it most
definitely matches the spec. Added some tests that exercise the corner
cases around StringRef() and StringRef("") that all now pass.

I'll start using this in code in the next commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247249 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 07:51:37 +00:00
Chandler Carruth 82c835626f [ADT] Add a single-character version of the small vector split routine
on StringRef. Finding and splitting on a single character is
substantially faster than doing it on even a single character StringRef
-- we immediately get to a *very* tuned memchr call this way.

Even nicer, we get to this even in a debug build, shaving 18% off the
runtime of TripleTest.Normalization, helping PR23676 some more.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247244 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 06:07:03 +00:00
Craig Topper 3512034554 Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216525 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 05:25:25 +00:00
Craig Topper e88b796285 Remove custom implementations of max/min in StringRef that was originally added to work an old gcc bug. I believe its been fixed by now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216156 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 04:31:10 +00:00
Craig Topper 34bc6b6e78 [C++11] Make use of 'nullptr' in the Support library.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205697 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-07 04:17:22 +00:00
Ahmed Charles f4ccd11075 Replace OwningPtr<T> with std::unique_ptr<T>.
This compiles with no changes to clang/lld/lldb with MSVC and includes
overloads to various functions which are used by those projects and llvm
which have OwningPtr's as parameters. This should allow out of tree
projects some time to move. There are also no changes to libs/Target,
which should help out of tree targets have time to move, if necessary.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203083 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-06 05:51:42 +00:00
Rui Ueyama f34c3ca304 Add {start,end}with_lower methods to StringRef.
startswith_lower is ocassionally useful and I think worth adding.
endwith_lower is added for completeness.

Differential Revision: http://llvm-reviews.chandlerc.com/D2041

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193706 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-30 18:32:26 +00:00
Dmitri Gribenko 4a48389b27 Added const qualifier to StringRef::edit_distance member function
Patch by Ismail Pazarbasi.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189162 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-24 01:50:41 +00:00
Manman Ren 6afede522e Revert r185852.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185861 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-08 20:27:34 +00:00
Manman Ren f856249d49 StringRef: add DenseMapInfo for StringRef.
Remove the implementation in include/llvm/Support/YAMLTraits.h.
Added a DenseMap type DITypeHashMap in DebugInfo.h:
  DenseMap<std::pair<StringRef, unsigned>, MDNode*>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185852 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-08 19:17:48 +00:00
Chandler Carruth d04a8d4b33 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169131 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-03 16:50:05 +00:00
Nick Kledzik 6e033d6dcd Improve overflow detection in StringRef::getAsUnsignedInteger().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165038 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-02 20:01:48 +00:00
Michael J. Spencer b0940b46ed [Support/StringRef] Add find_last_not_of and {r,l,}trim.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156652 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 22:08:50 +00:00
Chris Lattner a9963c648e Don't die with an assertion if the Result bitwidth is already correct. This
fixes an assert reading "1239123123123123" when the result is already 64-bit.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155329 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-23 00:27:54 +00:00
Chris Lattner 2dbd7844e8 No need for "else if" after a return. Autosense "0o123" as octal in
StringRef::getAsInteger


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155298 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-21 22:03:05 +00:00
Michael J. Spencer 9130b42a85 Make StringRef::getAsInteger work with all integer types. Before this change
it would fail with {,u}int64_t on x86-64 Linux.

This also removes code duplication.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152517 91177308-0d34-0410-b5e6-96231b3b80d8
2012-03-10 23:02:54 +00:00
Chandler Carruth 528f0bbe19 Add generic support for hashing StringRef objects using the new hashing library.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152003 91177308-0d34-0410-b5e6-96231b3b80d8
2012-03-04 10:55:27 +00:00
Duncan Sands 37b6e5ae7d Workaround a miscompilation by gcc-4.3 that showed up as a failure
of the StringRef.Split2 unittest on 32 bit machines.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151358 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-24 09:01:34 +00:00
Duncan Sands bf8653ff3b Move the implementation of StringRef::split out of StringExtras.cpp
and into StringRef.cpp, which is where the other StringRef stuff is.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151054 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-21 12:00:25 +00:00