From f4c3fd0272f8da8eb9290fe4d59fb239f5954533 Mon Sep 17 00:00:00 2001 From: Tyler Downer Date: Fri, 6 May 2011 12:43:35 -0700 Subject: [PATCH] Bug 471588 - Remove String docs. r=dbaron DONTBUILD --- xpcom/string/doc/README.html | 44 - xpcom/string/doc/string-guide.html | 2508 ---------------------------- 2 files changed, 2552 deletions(-) delete mode 100644 xpcom/string/doc/README.html delete mode 100644 xpcom/string/doc/string-guide.html diff --git a/xpcom/string/doc/README.html b/xpcom/string/doc/README.html deleted file mode 100644 index 154b7969096..00000000000 --- a/xpcom/string/doc/README.html +++ /dev/null @@ -1,44 +0,0 @@ - - - -

documentation aimed at programmers who are clients of the string library

-

- -

- - diff --git a/xpcom/string/doc/string-guide.html b/xpcom/string/doc/string-guide.html deleted file mode 100644 index d954fe67a07..00000000000 --- a/xpcom/string/doc/string-guide.html +++ /dev/null @@ -1,2508 +0,0 @@ - - - - an incomplete guide to mozilla/string - - - - - - - -

an incomplete guide to mozilla/string

-

This document is now deprecated in favor of The new string guide.

-
-

by Scott Collins -

last modified 8 April 2001 -

- -
-

-

Abstract

- This document provides - an introduction to the design and use of the string classes in mozilla, - detailed information on their implementation and how one may extend them, - and answers to frequently asked questions about strings. -

-
- - - -

contents

- -
- -
- -

- Please direct all comments, requests, and contributions to, - in order of preference, - the tracking bug #70076 for this document, - the author scc@mozilla.org, and/or - the newsgroup news:netscape.public.mozilla.xpcom - (should there be a strings newsgroup?) -

- -
-

- A note to potential editors: - don't even consider modifying this document with an HTML editor. - That would destroy the internal formatting, - and make patches unmanagable. -

-
- - - - - -
-

user's guide

- -
-

- Strings in mozilla are a world apart from char*s. - If you don't know why they are different, - this section is the place for you to start. - If you're already familiar with the hierarchy of string classes in mozilla, - then you might want to skip ahead to the implementor's guide - or the FAQ. -

-
- -
- -
- -

introduction

-

what and what isn't a string?

-

- A string is an opaque container holding a, possibly zero length, linear sequence of characters. - Understanding the implications of this statement is the foundation for understanding all mozilla's string classes. -

- -

readable and writable

-

dependent strings

-

flat strings

-

encoding

-

sharing

- -

using the string classes correctly; using the correct string class

-

basic string operations

-

comparison

-

concatenation

-

substrings

-

find and replace

-

conversions

-

calling a function that expects a different kind of string

-

converting between string classes

-

converting between encodings

-

selecting the right string class

-

user string classes

-

selecting the right string class for a parameter

-

selecting the right string class for a local variable

-

selecting the right string class for a member variable

-

selecting the right string class for a return value

-

selecting the right string class in IDL

-

dont's

- -

using string iterators

-

what is an iterator?

-

reading iterators and writing iterators

-

`chunky' iterating for efficiency

-

copy_string, character sources and sinks

-

encoding conversion iterators

- -

summary

- - - -
-

implementor's guide

- -
-

- -

-
- -
- -
- - - - -
-

frequently asked questions

- -
-
- -
-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
you have some chars
you want'x'char c"foo"char* cpnsACString& cs
char. [] [] extract a character
PRUnichar PRUnichar('x') PRUnichar(c)convert encoding, extract a character
char* & & & . get a pointer
PRUnichar*convert encoding, get a pointer
nsACString NS_LITERAL_CSTRING("x") make a string NS_LITERAL_CSTRING("foo") make a string .
nsAString NS_LITERAL_STRING("x") convert encoding NS_LITERAL_STRING("foo")convert encoding
to call printf. call printf
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
you have some PRUnichars
you wantPRUnichar wPRUnichar* wpnsAString& s
char
PRUnichar [] extract a character
char*
PRUnichar* & get a pointer
nsACString
nsAString
to call printf call printf
- -
-
-
- is there any string doc? -
-
- Yes, you're soaking in it! -
- - - - -
- I have a string, how do I get a pointer to the characters? -
-
- You want to avoid this situation. - In your own interfaces, prefer string types over raw pointers. - Any interface that wants to process a string using a single pointer is making two expensive assumptions. - First, that the string is stored in one contiguous hunk; and - second, that the string is zero-terminated. - If this isn't the case, - then to get a pointer, storage must be allocated and the entire string must be copied to it and zero-terminated. - You may not be able to avoid needing a pointer when interacting with system calls. -
-
- Some string classes guarantee that they are `flat'. - That is, that their data is stored in one contiguous zero-terminated hunk. - This does not imply that there are no embedded nulls. Caveat emptor. - All strings that explicitly promise flatness - inherit from the class nsAFlatString - or nsAFlatCString - and can produce a constant pointer to their data with the get() member function. - Even strings that don't explicitly promise to be flat - may happen to be flat. - The helper function PromiseFlatString will produce - a const dependent string that is guaranteed to be flat. - If you use this on a string that already happens to be flat, - the result is simply a reference through to that string. - Otherwise, - PromiseFlatString does the work to allocate, copy, terminate, and manage - a temporary flat string. - Since the result of PromiseFlatString is a temporary, - you must be careful not to get and hold a pointer to its data for longer than the temporary itself lives. -
-
-
-
-  /* I have a string, how do I get a pointer to the characters? */
-
-extern void EvilNarrowOSFunction( const char* );    // evil OS routines that want a pointers
-extern void EvilWideOSFunction( const PRUnichar* );
-
-void func( const nsAString& aString, const nsACString& aCString )
-  {
-    EvilWideOSFunction( NS_LITERAL_STRING("Hello, World!").get() );
-      // literal strings are flat already (as are |nsString|s, et al), just use |.get()|
-
-    EvilWideOSFunction( PromiseFlatString(aString).get() );
-      // for strings that don't explicitly guarantee flatness, use |PromiseFlatString|
-
-
-      // beware holding the pointer for longer than the life of the promise
-    const PRUnichar* wp = PromiseFlatString(aString).get(); // BAD! |wp| dangles
-    EvilWideOSFunction(wp);
-
-      // if you really need to use the pointer from |PromiseFlatString| in more than one expression...
-    const nsAFlatString& flat = PromiseFlatString(aString);
-    EvilWideOSFunction(flat.get());
-    SomeOtherFunction(flat.get());
-
-      // similarly for |char| strings
-    EvilNarrowOSFunction( PromiseFlatCString(aCString).get() );
-  }
-
-
-
- - - - -
- How do I get a particular character out of a string? -
-
- Flat strings provide operator[] and CharAt(). - All strings provide First(), Last(), and access with iterators. - Don't promise a string flat just to do character indexing. - Prefer, instead, to get an iterator and advance it to the position you care about. -
-
-
-
-  /* How do I get a particular character out of a string? */
-
-PRUnichar Get5thCharacterOf( const nsAString& aString )
-  {
-    if ( aString.Length() >= 5 )
-      {
-        nsAString::const_iterator iter;
-        aString.BeginReading(iter); // make |iter| point to the beginning of |aString|
-        iter.advance(5);
-        return *iter;
-      }
-
-    return PRUnichar(0);
-  }
-
-
-
-
- Using iterators isn't as bad as the example above makes it feel. - The typical use is for advancing through a string, examining many characters. -
- - - - -
- How do I convert from one encoding to another? -
-
-
- - - - -
- How do I create a string? -
-
-
- - - -
- What is the best way to return a string? -
-
-

- There are several reasonable ways to produce a string result from a function. - If you are already holding the answer as a sharable string, - you can simply return that string (pass-by-value). - Otherwise, - the most efficient and flexible way to return a string is - to assign your result into a non-const reference parameter. - Don't bother to create a sharable string from scratch with your generated result. -

-

- Why? - The two things you want to minimize in string manipulation are, - in order of importance, - heap allocation, and - moving characters around. -

-
-
-
-
-  /* What is the best way to return a string? */
-
-class foo
-  {
-    public:
-      // ...
-      void GetShortName( nsAString& aResult ) const;
-      nsCommonString GetFullName() const;
-      
-    private:
-      nsCommonString    mFullName;
-
-      const PRUnichar*  mShortName;
-      PRUint32          mShortNameLength;
-      
-  };
-
-nsCommonString
-foo::GetFullName() const
-  {
-    return mFullName;
-  }
-
-void
-foo::GetShortName( nsAString& aResult ) const
-  {
-    aResult = DependentString(mShortName, mShortNameLength);
-  }
-
-
-
- - -
- How do I printf a string, e.g., for debugging. -
-
- If your string is already narrow, you just have to worry about making it flat, and then getting a pointer. -
-
- If your string happens to be wide, - you'll need to convert it before you can printf something reasonable. - If it's just for debugging, - you probably wouldn't care if something odd was printed in the case of a Unicode character that didn't have - an ASCII equivalent. (If you have a UTF-8 terminal, the result is - perfectly legible and nothing odd is printed.) - The simplest thing in this case is to make a temporary conversion using NS_ConvertUTF16toUTF8. - The result is conveniently flat already, so getting the pointer is simple. - Remember not to hold onto the pointer you get out of this beyond the lifetime of temporary. -
-
-
-
-  /* How do I |printf| a string? */
-
-
-void PrintSomeStrings( const nsAString& aString, const PRUnichar* aKey, const nsACString& aCString )
-  {
-      // |printf|ing a narrow string is easy
-    printf("%s\n", PromiseFlatCString(aCString).get());     // GOOD
-
-      // the simplest way to get a |printf|-able |const char*| out of a string
-    printf("%s\n", NS_ConvertUTF16toUTF8(aKey).get());       // GOOD
-
-      // works just as well with an formal wide string type...
-    printf("%s\n", NS_ConvertUTF16toUTF8(aString).get());
-
-
-      // But don't hold onto the pointer longer than the lifetime of the temporary!
-    const char* cstring = NS_ConvertUTF16toUTF8(aKey).get(); // BAD! |cstring| is dangling
-    printf("%s\n", cstring);
-  }
-
-
-
- -
- -

- Here are the email answers I have yet to format into the FAQ. - Some of the URLs may be out-dated or moved. - The messages are in order from oldest to newest. -

-

[Note : In June, 2003, these emails were modified -to better reflect what is stored in 'wide' string -classes (UTF-16 string instead of UCS-2) and what -related methods do as a part of the patch for bug 183156. -Therefore, they're a little different from the original emails -written by Scott Collins] -

-
-
-Date: Thu, 13 Apr 2000 19:41:47 -0400
-
- -

Encoding Wars - -

This message is all about strings and the various encodings that might -be used to interpret their contents, the ramifications of that, and -where we're heading. The point of this message is to say what we're -currently thinking, and get feedback. I apologize in advance for the -rambling, and for the fact that this message may accidentally mix -discussion of how things are and how they will be. - -

There are many different possible encodings. Three in common use in -the Mozilla source base are: ASCII, UTF-16, and UTF-8. In ASCII, every - -character fits in 7-bits and is typically stored in an 8-bit byte. We -usually represent ASCII strings with nsCStrings, nsXPIDLCStrings, -or char string literals. In UTF-16, characters occupy one 16-bit code unit ( - -BMPcharacters) -or two 16-bit code units -( -non-BMP characters). -We usually represent UTF-16 strings as nsStrings, etc., i.e., two-byte -or `wide' strings. UTF-8 is a multi-byte encoding. A character might -occupy one, two, three, or four bytes. It is easiest to store and -manipulate such a string within a single-byte or `narrow' string -implementation. - -

None of our current string implementations know the encoding of the -data they hold at any given moment. An nsCString might legitimately -hold data encoded in ASCII, UTF-8 or even EBCDIC for that matter. - -

Operations that convert from one encoding to another, or operations -that are encoding sensitive (e.g., to_upper), rightly belong in -i18n. The fact that our current string interfaces automatically and -implicitly convert between wide and narrow strings is actually the -source of many errors in two particular categories: (1) unintended -extra work, (2) mistaken re-encoding, e.g., accidentally `converting' -a UTF-8 string to UTF-16 by pretending the UTF-8 string is ASCII and then -padding with '\0's. - -

We've known these were bad for a long time, and have been trying to -find the right way to fix them. The current thinking is to just byte -the bullet and eliminate implicit conversions. That has interesting -ramifications. - -

-
-void foo( const nsString&  aUTF16string );
-
-foo("hello"); // works!  constructs a temporary |nsString| by
-              // converting the ASCII literal with padding.
-              // Note: this requires an allocation
-
-
- -

Though we've always hated this form since it requires a heap -allocation. In current code, we recommend - -

-
-foo( nsAutoString("hello") );
-
-
- -

which still copy/converts, but at least it probably doesn't need to do -a heap allocation. In the best of all worlds, no conversion, copying, -or allocation would be necessary. To do that, you would need to be -able to directly specify a UTF-16 string, e.g., with the L"hello" -notation, and wrap that in an interface that just held a pointer. -E.g., something like - -

-
-void foo( const nsAReadableString&  aUTF16string );
-
-foo( nsLiteralString(L"hello") );
-
-
- -

There are problems with this example, however. The L notation -specifically makes objects that are arrays of wchar_t, which under -GCC is a 4-byte element. This leads to incompatibility with JS, and -the annoyance of possibly bloated storage (I'm sort of minimizing the -situation here. It's worse that I make it sound). More about tricks -to get around this in a bit, but first, let me talk about what to do -in the meantime while we're just getting rid of implicit constructors. - Initially to get around this problem (what problem? The problem that -foo("hello") stopped compiling on my machine when I threw the -switch) I made a routine called NS_ConvertToString which looked like -this - -

-
-inline
-nsAutoString
-NS_ConvertToString( const char* anASCIIstring )
-  {
-    nsAutoString aUCS2string;
-    aUCS2string.AssignWithConversion(anASCIIstring);
-    return aUCS2string;
-  }
-
-
- -

Which lets me write - -

-
-foo( NS_ConvertToString("hello") );
-
-
- -

This was OK, but in discussion there were concerns about performance -on machines that didn't inline well, and issues about naming. In -that meeting we came up with an alternate naming strategy that we -think has room for growth and an implementation more likely to be -efficient on every platform. The implementation is to define a new -class that derives from nsAutoString, but allows construction from a -char* - -

-
-class NS_ConvertASCIItoUTF16 : public nsAutoString
-  {
-    public:
-      NS_ConvertASCIItoUTF16( const char* );
-      // ...
-  };
-
-
- -

Which gives identical (though renamed) notation for calling foo: - -

-
-foo( NS_ConvertASCIItoUTF16("hello") );
-
-
- -

It looks like a function call to an explicit encoding conversion. It -acts like a function call to an explicit encoding conversion. It is -a function call to an explicit encoding conversion. We think that -this naming pattern has room for growth. In the meeting, we concluded -that the best representation for encoding conversions is a family of -functions, and NS_ConvertASCIItoUTF16 fits right in. We think that -XPCOM probably can't live without the ASCII to UTF-16 conversion (though -as explicit as possible) but that all others rightly belong in i18n -land. - -

You can probably deduce from the clues in NS_ConvertToString, above, -that constructors weren't the only thing that became explicit. -Assignment, appending, comparison, et al, got renamed so that when -assigning, appending, or comparing to a value in a different encoding -the `WithConversion' form must be used. E.g., - -

-
-nsString aUTF16string;
-nsCString anASCIIstring;
-// ...
-
-aUTF16string += anASCIIstring;  // Currently legal, but not for long
-aUTF16string.Append(anASCIIstring); // same
-
-aUTF16string.AppendWithConversion(anASCIIstring); // the new way
-
-if ( aUTF16string == anASCIIstring ) // Sorry, this is going away too
-  // ...
-
-if ( aUTF16string.EqualsWithConversion(anASCIIstring) )
-  // ...
-
-
- -

Yes, it's long and annoying. Just like the extra work you were -implicitly asking to have done, perhaps incorrectly. There are other -reasons to rename these functions. When nsString and nsCString -defined a ton of, e.g., Appends each there was no problem, because -nobody wanted to override Append. Now, with strings inheriting from -abstract base classes we immediately run into the problem that -overriding and overloading don't mix very well in C++. Because of a -feature of C++ called name hiding, it is problematic to override only -a single signature of a name overloaded in a base class. The base -nsAWritableString provides several Appends, all for objects of -(hopefully) the same encoding. nsString can't easily add a bunch of -new Appends (the converting ones) without running face first into -the name hiding problem. The discussion of the fix for this is mostly -unrelated to encoding issues, so I'll defer it to another post. - -

In hindsight, after the meeting, it seemed clear that all the -`WithConversion' forms would be better named - -

-
-xxxConvertingASCIItoUTF16
-xxxConvertingUTF16toASCII
-
-
- -

however, the real goal (probably) is to move most such conversions -into i18n. Just bringing attention to the previously implicit -conversions is a good first step. Renaming these conversions as just -suggested is probably the right thing to do, though it sort of -validates them, which I'm not sure we really want. This is a decision -we need to discuss further. - -

Now, back to the string literal problem above. One possible solution -is to use a macro. Imagine - -

-
-NS_LITERAL_STRING("Hello")
-
-
- -

which on a machine where the L trick works, turns into - -

-
-nsLiteralString(L"Hello")
-
-
- -

but on a machine where there is trouble, turns into something less -appealing, but more likely to work, like - -

-
-NS_ConvertASCIItoUTF16("Hello")
-
-
- -

Another solution is to add a compilation step that fixes L strings -on bad platforms to be non-L strings, but padded with \0s. E.g., -L"Hello" gets preprocessed into "\000H\000e\000l\000l\000o\000". -This solution is more annoying to the developer, where the prior -solution is more annoying during the runtime. - -

Before we go to too much trouble on this specific feature, we will -probably want to do more measurement to see just how much and how -often we are converting constant literal strings, and why. - - -

I'm currently ripping through the tree fixing things to use the -`WithConversion' forms where appropriate. I was also converting -things to use NS_ConvertToString where appropriate; unless I get -talked out of it, I want to switch midstream to -NS_ConvertASCIItoUTF16, then go back and fix up the -NS_ConvertToString instances later. I've set things up so I can -check in as I go. After all these conversions have been done, I'll be -able to throw the switch (what switch? NEW_STRING_APIS) which will -make nsString inherit from nsAWritableString, etc. and allow us to -start exploiting these other opportunities (e.g., for literal strings, -shared strings, etc. See -http://bugzilla.mozilla.org/show_bug.cgi?id=28221 for details and -reasoning.) - -

I guess I'm expecting comments on: - -

- -

So as not to jumble the discussion, I'll be separately posting other -requests for comments about specific features of the design of the new -string hierarchy. - -

I hope this helps keep everybody filled in on what we're thinking and -able to point out what we're forgetting or screwing up :-) - - - - - -


-
-Date: Wed, 19 Apr 2000 21:12:47 -0400
-Subject: more string info
-
- -

news://news.mozilla.org/scc-705460.16423913042000@news.mozilla.org - - - - - -


-
-Date: Fri, 26 May 2000 15:31:37 -0400
-Subject: Re: Question on ==
-
- -

I would prefer you compare with Equals (which should really be named -IsEqualTo) rather than operator==() because of this: - -

-
-char* a;
-char* b;
-
-// ...
-
-if ( a == b )
-  // ...
-
-
- -

Comparing two raw `string' pointers doesn't compare the characters -they point to, but instead compares the bits of the pointers. For -this reason, I may eventually make comparison of a string with a -pointer using operators just go away. - - - - - -


-
-Date: Wed, 14 Jun 2000 14:38:55 -0400
-Subject: Re: Fix to XprtDefs.h
-
- -

Yes, we're aware that turning off wchar_t support makes wchar_t be -a synonym for unsigned short under Metrowerks. We know that the -current version of VC++ also makes these types equivalent. In theory, -though, the types are distinct even when they are the same size and -shape. By using real wchar_t support, we are forced to recognize -the distinction and navigate it appropriately with reinterpret_cast. -The win here is that we aren't caught by -compiler changes that suddenly make some set of compilers compliant -and therefore break our code. We will add an autoconf test that lets -UNIX compilers opt in to our string scheme when they have an -appropriately shaped wchar_t. If these happen to be compliant -compilers, all will be well. If they don't, the casts don't hurt, -because they are type correct. We are writing our code to meet the -standard as we move forward. - -

The win for us is realized by the following macros - -

-
-#ifdef HAVE_CPP_2BYTE_WCHAR_T
-  #define NS_LITERAL_STRING(s)  nsLiteralString(L##s, \
-                      (sizeof(L##s)/sizeof(wchar_t))-1)
-#else
-  #define NS_LITERAL_STRING(s)  NS_ConvertASCIItoUTF16(s, \
-                       sizeof(s)-1)
-#endif
-
-
- -

An nsLiteralString points directly to the literal characters. No -copying, no conversion, and the length calculation happens at compile -time. This has turned out to be as large a savings as 15% of code -space and 8% of data space, net, in our string test harness It's -faster as well, again by eliminating the copying, conversion, and -length calculation. We don't know yet what those numbers translate -into in our real code base, but we have high hopes. - -

I don't want to be in the position to ask you to change your code. I -don't think it's appropriate for me to do so. The AIM application -that is your client is our client as well. They need to resolve this -difference between us in whatever way they think best. That may mean -asking you if changing your apis is the right thing to do. Or it may -mean applying the casts. Our code-base and yours, Justin, are more -like cousins. I don't think you should have to change just to conform -to us. You may think my arguments for using real wchar_t have -merit, and adopt similar usage just because you agree; but I think the -only obligation you have is to follow the technical solution you think -is right for your code. - -

If you decide to make this api change, it will mean shipping a new -binary (on Mac) for your library to clients who want to switch over to -the new api (since the name mangling will be different, and therefore, -the link requirements will change). - -

Hope this helps, - - - - - -


-
-Date: Thu, 15 Jun 2000 19:36:55 -0400
-Subject: Re: Checkin approval for bug 32336
-
- -
-
-S.Equals(NS_LITERAL_STRING("bar"), PR_TRUE, 3)
-
-
- -

doesn't compile because there is no three parameter form for Equals. - For all definitions of Equals on strings, see "nsAReadableString.h" - -

http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAReadableString.h - -

There is an EqualsWithConversion that takes three parameters. - -

http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsString2.h#731 - -

It is ``EqualsWithConversion'' because it admits the possibility of an -encoding specific transformation, in this case to provide -case-insensitive comparison. This also wouldn't compile, however, -since, at the moment, an nsLiteralString doesn't provide an operator -to produce a const PRUnichar* (though perhaps it should), and it -doesn't satisfy the other interfaces that match this call, e.g., a -const nsString&. - -

Perhaps I need to move case-insensitive comparison up out of -nsString into a global encoding specific transformations and -algorithms file (which was on its way anyway as Waterson, knows); this -use is one bit of evidence to support this. In the short term, this -can be fixed (if we think the current behavior is wrong) by providing -operator const CharT*() const on literal string. - -

If you can live with out case-folding, the earlier form is preferred - -

-
-S == NS_LITERAL_STRING("bar")
-
-
- -

if you can't, then one of the fixes I mentioned is in order. - - - - - -


-
-Date: Thu, 15 Jun 2000 19:47:12 -0400
-Subject: Re: [Fwd: how to use nsString ?]
-
- - - -

Apologies. Documentation mentioning strings is getting out of date. -Here are some specific answers. - - -

- -

...is now perhaps best expressed as - - nsString URLString( NS_LITERAL_STRING("http://www.mozilla.org") ); - -

since an nsString is a sequence of 2-byte wide characters, and the -routines that implicitly convert 1-byte sequences (like the literal -sequence you specified, "http:...") are now gone. - -

Up until not too long ago, one would have had to say - -

-
-nsString URLString;
-URLString.AssignWithConversion("http://www.mozilla.org");
-
-
- -

The NS_LITERAL_STRING construction is new machinery that has the -potential to make many operations much more efficient. - -

- -

SetString was a synonym for Assign or assignment with -operator=(), it too went away. The equivalent is the second -example I gave above, that is, the one with AssignWithConversion. - -

Assign still exists. AssignWithConversion takes on that -functionality for assignments that require encoding transformations -(e.g., from ASCII to UTF16). SetString is gone, since it was always -a synonym for Assign. - -

Learn more about the general APIs for strings that we are trying to -move to by examining - -http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAReadableString.h -http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAWritableString.h - -

Hope this helps, - - - - - -


-
-Date: Thu, 15 Jun 2000 21:26:51 -0400
-Subject: Re: Checkin approval for bug 32336
-
- - - -

This is what substrings are for. In that case, you could use - -

-
-Substring(S, 0, 3) == NS_LITERAL_STRING("bar")
-
-
- -

As for case-folding, it's best if you can case-fold everything up -front, instead of doing it repeatedly. I'll have to get back to you -on a general solution to that problem, or what my schedule for getting -it checked in would be. I'm sorry, I know that's not what you needed -to hear. If the source string is an nsString, you can continue to -exploit its implementation of these routines, e.g., ToLower all -up-front. - -

Hope this helps, - - - - - -


-
-Date: Mon, 19 Jun 2000 14:23:47 -0400
-Subject: Re: string fu
-
- - - -

What would you prefer? That extracting a character not in the string -always return CharT(0)? Can't do it for two reasons: (1) 0 may be -a valid character in a particular encoding, so it can't be used in -general as a ``no character at that position'' marker; and (2) I can't -control what an individual string implementation does when asked to -get an out-of-bounds fragment, it's explicitly undefined. That means -the result of CharAt is explicitly undefined for indexes outside the -defined contents of the string. As a debugging convenience, I have -made this assert, but it has always been the case that retrieving such -a character had undefined results ... even in [the old] code. - -

OK, you might say, well at least let me ask for a character that is -only off the end by one. E.g., Last of an empty string. Reason (1) -from above still applies. How bad is it to say, for the case you gave - -

-
-PRBool needsDelim = PR_FALSE;
-if ( !path.IsEmpty() )
-  {
-    PRUnichar last = path.Last();
-    needsDelim = !(last == '/' || last == '\\');
-  }
-
-
- -

In general, you probably want to opt out of a whole lot of work when -the source string is empty. It is slightly less convenient, but it -doesn't tie us to a bunch of implementation specific mojo. - - -

- -

This is an annoying property of auto strings, e.g., that they always -have an allocated buffer. I'm happy to fix this bug, however, be -aware that GetUnicode and GetBuffer are artifacts of [the old] -implementation that we don't want to support. They are not part of -the abstract interface. We will keep them no longer than we have to. -They don't support our multi-fragment paradigm. People who require a -contiguous hunk of characters in the future, and are unwilling to -switch over to chunky-iterators, may be forced to copy the string to -their own buffer. There will be an implementation of narrow character -string that guarantees contiguous allocation and a zero-terminator, -much as nsCString does now, for compatibility with platform uses, -but this won't be the default string class. - - - - - -


-
-Date: Mon, 19 Jun 2000 17:22:31 -0400
-
- -

Clarifying String Sematics - -

Recently, I added an assert to the string operations that extract -characters, namely First(), Last(), CharAt(), and -operator[](). This assert fires when any of these routines are used -to access a character outside the defined contents of the string. For -First() and Last() that means whenever they are applied to an -empty string. For CharAt() and operator[](), that means whenever -they are used to access an index outside the range of -0..Length()-1. There have been some complaints, however, the -result was always undefined. What follows is extracted from an email -exchange between me and warren on this topic. I hope it clarifies -strings semantics - -

Warren writes: -

- -

I replied: -

- -

Warren also asks: -

- -

And I reply: -

- -

In a later message, Chris Waterson asks a related question -

- -

And I reply: -

- -

Hope this makes sense, - - - - -


-
-Date: Tue, 20 Jun 2000 04:05:31 -0400
-Subject: Re: NS_LITERAL_STRING is broken
-
- -

The behavior you describe sounds exactly like when you say - -

-
-const char* foobar = "foobar";
-
-... NS_LITERAL_STRING(foobar).get() ...
-
-
- -

because in this case, the thing passed in is a const char*. -NS_LITERAL_STRING is not meant to be used in this way. It is only -meant to be used around a " delimited string. The type of such is -const char[N] where N is the number of characters in the string + 1 -for the zero terminator it helpfully adds. sizeof such a type is -N. - -

Are you sure you had the actual string as an argument, as in your -example to me? Or could the actual code have been like my sample, -above? - - - - - -


-
-Date: Thu, 29 Jun 2000 13:35:10 -0400
-Subject: Re: a fix
-
- - - - -

Dave, - -

please read - - news://news.mozilla.org/scc-314ABF.14261619062000@news.mozilla.org - -

It's just plain wrong to let people try to index into a string outside -its defined contents. I can't just return '\0' or PRUnichar('\0') -there as that could be a legal value to have somewhere in your -string for some encodings ... and the encoding is not specified. So -your patch has the basic problem of defeating my plan to stop people -from doing this bad thing. - -

The second problem with your patch is that you use the symbolic -constant nsnull, which is ostensibly a pointer value; Last returns -a character. nsnull is not appropriate for that purpose. In fact, -C++ gurus pretty much eschew the use of symbolic constants for 0. -NULL is to be avoided. nsnull is wrong-headed in that it presumes -we could have some other application specific value for NULL. We -can't, it would never work. It's just wasted brain-print. Always use -0 for these situations, and if you want to communicate the fact that -something is a pointer type, either use a comment or a -(construction-style) cast, like so (graded examples from worst to -best:) - -

- -

Don't let this discourage you; keep up the good work :-) - - - - - -


-
-Date: Tue, 8 Aug 2000 23:47:16 -0400
-Subject: Re: nsWritingIterator?
-
- - - - http://ScottCollins.net/Journal/discussion/string_iterators.html - -

does this help? - -

I can personally walk you through any specific scenario you need. - - - - - -


-
-Date: Wed, 9 Aug 2000 02:35:03 -0400
-Subject: Re: nsWritingIterator?
-
- -

You got it right... it's nsWritingIterator for whichever -character type you care about, either char or PRUnichar. You -_can_ use this iterator like a character pointer ... that is, you can -dereference it, assign into its dereference, etc. It is more -efficient, though, to directly address a particular range of -characters around where it points by asking it for its actual -character pointer with get, and knowing that there are -size_forward() characters available ahead of that pointer and -size_backward() characters available behind it. After examining -those characters by hand, you can advance the iterator beyond the -characters you have examined (and possibly into the next chunk, should -one exist) by adding into it (with +=) the count of the characters you -have processed. - -

Here are three examples of running through a string and modifying some -of the characters in it. All use nsWritingIterators. - - -

-
-  // inefficient, but works in a pinch:
-  //  iterators can hide all details of chunks by acting like
-  //  a raw character pointer
-
-nsWritingIterator<PRUnichar> s = S.BeginWriting();
-nsWritingIterator<PRUnichar> done_with_string = S.EndWriting();
-
-  // for each character in the string |S|
-while ( s != done_with_string )
-  {
-      // if the character is lower case, capitalize it
-    if ( 'a' <= *s && *s <= 'z' )
-      *s = *s -'a' + 'A';
-  }
-
-
-
-
-  // efficient
-  //  iterators provide a mechanism by which you can process
-  //  a chunk-at-a-time
-
-nsWritingIterator<PRUnichar> iter = S.BeginWriting();
-nsWritingIterator<PRUnichar> done_with_string = S.EndWriting();
-
-  // for each chunk of the string
-while ( iter != done_with_string )
-  {
-    size_t N = iter.size_forward();  // # of chars in this chunk
-    PRUnichar* s = iter.get();
-    PRUnichar* done_with_chunk = s + N;
-
-      // for each character in this chunk
-    for ( ; s < done_with_chunk; ++s )
-      {
-         // if the character is lower case, capitalize it
-       if ( 'a' <= *s && *s <= 'z' )
-          *s = *s - 'a' + 'A';
-      } 
-
-      // advance the iterator past characters
-      //  we examined (and into the next chunk, if any)
-    s += N;
-  }
-
-
-
-  // elegant
-  //  pull your transformation into a `sink', and |copy_string|
-  //  will efficiently pump any kind of string into it
-
-struct Capitalize
-  {
-      // inline
-    PRUint32
-    write( PRUnichar* s, PRUint32 N )
-        // processes one chunk, called repeatedly by |copy_string|
-      {
-        PRUnichar* done_with_chunk = s + N;
-
-         // for each character in this chunk
-        for ( ; s < done_with_chunk; ++s )
-          {
-              // if the character is lower case, capitalize it
-            if ( 'a' <= *s && *s <= 'z' )
-              *s = *s - 'a' + 'A';
-          }
-      }
-  };
-
-copy_string(S.BeginWriting(), S.EndWriting(), Capitalize());
-
-
- - - -

Does this show it better? - - - - - -


-
-Date: Thu, 17 Aug 2000 18:23:22 -0400
-
- - - -

I'll explain things in a little more detail than you need, then so -that some of the stuff you see in these headers will make more sense. -I'll also answer your questions out of order. - -

First: the string hierarchy looks like this - -http://ScottCollins.net/Journal/discussion/string_hierarchy.gif - -

The two most important headers are: - -http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAReadableString.h -http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAWritableString.h - -

These abstract classes, nsAReadable[C]String, and -nsAWritable[C]String are typically what you will want to use in the -interfaces of new code. If you write a piece of code that takes a -string for input, consider, e.g., - -

-
-void consumes_a_string( const nsAReadableString&  aInput );
-
-
- -

If you write a piece of code that modifies a string, consider - -

-
-void modifies_a_string( nsAWritableString&  aResult );
-
-
- - -

When creating your own classes, member strings will typically be -nsStrings. When you can't avoid creating a short string that you -need only temporarily during a function, you will typically use -nsAutoString. When someone passes you a raw pointer, or a raw -pointer and a length, representing a buffer of characters that you may -examine, but won't own, you can treat it like a string by wrapping it -in an nsLiteralString, e.g., - -

-
-void
-reads_a_buffer( const PRUnichar* aInput, PRUint32 aInputLength )
-  {
-    nsLiteralString input(aInput, aInputLength);
-      // doesn't allocate or copy
-
-    // ...
-  }
-
-
- -

You will use nsLiteralString around quoted constant strings as well, -though typically through the NS_LITERAL_STRING macro, to avoid doing -a length calculation - -

-
-NS_LITERAL_STRING("x")
-
-
- -

expands to - -

-
-nsLiteralString(L"x", (sizeof(L"x")/sizeof(PRUnichar) - 1))
-
-
- -

if L notation works as needed on your platform. - -Those are the basics. Now onto your questions: - - -

- - -

L"abc " makes a an object that is a const wchar_t[5], and none of -the string code knows about wchar_t. The main reason is that -wchar_t is not necessarily the right size (it can be 4 bytes under -gcc). If you wrap these constant expressions in NS_LITERAL_STRING, -as described above, you should get the right thing, e.g., - -

-
-str1 += NS_LITERAL_STRING("abc ") + str2 + NS_LITERAL_STRING("def");
-
-
- - - - -

This one, I have a quick and easy explanation for. If function was -declared like this - -

-
-function( const nsAReadableString&  )
-
-
- -

then, no problem, since a nsPromiseConcatenation (which was the -result of adding those two things together) is a readable string. -No other objects need to be created; no copying needs to be performed. - -

In all cases, we want the creation of nsStrings et al, to be -explicit, since creation is unbelievably expensive, requiring heap -allocation, locks, copying, etc. - -

I hope this answers both your posts, - - - - - -


-
-Date: Thu, 17 Aug 2000 20:57:08 -0400
-Subject: re our conversation
-
- - return ToNewUnicode( nsLiteralCString(buffer) ); - - - - - - -
-
-Date: Fri, 18 Aug 2000 02:52:45 -0400
-Subject: Re: More questions and new string API
-
- - - -

Unfortunately, NS_LITERAL_STRINGs definition is not particularly -amenable to this use. Instead, you would have to say something like -this: - -

-
-const nsAReadableString&
-foo()
-  {
-#ifdef HAVE_CPP_2BYTE_WCHAR_T
-    static nsLiteralString static_foo(L"x", 1);
-#else
-    static nsLiteralString static_foo;
-    static PRBool initialized = PR_FALSE;
-    if ( !initialized )
-      {
-        static_foo.AssignWithConversion("x", 1);
-        initialized = PR_TRUE;
-      }
-#endif
-    return static_foo;
-  }
-
-
- - - - -

I don't know what errors you are getting; but it probably doesn't work -because a reference isn't an assignable type. This is just a guess. -You may need to use - -

-
-map
-
-
- -

If you actually want the map to manage ownership of the keys, then -you'll want to use a concrete type, e.g., - -

-
-map
-
-
- -

or perhaps - -

-
-map
-
-
- -

Or maybe there's something else wrong. Send me the error messages. -If you end up using a pointer, then of course you'll have to supply a -comparison function to the map template. You won't be satisfied -with the default comparison of pointers :-) Sorry I couldn't answer -this one more completely. - - -

- -

The problem with this scenario is that an nsAReadableString doesn't -promise that all its data is contiguous, nor that it is -zero-terminated, which is what I suspect you want in this case. If -the function you want to call can take {pointer, length} tuples, and -can consume the string in hunks without zero termination ... then you -can use copy_string to pump the string into your function, see - - http://ScottCollins.net/Journal/discussion/string_iterators.html - -

If not, and you absolutely have to have a contiguous zero-terminated -buffer, then there is a new facility (part of the DOMAPI branch) that -does what you need. It's not checked in on the trunk; it should -be in early next week. It is nsPromiseFlatString. This class -promises a contiguous zero-terminated buffer; and has an operator -PRUnichar* to produce a pointer to that buffer automatically. If the -underlying class is one that happens to be a single fragment and -zero-terminated, then, like nsPromiseSubstring and -nsPromiseConcatenation, this class merely holds a reference into the -original data. If, however, the underlying string is multi-fragment -or not zero-terminated, then nsPromiseFlatString allocates a -contiguous buffer of appropriate size and copies the fragmented string -data to it. So given - -

-
-void ReadBuffer( PRUnichar* );
-
-
- -

You can call this as efficiently as possible with an arbitrary string -like so - -

-
-ReadBuffer( nsPromiseFlatString(aString) );
-
-
- - -

If the function you are calling needs to take ownership of the buffer -you hand it, then you will probably call ToNewUnicode like so - -

-
-void ConsumeBuffer( PRUnichar* );
-
-ConsumeBuffer( ToNewUnicode(aString) );
-
-
- -

The global function ToNewUnicode is declared in "nsReadableUtils.h", -and was only recently added to the build. It is currently being used -in the DOMAPI branch. It is part of the build, but the file -"dlldeps.c" in XPCOM may need to be modified to ensure it is exported -on your platform if you are building the tip. - -Needless to say, you want to avoid functions that require bare -pointers for several reasons: (a) they typically assume -zero-termination, which is not guaranteed by the normal encodings; (b) -they require contiguous allocation, which may not be possible; (c) -they scan for the end of the string, at linear cost (if the encoding -makes it possible at all), when the length could be known in advance. -If you have to do it, the above mechanisms work, but be aware of the -cost and the potential need to copy. - - -

- -

nsAReadableString is an abstract type. So you can't have a concrete -instance of it. All strings in the hierarchy are readable strings. -If you just want a reference to a readable string, you can say, e.g., - -

-
-struct foo
-  {
-    const nsAReadableString&  mString;
-    // ...
-
-    foo( const nsAReadableString&  aString ) : mString(aString) { }
-  };
-
-
- -

...similarly with pointers; but I suspect you are looking for -something more concrete. An nsString is a nsAReadableString, and -is the typical thing you want as a member variable. An nsAutoString -is also an nsAReadableString and is typically what you would use for -a short (in length) temporary (in lifetime) local variable, as I -mentioned in my previous post. - - -

- -

Yes, though remember, an nsLiteralString assumes the lifetime of the -underlying data is under someone else's control. If the called -function gives you a buffer that you need to delete, you will have -to manage that yourself. Currently, people often use nsXPIDLString -to handle that. XPIDL strings are not part of the hierarchy. They -are only used as a sort of string-auto_ptr. However, I'm -integrating their functionality into nsString. There is no problem -in wrapping the same pointer in both as two separate local variables, -one to give you the readable interface, and one to manage the -lifetime. - -

If it's OK with you, I'd like to post this reply (including your -quoted questions) to n.p.m.xpcom and also put a copy near the string -iterator discussion I provided a link to above, so that other people -with similar questions can see these answers. - -

Hope this helps, - - - - - -


-
-Date: Sun, 3 Sep 2000 03:52:17 -0400
-
- -

In article <8nu9m2$eo14@secnews.netscape.com>, "Jon Smirl" - wrote: - -> I have the new strings up and running in my app. They work as -> advertised and -> I haven't found any bugs. Thanks for the good job in designing and -> implementing them. Here's are a summary of issues I've encountered -> so far... - -

Thanks, and I appreciate your comments and insights. - - -> -> 1) Should there be a nsSegmentedString derived from nsString instead -> of building segment support into nsString? None of my strings are -> segmented but -> I keep executing code that is supports it. nsPromiseFlatString would -> be trivial in the non-segmented case. - -

The general case is that a string does not promise to have contiguous -data. A specific case is that, for some implementations, it does. -You couldn't do it the other way around, because a segmented string -couldn't satisfy all the promises of a flat string. However, through -the use of chunky iterators, operating on strings that happen to be -flat is very efficient. In fact, nsPromiseFlatString is trivial in -the non-segmented case. In addition, I'll be adding an abstract flat -class into the hierarchy, which will present additional interface ... -in your local routines where you actually have declared a concrete -string instance that happens to be flat, the compiler will give you -the benefit of using the flat specific routines (e.g., a substring -object over a flat string is simpler than the general purpose -substring). I need to be cautious about this, though, since I don't -automatically want people propagating the flat type through their -interfaces. That would put us in the same boat we're in right now ... -where routines only work on a specific kind of string, which denies -other parts of the code the opportunity to use an implementation -beneficial to its specific needs, and typically for no good reason. - -> -> 2) Should nsAWritableString have a way to get the buffer and then -> return it? -> I need to get the buffer to pass it to OS calls. I'm doing this now -> by passing around nsStrings instead of the interface. If I just use -> the interface I encur an extra copy since I have to use a temporary -> buffer. - -

A specific string implementation could promise this, but in general, a -writable could not. After all, a writable doesn't even guarantee -contiguous storage. To some degree, this is what -nsPromiseFlatString is for. However, this is a readable promise -only. It will also be the case that ns[C]Strings, in the very near -future will be able to just assume ownership of an arbitrary buffer -allocated on the free store with the XPCOM allocators ... getting one -to give up its buffer, on the other hand, presents some problems. Do -you have a lot of places where the system writes into your string -buffer space? Or do you have a lot of system routines that return you -new buffers? I can imagine using nsPromiseFlatString for this, but -what happens when the OS alters the underlying data? If the promise -had generated that flat data on behalf of a multi-fragment string, -should it now put the changes back? It's possible to do, I just want -to know if it's correct to allow this situation to happen. - - - -> -> 3) There needs to be a NS_LITERAL_CHAR() to go along with -> NS_LITERAL_STRING(). - -

OK. - - - -> Having NS_LITERAL_STRING() all over the code clutters -> it up and makes it hard to tell what the code is doing, could we -> have a standard short alias for this? - -

Yes, I'll try to think of something ... perhaps NS_LSTR? - - -> 4) nsLiteralString should support n.ToInteger(&error); - -

ToInteger is actually a bad interface. It's only good if your -entire string is the number; this encourages you to edit your string -until it is one, or perhaps copy the numeric part to another string. -Better if you just sscanf a string (don't know if I can provide -that in the general case, but I'm thinking about it), or else use -regular C++ extractors (which wouldn't be too hard for me to -provide), or else I could give you a ToInteger that works on a pair -of iterators, extracting the integer from the digits between them. - -> -> 5) There should be a global define for an interface to a readonly -> empty string. - -

Yes, there will be. - - -> -> 6) Something is wrong with concatenation.... - -

Hopefully I've fixed this now. - - - -> 8) A forward definition is missing in the h files - -

I'll check it out. - - - -

My understanding is that you have already found the answers to your -other questions. - -

I hope this helps, - - - - -


-
-Date: Wed, 20 Sep 2000 17:32:13 -0400
-Subject: Re: how to free an nsString::ToNewCString
-
- - - -

nsMemory::Free - - - - - -


- -

You use several NS_ConvertASCIItoUTF16("...").get(), these should be - - NS_LITERAL_STRING("...").get() - -

Don't do this to the very first case where you aren't wrapping an actual literal string. -The first instance would should exploit NS_LITERAL_STRING technology as well, -around the initial declarations of the strings ... probably want to do this with -NS_NAMED_LITERAL_STRING. - - - -


-
-Date: Thu, 12 Oct 2000 00:57:28 -0400
-Subject: string answers
-
- -
-
-nsresult
-DoSomething( nsAWritableString&  answer )
-  {
-    nsresult rv;
-
-    nsXPIDLString registry_data;
-    Fetch("key", getter_Shares(registry_data));
-
-    nsLiteralString path(not_my_string);
-
-    PRInt32 first_colon = path.FindChar(PRUnichar(':'));
-    if ( first_colon != -1 )
-      {
-        // convert ... extract path from |path|
-        nsCOMPtr localFile( do_CreateInstance(CID, &rv)
-);
-        if ( localFile )
-          {
-           
-localFile->SetPersistentDescriptor(NS_ConvertUTF16toUTF8(path));
-
-            nsXPIDLString converted_path;
-            localFile->GetUnicodePath(getter_Copies(converted_path));
-            answer = converted_path.get();
-          }
-      }
-    else
-      {
-        answer = path;
-      }
-
-
-    return rv;
-  }
-
-
- - - - - -
-
-Date: Thu, 12 Oct 2000 02:03:49 -0400
-Subject: Re: and the answer is ...
-
- -

You can see from the line of code that you're on, that this should -have been fine. nsMemory::Alloc would be asked to allocate a 1 byte -object. But it failed trying to allocate that. Which suggests that -the allocator was busy and non-reentrant and the debugger tried to -misuse it. Yes? - -

Of course, this doesn't solve your problem. Perhaps we need to go -back to the idea of a function that returns a pointer to the first -hunk of the string. - -

-
-const char*
-debug_string( const nsAReadableCString& aCString )
-  {
-    nsReadingIterator<char> iter;
-    aCString.BeginReading(iter);
-    return aCString.IsEmpty() ? "" : iter.get();
-  }
-
-
- -

This code should work regardless of what the allocator is doing. The -downsides are (a) it only returns the first hunk of the string, in the -case of a multi-fragment string; and (b) that hunk might not be -zero-terminated. - -

Hope this helps, - - - - - -


-
-Date: Thu, 12 Oct 2000 08:30:32 -0400
-Subject: Re: Self healing the cache :-)
-
- -

At 3:04 PM -0400 10/11/00, Mike Shaver wrote: -

- -

Macro ugliness makes NS_LITERAL_STRING inappropriate for use over -other macros. In other words: - -

-
-NS_LITERAL_STRING("foo")
-
-
- -

is good. - -

-
-#define FOO "foo"
-NS_LITERAL_STRING(FOO)
-
-
- -

is bad. Why? Because it turns into - -

-
-nsLiteralString(LFOO, sizeof(LFOO)...
-
-
- -

and there is no LFOO. Sorry. If you have to do this to a -macro-ized string, do the magic by hand, e.g., - -

-
-nsLiteralString(FOO, sizeof(FOO)/sizeof(PRUnichar)
-                                          + sizeof(PRUnichar('\0')))
-
-
- -

or else if you don't care that nsLiteralString will scan for the -length, just say - -

-
-nsLiteralString(FOO)
-
-
- -

Hope this helps, - - - - - -


-
-Date: Thu, 12 Oct 2000 08:36:14 -0400
-Subject: Re: Self healing the cache :-)
-
- -

Actually, I'm not even sure you can do it by hand, since you didn't - -

-
-#define FOO L"foo"
-
-
- -

and can't do that cross-platform. The other way around this is to -define a global instead of a macro, that is, instead of saying - -

-
-#define FOO "foo"
-
-
- -

at the top of your file, say - -

-
-NS_NAMED_LITERAL_STRING(FOO, "foo")
-
-
- -

or else, if the macro was used only in one spot ... perhaps you could -just eliminate the macro in favor of NS_NAMED_LITERAL in situ. - -

Arghh. In this case, you may be stuck with the extra work of -AssignWithConversion. - - - - - -


-
-Date: Sun, 3 Dec 2000 16:38:07 -0400
-Subject: Re: another copy_string question
-
- - - -

No, there isn't. But you could move such special processing into the -destructor of the sink. Remember, the sink is passed by reference, so -you can exactly control its lifetime. - -

-
-{
-  MySink sink;
-  nsReadingIterator<PRUnichar> sourceStart = aStr.BeginReading();
-  nsReadingIterator<PRUnichar> sourceEnd = aStr.EndReading();
-  copy_string(sourceStart, sourceEnd, sink);
-    // |sink| destructor executed here
-}
-
-
- -

Hope this helps, - - - - - -


-
-Date: Fri, 15 Dec 2000 20:02:08 -0400
-Subject: fragment of code
-
- -
-
-nsPromiseFlatString flatKey(aReadable);
-
-flatKey.get()
-
-
- - - - - - -
-
-Date: Tue, 16 Jan 2001 16:47:37 -0400
-Subject: Re: a few string questions...
-
- ->I've accumulated a few questions I've been wanting to ask you, mostly ->about string stuff. Nothing urgent, but I want to ask them before I ->forget. So here goes...: -> ->1) Is it acceptable to use nsLiteralCString or nsLiteralString on ->something that's not a literal? This can be useful in some places, ->for example, to convert a char* to PRUnichar*: -> ->PRUnichar* new = ToNewUnicode(nsLiteralCString(myCharPtr)); - -

This is explicitly allowed. That's why I'm proposing to change the -names of those classes to nsLocal[C]String. - - ->2) Should nsString2x.h and nsString2x.cpp go away? They look like a ->never-completed rewrite or something... - -

Yes. They should go away. They are uncompleted [old] bullshit, -exactly as you diagnosed. - -

I'll look into the other two questions. - - - - - -


-
-Date: Thu, 1 Feb 2001 15:12:41 -0400
-Subject: Re: [Fwd: bad string, bad string]
-
- -

We've been removing implicit conversion operators because they -_always_ lead to trouble. Usually they make it harder to pick the -right function when overloading is involved and in the past they have -led to huge performance suckage because we ended up doing conversions -when we didn't need to because the implicit operator made us pick the -wrong function. - -

It's borderline when the class implements something that is so -close, as with a guaranteed flat string or an nsCOMPtr ... but the -general recommendation is to avoid implicit conversions. - -

See bug #53057. - - - - - -


-
-Date: Tue, 6 Feb 2001 18:52:23 -0400
-Subject: seeking review for bug #57087
-
- -

bug: - http://bugzilla.mozilla.org/show_bug.cgi?id=57087 - - patch: - http://bugzilla.mozilla.org/showattachment.cgi?attach_id=24576 - -

This patch is supposed to add the ability to define very long literal -strings more easily by breaking lines, e.g., - -

-
-NS_MULTILINE_LITERAL( NS_L("This is the start of a very long line")
-                      NS_L(" which actually continues across")
-                      NS_L(" a couple more.") )
-
-
- -

The main danger in this scheme is callers who omit the inner NS_L -wrapping. Though I believe this will be caught at compile time as the -wrong type initializer. - -

Seeking input from everybody, and waterson in particular. - - - - - -


-
-Date: Wed, 14 Feb 2001 16:09:10 -0400
-Subject: Re: Question...
-
- -

There are some utilities in "xpcom/ds/nsReadableUtils.h". In -particular, if you want to get back a new heap-allocated ASCII string -with the minimal work, you would say - -

-
-PRUnichar* sourceChars = ...;
-
-char* destChars = ToNewCString(nsLiteralString(sourceChars));
-
-
- - -

It's more efficient if you happen to already know the length. If you -don't, don't bother counting, that's what I'll do in the constructor -for nsLiteralString. If you do, then call like this - -

-
-destChars = ToNewCString( nsLiteralString(sourceChars, length) );
-
-
- -

Other routines in that file will help you if, for instance, you wanted -to translate into a buffer you had already allocated. - -

Hope this helps, - - - - - -


-
-Date: Fri, 23 Feb 2001 03:12:58 -0400
-Subject: string snippet
-
- -
-
-nsCString aInput;
-
-
-
-nsReadingIterator<char> search_start;
-aInput.BeginReading(search_start);
-
-nsReadingIterator<char> search_end;
-aInput.EndReading(search_end);
-
-if ( FindCharInReadable(':', search_start, search_end) )
-  {
-    ++search_start;
-    return ToNewCString( Substring(aInput, search_start, search_end)
-);
-  }
-
-
- - - - - - -
-
-Date: Wed, 7 Mar 2001 19:44:08 -0400
-Subject: string help
-
- -

Here you go, Mike: - - http://scottcollins.net/journal/discussion/mjudge-scratch.cpp - - - - - - -


-
-Date: Fri, 9 Mar 2001 20:56:07 -0400
-Subject: Re: string assertions
-
- -

If you get an iterator into a string and you advance it all the way to -the end of the string, and then keep trying to advance it, you hit -this assert. This could happen, for example if you tried to copy 10 -characters out of a 9 character string. I've tried to make this -impossible to get to. As far as I know, all my routines trim requests -in advance of manipulating iterators. When you see this, you should -get the stack. That will take you right to the bad spot. - - - - - -


-
-Date: Sat, 31 Mar 2001 11:04:03 -0400
-Subject: Re: Sun bustage and string advice
-
- -

You do know you are comparing two pointers now? It seems unlikely -those two pointers would ever be the same pointer. You probably want -to say something like - -

-
-NS_LITERAL_STRING("foo").Equals(aTopic) // or
-
-NS_LITERAL_STRING("foo") == nsLiteralString(aTopic)
-
-
- -

...so that you compare the contents of two strings. Right now, -you're just testing to see if two pointers both point to the same -location in memory. A lot of people make this mistake. I would like -to make it obvious to people that comparing two pointers does not -compare strings. Can you tell me what gave you that impression so -that I can figure out how to better educate people not to do this? By -the way, it's not that I don't want to make this compare two -strings; it's that in C++, you can't override operations for built-in -types. And pointers are built-in types. So I can't make -operator==(const PRUnichar*, const PRUnichar*) do anything different -than it already does, which is the same thing it does for any other -pointer. - - - - - - -

- - - - - - - - -