15 мая 2023 года "Исходники.РУ" отмечают своё 23-летие!
Поздравляем всех причастных и неравнодушных с этим событием!
И огромное спасибо всем, кто был и остаётся с нами все эти годы!

Главная Форум Журнал Wiki DRKB Discuz!ML Помощь проекту


Fast CString concatenation

Ed Ball -- ed@logos.com
Wednesday, September 25, 1996

Environment: Win95, VC++ 4.2

I am a frequent user of the CString class. Naturally, the most common
operation I perform on CStrings (other than assignment) is
concatenation. Unfortunately, concatenation is one of the few areas that
CString does poorly, IMHO due to the fact that there is no "grow by"
attribute as there is with the MFC array classes. So every time I
concatenate a character or two to a CString, a new buffer is allocated,
the string copied into it, and the old buffer deleted, causing some
serious memory fragmentation, as the space left by the old buffer will
not be large enough to allocate the next concatenation into, for obvious
reasons.

To avoid this, you could go back to using arrays of characters and
strcat... or you could use GetBuffer. This function is normally used to
give you direct access to the character array of the string, but it can
also be used to pre-allocate the string to a given length. If you can
make a reasonable guess as to the maximum size of the final string, you
can call GetBuffer with that value, call ReleaseBuffer right after that
(it won't deallocate the unused characters of the buffer), and then do
the concatenations. After they are done, you can call FreeExtra if you
don't want the unused characters hanging around.

So instead of this:

  CString str;
  str += str1;
  // ... do more concatenations ...
  return str;

You could do this to improve speed and reduce fragmentation:

  CString str;
  str.GetBuffer(1024);
  str.ReleaseBuffer();
  str += str1;
  // ... do more concatenations ...
  str.FreeExtra();
  return str;

The former will do one allocation and one deallocation for each
concatenation, while the latter will do exactly two allocations and two
deallocations, assuming that 'str' never exceeds 1024 bytes.

Just to test my theory, I wrote a simple function that concatenated a
digit to a string 10000 times. The former method took 3.13 seconds on my
120MHz Pentium, while the latter (using 10001 in the GetBuffer call) was
"instantaneous".

Just a tip that seems obvious in retrospect but is very useful for
rampant CString-users such as myself...

- Ed




MHENRY.UMI.COM -- MHENRY@umi.com
Friday, September 27, 1996

Ed, 
 
Thanks for passing the information.  I have to admit that I have always 
considered CString just one step short of useless because of their slow speed 
and the fact that its not all that tough to use arrays of chars (yes, I AM 
aware that CStrings can be fast if you understand how they are implemented and 
code to that, but if you have to take into account how a class is implemented 
it kind of defeats the whole point of object-oriented programming).  Anyway I 
wouldn't care to start any holy wars, but it would be remiss not to mention 
that it is still much faster to concatenate using a buffer of chars. 
 
With a buffer size of 200K 
Using CStrings 
CString Buf; 
for (int inx = 0; inx < 200000; inx++) Buf += 'a'; 
This function takes so long it's not even worth discussing.  Like 10 minutes 
or something. 
 
Using CString and GetBuffer() 
CString Buf; 
Buf.GetBuffer(200001); Buf.ReleaseBuffer(); 
for (int inx = 0; inx < 200000; inx++) Buf += 'a'; 
Much faster.  It takes about a second by my watch.  525 milliseconds according 
to the profiler. 
 
Using char array and keeping a pointer to the last char 
char *pBuf = new char[200001]; 
for (int inx = 0; inx < 200000; inx++) pBuf[inx] += 'a'; 
pBuf[200000] = NULL; 
delete [] pBuf; 
Instantaneous by my watch.  39 milliseconds by the profiler. 
 
For non time-intensive tasks the GetBuffer() trick is pretty useful (but then 
what task isn't time-intensive?). 
 
--matt 
/~~~~~~~~~~~~~~~~~~~~~~~~~~~  
  Matthew Henry  -- UMI            
  mhenry@umi.com     (Work)              
  mhenry1384@aol.com (Home)  
~~~~~~~~~~~~~~~~~~~~~~~~~~/  



Bradley V. Pohl -- brad.pohl@pobox.com
Sunday, September 29, 1996

[Mini-digest: 4 responses]


>it kind of defeats the whole point of object-oriented programming).  Anyway I 
>wouldn't care to start any holy wars, but it would be remiss not to mention 
>that it is still much faster to concatenate using a buffer of chars. 
 
>With a buffer size of 200K 
>Using CStrings 
>CString Buf; 
>for (int inx = 0; inx < 200000; inx++) Buf += 'a'; 

>Using char array and keeping a pointer to the last char 
>char *pBuf = new char[200001]; 
>for (int inx = 0; inx < 200000; inx++) pBuf[inx] += 'a'; 
>pBuf[200000] = NULL; 
>delete [] pBuf; 
>Instantaneous by my watch.  39 milliseconds by the profiler. 

>--matt 
>/~~~~~~~~~~~~~~~~~~~~~~~~~~~  
>  Matthew Henry  -- UMI            
>  mhenry@umi.com     (Work)              
>  mhenry1384@aol.com (Home)  
>~~~~~~~~~~~~~~~~~~~~~~~~~~/  
 
You did mean 
	pBuf[ inx ] = 'a';
and not
	pBuf[ inx ] += 'a';
right?

The former, of course, will give you an array of garbage
(although very quickly) :-)

--Brad
brad.pohl@pobox.com
-----From: Mike Blaszczak 

At 09:52 9/27/96 -0400, "MHENRY.UMI.COM"  wrote:

>if you have to take into account how a class is implemented 
>it kind of defeats the whole point of object-oriented programming). 

There's more than one "point" to object-oriented programming.  In the
real, applied world, some of those points are more important than others.

>Anyway I wouldn't care to start any holy wars,

Then you should compare apples to apples, then, shouldn't you?

It doesn't make sense to compare:

>Using CString and GetBuffer() 
>CString Buf; 
>Buf.GetBuffer(200001); Buf.ReleaseBuffer(); 
>for (int inx = 0; inx < 200000; inx++) Buf += 'a'; 
>Much faster.  It takes about a second by my watch.  525 milliseconds according 
>to the profiler. 

with:
 
>Using char array and keeping a pointer to the last char 
>char *pBuf = new char[200001]; 
>for (int inx = 0; inx < 200000; inx++) pBuf[inx] += 'a'; 
>pBuf[200000] = NULL; 
>delete [] pBuf; 
>Instantaneous by my watch.  39 milliseconds by the profiler. 

You probably meant to code:

for (int inx = 0; inx < 200000; inx++) pBuf[inx] = 'a'; 

because pBuf[inx] isn't initialized, it doesn't make sense to use += instead
of +.

Your watch is far too blunt of an instrument for these comparisons.  You'd
need to use a profiler, for example, to realize that coding this:

CString Buf('x', 200001);

is almost perfectly equivalent (in exeuction time and effect, after the
optimizer has had its say) to your array-based pBuf[] code fragment.

> (but then what task isn't time-intensive?). 

In many, many applications, speed of development and maintainability of code
is far more important than runtime speed.  (If it wasn't, we'd all be coding
in assembler.)  Maybe that's not strictly true in your application, or even
in your experience, but it's true for a very very many of the people who use
MFC. Since you're not the only person who uses MFC, you should be resepctful
of those other applications; they might not see things as "useless" as you do
because they're interested in other "points".



.B ekiM
http://www.nwlink.com/~mikeblas/
Don't look at my hands: look at my _shoulders_!
These words are my own. I do not speak on behalf of Microsoft.

-----From: Raja Segar 

What else can i say ...A Superb Tip ...Keep it up man.
By the way ... can i ask u something ...if u don't mind ..

How do u profile the time it takes to execute line by line.
I can only get it to work by functions.
Any help will be much appreciated.
Thanks in Advance.
BYE.

Keep the Tips coming please !
 (  _ \/ __)(_   )
  )   /\__ \ / /_ 
 (_)\_)(___/(____)@pc.jaring.my

-----From: bop@gandalf.se

This is close to comparing apples and oranges!

What about REAL "object oriented" programming, like:

CString Buf(_T('a'),200000);

Probably "instantaneous" too!


The CString class buys you a VARIABLE length string,
with very little overhead. It's also much easier to use
than a (fixed size) char array.


In your char array approach is you definitely program
for the implementation, just what you didn't like about
CString in the first place...


Bo Persson
bop@gandalf.se




| Вернуться в корень Архива |