Fast CString concatenation
Ed Ball -- ed@logos.com Wednesday, September 25, 1996 Environment: Win95, VC++ 4.2 I am a frequent user of the CString class. Naturally, the most common operation I perform on CStrings (other than assignment) is concatenation. Unfortunately, concatenation is one of the few areas that CString does poorly, IMHO due to the fact that there is no "grow by" attribute as there is with the MFC array classes. So every time I concatenate a character or two to a CString, a new buffer is allocated, the string copied into it, and the old buffer deleted, causing some serious memory fragmentation, as the space left by the old buffer will not be large enough to allocate the next concatenation into, for obvious reasons. To avoid this, you could go back to using arrays of characters and strcat... or you could use GetBuffer. This function is normally used to give you direct access to the character array of the string, but it can also be used to pre-allocate the string to a given length. If you can make a reasonable guess as to the maximum size of the final string, you can call GetBuffer with that value, call ReleaseBuffer right after that (it won't deallocate the unused characters of the buffer), and then do the concatenations. After they are done, you can call FreeExtra if you don't want the unused characters hanging around. So instead of this: CString str; str += str1; // ... do more concatenations ... return str; You could do this to improve speed and reduce fragmentation: CString str; str.GetBuffer(1024); str.ReleaseBuffer(); str += str1; // ... do more concatenations ... str.FreeExtra(); return str; The former will do one allocation and one deallocation for each concatenation, while the latter will do exactly two allocations and two deallocations, assuming that 'str' never exceeds 1024 bytes. Just to test my theory, I wrote a simple function that concatenated a digit to a string 10000 times. The former method took 3.13 seconds on my 120MHz Pentium, while the latter (using 10001 in the GetBuffer call) was "instantaneous". Just a tip that seems obvious in retrospect but is very useful for rampant CString-users such as myself... - Ed
MHENRY.UMI.COM -- MHENRY@umi.com Friday, September 27, 1996 Ed, Thanks for passing the information. I have to admit that I have always considered CString just one step short of useless because of their slow speed and the fact that its not all that tough to use arrays of chars (yes, I AM aware that CStrings can be fast if you understand how they are implemented and code to that, but if you have to take into account how a class is implemented it kind of defeats the whole point of object-oriented programming). Anyway I wouldn't care to start any holy wars, but it would be remiss not to mention that it is still much faster to concatenate using a buffer of chars. With a buffer size of 200K Using CStrings CString Buf; for (int inx = 0; inx < 200000; inx++) Buf += 'a'; This function takes so long it's not even worth discussing. Like 10 minutes or something. Using CString and GetBuffer() CString Buf; Buf.GetBuffer(200001); Buf.ReleaseBuffer(); for (int inx = 0; inx < 200000; inx++) Buf += 'a'; Much faster. It takes about a second by my watch. 525 milliseconds according to the profiler. Using char array and keeping a pointer to the last char char *pBuf = new char[200001]; for (int inx = 0; inx < 200000; inx++) pBuf[inx] += 'a'; pBuf[200000] = NULL; delete [] pBuf; Instantaneous by my watch. 39 milliseconds by the profiler. For non time-intensive tasks the GetBuffer() trick is pretty useful (but then what task isn't time-intensive?). --matt /~~~~~~~~~~~~~~~~~~~~~~~~~~~ Matthew Henry -- UMI mhenry@umi.com (Work) mhenry1384@aol.com (Home) ~~~~~~~~~~~~~~~~~~~~~~~~~~/
Bradley V. Pohl -- brad.pohl@pobox.com Sunday, September 29, 1996 [Mini-digest: 4 responses]>it kind of defeats the whole point of object-oriented programming). Anyway I >wouldn't care to start any holy wars, but it would be remiss not to mention >that it is still much faster to concatenate using a buffer of chars. >With a buffer size of 200K >Using CStrings >CString Buf; >for (int inx = 0; inx < 200000; inx++) Buf += 'a'; >Using char array and keeping a pointer to the last char >char *pBuf = new char[200001]; >for (int inx = 0; inx < 200000; inx++) pBuf[inx] += 'a'; >pBuf[200000] = NULL; >delete [] pBuf; >Instantaneous by my watch. 39 milliseconds by the profiler. >--matt >/~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Matthew Henry -- UMI > mhenry@umi.com (Work) > mhenry1384@aol.com (Home) >~~~~~~~~~~~~~~~~~~~~~~~~~~/ You did mean pBuf[ inx ] = 'a'; and not pBuf[ inx ] += 'a'; right? The former, of course, will give you an array of garbage (although very quickly) :-) --Brad brad.pohl@pobox.com -----From: Mike Blaszczak At 09:52 9/27/96 -0400, "MHENRY.UMI.COM" wrote: >if you have to take into account how a class is implemented >it kind of defeats the whole point of object-oriented programming). There's more than one "point" to object-oriented programming. In the real, applied world, some of those points are more important than others. >Anyway I wouldn't care to start any holy wars, Then you should compare apples to apples, then, shouldn't you? It doesn't make sense to compare: >Using CString and GetBuffer() >CString Buf; >Buf.GetBuffer(200001); Buf.ReleaseBuffer(); >for (int inx = 0; inx < 200000; inx++) Buf += 'a'; >Much faster. It takes about a second by my watch. 525 milliseconds according >to the profiler. with: >Using char array and keeping a pointer to the last char >char *pBuf = new char[200001]; >for (int inx = 0; inx < 200000; inx++) pBuf[inx] += 'a'; >pBuf[200000] = NULL; >delete [] pBuf; >Instantaneous by my watch. 39 milliseconds by the profiler. You probably meant to code: for (int inx = 0; inx < 200000; inx++) pBuf[inx] = 'a'; because pBuf[inx] isn't initialized, it doesn't make sense to use += instead of +. Your watch is far too blunt of an instrument for these comparisons. You'd need to use a profiler, for example, to realize that coding this: CString Buf('x', 200001); is almost perfectly equivalent (in exeuction time and effect, after the optimizer has had its say) to your array-based pBuf[] code fragment. > (but then what task isn't time-intensive?). In many, many applications, speed of development and maintainability of code is far more important than runtime speed. (If it wasn't, we'd all be coding in assembler.) Maybe that's not strictly true in your application, or even in your experience, but it's true for a very very many of the people who use MFC. Since you're not the only person who uses MFC, you should be resepctful of those other applications; they might not see things as "useless" as you do because they're interested in other "points". .B ekiM http://www.nwlink.com/~mikeblas/ Don't look at my hands: look at my _shoulders_! These words are my own. I do not speak on behalf of Microsoft. -----From: Raja Segar What else can i say ...A Superb Tip ...Keep it up man. By the way ... can i ask u something ...if u don't mind .. How do u profile the time it takes to execute line by line. I can only get it to work by functions. Any help will be much appreciated. Thanks in Advance. BYE. Keep the Tips coming please ! ( _ \/ __)(_ ) ) /\__ \ / /_ (_)\_)(___/(____)@pc.jaring.my -----From: bop@gandalf.se This is close to comparing apples and oranges! What about REAL "object oriented" programming, like: CString Buf(_T('a'),200000); Probably "instantaneous" too! The CString class buys you a VARIABLE length string, with very little overhead. It's also much easier to use than a (fixed size) char array. In your char array approach is you definitely program for the implementation, just what you didn't like about CString in the first place... Bo Persson bop@gandalf.se
| Вернуться в корень Архива |