StyledText

I'd like to welcome BKeeney Software engineer Seth Verrinder to the BKeeney Briefs blog. Seth recently discovered a few issues relating to StyledText class in REALbasic. Here are his observations....
A few days ago I needed to read a file in rich text format (RTF), starting with support for basic text formatting only. It seemed like a simple task with REALbasic’s built-in RTF support, but turned into a lesson on why you should test your software with real data. It didn’t take long to create a program that worked pretty well for a couple small files, so I went ahead and opened the client’s 200KB file. At first I thought it was going to be unacceptably slow, so I went and made some tea. When I got back the screen was still blank. Turns out RB’s RTF reader is basically useless.

To find out what was going on I created a series of sample files and timed how long it took to read each one on both Windows and on a Mac. The times are similar between the two platforms, but the interesting thing is that, after a certain point, each time you double the size of the input file the amount of time it takes to read increases by a factor of more than eight. Extrapolating from these numbers it would take more than
10 hours to read a 200KB file on an iMac with a 2 GHz Core 2 Duo processor and 2GB of RAM.

chart

Using Apple’s Shark profiler revealed that 72.9% of the program’s time was spent in StyledTextParagraphCountGetter and 25.0% in StyledTextParagraphGetter. It’s difficult to imagine the code that would produce these results, but between the point that the decision was made to add RTF support and delivering the current release to the public it’s clear the something went terribly wrong. There’s nothing in the RTF format that requires eight times more processing for twice as much data.

There are other ways to read an RTF file. The one I settled on was to use the RtfParser classes from
http://www.belle-nuit.com/realbasic/rtfparser.html which are licensed under the LGPL. Other options include True North Software’s formatted text control and, for Windows only, RB’s own FolderItem.OpenStyledEditField works very nicely. Each of these solutions reads the test file in a few seconds.

REAL Software will (hopefully) fix this issue, so the lasting message to anyone who writes code for a living is to test your software with a wide range of inputs and review code to make sure that the way it’s solving a particular problem isn’t fundamentally flawed. Every developer screws up from time to time, so there’s no excuse for not planning ahead.