by Rupert Walsh - GUI Computing
Normally, I use a bubble sort to order data. It's reasonably simple, and unless you're sorting a large amount of data, it's fast enough for most applications. The crux of sorting data is comparing two items in an out-of-order list, and deciding which comes first. For sorting numbers, you can simply use the >, < or = operators. For sorting text, Visual Basic has the built in function StrComp, short for String Compare. Its syntax is as follows:
StrComp(string1, string2[, compare_type])
It can return any of four values:
string1 is less than string2 -1 string1 is equal to string2 0 string1 is greater than string2 1 string1 or string2 is Null Null
The compare_type can be 0 (vbBinaryCompare), 1 (vbTextCompare) or 2 (vbDatabaseCompare for MS Access databases only). The only difference between Binary and Text comparison is that Binary comparison is case sensitive, whereas text isn't.
If your data is numeric, or simple text strings, the built in functions and a simple sorting algorithm are all you need. However, imagine that you need to sort the titles from a book on programming. It may be divided into sections like this:
1 Introduction 2 Programming Basics 2.1 Input 2.2 Output … 2.9 Vectors 2.10 Sorting Algorithms 3 Advanced Programming 3.1 Graphics 4. Glossary
This sort of text is common in manuals, textbooks and other documents. And VB's inbuilt functions just don't cover it - time for a simple addition to my programmer's toolkit.
My routine has three sort type options: Binary, Text and what I've called Index for want of a better term. If we sort the example above, here are the results using either the binary or text comparisons:
It's mostly OK, except that the author probably expects section 2.10 to come after section 2.9, rather than between 2.1 and 2.2. To sort this type of data correctly, I've written a function called StrCompEx (short for Extended String Compare) that has a third comparison method, which uses a simple bubble sort. It is defined as follows:
Public Function StrCompEx(ByVal sFirst As String, _ ByVal sSecond As String, _ Optional ByVal nCompareType As eCompareType = ctBinary) As eCompareResult
Note that I have enumerated both the compare and the return types, so that they pop up for auto-completion when you use the function in VB5 or VB6 (as described in detail in another article).
When the user clicks on the Sort button, each item in the list is added to an array which is passed (by reference) to the function SortStrings, along with the sort type:
Public Sub SortStrings(ByRef aText As Variant, ByVal nOrder As eCompareType) 'Bubble Sort the strings in aText Dim nGap As Integer Dim i As Integer Dim j As Integer Dim nCount As Long Dim nRowIndex As Long Dim tmpText As String nCount = UBound(aText) nGap = nCount / 2 If nGap = 0 Then nGap = 1 Do While nGap > 0 For i = nGap To nCount j = i - nGap Do While (j >= 0) If StrCompEx(aText(j), aText(j + nGap), nOrder) <= 0 Then Exit Do tmpText = aText(j) aText(j) = aText(j + nGap) aText(j + nGap) = tmpText j = j - nGap Loop Next nGap = nGap / 2 Loop End Sub
If you don't understand how a bubble sort works, I suggest you read about it in more detail in almost any programming or algorithm textbook. The essence of the bubble sort is that it compares pairs of values, swapping them if they are out of order, until all values are in order.
This is the line that performs the comparison:
If StrCompEx(aText(j), aText(j + nGap), nOrder) <= 0 Then Exit Do
It is here that I call my StrCompEx function, and if it returns a value greater than zero the string positions will be swapped.
As you can see, StrCompEx itself is really just a wrapper function for the standard StrComp, and another function called CompareByIndex, which is only called for our special 'Index' sort type:
Public Function StrCompEx(ByVal sFirst As String, ByVal sSecond As String, Optional ByVal nCompareType As eCompareType = ctBinary) As eCompareResult 'Extended string compare 'Purpose: Adds a new compare type for sorting numbered documents 'Catch null strings first so that null won't be returned by strcomp If IsNull(sFirst) Or IsNull(sSecond) Then StrCompEx = crError Else Select Case nCompareType Case ctBinary StrCompEx = StrComp(sFirst, sSecond, vbBinaryCompare) Case ctString StrCompEx = StrComp(sFirst, sSecond, vbTextCompare) Case ctIndexNumeric StrCompEx = CompareByIndex(sFirst, sSecond) End Select End If End Function
CompareByIndex uses a support function called GetIndexArray to split each of the strings we want to compare into an array containing the index part, and a string containing the rest of the text. For example, section "2.10 Sorting Algorithms" is split into an array containing the (string) values "2" and "10", and a string containing " Sorting Algrorithms":
Private Function GetIndexArray(ByVal sString As String, ByRef aIndexString As Variant, ByRef sTextPart As String) 'Split the string into an array containing each level of the index 'and a string containing the rest of the title. Dim nPos As Long Dim sChar As String Dim sReturn As String Dim nIndex As Integer For nPos = 1 To Len(sString) sChar = Mid(sString, nPos, 1) Select Case sChar Case "0", "1", "2", "3", "4", "5", "6", "7", "8", "9" 'If numeric, concatenate If nPos = 1 Then 'We need to redim the first time we add a level. ReDim aIndexString(0 To 0) End If aIndexString(UBound(aIndexString)) = aIndexString(UBound(aIndexString)) & sChar Case "." If nPos = 1 Then 'We need to redim the first time we add a level. 'Will only occur here if the first character is a period '(unlikely but possible) ReDim aIndexString(0 To 0) End If 'Index level seprator - add a level to the array ReDim Preserve aIndexString(0 To UBound(aIndexString) + 1) Case Else 'Any other character means we are no longer within the index number 'so the rest of the string is the text part. sTextPart = Mid(sString, nPos + 1) Exit For End Select Next End Function
Once the strings have been split into their index array and text parts, we can proceed to compare them. We need to check whether each string started with an Index number at all. If it didn't, the variant that is was passed to GetArrayIndex won't have been redimmed, and won't be an array at all. We test this with IsArray(). If both strings had an index number, we compare each value in the array, starting with what was the leftmost part of the index.
As an example the "2" part of "2.10" is compared with the "2" of "2.3". We continue testing each of the values in the arrays, so the next the "10" is compared with the "3". These values are stored as strings in our array, but we want to compare them as (long) integers, so they are converted with CLng and tested using the >, < and = operators.
There are quite a few more checks we need to do to handle index numbers that are of different depths (say "2.1" vs "2.1.1") or with unusual syntax (say "2.1." or ".3"), as explained with comments in the code.
If the index numbers turn out to be equal (or both missing) we need to base our comparison on the rest of the strings (stored in sFirstText and sSecondText), and can use the standard StrComp() function.
Here's the function in full:
Private Function CompareByIndex(ByVal sFirst As String, sSecond As String) As eCompareResult 'Compare two strings using document numbering method Dim sFirstText As String Dim sSecondText As String Dim aFirst As Variant Dim aSecond As Variant Dim nPos As Long Dim nMaxBound As Integer Dim nIndex As Integer Dim nReturn As eCompareResult 'Split the strings into index and text parts GetIndexArray sFirst, aFirst, sFirstText GetIndexArray sSecond, aSecond, sSecondText 'There are four possible results after splitting: '1. Both had an index '2. 1st only had an index '3. 2nd only had an index '4. Neither had an index If IsArray(aFirst) And IsArray(aSecond) Then 'Both had an index - compare on index 'Only compare up to the lowest depth: If UBound(aFirst) < UBound(aSecond) Then nMaxBound = UBound(aFirst) Else nMaxBound = UBound(aSecond) End If 'If first or second is greater at any level, we have a winner For nIndex = 0 To nMaxBound If CLng(aFirst(nIndex)) = CLng(aSecond(nIndex)) Then 'If they were equal when converted to longs, need to 'test for case when one was empty and other was zero 'The empty one wins. If IsEmpty(aFirst(nIndex)) Then nReturn = crFirst Exit For ElseIf IsEmpty(aSecond(nIndex)) Then nReturn = crSecond Exit For End If ElseIf CLng(aFirst(nIndex)) < CLng(aSecond(nIndex)) Then nReturn = crFirst Exit For ElseIf CLng(aFirst(nIndex)) > CLng(aSecond(nIndex)) Then nReturn = crSecond Exit For End If Next 'If at this stage we still have no winner, check whether one 'is indexed to a deeper level. It will lose. If nReturn = crEqual Then If UBound(aFirst) > UBound(aSecond) Then nReturn = crSecond ElseIf UBound(aFirst) < UBound(aSecond) Then nReturn = crFirst Else 'If they are still the same, compare the rest of the strings nReturn = StrComp(sFirstText, sSecondText, vbTextCompare) End If End If ElseIf IsArray(aFirst) And Not IsArray(aSecond) Then '1st only had an index - return first nReturn = crFirst ElseIf Not IsArray(aFirst) And IsArray(aSecond) Then '2nd only had an index - return second nReturn = crSecond ElseIf Not IsArray(aFirst) And Not IsArray(aSecond) Then 'Neither had an index - do a traditional text compare nReturn = StrComp(sFirst, sSecond, vbTextCompare) End If CompareByIndex = nReturn End Function
Here's the result of our new sort, just as the author intended:
Before I'm bombarded with email, let me say that I don't think this sort will be particularly quick. Redimming the index arrays at each level is a definite no-no, as is checking the upper bound more times than is absolutely necessary, as I have done in the GetIndexArray function. However, for the type of data that would probably use this type of sort, it probably doesn't need to be all that fast. I almost always code for readability and maintainability, in any case. Speed freaks, feel free to tune this code!
The example program is available in full for download.