Image of Navigational Map linked to Home / Contents / Search Summer Explorations

by Dan Appleman - Desaware
Image of Line Break

Author's Introduction

I write a regular column for Pinnacle's VB newsletter, and many of my columns are inspired by reader's questions. Recently, I wrote one of these columns, and as is my practice sent a preview copy to the person whose questions inspired the article, along with a request that he not spread it further pending publication.

Through ignorance and a bit of excess enthusiasm, the individual posted the entire article to a list server with over 4000 subscribers.

I am pleased to say that many of the subscribers protested this act, both in public and by notifying me directly. I'm also grateful to the owner of the listserver who promptly deleted the offending file and helped me to protect my copyright.

But sigh, the damage had been done. The publisher (rightly) decided that they wanted "fresh" material. So there I was, out hundreds of dollars of lost income and hours of research - with a really cool article and no way to sell it.

Then I recalled the old saying to the effect of: "If it's raining lemons, make lemonade".

I had previous experimented with writing an online white paper on OLE structured storage, and I received a fair amount of positive feedback about the approach. Since this article could no longer be sold, I decided to post it as another online article in the hope that readers will find it of interest. If you find value here, all that I ask is that you take a few minutes and visit Desaware's web site at www.desaware.com. You'll find quite a bit of additional technical information as well in our newsletters, and a selection of products that are unique in the industry.

Thank you

Daniel Appleman

Summertime Explorations

It's mid-summer as I write this. The season where kids are out of school, and spend their time playing, watching TV, and pestering adults with questions like: "Why is the sky blue?".

Ah yes, those simple, childish, beginner's questions - the kind of questions that deserve a three hour lecture from a local physicist, but are often blown off with something along the lines of: "the tidy bowl man in the sky dropped his bucket".

It's the season where my inbox seems stuffed with messages such as this one:

"I have been working with VBA for some time now, and have recently discovered that VBA and VB are basically the same. It is a little transference to VB 5.0 from Access 95, but it seems to be going very well. And now that I have Dan Appleman's guide to the Win32 API for VB 5.0 I seem to be basically unstoppable. I would like to see a place that I can get basic BAS files. I know that there are some very simple routines and functions that beginners are always looking for, and I wanted to know if there was a place that we might be able to download some of the basics.

Example: how to get cleanly in and out of the registry to return the registered name Windows 95 for an installation (as well as the company name). Also, how to "deltree" a directory, and how I can empty the recycle bin from a sub or function."

It's always nice to hear that a reader has been able to put one of my books to good use. And make no mistake, these are great questions that he asks. But they aren't really what I would call beginner's questions. In fact, by no stretch of the imagination would I classify my corespondent as a beginner.

And while I don't have a Ph.D. in physics, I've written a line or two of code in my time, and the three examples that he suggests do indeed sound intriguing.

The Registered Name

You've probably noticed that nowadays when you install new software, likely as not it will prompt you for a user and organization name, using as default values the settings that you specified when you installed windows. How are new applications smart enough to figure out those settings?

The logical place to look for any system settings is the system registry. The tool to use to view or edit the registry is regedit.exe or regedt32.exe (depending on your system and version). The registry is a hierarchical database which has a number of top level keys depending on the operating system in use. Each key can have a default value and one or more named values. Each key may also have zero or more subkeys.

How can you find out the meanings of various keys and values in the registry?

One approach is to search through the Windows NT or Windows 95 resource kit, which contains documentation on various registry entries. Another approach is to use your Microsoft Developer's Network (MSDN) library CD to search for registry (you do have MSDN, right?).

But there is so much documentation to search through that you might find it easier to search through the registry itself - especially if you are already somewhat familiar with its contents.

Since the registered name and organization relate to a particular installation on a particular machine, a good starting point is the HKEY_LOCAL_MACHINE key. Since the information that we are looking for relates to an installation of Windows, the next logical place to look is the Software key. Subkeys under the Software key are sorted by company - so naturally we look in the Microsoft subkey.

If your system is typical, it has a great many subkeys under the Microsoft key. Since we're concerned with the person and organization that registered the operating system itself, the natural place to look is the Windows NT or Windows subkey (Windows 95 does not have a Windows NT subkey, for obvious reasons).

The only key that I found under the Windows NT subkey is the CurrentVersion subkey (I wonder if they ever create a subkey called ObsoleteVersion to keep track of previous versions of software. Probably not, with the rate of change in software these days, your disk would rapidly become overloaded with obsolete junk). Sure enough, under the CurrentVersion subkey there are keys called RegisteredOwner and RegisteredOrganization that, according to the registry editor, contain the information that we are looking for.

As you read this you may be wondering if perhaps I actually spent hours searching for these entries, and am just demonstrating excellent hindsight in describing the logic of where these entries are located. Sorry - it really was that easy. It took about 10 seconds to figure out where they were. But it is true that I have spent my share of time exploring the registry - and I encourage you to do the same. It's quite interesting and educational. Don't hesitate to use your MSDN CD to look up registry entries that you don't understand. And don't be surprised if some registry entries aren't listed there. I truly believe that there are registry entries in my registration database that no human being understands (my theory is that they spontaneously evolve as a system becomes more complex).

Which brings us to the implementation code.

The keys that we are looking for are HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion under NT, and HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion under Windows 95. This means that the routine will need to know if it is running under Windows NT or Windows 95. The GetVersionEx function makes this easy. Place the following code in the declaration section of a form or module:

   Private Type OSVERSIONINFO
   dwOSVersionInfoSize As Long
   dwMajorVersion As Long
   dwMinorVersion As Long
   dwBuildNumber As Long
   dwPlatformId As Long
   szCSDVersion As String * 128
End Type

Private Const VER_PLATFORM_WIN32_NT = 2

Private Declare Function GetVersionEx Lib "kernel32" Alias _
"GetVersionExA" (lpVersionInformation As OSVERSIONINFO) As Long

Private IsWindows95 As Boolean

The code that does the work can be placed in the Load routine of a form, or the Initialize routine of a class.

   Dim os As OSVERSIONINFO
   os.dwOSVersionInfoSize = Len(os)
   Call GetVersionEx(os)
   If os.dwPlatformId <> VER_PLATFORM_WIN32_NT Then
   IsWindows95 = True
End If

Don't forget to set the os.dwOSVersionInfoSize field to the length of the OSVERSIONINFO structure. GetVersionEx will fail if you forget to do this - a very common error (and one which I make almost every time I use the function).

The registry code requires the following declarations:

Private Const HKEY_LOCAL_MACHINE = &H80000002
Private Const REG_SZ = 1

Private Declare Function RegOpenKey Lib "advapi32.dll" Alias _
"RegOpenKeyA" (ByVal hKey As Long, ByVal lpSubKey As String, _
phkResult As Long) As Long

Private Declare Function RegQueryValueEx Lib "advapi32.dll" Alias _
"RegQueryValueExA" (ByVal hKey As Long, ByVal lpValueName As String, _
ByVal lpReserved As Long, lpType As Long, lpData As Any, lpcbData As Long) _
 As Long

Private Declare Function RegCloseKey Lib "advapi32.dll" (ByVal hKey As Long) As Long

HKEY_LOCAL_MACHINE is actually a constant long value that represents one of the top level keys of the registry. It is passed to the RegOpenKey function to open the desired key. The following code retrieves the information that we are looking for.

Dim subkey$
Dim key&, res&
Dim owner$, company$
Dim datalen&
Dim datatype&

subkey = "SOFTWARE\Microsoft\"

' The key is different under Windows 95 and Windows
If IsWindows95 Then
   subkey = subkey & "Windows\CurrentVersion"
Else
   subkey = subkey & "Windows NT\CurrentVersion"
End If

' Open the key
res = RegOpenKey(HKEY_LOCAL_MACHINE, subkey$, key)

If res <> 0 Then
   MsgBox "Can't open key"
   Exit Sub
End If

' Now read the owner
datalen = 100 ' Pick some large number
owner$ = String$(datalen, 0)
res = RegQueryValueEx(key, "RegisteredOwner", 0, datatype, ByVal owner$, datalen)
owner$ = Left$(owner$, datalen - 1)

datalen = 100 ' Pick some large number
company$ = String$(datalen, 0)
res = RegQueryValueEx(key, "RegisteredOrganization", 0, datatype, _
ByVal company$, datalen)
company$ = Left$(company$, datalen - 1)

RegCloseKey key

MsgBox owner$ & vbCrLf & company$, vbOKOnly, "User Info"

Most of this code is straightforward, but there are a few subtleties to watch for when using the RegQueryValueEx function. As with any API function that uses strings, it is important to initialize the string. In this case choose a value that you know will be longer than any reasonable value that the key will take. The datalen variable is a long variable that initially contains the length of the string. This value is interpreted by the RegQueryValueEx function as the maximum size of the data buffer that you are providing. If the key is longer than this value, it will be truncated. The datalen variable is set to the actual number of bytes loaded into the buffer when the function returns.

The owner and company strings must be passed ByVal. This is because the lpData parameter of the function is defined "As Any". If you forget to use ByVal here, you'll be passing a pointer to the location containing the BSTR (OLE string) handle of the string, rather than a pointer to the string itself. When the function overwrites that location (and those that follow) it will probably corrupt your Visual Basic memory and lead to an error or memory exception.

Finally, the strings are truncated using the Left$ function to contain only the string data. The length is one less than the datalen property when RegQueryValueEx returns. This is because the datalen value includes the null terminating character which you definitely do not want as part of the string (as an experiment, change the parameter of the Left$ functions from datalen-1 to datalen and look at the message box that is displayed. The company name vanishes. Can you see why?).

Even though this code is implemented in Visual Basic, it's pure VBA and should be usable from a code module in any VBA based application including Excel and Word.

Deleting a Directory Tree

The next "beginners" example has to do with deleting a directory tree. Like many algorithms, the trick is in how you state the problem. What does it mean to delete a directory tree? It means that for each directory you must do the following:

  1. Delete all of the regular files in the directory.
  2. Delete the directory tree under each subdirectory.
  3. Delete each (now empty) subdirectory.

Step 2 is the key - The function that deletes a directory tree must also delete subdirectory trees. In other words, it must call itself. This is a technique called recursion.

The function can be implemented as follows:

Private Sub KillRecursive(thispath$)
   Dim nextdir$, nextfile$
   ' Delete any normal files

   Do
      nextfile$ = Dir$(thispath & "\*.*", vbNormal)
      If nextfile$ <> "" Then Kill thispath & "\" & nextfile$
   Loop While nextfile <> ""

' Recursively delete any subdirectories
   Do
      nextdir$ = Dir$(thispath & "\*.*", vbDirectory)
      ' Skip past . and ..
      Do While Left$(nextdir$, 1) = "."
         nextdir$ = Dir$()
      Loop

' Any real subdirectories?
      If nextdir$ <> "" Then
         KillRecursive thispath & "\" & nextdir$
      End If

   Loop While nextdir <> ""

' And delete the directory itself
   RmDir thispath

End Sub

The function follows the sequence described earlier. There are a couple of subtleties to the code. One common way to use the Dir$ function is to call it first with a path, then call it repeatedly with no parameters to see other directories. However, the Dir$ function is not, itself, recursive. You can't rely on the use of multiple calls to Dir$ without parameters because the code calls Dir$ again with a path parameter during the recursion process. It is also not clear if the function will correctly enumerate files and directories if they are deleted during the enumeration process. This algorithm avoids this problem by keeping the full path name at all times.

The subdirectory deletion code skips past the directories name "." and "..", which refer to the current and parent directories respectively.

Note that this function will fail with an error if there are any hidden, system or read-only files in the directories. So you'll want to add error checking, or code to change the attributes of these files in order to delete them. You'll also want a clean way to exit the routine, as simply ignoring the error will lead to an infinite loop.

At this point you may be wondering: "what about the Win32 API solution?". True, it is also possible to implement a directory tree deletion algorithm using API functions, but why bother? The pure Visual Basic solution is easy and reliable. A directory tree deletion routine is, by definition, very disk and operating system intensive, so it's unlikely that API calls will lead to any performance improvement. If you want your deleted files to be placed in the recycled bin, you will need to use the SHFileOperation API function, but the use of that function is the subject for another article.

Deleting Files in the Recycle Bin

Logic suggests that deleting files in the recycle bin should be easy - just apply the file deletion algorithm described above to the directory containing the recycled files and presto: the recycling bin is empty.

But we're talking about Windows here. And while Windows is a very logical operating system, that logic is occasionally twisted, convoluted, hidden, lost, undocumented, folded, spindled and mutilated - to put it mildly.

The problem is simple - where is the recycled directory (called recycler under NT 4)? Where are the files kept?

Let's say you have more than one logical hard drive on your system (I have eight, myself). Look in the recycling bin on one drive using explorer, then delete a file and send it to the recycling bin using another explorer window. (Note, this refers to the windows explorer, not Internet explorer). The file will appear in the recycled bin that you have displayed earlier. In other words: the recycled bin under Windows 95 and Windows NT displays all deleted files, not just those for a particular drive.

How can this be? How can you navigate to what looks like the recycle bin on one drive, and see the contents of all of the recycled bins at once? Shouldn't explorer display the recycled directory only for the drive that you've selected?

What you're seeing here is a feature of explorer called a "name space". When you navigate to the recycled bin, you aren't really looking at a directory at all. You're looking at a COM object that is built into the explorer that is designed specifically to manage file recycling. This object uses a couple of standard interfaces to display the recycle bin name space in the explorer window, and that name space includes deleted drives from all of your drives. The most familiar namespace that most people see is the control panel - it looks like a directory in the explorer tree, but once you click on it you are looking at control panel applets instead of files.

Name spaces are managed in part using API functions from the Shell32.dll API. My initial thought was to use the SHGetSpecialFolderLocation API function to retrieve information about the recycled bin, then use the SHGetPathFromIDList function to obtain the appropriate directory path. This failed, I presume because there is no direct relationship between the recycled bin name space and a particular directory.

My next thought, which probably should have been my first thought, was to search MSDN for help. The results were disappointing - the recycled bin seems to be extremely poorly documented. I did find one article in the knowledge base, Q136517 dated August 1996, which discusses how the recycle bin stores files. Unfortunately, it discussed it mostly from the perspective of users trying to fix a corrupted system. It did provide some hints for programmers though. When combined with the shell name space documentation, it provided enough information to figure out what was going on. Keep in mind that what I'm about to describe is more of a theory than fact. It's based on experimentation and interpretation of some vague documentation. So I won't promise that it's 100% correct. It is, however, correct enough for the code that I'll show you later to work.

First, each drive does, in fact, have its own recycled directory. The directory seems to take two different forms depending on whether it is a FAT drive or an NTFS drive.

On FAT drives the directory is named "Recycled". On NTFS drives it is named "Recycler". Of course, this could also be a difference between Window 95 and NT - I didn't pursue the research far enough to see whether the operating system or file system was the determining factor. The code takes the easy route of trying it each way. In either case the directory has both the system and hidden attributes set.

To understand the next part, you need to know that every shell namespace has a unique identifier which is a 16 byte GUID (globally unique identifier). When the windows explorer is browsing through directories, it checks to see if the directory contains file desktop.ini. This file can specify the GUID of a namespace, in which the directory is considered to be a "junction point". Explorer does not display the directory using its usual file display mechanism. Instead, it passes responsibility for the display to the component that manages the namespace identified by the GUID. It even uses that component to obtain an icon for that directory, which is why the recycling bin appears in explorer as a garbage can instead of a file folder.

On FAT drives, the Recycled directory itself contains the recycled files. On NTFS drives (or NT), the Recycler directory seems to be an ordinary directory that contains one or more subdirectories, each with a very long and cryptic name. Those directories contain the desktop.ini file and thus form the junction points to the recycled bins.

There is one other file of importance to consider. The recycled directory on each drive contains a file named INFO which contains information about the recycled files. You'll find that the actual file names in the recycled directory have no relationship to the original names of the deleted files. This INFO file contains the original names and location of the files, and is used by the namespace component to display the original file names in the explorer window instead of the name of the file in the directory. In other words: if you delete file xyz.doc, it will be stored in the recycled directory with a name like DC01.DOC. The INFO file will contain the xyz.doc. A file browser (such as the one that follows) will see the disk name DC01.DOC, but xyz.doc would be displayed in the explorer window. The INFO file is only important for this application in that you don't want to delete it under any circumstances. Doing so can screw up your recycled bin. Naturally, you don't want to delete desktop.ini either.

This was enough information to create a recycled bin cleanup routine. The ShowRecycledRecursive routine actually displays the files in a recycled directory in a list box named list1, though if you uncomment the Kill function it will delete them as well. It is called from the following code:

List1.Clear
Dim RecycledPath$
RecycledPath$ = m_Drive & "\"
If Dir$(RecycledPath & "recycler", vbSystem Or vbHidden Or vbDirectory) _
<> "" Then
   RecycledPath = RecycledPath & "recycler"
Else
   If Dir$(RecycledPath & "recycled", vbSystem Or vbHidden _
   Or vbDirectory) <> "" Then
      RecycledPath = RecycledPath & "recycled"
   Else
      MsgBox "No recycled directory found on the specified drive"
      Exit Sub
   End If
End If
ShowRecycledRecursive RecycledPath, 0

The code clears the list box and builds two paths to try. m_Drive is the drive that you want to examine. The routine tries both "Recycled" and "Recycler". If you've changed the name of the recycled bin, the function will fail. Note the use of the vbSystem, vbHidden and vbDirectory attributes necessary to find a hidden system directory. A complete recycled bin deletion routine would have to check each logical drive. Look at the LogDrvs.vbp example from chapter 13 of my Visual Basic 5.0 Programmer's Guide to the Win32 API for VB code that demonstrates how to retrieve a list of logical drives on your system.

The ShowRecycledRecursive function starts with the recycled directory. The level parameter is used to track how many levels down you have recursed so that the file display in the listbox can be indented to indicate the depth of the directory structure.

Private Sub ShowRecycledRecursive(ByVal thisPath$, ByVal level%)
   Dim dirlist() As String
   Dim filelist() As String
   Dim r$
   Dim arraylen As Long
   Dim idx&
   ReDim dirlist(0)
   ReDim filelist(0)

' Get a list of files
  r$ = Dir$(thisPath & "\*.*", vbSystem Or vbHidden)

  Do
      If r$ = "" Then Exit Do
      If LCase$(r$) <> "desktop.ini" And LCase$(r$) <> "info" Then
         arraylen = arraylen + 1
         ReDim Preserve filelist(arraylen)
         filelist(arraylen) = r$
      End If

      r$ = Dir$()

  Loop While True

' Get a list of directories

  arraylen = 0

  r$ = Dir$(thisPath & "\*.*", vbSystem Or vbHidden Or vbDirectory)

  Do
      If r$ = "" Then Exit Do
      If Left$(r$, 1) <> "." Then
         If GetAttr(thisPath & "\" & r$) And vbDirectory Then
            arraylen = arraylen + 1
            ReDim Preserve dirlist(arraylen)
            dirlist(arraylen) = r$
         End If
      End If

      r$ = Dir$()

   Loop While True

' Now show files
   For idx = 1 To UBound(filelist)
      List1.AddItem String$(level * 2, " ") & filelist(idx)
      ' Uncomment this line to delete files
      ' Kill thisPath & "\" & filelist(idx)
   Next idx

' Now show directories (recursively) 
   For idx = 1 To UBound(dirlist)
      List1.AddItem String$(level * 2, " ") & dirlist(idx) & "(d)"
      Call ShowRecycledRecursive(thisPath & "\" & dirlist(idx), _
      level + 1)
   Next idx

End Sub

Unlike the earlier directory deletion routine, this routine is designed to display files as well as delete them. This means that you'll want to use the feature that allows the Dir$ function to be called multiple times to retrieve a list of files or directories. The information is stored in two arrays, one for files and one for directories. A slightly more efficient implementation would create both arrays in a single pass, but I wanted to keep the operations separate in this example for the sake of clarity.

Note that the desktop.ini and INFO file are intentionally ignored to prevent corruption of the recycled bin.

Once the arrays are built, they are processed in order. First the files are displayed or deleted. Next, each directory is recursively processed.

Experimentation shows that deleting files in this manner does, in fact, remove them from the recycled bin. But what about the information for that file in the INFO file? What happens when the file referenced by the INFO file no longer exists?

The truth is, I don't know. I haven't seen any problems with this approach. I have to hope that the recycled bin manager detects that the file is missing and ultimately removes the information from the INFO file.

Conclusion

It's one of the curious things about complex systems like Windows that a few simple questions can take you in all sorts of unforeseen directions. In this case they've lead us from differences between operating systems, to the system registry and API functions, to file operations where API functions are a waste of time, to some of the least documented features of the windows explorer where you're left to depend on experimentation and guesswork.



Written and Copyright by: Dan Appleman
October '97

Copyright 1997 by Daniel Appleman. All Rights Reserved.


Image of Arrow linked to Previous Article Image of Arrow linked to Next Article
Image of Line Break
[HOME] [TABLE OF CONTENTS] [SEARCH]