Persistence of Data
The Missing Technology, page 7
All programmers have to deal with the problem of persisting data, and there is a tendency to jump straight to a solution without considering the problem in the context of the requirements of a particular software project. It is true, as you have seen, that application configuration information is usually best handled using either a private initialization file or the system registry. It is true that record storage is usually best handled by a database. It is true that simple documents are usually best handled by reading and writing a disk file directly. But you have also seen that each of the four technologies that we've discussed can be applied for any of these persistence tasks, and the obvious choice is not always the best.
It is also clear from the previous discussion that there is one type of persistence problem for which none of the technologies is ideal. What is the best way to store complex documents that contain different types of information? Implementing a complex file format is a great deal of work. And databases are designed for records that are similar to each other - using them to store arbitrary types of information of arbitrary length is possible in many cases, but can be even more work.
Until recently, there has been no obvious solution to this problem available to Visual Basic programmers.
Before looking at OLE structured storage in the context of this problem, let's pause for a moment and consider the characteristics of the ideal complex document storage system - a system able to manage a complex document file. Assume for the moment, that the term "object" refers to any block of data that you care to define.
The complexity of implementing such a system on your own is substantial - and one of the reasons that it is so tempting to look for alternatives to creating a private file format. At the same time, this description might sound familiar - in fact, a storage system that meets these criteria is present on every Windows and DOS based system.
It is the file system.
Think for a moment of files as objects. A file system can handle many thousands of objects. Each file can range from zero bytes to gigabytes in size. You can certainly add and remove files without fear of one file interfering with the next. And writing into one file cannot corrupt any other. Files are named, and can be organized in a hierarchy of directories.
In fact, you will find some applications that solve the problem of complex document storage by using the file system - dividing their documents into multiple disk files. An example that I worked with recently is Corel Ventura Publisher, in which one publication can consist of multiple chapters each of which contains links to documents of different types. A single publication can thus be made up of hundreds of files, which are typically organized in one or more directories.
The disadvantages of this approach are clear when you try to copy a document to a floppy disk, to another directory, or to send it via Email. Copying the publication requires a special utility that can search out all of the linked files and copy them to the correct destination. If even one file is missing, the entire publication can fail to load at worst, have missing information at best.
Which brings us to OLE Structured Storage.