Quattro Pro for DOS revisited: an obsolete format no more?

in 2014 I wrote a post on the Quattro Pro for DOS spreadsheet formats. This documents my attempts at reading a few old Quattro Pro for DOS spreadsheets from my personal archives with modern (at the time) software. Back then, neither Microsoft Excel nor LibreOffice Calc supported these formats. Only the then-current Quattro Pro X7 was able to read the files, but there were several issues related to formatting, rendering of charts, and the handling of external references. Based on these tests, I argued that Quattro Pro for DOS could be a case of that rarest species in the world of digital preservation: a file format that had truly become obsolete!
While working on my previous post about Microsoft Multiplan spreadsheets, I was surprised to see that LibreOffice actually supported this ancient format. A quick glance at LibreOffice’s feature matrix showed that LibreOffice has added support for many legacy formats over the past years, including Quattro Pro for DOS1. This made me wonder if the conclusions of my 2014 post would still hold up. This folow-up post puts this to the test.
Quattro Pro
Quattro Pro is a spreadsheet program that was first released in 1988. It’s still around today as part of the WordPerfect Office suite. A number of file formats are associated with the software. As with my 2014 post, the scope of this follow-up is restricted to the old Quattro Pro for DOS formats:
Test files
I used the same test files as in my 2014 post (which are all part of the OPF Format Corpus):
These are all files from my personal archives, and I originally created them in 1996.
Emulation of original software
To evaluate LibreOffice Calc’s handling of my test files, I emulated the original software, and used the emulation as a reference. WinWorld has installers for most old Quattro Pro versions. From what I recall I created most of my files with Quattro Pro 5.0 for DOS, so I downloaded the installers of that version. I then installed the software in DOSBox-X. My initial attempts at this failed, because the installer wouldn’t recognise the second installation disk. As a workaround, I simply copied the files from both installation disks to a temporary folder, and then ran the installer from there. This resulted in a successful installation.
Simple numbers and text
I started out with this WQ1 spreadsheet, which only contains simple numerical and text data. The DOSBox-X emulation rendered this file without any apparent problems:

You may notice that the first column appears as shaded. Apparently the file was originally saved with the “locked titles” option enabled, which can be used to lock the first row or column during scrolling. This can easily be disabled.
Next I opened the same file in LibreOffice Calc2:

Calc reads the file without any problems. There are some minor differences compared to the emulation: Calc doesn’t “lock” the first column, the date format is slightly different, and Calc shows more decimal places for the numbers in the J, K and L columns. All of these are mostly cosmetic differences, and for practical purposes Calc’s rendering looks perfectly usable. The results were largely identical for the WQ2 version of the same spreadsheet.
Simple formulas, charts
Next up was KS4001.WQ2, which contains some formulas and a chart. Here are two screenshots of the DOSBox-X emulation of this file:


And here’s a screenshot that shows what LibreOffice Calc makes of this:

Compared to the emulated version, the most notable difference is the formatting of the chart, but overall the Calc rendering gives a good representation of the data in the original spreadsheet. Interestingly, in my 2014 analysis the then-current Quattro Pro X7 version failed to render the chart correctly:

It’s reassuring to see that LibreOffice Calc now does a much better job!
External references
Finally there’s this WQ2 spreadsheet, which contains references to another spreadsheet (KSBASE.WQ2) that is located in the same directory. This file exposed some interesting issues, and the following sub-sections are an attempt to demonstrate these as best as I can. Be warned though that what follows is slightly convoluted, which is due to a combination of the nature of the spreadsheet, and the fact that both issues are interrelated.
Quattro Pro emulation
First I opened this file in the emulator. On loading, Quattro Pro detects that it contains external references, and prompts the user what to do with them:

Here I chose “Load Supporting”, after which the file loaded like this:

The external references, which are in cells E4, G4, B5, E5 and H5, are all resolved correctly. To double-check this, I also tried to open the file after removing the referenced file. This resulted in “NA” values for all cells with references to this file, as well as for all cells that depend on it.
I initially assumed that the shown “ERR” values were related to the external references, but on closer inspection this doesn’t seem to be the case at all. Instead, they simply result from the fact that the C column doesn’t contain any data.
Initial behaviour in LibreOffice Calc
Opening the file in LibreOffice Calc also results in a notification message that “Automatic update of external links has been disabled”:

Note how the values in cells E4 and G4 (which both contain external references that return a text string in the emulation) are shown as “#REF”. Interestingly though, cells B5, E5 and H5 also contain external references, but the (correct!) numerical values are shown nevertheless. This could mean either of the following:
- The numbers shown are “cached” values that were saved alongside the formulas at the time of the file’s creation.
- The numbers are imported from the linked spreadsheet (even though this shouldn’t really happen).
To test which of these is true, I created a copy of the spreadsheet in an empty directory. When I opened it in Calc, it showed the exact same behaviour. This means the numbers really are cached values. This does raise the question why Calc only shows cached numbers, and not cached text strings. I will return to this in the final sections of this post.
Restoring the external references
First let’s go back to our original spreadsheet. After clicking on “Allow updating”, Calc showed a notification that the external file that is referenced could not be loaded:

The reason for this is, that all references to the external file are defined as a file name without a file extension (i.e. KSBASE instead of KSBASE.WQ2). A quick look at the LibreOffice documentation shows how to fix this:
- In Calc’s “Edit” menu, go to “Links to External Files…”. This brings up a link editor dialog.
- Select the link you want to change, and click on “Modify …”
- In the file dialog that appears, select the file you want to link to (in this case KSBASE.WQ2).
- Click on the “Update” button, and then close the dialog.
Behaviour after recalculate
With the link restored, I opened the “Data” menu, and selected “Calculate/Recalculate Hard”. This forces all cells to be re-calculated. The results of this operation were not quite what I expected though:

Even though the “#REF” values are now replaced by actual values from the external spreadsheet, these values are different from those in the original Quattro Pro rendering. For example:
- The value of cell B5 is supposed to be a number (variable DEPTH, 30), but instead it is now a text string (“sal”).
- The value of cell G4 is supposed to be a text string (variable SOIL, “c”), but instead it is now a numerical value (3133.03).
So what’s going on here?
VLOOKUP function behaviour in Quattro Pro and Calc
To understand why this happens, it is first important to know that the external references in rows 4 and 5 of the spreadsheet are all table lookup operations. As an example, the Quattro Pro formula in cell B5 (copied here from the emulated version) is:
@VLOOKUP($B$4,[KSBASE]A:$A$2..$I$83,6)
This function has 3 arguments:
- A lookup value.
- A data block in the external spreadsheet, where the first column is used as an index column.
- The column number in the data block that contains the requested value.
The equivalent function in LibreOffice Calc is VLOOKUP. Here’s a portion of the relevant data in the external spreadsheet (which is used as the data block):

Note how column G contains the expected value (30) for the DEPTH variable (highlighted here in green), but this is the 7th column in the data block, not the 6th! The 6th column (variable SOIL) contains the text string “sal” (highlighted here in red), and this is also what is returned by Calc’s VLOOKUP function. Other variables that are based on this function are similarly offset by exactly one column.
As it turns out, Quattro Pro’s @VLOOKUP function and Calc’s VLOOKUP function each treat the data block geometry slightly differently: in Quattro Pro, the index of the first column is defined as 0, whereas it is 1 in Calc. This means that when you use Calc to open a Quattro Pro for DOS spreadsheet that uses the @VLOOKUP function, the values that are returned by Calc’s VLOOKUP will be offset by exactly one column3.
I created a small, self-contained test file that demonstrates this issue, and wrote some accompanying documentation. Both are available here in the OPF Format Corpus.
Handling of cached formula values
As I explained before, my initial attempt at opening the spreadsheet (before recalculation) showed that Calc loads “cached” values for externally referenced cells, but only if these values are numbers. In case of strings, a “#REF” value was shown in my tests. There are two plausible explanations for this. Either:
- Quattro Pro stores a cached value if the formula result is a number, but not if it is a text string, or,
- Quattro Pro stores a cached value irrespective of the formula result’s data type, but Calc doesn’t correctly handle it if it is a string.
Again, this is something we can easily test by isolating a copy of the spreadsheet in an empty directory, and then opening that copy in the Quatro Pro emulation:

Here we see that Quattro Pro shows the (correct) cached value for each cell with an external reference, irrespective of whether it is a number or a text string. This leads to the conclusion that Quattro Pro does store cached string values, but that Calc doesn’t correctly handle them. I was able to confirm this with some additional tests on another self-contained test file. For brevity I won’t go into details here, but the documentation of these tests and the test file can be found here.
Combined effect of VLOOKUP and cached values issues
The cached values issue does not only affect cells with external references, but any cell that contains a formula. We can see this in the previously created VLOOKUP demo file, which renders like this in the Quattro Pro emulation:

The top 11 rows here contain a 5-column block of data, which is queried with the @VLOOKUP function in row 15. When I opened this file in Calc, this initialy produced:

Note how the values in cells D15 and E15 (which both result in a text string in the emulation) are different from the original rendering, while the values in B15 and C15 (both numbers) are correct. After re-calculating (“Data/Calculate/Recalculate Hard”), I got this:

This is the familiar column shift pattern that results from the @VLOOKUP compatibility issue. However, by itself this doesn’t explain why cells B15 and C15 showed the correct values before the recalculation!
My best guess is that on opening a Quattro Pro for DOS spreadsheet, Calc’s intended behaviour is to load the cached cell values, instead of recalculating the underlying formulas. This would be entirely sensible for legacy spreadsheet formats, since it is highly likely that not all functions in their original creation software are completely compatible with Calc. Since the tests in the previous section showed that Calc is unable to read cached values that are text strings, I suspect that in this particular case, Calc recalculates those cells as a fallback. This then results in a rendering that contains a mix of both cached and recalculated values. In most situations this would go unnoticed by the user, but not here, since the recalculated values are affected by the @VLOOKUP compatability issue.
Conclusions
In the concluding section of my 2014 post I wrote:
[N]o modern-day software is able to correctly handle the Quattro Pro for DOS formats. Add to this that the Quattro Pro for DOS formats are proprietary with (as far as I’m aware) no publicly available specifications, and I think we have a pretty strong candidate for a format that may be (nearly) obsolete.
More than ten years onward, things are looking much better now. LibreOffice Calc, which is both free and open-source, was able to read all my old WQ1 and WQ2 files, and it did a better job at this than the (proprietary/closed) Quattro Pro X7 version I used in 2014. The tests with my external reference spreadsheet highlighted two issues though:
- An incompatability between a legacy Quattro Pro for DOS function and its modern equivalent in Calc.
- A lack of support for cached formula values that are text strings, which in turn triggers inconsistent recalculation behaviour.
Realistically, I doubt the compatibility issue itself can (or even should!) be fixed in Calc. Given the large number of legacy spreadsheet formats that Calc supports, and the wide array of functions within each of these formats, there may be many other, similar compatability issues lurking beneath the surface. That being said, the effects of such issues could be mitigated by addressing the second issue, specifically:
- By adding read support for cached formula values that are not numbers (text strings, possibly other types as well).
- By reconsidering the current recalculation behaviour on opening a file. If no cached value can be retrieved, I think Calc shouldn’t automatically recalculate the formula as a fallback, but instead just let the user know that it is not available. (Of course, after this it’s up to the user if they want to manually force a recalculation or not.)
With these changes, Calc’s rendering of a Quattro Pro for DOS file would provide the best approximation of its state when it was last saved.
I’ve submitted a bug report about both issues, where I also mentioned the above suggestions.
The results of my tests also underline -once again- the risks of migrating or “normalizing” legacy spreadsheets to some modern format. On their own, each of the reported issues can already lead to lost or altered data in a migration action. The combined effect of both issues can result in more complex changes that would be nearly impossible to trace back to the source data. At the very minimum, the original source file should always be kept, as well as a full audit trail of the migration process. Both should also be made available to the user, so they can use it to make an informed judgment of the accuracy of the migrated data.
Just like Microsoft Multiplan, the Quattro Pro for DOS case confirms David Rosenthal’s 2009 assertion:
Clearly, formats with open source renderers are, for all practical purposes, immune from format obsolescence.
So would I still call the Quattro Pro for DOS format(s) obsolete? Probably not. This would make it the first example I’m aware of, of a file format that was effectively obsolete 10 years ago, but isn’t anymore!
Further resources
-
Quattro Pro for DOS: an obsolete format at last? (original 2014 post)
-
Opening Johan’s Quattro Pro files in Quattro Pro 6 for Win 3.11 by Euan Cochrane
-
Legacy spreadsheet sample files in the OPF Format Corpus (this includes the WQ1 and WQ2 files used here)
-
VLOOKUP compatibility demo (showcases VLOOKUP compatibility issue)
-
External reference demo (showcases cached values issue)
-
LibreOffice bug report on inconsistent handling of cached values in Quattro Pro for DOS (WQ2) spreadsheets
-
Here’s a snapshot of the same feature matrix from January 2015. ↩
-
The screenshots in this post are all from the (rather old) 6.4.7.2 version; I later re-did some of my tests in LibreOffice 24.2.7.2, which gave identical results. ↩
-
This most likely applies to quattro Pro’s @HLOOKUP function as well. ↩
Comments