It is perhaps indicative of the modern trend towards transient information – no doubt born of the Internet’s most persistent by-product that all information should be free (and therefore of no value) – that one of last month’s big news stories is now all but forgotten.
What is worse is that it would seem to have been deliberately forgotten by most of the world’s largest and most influential software companies. That possibility leaves suggests that they might just be showing far more interest in turning a Dollar now than in the future of information retention for the education and entertainment of generations to come.
That story, of course, was Google Vice President, Vint Cerf, stating that we were in serious danger of losing vast amounts of valuable data because of the dramatic pace of technology advances. In other words, the race to come up with ever-better ways to format and store data will not just obsolete old storage technologies and file formats. It will risk losing them – and the data using them – forever.
Two thoughts can be found lurking around this particular story. One is the fact that Cerf felt the need to make this point in the first place. If you think about it, data storage is an irrelevant pastime if every technology change risks making all previous storage unreadable and inaccessible. It should be such a no-brainer to ensure that every development comes with not just the necessary file reading capabilities, but also a routine that can search out those to-be-obsoleted files on any storage device the user has and reformat them without data loss or corruption.
The other is the deafening silence that seems to have come from the industry… no howls of anguish that yer man is well wide of the mark, no protestations that the means to solve this exist, no moves towards a lingua franca for each type of data.
Increasingly, it seems to me like they would rather get the money from getting (obliging?) everyone to change to the latest and greatest `cool’ storage formats and actually obsolete not only all the old ones but also the data they hold. It’s only old data after all: bring on the sexy, cool new data.
But it is not `data’, it is history. The smart-ass response to this, of course, is that human history shows that humans find it impossible to learn anything from history, so it is all irrelevant junk that can just be erased.
But sometimes I wonder just how necessary it is – to us as users – that technology marches on? To the vast majority of users, has the development road map between, say, Microsoft’s Word 3 and Word 2010 created too many absolutely new ways of writing down information? Yet there has been a new version to buy every couple of years.
For the software vendors this approach has the obvious advantage of keeping the revenue earning potential up, and most of the vendors have expended large amounts of money to keep users in the expectation that new technologies – that they `simply must have’ – will always be coming along.
To be sure, the thought has passed my mind that, as a Google VP, Cerf may even be on a marketing wind-up campaign for a near-future service announcement from his employers.
But does any of that answer the basic question of `why?’ Why do the users want to do anything with the technology? In terms of data storage that question becomes: `why does anyone want to save this stuff anyway?’
There are as many answers to that as there are people saving data, but I suspect none of them features words such as “because the storage technology is so damned cool”. And if any of them do have a transitory flash of such a thought, you can guarantee that they will be thinking it about some other storage technology next week.
In practice they want to save it because they know they will want to refer back to it at some time, either for education/re-education of themselves and future generations, or just the entertainment value of good memories. I can still read extracts of my family history from the pages of my father’s old family Bible, yet I can’t always find a way of reading files I created ten years ago. Only this morning my PC asked me how I wanted to view a plain text file – and mangled its on-screen presentation anyway.
So maybe there needs to be a rule – and certainly a rule that all customers need to apply to their future software purchases. No new storage format technologies will be bought unless there is clear, simple and non-obsoleteable backward compatibility with all relevant storage formats. If one can go to the British Library reading room and work with books and documents of n-hundred years ago, then the technology vendors have to understand and accept their responsibilities to the future histories of their millions of customers.
One answer – and leaf they could adapt from the open source community – is that they form an industry-wide format compatibilities standards body, an organisation which not only recognises the problem but develops the appropriate technologies and `persuades’ vendors to offer and adhere to them.
Indeed, if they don’t do this they could drastically impede the development of – and need for – new applications, and versions of established applications. Once users start `losing’ data by having it in now-unreadable formats, they may well start sticking with the old applications………just like n-million businesses did with data written for XP-based applications.
Bet they don’t, however.