Industry


Ads by TechWords

See your link here


Michael Horowitz's picture
Michael Horowitz

Defensive Computing

Linux file integrity: the same as Windows

My previous posting described how Windows 7 lets two programs update the same file at the same time, with the inevitable result, data corruption. It's an accident waiting to happen. And the same is true of Linux.

KUBUNTU

Today, I tested with Kubuntu version 8.10 running KDE 4.1.2, the Kate text editor version 3.1.2 and OpenOffice.org Writer version 2.4.1.

I tested two instances of Kate running concurrently and editing the same file. I also tested Kate and Writer simultaneously editing the same file.

The results were the same as with Windows 7, file corruption. Not once was I warned about editing a file that was already being edited by another application.

For example, I opened a test file in Kate, then in Writer. I updated the file in Writer, saved it and exited Writer. Then I updated the file in Kate, saved it and exited Kate. Opening the file again, in either program, shows just the updates from Kate, the updates made in Writer were lost.

To further show how the operating system fails to protect its files, I opened a file in Writer, updated it, saved the updates, left Writer running, switched to the Dolphin file manager and deleted the file being edited. I also started Writer, created a new file, saved the file, switched to Dolphin and was able to delete the file while Writer was still running. This repeats the results I had earlier found in both Ubuntu 7.04 and Windows XP.

Windows 7 also lets you delete a file while it's in use. I opened WordPad, created a new file, saved it, then was able to delete it using Windows Explorer while WordPad was running. The same thing happens with IrfanView - Windows 7 deleted a file that IrfanView had just saved while IrfanView was still running.

UBUNTU

Kubuntu is my least favorite distro, so I also ran these tests on Ubuntu 8.10 running GNOME 2.24.1, gedit 2.24.2, OpenOffice.org Writer 2.4.1 and version 2.24.1 of the Nautilus file browser.

While gedit was editing a file, Nautilus let me delete it. Likewise, a file could be deleted while Writer was editing it.

To test concurrent updates, I opened a file with Writer, then opened it with gedit, updated the file in gedit, saved it and exited gedit. Then I updated the file in Writer, saved it and shut down Writer. As expected, the updates made in Writer clobbered the changes made in gedit.

HIGHER END APPLICATIONS

Higher end applications can offer the integrity that the operating system lacks.

On Windows, for example, my favorite text editor, Notepad++ will not create two instances of itself each editing the same file. The same goes for gedit under Ubuntu.

Both applications also detect changes made by another application. Below is the warning issued by gedit when it detected that Writer had updated the file it was also updating.

Notepad++ issues a similar warning under the same circumstances (it's far too wide however to fit in this web page and still be readable).

FINAL THOUGHTS

Commenters to the prior Windows 7 posting argued in favor of the current approach, data integrity be damned. Many people look at it as an application issue. I see it as an operating system issue. The operating system should protect its files from mistakes made by users (deleting a file while it's in use) and applications (editing a file that another application is already editing).

Still, I seem to be in the minority. Perhaps we get the operating systems we deserve.  

Backup, backup backup.



NOTE: Linux supports multiple file systems and it could be that another file system would offer better integrity, I don't know. I'm not a Mac person, so I can't run these tests on OS X.

 

What People Are Saying

Petition Denied

There is dispute here over what an operating system should do, and what should be under the sole jurisdiction of applications. I think reasonable people can disagree about that. I wish some of these comments were in a move civil tone, but otherwise I see nothing terribly wrong here. Carry on -- nicely please.

Mandatory file locking would cripple modern computing

If, as you claim, operating systems are flawed because they don't implement mandatory file locking then how do you explain the fact that all modern OS distributions don't do this? Is there some kind of mass delusion going on here on the part of OS manufacturers? I think not.

If modern operating systems were to enforce mandatory file locking for every app just think of what the unacceptable performance hit would be on any multi-threaded application that that needed to share info on disc between its threads... The whole thing would grind to a halt.

File locking ain't so great even for the end user. I remember many years ago when I first started my computer science education on an old ICL mini running George 3 OS. That enforced file locking. We were forever running around asking people what files they had open just so we could edit our source code. It was ridiculous.

There are much more elegant ways to avoid the trivial examples you're sighting in your articles. The apps you've chosen to use in your examples don't do file locking because they were not designed to be used in collaborative environments. To say that design choice is the fault of the OS is just ridiculous.

Databases

File locks have their place. Databases are for serious concurrent updating.

This is not a bug...

The fact that the file descriptors/handle are being opened up by more than one application signifies that the applications utilized are not opening up the file in blocking mode. You can "man 2 open" for more details.

There are situations where this may be ideal but to set it as default can really impact and limit functionality; especially when it comes to clustering, database access, etc.

That is why I say that this is normal and if a single user was foolish enough to open up the same file through two applications or delete it while editing then they need to immediately stop using a PC.

not a bug

Actually linux provides the necessary means. It is used for critical system files. In ubuntu for example /var/lib/dpkg/lock can only be opened by one process, which makes sense, since you would want your installed packagelist match the actual content of your system and not have to packagemanagers manipulate it at the same time.

For editing purposes you consider the user wise enough for the files he has the right to write to, not to edit them in 2 different programs at the same time, unless he has a purpose for doing so.

There is also no file corruption as several other users have also commented upon. There would be corruption if the content would be mixed or unreadable. It's just the content saved latest, which is exactly as it should be.

That's funny: When I run

That's funny:

When I run more than three applications at the same time on my DOS 3.0 box, the exact same problem occurs. The only thing that helps is upgrading the sound speed.

(NOTE: The video jumpers were totally unaffected by this experiment.)

This is a complete fabrication ...

"The results were the same as with Windows 7, file corruption. Not once was I warned about editing a file that was already being edited by another application.

For example, I opened a test file in Kate, then in Writer. I updated the file in Writer, saved it and exited Writer. Then I updated the file in Kate, saved it and exited Kate. Opening the file again, in either program, shows just the updates from Kate, the updates made in Writer were lost."

This is a complete fabrication. When you save the file in oowriter a dialog box pops up in Kate stating "File Changed on Disk" which gives you several options including "View Difference", "Overwrite", "Reload", "Ignore", or "Cancel".

If you open the file in both programs and save the file first in Kate, then in oowriter at that point only the edits in oowriter would be saved. This is the EXACT OPPOSITE of your little example.

Since Kate is part of the KDE desktop it performed exactly how it should with regards to file changes from multiple applications. Even using Kwrite, another KDE application, produced the same "File Changed on Disk" dialog box that Kate produced.

The lack of warning from oowriter on file saving is not the fault of Linux or KDE, but the fault of Open Office. If you downloaded the Windows or Mac version of Open Office the same problem would exist. It has absolutely NOTHING to do with the operating system.

Michael did you think that nobody would test if your little example wasn't valid? You're completely incompetent. How CW continues this blog of yours is a mystery.

He's right in that kate in

He's right in that kate in KDE 4.2.1 and Writer by default don't ask whether to replace the file on disk, if it has changed since opening. Even in this case though, there's no file corruption, the file on disk is the file you requested to save.

Something completely different ...

"He's right in that kate in KDE 4.2.1 and Writer by default don't ask whether to replace the file on disk, if it has changed since opening."

The file is opened as a tmp in cache so no changes to the original have been done. Changes only take effect when the file has been written to disk. This is not a bug, but how the system was intended to work.

His specific example stated that no warnings were given that the file had changed, which is false. Kate and Kwrite will give a warning that the file has changed when committed to disk.

If you want applications to be aware of changes made by other programs on non-written to disk changes you need something like Google Docs which allows you to share changes in real time, otherwise you use revision control:

http://en.wikipedia.org/wiki/Revision_control

Most popular word processor and spreadsheet applications already have this embedded. You could also use an application that does file locking, such as with package managers, or version merging that's used in systems such as CVS and Subversion.

What do you mean by file

What do you mean by file corruption?

If you open a file twice in linux and edit & save in one editor, the other notices that the file on-disk has changed and offers to remedy the situation as it should, i.e. no data corruption occurs (data on disk in the end is that of editor #1 or #2), and you're offered the option of which version to keep.