Tuesday, January 13, 2009

File corruption on Windows XP freeze the whole OS

I just spent 8 hours tracking down a nasty problem on IE 8 that constantly caused my entire OS (Win XP SP3) to freeze after a few minutes of using IE. As a former MS employee who worked at MS for 9 years, I care a lot about Microsoft and want to help out whenever I can. I submitted the issue to Microsoft and I am posting it to help anyone who may encounter the same issue.

Basically, over last few days I found that my computer would start to freeze grandually after using IE 8 for a while. Typically, IE 8 would first stop responding first, then Outlook and Firefox would stop responding. Start menu would also freeze. Task manager would continue to work for a little longer. Cmd windows would mostly works longer. When that happens, I had no choice but force power off. In the last few days, I had multiple instances of this problem with increased frequency. Sometimes during reboot, the OS would run chkdsk and correct some bad file indexes, but the problem persissted.

Today I had enough of it and decided to track it down. I tried to remove many of my programs and stop various services and shell hooks but none worked. Finally I decied to uninstall IE 8 beta reluctantly. However, uninstall IE 8 made things worse because now IE7 would freeze the OS within 5 seconds. Eventually I found that if I log in as a different user and use IE (now IE 7) and things worked fine. So I ran"chkdsk /R" and it finally found clutsters in file \Document and settings\\Locals~1\TEMPOR~1\Content.IE5\index.dat. Note this fix was found on stage 4 of 5 of chkdsk. This should have been cautght before previous chkdsk that the system automatically ran after crashed, but apprently the system only performs "chkdsk /f" which does not perform step 4,5.

Anyway, I now recall that I had similar problem back in the days when XP first shipped while I worked at Microsoft. I think this problems could be better handled if system performed chkdsk /r instead of chkdsk /f when system repeately crash. In addition, the OS should handle such file corruption better when it happens. That can be done because the File subsystem, problems that aren't performing IO can still function and detect and monitor such issues.

Cheers,

Wei Zhu