From bash@blaze.csci.csusb.edu Sat Dec 3 11:59 PST 1994 Return-Path: Received: from blaze.csci.csusb.edu by silicon.csci.csusb.edu (5.0/SMI-SVR4) id AA23731; Sat, 3 Dec 94 11:59:00 PST Received: by blaze.csci.csusb.edu (AIX 3.2/UCB 5.64/4.03) id AA19283; Sat, 3 Dec 1994 11:52:20 -0800 Date: Sat, 3 Dec 1994 11:52:20 -0800 From: bash@blaze.csci.csusb.edu (Bashir Kassab) Message-Id: <9412031952.AA19283@blaze.csci.csusb.edu> To: dick@blaze.csci.csusb.edu Subject: Pentium_FPU_Bugmailx! Content-Type: text Content-Length: 10457 Status: RO Hello, I got the following two E-Mails through the Novell Interest Group server, I thought they would be of great interest to you. Please forword them to the other faculty and students if they are interested. Thnak you. bash ********************************************************* ********************************************************* Message-Id: <9411302006.AA16467@blaze.csci.csusb.edu> Date: Wed, 30 Nov 1994 09:50:34 CST Reply-To: Novell LAN Interest Group Sender: Novell LAN Interest Group From: Danny Owens Subject: Pentium Bug X-To: NOVELL@suvm.acs.syr.EDU To: Multiple recipients of list NOVELL Content-Type: text Content-Length: 3896 Status: R This message is cut from a posting to the Windows interest group I belong to. This should clear up some questions and information that's been floating around here lately. Danny Owens The message follows: FYI. Some of you may already know of this. Got this from the Scripps Oceantech (Steve Roberts) mail group I belong to. Pass along to friends that would have a concern, or are thinking of buying a Pentium. Subject: Pentium Floating Point Bug Date: 15 Nov 1994 Summary: Divisions might give incorrect results on Pentium Pentium Floating Point Division Bug There has been a flurry of activity the last fews days on the Internet news group, comp.sys.intel, that should interest MATLAB users. A serious design flaw has been discovered in the floating point unit on Intel's Pentium chip. Double precision divisions involving operands with certain bit patterns can produce incorrect results. The most dramatic example seen so far can be extracted from a posting last night by Tim Coe of Vitesse Semiconductor. In MATLAB, his example becomes x = 4195835 y = 3145727 z = x - (x/y)*y With exact computation, z would be zero. In fact, we get zero on most machines, including those using Intel 286, 386 and 486 chips. Even with roundoff error, z should not be much larger than eps*x, which is about 9.3e-10. But, on the Pentium, z = 256 The relative error, z/x, is about 2^(-14) or 6.1e-5. The computed quotient, x/y, is accurate to only 14 bits. An article in last week's edition of Electronic Engineering Times credits Prof. Thomas Nicely, a mathematics professor at Lynchburg College in Virginia, with the first public announcement of the Pentium division bug. One of Nicely's examples involves p = 824633702441 With exact computation q = 1 - (1/p)*p would be zero. With floating point computation, q should be on the order of eps. On most machines, we find that q = eps/2 = 2^(-53) ~= 1.11e-16 But on the Pentium q = 2^(-28) ~= 3.72e-09 This is roughly single precision accuracy and is typical of the most of the examples that had been posted before Coe's analysis. The bit patterns of the operands involved in these examples are very special. The denominator in Coe's example is y = 3*2^20 - 1 Nicely's research involves a theorem about sums of reciprocals of prime numbers. His example involves a prime of the form p = 3*2^38 - 18391 We're not sure yet how many operands cause the Pentium's floating point division to fail, or even what operands produce the largest relative error. It is certainly true that failures are very rare. But, as far as we are concerned, the real difficulty is having to worry about this at all. There are so many other things than can go wrong with computer hardware, and software, that, at least, we ought to be able to rely on the basic arithmetic. The bug is definitely in the Pentium chip. It occurs at all clock rates. The bug does not affect other arithmetic operations, or the built-in transcendental functions. Intel has recently made changes to the on-chip Program Logic Array that fix the bug and is now believed to be producing error free CPUs. It remains to be seen how long it will take for these to reach users. An unnamed Intel spokesman is quoted in the EE Times article as saying "If customers are concerned, they can call and we'll replace any of the parts that contain the bug." But, at the MathWorks, we have our own friends and contacts at Intel and we're unable to confirm this policy. We'll let you know when we hear anything more definite. In the meantime, the phone number for Customer Service at Intel is 800-628-8686. -- Cleve Moler moler@mathworks.com Chairman and Chief Scientist, The MathWorks, Inc. Danny Owens \ Beaten paths are SMMI \ are for beaten people. dowens@cbcn.cbcinc.com Message-Id: <9412010021.AA29399@blaze.csci.csusb.edu> Date: Wed, 30 Nov 1994 14:34:51 CST Reply-To: Novell LAN Interest Group Sender: Novell LAN Interest Group From: Ron Neely Subject: Re: Pentium bug--Intel's response X-To: NOVELL@SUVM.ACS.SYR.EDU To: Multiple recipients of list NOVELL Content-Type: text Content-Length: 5320 Status: R A student who works for me received this memo from a friend of his that works at Intel. ================================================================== From: INTEL9::PUBLIC_RELAT 29-NOV-1994 18:45:02.37 To: XXXXXXXXXXXXXXXXXXX CC: Subj: ANDREW GROVE LETTER ON INTERNET EMPLOYEE MEMO Andrew Grove Letter on Internet (11/27/94) This is Andy Grove, president of Intel. I'd like to comment a bit on the conversations that have been taking place here. First of all, I am truly sorry for the anxiety created among you by our floating point issue. I read thru some of the postings and it's clear that many of you have done a lot of work around it and that some of you are very angry at us. Let me give you my perspective on what has happened here. The Pentium processor was introduced into the market in May of '93 after the most extensive testing program we at Intel have ever embarked on. Because this chip is three times as complex as the 486, and because it includes a number of improved floating point algorithms, we geared up to do an array of tests, validation, and verification that far exceeded anything we had ever done. So did many of our OEM customers. We held the introduction of the chip several months in order to give them more time to check out the chip and their systems. We worked extensively with many software companies to this end as well. We were very pleased with the result. We ramped the processor faster than any other in our history and encountered no significant problems in the user community. Not that the chip was perfect; no chip ever is. From time to time, we gathered up what problems we found and put into production a new "stepping" -- a new set of masks that incorporated whatever we corrected. Stepping N was better than stepping N minus 1, which was better than stepping N minus 2. After almost 25 years in the microprocessor business, I have come to the the conclusion that no microprocessor is ever perfect; they just come closer to perfection with each stepping. In the life of a typical microprocessor, we go thru half a dozen or more such steppings. Then, in the summer of '94, in the process of further testing (which continued thru all this time and continues today), we came upon the floating point error. We were puzzled as to why neither we nor anyone else had encountered this earlier. We started a separate project, including mathematicians and scientists who work for us in areas other than the Pentium processor group to examine the nature of the problem and its impact. This group concluded after months of work that (1) an error is only likely to occur at a frequency of the order of once in nine billion random floating point divides, and that (2) this many divides in all the programs they evaluated (which included many scientific programs) would require elapsed times of use that would be longer than the mean time to failure of the physical computer subsystems. In other words, the error rate a user might see due to the floating point problem would be swamped by other known computer failure mechanisms. This explained why nobody -- not us, not our OEM customers, not the software vendors we worked with and not the many individual users -- had run into it. As some of you may recall, we had encountered thornier problems with early versions of the 386 and 486, so we breathed a sigh of relief that with the Pentium processor we had found what turned out to be a problem of far lesser magnitude. We then incorporated the fix into the next stepping of both the 60 and 66 and the 75/90/100 MHz Pentium processor along with whatever else we were correcting in that next stepping. Then, last month Professor Nicely posted his observations about this problem and the hubbub started. Interestingly, I understand from press reports that Prof. Nicely was attempting to show that Pentium-based computers can do the jobs of big time supercomputers in numbers analyses. Many of you who posted comments are evidently also involved in pretty heavy duty mathematical work. That gets us to the present time and what we do about all this. We would like to find all users of the Pentium processor who are engaged in work involving heavy duty scientific/floating point calculations and resolve their problem in the most appropriate fashion including, if necessary, by replacing their chips with new ones. We don't know how to set precise rules on this so we decided to do it thru individual discussions between each of you and a technically trained Intel person. We set up 800# lines for that purpose. It is going to take us time to work thru the calls we are getting, but we will work thru them. I would like to ask for your patience here. Meanwhile, please don't be concerned that the passing of time will deprive you of the opportunity to get your problem resolved -- we will stand behind these chips for the life of your computer. Sorry to be so long-winded -- and again please accept my apologies for the situation. We appreciate your interest in the Pentium processor, and we remain dedicated to bringing it as close to perfection as possible. I will monitor your communications in the future -- forgive me if I can't answer each of you individually. Andy Grove