Jump to content

Microsoft Binary Format

From Wikipedia, the free encyclopedia

In computing, Microsoft Binary Format (MBF) is a format for floating-point numbers which was used in Microsoft's BASIC languages, including MBASIC, GW-BASIC and QuickBASIC prior to version 4.00.[1][2][3][4][5][6][7]

There are two main versions of the format. The original version was designed for memory-constrained systems and stored numbers in 32 bits (4 bytes), with a 23-bit mantissa, 1-bit sign, and an 8-bit exponent. Extended (12k) BASIC included a double-precision type with 64 bits.

During the period when it was being ported from the Intel 8080 platform to the MOS 6502 processor, computers were beginning to ship with more memory as a standard feature. This version was offered with the original 32-bit format or an optional expanded 40-bit (5-byte) format. The 40-bit format was used by most home computers of the 1970s and 1980s. These two versions are sometimes known as "6-digit" and "9-digit", respectively.[8]

On PCs with x86 processor, QuickBASIC, prior to version 4, reintroduced the double-precision format using a 55-bit mantissa in a 64-bit (8-byte) format. MBF was abandoned during the move to QuickBASIC 4, which used the standard IEEE 754 format, introduced a few years earlier.

History

[edit]

Bill Gates and Paul Allen were working on Altair BASIC in 1975. They were developing the software at Harvard University on a DEC PDP-10 running their Altair emulator.[9] One thing they lacked was code to handle floating-point numbers, required to support calculations with very big and very small numbers,[9] which would be particularly useful for science and engineering.[10][11] One of the proposed uses of the Altair was as a scientific calculator.[12]

Altair 8800 front panel

At a dinner at Currier House, an undergraduate residential house at Harvard, Gates and Allen complained to their dinner companions that they had to write this code[9] and one of them, Monte Davidoff, told them that he had written floating-point routines before and convinced Gates and Allen that he was capable of writing the Altair BASIC floating-point code.[9] At the time, while IBM had introduced their own programs, there was no standard for floating-point numbers, so Davidoff had to come up with his own. He decided that 32 bits would allow enough range and precision.[13] When Allen had to demonstrate it to MITS, it was the first time it ran on an actual Altair.[14] But it worked, and when he entered ‘PRINT 2+2’, Davidoff's adding routine gave the correct answer.[9]

A copy of the source code for Altair BASIC resurfaced in 1999. In the late 1970s, Gates's former tutor and dean Harry Lewis had found it behind some furniture in an office in Aiden, and put it in a file cabinet. After more or less forgetting about its existence for a long time, Lewis eventually came up with the idea of displaying the listing in the lobby. Instead, it was decided on preserving the original listing and producing several copies for display and preservation, after librarian and conservator Janice Merrill-Oldham pointed out its importance.[15][16] A comment in the source credits Davidoff as the writer of Altair BASIC's math package.[15][16]

Radio Shack Tandy TRS-80 Model I System

Altair BASIC took off, and soon most early home computers ran some form of Microsoft BASIC.[17][18] The BASIC port for the 6502 CPU, such as used in the Commodore PET, took up more space due to the lower code density of the 6502. Because of this it would likely not fit in a single ROM chip together with the machine-specific input and output code. Since an extra chip was necessary, extra space was available, and this was used in part to extend the floating-point format from 32 to 40 bits.[8] This extended format was not only provided by Commodore BASIC 1 & 2, but was also supported by Applesoft BASIC I & II since version 1.1 (1977), KIM-1 BASIC since version 1.1a (1977), and MicroTAN BASIC since version 2b (1980).[8] Not long afterwards, the Z80 ports, such as Level II BASIC for the TRS-80 (1978), introduced the 64-bit, double-precision format as a separate data type from 32-bit, single-precision.[19][20][21] Microsoft used the same floating-point formats in their implementation of Fortran[22] and for their macro assembler MASM,[23] although their spreadsheet Multiplan[24][25] and their COBOL implementation used binary-coded decimal (BCD) floating point.[26] Even so, for a while MBF became the de facto floating-point format on home computers, to the point where people still occasionally encounter legacy files and file formats using it.[27][28][29][30][31][32]

VAX-11/780 minicomputer

In a parallel development, Intel had started the development of a floating-point coprocessor in 1976.[33][34] William Morton Kahan, as a consultant to Intel, suggested that Intel use the floating point of Digital Equipment Corporation's (DEC) VAX. The first VAX, the VAX-11/780 had just come out in late 1977, and its floating point was highly regarded. VAX's floating-point formats differed from MBF only in that it had the sign in the most significant bit.[35][36] However, seeking to market their chip to the broadest possible market, Kahan was asked to draw up specifications.[33] When rumours of Intel's new chip reached its competitors, they started a standardization effort, called IEEE 754, to prevent Intel from gaining too much ground. As an 8-bit exponent was not wide enough for some operations desired for double-precision numbers, e.g. to store the product of two 32-bit numbers,[1] Intel's proposal and a counter-proposal from DEC used 11 bits, like the time-tested 60-bit floating-point format of the CDC 6600 from 1965.[34][37][38] Kahan's proposal also provided for infinities, which are useful when dealing with division-by-zero conditions; not-a-number values, which are useful when dealing with invalid operations; denormal numbers, which help mitigate problems caused by underflow;[37][39][40] and a better balanced exponent bias, which could help avoid overflow and underflow when taking the reciprocal of a number.[41][42]

By the time QuickBASIC 4.00 was released,[when?] the IEEE 754 standard had become widely adopted—for example, it was incorporated into Intel's 387 coprocessor and every x86 processor from the 486 on. QuickBASIC versions 4.0 and 4.5 use IEEE 754 floating-point variables by default, but (at least in version 4.5) there is a command-line option /MBF for the IDE and the compiler that switches from IEEE to MBF floating-point numbers, to support earlier-written programs that rely on details of the MBF data formats. Visual Basic also uses the IEEE 754 format instead of MBF.

Technical details

[edit]

MBF numbers consist of an 8-bit base-2 exponent, a sign bit (positive mantissa: s = 0; negative mantissa: s = 1) and a 23-,[43][8] 31-[8] or 55-bit[43] mantissa of the significand. There is always a 1-bit implied to the left of the explicit mantissa, and the radix point is located before this assumed bit. The exponent is encoded with a bias of 128[citation needed], so that exponents −127…−1[citation needed] are represented by x = 1…127 (01h…7Fh)[citation needed], exponents 0…127[citation needed] are represented by x = 128…255 (80h…FFh)[citation needed], with a special case for x = 0 (00h) representing the whole number being zero.

The MBF double-precision format provides less scale than the IEEE 754 format, and although the format itself provides almost one extra decimal digit of precision, in practice the stored values are less accurate because IEEE calculations use 80-bit intermediate results, and MBF doesn't.[1][3][43][44] Unlike IEEE floating point, MBF doesn't support denormal numbers, infinities or NaNs.[45]

MBF single-precision format (32 bits, "6-digit BASIC"):[43][8]

Exponent Sign Significand
Bit 31...24
(8 bit)
Bit 23
(1 bit)
Bit 22...0
(23 bit)
xxxxxxxx s mmmmmmm mmmmmmmm mmmmmmmm

MBF extended-precision format (40 bits, "9-digit BASIC"):[8]

Exponent Sign Significand
Bit 39...32
(8 bit)
Bit 31
(1 bit)
Bit 30...0
(31 bit)
xxxxxxxx s mmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm

MBF double-precision format (64 bits):[43][1]

Exponent Sign Significand
Bit 63...56
(8 bit)
Bit 55
(1 bit)
Bit 54...0
(55 bit)
xxxxxxxx
 
s
 
mmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm
mmmmmmmm mmmmmmmm mmmmmmmm

Examples

[edit]
32-bit format: 84h, 20h, 00h, 00h
40-bit format: 84h, 20h, 00h, 00h, 00h
  • "2":
32-bit format: 82h, 00h, 00h, 00h
40-bit format: 82h, 00h, 00h, 00h, 00h
32-bit format: 81h, 00h, 00h, 00h
40-bit format: 81h, 00h, 00h, 00h, 00h
32-bit format: 00h, 00h, 00h, 00h (or 00h, xxh, xxh, xxh)
40-bit format: 00h, 00h, 00h, 00h, 00h (or 00h, xxh, xxh, xxh, xxh)
32-bit format: 80h, 00h, 00h, 00h
40-bit format: 80h, 00h, 00h, 00h, 00h
32-bit format: 7Fh, 00h, 00h, 00h
40-bit format: 7Fh, 00h, 00h, 00h, 00h
32-bit format: 80h, 80h, 00h, 00h
40-bit format: 80h, 80h, 00h, 00h, 00h
32-bit format: 80h, 35h, 04h, F3h
40-bit format: 80h, 35h, 04h, F3h, 34h
32-bit format: 81h, 35h, 04h, F3h
40-bit format: 81h, 35h, 04h, F3h, 34h
32-bit format: 80h, 31h, 72h, 18h
40-bit format: 80h, 31h, 72h, 17h, F8h
32-bit format: 81h, 38h, AAh, 3Bh
40-bit format: 81h, 38h, AAh, 3Bh, 29h
32-bit format: 81h, 49h, 0Fh, DBh
40-bit format: 81h, 49h, 0Fh, DAh, A2h
32-bit format: 83h, 49h, 0Fh, DBh
40-bit format: 83h, 49h, 0Fh, DAh, A2h

See also

[edit]

References

[edit]
  1. ^ a b c d "IEEE vs. Microsoft Binary Format; Rounding Issues (Complete)". Microsoft Support. Microsoft. 2006-11-21. Article ID KB35826, Q35826. Archived from the original on 2020-08-28. Retrieved 2010-02-24.
  2. ^ "(Complete) Tutorial to Understand IEEE Floating-Point Errors". Knowledge Base. Microsoft. 2005-08-16. Article ID KB42980, Q42980. Archived from the original on 2020-08-28. Retrieved 2016-06-02.
  3. ^ a b "Convert pre-IEEE-754 C++ floating-point numbers to/from C#". stackoverflow.com. 2010-04-21. Archived from the original on 2020-08-28. Retrieved 2016-06-02. (NB. The second reference could be mistaken to say that QB 4.0 could use MBF internally, but it only uses IEEE. It just has a few conversion functions to convert IEEE floating point numbers to strings containing MBF data, e.g. MKDMBF$ in addition to MKD$ which just copies the bytes of the IEEE value to a string.)
  4. ^ "The MASM 6.1 documentation notes that 5.1 was the last MASM version to support MBF" (PDF). people.sju.edu. Retrieved 2016-06-02.
  5. ^ GW-BASIC User's Manual, Appendix D.3 USR Function Calls.
  6. ^ BASIC Second edition (May 1982), IBM: Appendix C-15 (NB. This is the BASICA manual).
  7. ^ "ROM Routes (Integer Math)". Trs-80.com. Retrieved 2016-06-02.
  8. ^ a b c d e f g h i j k l m n o p q r Steil, Michael (2008-10-20). "Create your own Version of Microsoft BASIC for 6502". pagetable.com. Archived from the original on 2016-05-30. Retrieved 2016-05-30.
  9. ^ a b c d e Isaacson, Walter (2013-09-20). "Dawn of a revolution". Harvard Gazette. news.harvard.edu. Archived from the original on 2020-08-28. Retrieved 2016-05-30.
  10. ^ Rall, Louis B. (1987). "An introduction to the scientific computing language Pascal-SC". Computers & Mathematics with Applications. 14 (1). Mathematics Research Center, University of Wisconsin-Madison, Madison, Wisconsin: Pergamon Journals Ltd: 53–69. doi:10.1016/0898-1221(87)90181-7. (17 pages)
  11. ^ Leung, K. Ming (2005-02-03) [2000]. "Floating-Point Numbers in Digital Computers" (PDF). cis.poly.edu. Department of Computer and Information Science, Polytechnic University. Archived (PDF) from the original on 2018-12-14. Retrieved 2016-06-02.
  12. ^ Becraft, Michael B. (2014-08-26). Bill Gates: A Biography. Abc-Clio. ISBN 978-1-44083014-3. Retrieved 2016-05-30.
  13. ^ "The Math Package". altairbasic.org. 2014. Archived from the original on 2020-08-28. Retrieved 2016-05-30. (NB. Altair BASIC 3.2 (4K Edition).)
  14. ^ Orlowski, Andrew (2001-05-11). "Microsoft Altair BASIC legend talks about Linux, CPRM and that very frightening photo - A very rare interview with Monte Davidoff". The Register. Archived from the original on 2020-08-28. Retrieved 2016-05-30.
  15. ^ a b Orlowski, Andrew (2001-05-13). "Raiders of the Lost Altair BASIC Source Code - They came, they saw … they disassembled". The Register. Archived from the original on 2020-08-28. Retrieved 2016-05-30.
  16. ^ a b Griffiths, Ian (2000-05-08). "Quest for the Holy Source - Ian's trip to Harvard". Archived from the original on 2002-01-02. Retrieved 2016-05-30.
  17. ^ "Great people personally responsible for advancing the art of early computers". Oldcomputers.net. 2020-07-18. Archived from the original on 2020-08-28. Retrieved 2016-05-30.
  18. ^ "Basic 7.0 for Windows". comp.lang.basic.powerbasic.narkive.com. Archived from the original on 2020-08-28. Retrieved 2016-05-30.
  19. ^ Radio Shack Hardware Manual: Level II BASIC Reference Manual (1 ed.). Fort Worth, Texas: Radio Shack. 1978. Archived from the original on 2020-08-28. Retrieved 2016-05-30. [1]
  20. ^ Level II BASIC Reference Manual (PDF). Radio Shack. 1979. Retrieved 2016-06-02. {{cite book}}: |website= ignored (help)
  21. ^ BASIC-80 (MBASIC) Reference Manual (PDF). Retrieved 2016-05-30.
  22. ^ Microsoft FORTRAN-80 Version 3.4 Users Manual (PDF). November 1980. pp. 45, 55. Retrieved 2016-05-30. {{cite book}}: |website= ignored (help)
  23. ^ Pätzold, Michael, ed. (April 1993). "Zettelsammlung MS-DOS und AT" (in German). Gruppe Datenverarbeitung am MPI für Strömungsforschung Göttingen, Max-Planck-Institut. Archived from the original on 2005-02-20. Retrieved 2015-10-07.
  24. ^ "Tandy 200 Multiplan Manual" (PDF). classiccmp.org. Retrieved 2016-06-02.
  25. ^ Microsoft C Pcode Specifications, page 13. (NB. Multiplan wasn't compiled to machine code, but to a kind of byte-code which was run by an interpreter, in order to make Multiplan portable across the widely varying hardware of the time. This byte-code distinguished between the machine-specific floating point format to calculate on, and an external (standard) format, which was binary-coded decimal (BCD). The PACK and UNPACK instructions converted between the two.)
  26. ^ Microsoft COBOL-80 (PDF). 1978. pp. 26, 32. Retrieved 2016-05-30. {{cite book}}: |website= ignored (help)
  27. ^ Lee, Patrick Y. "QWK Mail Packet File Layout" (TXT). textfiles.com. Retrieved 2016-06-02.
  28. ^ "CSI Millennium (CSIM) format with CSI Y2K extensions". csidata.com. Boca Raton, Florida: Commodity Systems, Inc. 1998-11-17. Archived from the original (TXT) on 2016-03-05. Retrieved 2016-06-02. […] This document describes the abandoned CompuTrac data format, which until recently was actively used by Equis' MetaStock charting software. […]
  29. ^ Billard, Russ (2016-05-04) [2001-07-13]. "Converting Microsoft Binary Format to IEEE format Using VB 6". Archived from the original on 2020-08-28. Retrieved 2016-05-30.
  30. ^ JerMyster (2003-07-02). "Help !Anybody know how to convert old M/S MBF value from Qbasic to VB6". Tek-Tips. Visual Basic (Classic) Forum. Archived from the original on 2020-08-28. Retrieved 2016-05-30.
  31. ^ GL88. "Reading Binary Format (QBasic) with C#". Social.msdn.microsoft.com. Retrieved 2016-05-30.{{cite web}}: CS1 maint: numeric names: authors list (link)
  32. ^ "Rmetrics - Reading MetaStock data format in R". R.789695.n4.nabble.com. 2013-09-30. Retrieved 2016-05-30.
  33. ^ a b "Intel and Floating-Point - Updating One of the Industry's Most Successful Standards - The Technology Vision for the Floating-Point Standard" (PDF). Intel. 2016. Archived from the original (PDF) on 2016-03-04. Retrieved 2016-05-30. (11 pages)
  34. ^ a b "An Interview with the Old Man of Floating-Point". cs.berkeley.edu. 1998-02-20. Retrieved 2016-05-30.
  35. ^ "VAX Floating Point Numbers". nssdc.gsfc.nasa.gov. Archived from the original on 2020-08-28. Retrieved 2016-06-02. (NB. The VAX-11/780 did not implement the "G" format yet. Although this is not directly apparent from the tables because the structures have been cut up in two-byte words, the byte order is actually the same as on modern CPUs. There isn't enough room in the exponent range for NaNs, Infinity, infinities or denormals.)
  36. ^ "VAX11 780" (PDF). Ece.cmu.edu. Retrieved 2016-06-02.
  37. ^ a b "IEEE 754: An Interview with William Kahan" (PDF). dr-chuck.com. Retrieved 2016-06-02.
  38. ^ Thornton, James E. (1970). Written at Advanced Design Laboratory, Control Data Corporation. Design of a Computer: The Control Data 6600 (PDF) (1 ed.). Glenview, Illinois: Scott, Foresman and Company. LCCN 74-96462. Archived (PDF) from the original on 2020-08-28. Retrieved 2016-06-02. (1+13+181+2+2 pages)
  39. ^ Kahan, William Morton. "Why do we need a floating-point arithmetic standard?" (PDF). cs.berkeley.edu. Retrieved 2016-06-02.
  40. ^ Kahan, William Morton; Darcy, Joseph D. "How Java's Floating-Point Hurts Everyone Everywhere" (PDF). cs.berkeley.edu. Retrieved 2016-06-02.
  41. ^ Turner, Peter R. (2013-12-21). Numerical Analysis and Parallel Processing: Lectures given at The Lancaster …. Springer. ISBN 978-3-66239812-8. Retrieved 2016-05-30.
  42. ^ "Names for Standardized Floating-Point Formats" (PDF). cs.berkeley.edu. Retrieved 2016-06-02.
  43. ^ a b c d e f Borland staff (1998-07-02) [1994-03-10]. "Converting between Microsoft Binary and IEEE formats". Technical Information Database (TI1431C.txt). Embarcadero USA / Inprise (originally: Borland). ID 1400. Archived from the original on 2019-02-20. Retrieved 2016-05-30. […] _fmsbintoieee(float *src4, float *dest4) […] MS Binary Format […] byte order => m3 | m2 | m1 | exponent […] m1 is most significant byte => sbbb|bbbb […] m3 is the least significant byte […] m = mantissa byte […] s = sign bit […] b = bit […] MBF is bias 128 and IEEE is bias 127. […] MBF places the decimal point before the assumed bit, while IEEE places the decimal point after the assumed bit. […] ieee_exp = msbin[3] - 2; /* actually, msbin[3]-1-128+127 */ […] _dmsbintoieee(double *src8, double *dest8) […] MS Binary Format […] byte order => m7 | m6 | m5 | m4 | m3 | m2 | m1 | exponent […] m1 is most significant byte => smmm|mmmm […] m7 is the least significant byte […] MBF is bias 128 and IEEE is bias 1023. […] MBF places the decimal point before the assumed bit, while IEEE places the decimal point after the assumed bit. […] ieee_exp = msbin[7] - 128 - 1 + 1023; […]
  44. ^ "Google Groups". Groups.google.com. Retrieved 2016-06-02.
  45. ^ Bucknall, Julian M. (2018-11-03) [2007-10-23]. "Understanding single precision MBF". boyet.com. Archived from the original on 2019-02-20. Retrieved 2016-05-30. […] IEEE 754 Single format […] The exponent is biased by 127. There is an assumed 1 bit before the radix point (so the assumed mantissa is 1.ffff… where f's are the fraction bits) […] Microsoft Binary Format (single precision) […] The exponent is biased by 128. There is an assumed 1 bit after the radix point (so the assumed mantissa is 0.1ffff… where f's are the fraction bits) […] the IEEE mantissa is twice the MBF mantissa. […] to convert from MBF to IEEE single […] subtract 2 from the exponent (one for the bias change, one for the mantissa factor), and then rearrange the sign and exponent bits. The fraction does not change. To convert from IEEE single to MBF, […] add 2 to the exponent (one for the bias change, one for the mantissa factor), and then rearrange the sign and exponent bits. The fraction does not change. […]
  46. ^ a b c d e f g h Steil, Michael, ed. (2008-10-20). "msbasic/float.s". MIST64. Archived from the original on 2020-08-28. Retrieved 2020-08-28 – via github.com. [2] (NB. Commented 6502 disassembly listings, merged from several versions of Microsoft BASIC for 6502 between 1977 and 1982 to recreate byte-exact copies of the original ROMs for 10 different machines from different vendors.)
  47. ^ a b c Steil, Michael, ed. (2008-10-20). "msbasic/trig.s". MIST64. Archived from the original on 2020-08-28. Retrieved 2020-08-28 – via github.com. [3] (NB. Commented 6502 disassembly listings, merged from several versions of Microsoft BASIC for 6502 between 1977 and 1982 to recreate byte-exact copies of the original ROMs for 10 different machines from different vendors.)

Further reading

[edit]
[edit]
  • Microsoft provides a dynamic link library for 16-bit Visual Basic containing functions to convert between MBF data and IEEE 754.
    • This library wraps the MBF conversion functions in the 16-bit Visual C(++) CRT.
    • These conversion functions will round an IEEE double-precision number like ¾ ⋅ 2−128 to zero rather than to 2−128.
    • They don't support denormals at all: the IEEE or MBF single-precision number 2−128 will be converted to zero, even though it is representable in either format.
    • This library is only intended for use with Visual Basic; C(++) programs are expected to call the CRT functions directly.
  • https://github.com/option8/Altair-BASIC