Navigation
Highlights

Release 0.0.5

The latest release, alpha testing, unstable. See "downloads".

2005-06-01

Links
Documentation

Standards Compliance

FreeDOS-32 is a brand new software, thus is desirable for it to comply with the most recent standards. This document is a summary of standards that should be adopted in software and documentation that are part of FreeDOS-32. It is just a quick reference and it is not intended as a detailed or formal explaination of those standards.

Contents

  • IEEE 1541 - Units and prefixes for digital electronics
  • ISO 8601 - International date and time notation
  • ISO 10646 - Unicode™ character set

IEEE 1541 - Units and prefixes for digital electronics

A great confusion is arising on the computers market (and not only there) on the meaning of units and prefixes referring to data storage and transmission. The main source of confusion is the historical use of powers of two for multiples of computer related values, i.e. the (in)famous 1024 versus 1000, binary multiples versus decimal multiples.

The IEEE 1541 standard is the solution to avoid that confusion. It provides unambiguous ways to represent binary multiples as well as decimal multiples. With the end of 2004, IEEE 1541-2002 should have finished the "trial-use" period, and should become a full standard. See this site for official informations. Unfortunately this standard is not yet used widely.

According to the standard, the following are the units that shall be used referring to data storage and transmission:

UnitSymbolDescription
bitba binary digit (IEEE symbol)
bitbita binary digit (IEC symbol)
byteBa group of adjacent bits, usually eight, operated on as a group
octetoa byte composed of eight bits

In FreeDOS-32, unless differently stated (for example saying "a 7-bit byte"), a byte will be an octet.

The following are prefixes that shall be used for binary multiples and decimal multiples (the ones defined in the International System of Units, SI):

Binary multiples Decimal (SI) multiples
PrefixNameFactor PrefixNameFactor
Kikibi-210 = 1,024 kkilo-103 = 1,000
Mimebi-220 = 1,048,576 Mmega-106 = 1,000,000
Gigibi-230 = 1,073,741,824 Ggiga-109 = 1,000,000,000
Titebi-240 Ttera-1012
Pipebi-250 Ppeta-1015
Eiexbi-260 Eexa-1018

For example 6 GB are 6 gigabytes, that are 6,000,000,000 bytes, but 6 GiB are 6 gibibytes, that are roughly 6.4 GB. Similarly, a transfer rate of 64 kb/s means 64 kilobits per second, that are 64,000 bits per seconds, not 65,536. What we are used to call a "1.44 MB" floppy actually contains 1,440 KiB, that are 1.41 MiB or 1.47 MB. The inventor of the "1.44 MB" format name has mixed decimal multiples and binary multiples...

Hard disk manufacturers are used with decimal multiples for a long time, while operating systems often use binary multiples, and unfortunately they often use the SI prefixes for them, that is wrong. The result is that if you buy a 100 GB hard disk, your operating system will likely report it to be a 93 GB one, while the only wrong thing is that they should just be 93 GiB. The same happens with recordable DVDs, that are sold as 4.7 GB, but contains 4.38 GiB. This shall be avoided in FreeDOS-32, by using the correct prefixes and units.

The "-bi-" in the binary prefixes is pronounced as "bee", and stands for "binary" (for example "kibi" means "kilobinary"). Please note that the 'K' letter of the "kibi-" prefix is capital, while the 'k' letter of the "kilo-" prefix is small (as in km, kg etc.). Also note that the symbol for bit is a small 'b' while for a byte is a capital 'B'. Thus, using the notation 100 kb to mean 100 kilobytes is wrong, because it means 100 kilobits, or 100,000 bits. The 'o' (octet) symbol is also becoming fairly popular, so don't be surprised to read on a web site something like 10 Mo: it means 10 megaoctets, or 10,000,000 eight-bit bytes.

Finally, it should be noted that computer users are most used with binary multiples. For this reason in FreeDOS-32 we should continue to use them, just using the correct prefixes. The user will only notice a small 'i' more (like KiB instead of the non-conforming KB).

ISO 8601 - International date and time notation

The ISO 8601 standard provides a format to avoid confusion between different date and time notations used in different countries. It also provides a way to sort dates and times using trivial string comparison. The full standard text in PDF format is available here, from the International Organization for Standardization (ISO).

The following are the date representations available:

Date formats
YYYY-MM-DD YYYYMMDD
YYYY is the four-digit year in the Gregorian calendar.
MM is the two-digit month between 01 (January) and 12 (December).
DD is the two-digit day of the month between 01 and 31.

For example, 2002-09-18 and 20020918 are both September 18, 2002.

This notation applies only to numeric dates, not to worded language-dependent dates, which can be still used. Numeric notations such as MM-DD-YY, DD/MM/YY or YYYY-DD-MM shall be avoided as they are ambiguous. The standard also provides notations for year/week/day-of-week dates and year/day-of-year dates.

The following are the time-of-day representations available:

Time formats
HH:MM:SS HHMMSS
HH is the number of hours past midnight, between 00 and 23.
MM is the number of minutes since the start of the hour, between 00 and 59.
SS is the number of seconds since the start of the minute, between 00 and 59.

If the time is measured in Universal Time (UTC, once called Greenwich Mean Time), append a capital Z to the time (pronounced "zulu"). 12h notations, such as "12:01 p.m." shall be avoided. Actually the standard allows HH to be 24, in that case MM and SS shall be 00. In FreeDOS-32 it is likely that we won't need to distinguish between the two midnights, so 00:00:00 should be used. The standard also allows SS to be 60, for the periodical leap second correction to align the time to the Earth rotation.

Time zones can be indicated by appending one of the following notations to the time:

Time zone suffixes
±HH:MM ±HHMM
HH and MM are the number of hours and minuts ahead (+ sign) or behind (- sign) the UTC.

For example, 12:00Z is the same time as 13:00+01:00 (in Central European Time) and 07:00-05:00 (in U.S./Canadian Eastern Standard Time). The difference between local time and UTC time is the only reliable way to indicate time zones (for example, due to daylight saving times).

ISO 10646 - Unicode™ character set

The standard ASCII code, also known as ISO 646, providing mapping for only 128 characters, with a 7-bit encoding, has revealed definitely insufficient to represent even a very limited range of languages.

Many extensions of ASCII has been introduced, like a rich set of OEM code pages, ISO 8859 character sets and multi-byte encodings like the Japanese Shift JIS. All these extensions are different and conflicting, so that sharing text encoded with them all around the world needs more or less difficult and accurate conversion procedures.

Two standards, that are identical in practice, ISO-10646 and Unicode, solve this problem providing a wide character set: in Unicode tables you have room for more than two millions characters. For more informations, see the FD32 Unicode Support Library documentation and the Unicode Consortium web site.