Jump to content

Disk sector

From Wikipedia, the free encyclopedia
(Redirected from Data cluster)

Figure 1: Disk structures:
  (A) Track
  (C) Disk sector
  (D) Cluster

In computer disk storage, a sector is a subdivision of a track on a magnetic disk or optical disc. For most disks, each sector stores a fixed amount of user-accessible data, traditionally 512 bytes for hard disk drives (HDDs), and 2048 bytes for CD-ROMs, DVD-ROMs and BD-ROMs.[1] Newer HDDs and SSDs use 4096 byte (4 KiB) sectors, which are known as the Advanced Format (AF).

The sector is the minimum storage unit of a hard drive.[2] Most disk partitioning schemes are designed to have files occupy an integral number of sectors regardless of the file's actual size. Files that do not fill a whole sector will have the remainder of their last sector filled with zeroes. In practice, operating systems typically operate on blocks of data, which may span multiple sectors.[3]

Geometrically, the word sector means a portion of a disk between a center, two radii and a corresponding arc (see Figure 1, item B), which is shaped like a slice of a pie. Thus, the disk sector (Figure 1, item C) refers to the intersection of a track and geometrical sector.

In modern disk drives, each physical sector is made up of two basic parts, the sector header area (typically called "ID") and the data area. The sector header contains information used by the drive and controller; this information includes sync bytes, address identification, flaw flag and error detection and correction information. The header may also include an alternate address to be used if the data area is undependable. The address identification is used to ensure that the mechanics of the drive have positioned the read/write head over the correct location. The data area contains the sync bytes, user data and an error-correcting code (ECC) that is used to check and possibly correct errors that may have been introduced into the data.

History

[edit]

The first disk drive, the 1957 IBM 350 disk storage, had ten 100 character sectors per track; each character was six bits and included a parity bit. The number of sectors per track was identical on all recording surfaces. There was no recorded identifier field (ID) associated with each sector.[4]

The 1961 IBM 1301 disk storage introduced variable length sectors,[disputed (for: the literature does not use the term sector for variable-length blocks)  – discuss] termed records or physical records by IBM, and added to each record a record address field separate from the data in a record.[5][6] All modern disk drives have sector address fields, called ID fields, separate from the data in a sector.

Also in 1961 Bryant with its 4000 series introduced the concept of zoned recording (ZBR) which allowed the number of sectors per track to vary as a function of the track's diameter – there are more sectors on an outer track than on an inner track.[7] In the late 1980s ZBR was again used in disk drives then announced by Imprimis and Quantum[8] and by 1997 its industry usage was ubiquitous.[9]

The disk drives and other DASDs announced with the IBM System/360 in 1964 used self-formatting variable length sectors,[disputed (for: the literature does not use the term sector for variable-length blocks)  – discuss] termed records or physical records by IBM. They detected errors in all fields of their records with a cyclic redundancy check (CRC) replacing parity per character detection of prior generations. These IBM physical records have three basic parts, a Count field which acts as an ID field, an optional Key field to aid in searching for data and a Data field; in practice, most records had no Key field, indicated by a key length of zero. The structure of these three fields is called the CKD track format for a record.

The 1970 IBM 3330 disk storage replaced the CRC on the data field of each record with an error correcting code (ECC) to improve data integrity by detecting most errors and allowing correction of many errors.[10] Ultimately all fields of disk sectors had ECCs.

Prior to the 1980s, there was little standardization of sector sizes; disk drives had a maximum number of bits per track and various system manufacturers subdivided the track into different sector sizes to suit their OSes and applications. The popularity of the PC beginning in the 1980s and the advent of the IDE interface in the late 1980s led to a 512-byte sector becoming an industry standard sector size for HDDs and similar storage devices.[11][failed verification]

In the 1970s, IBM added fixed-block architecture Direct Access Storage Devices (FBA DASDs) to its line of CKD DASD. CKD DASD supported multiple variable length sectors while the IBM FBA DASD supported sector sizes of 512, 1024, 2048, or 4096 bytes.

In 2000 the industry trade organization, International Disk Drive Equipment and Materials Association (IDEMA) started work to define the implementation and standards that would govern sector size formats exceeding 512 bytes to accommodate future increases in data storage capacities.[11] By the end of 2007 in anticipation of a future IDEMA standard, Samsung and Toshiba began shipments of 1.8-inch hard disk drives with 4096 byte sectors. In 2010 IDEMA completed the Advanced Format standard for 4096 sector drives,[11] setting the date for the transition from 512 to 4096 byte sectors as January 2011 for all manufacturers,[12] and Advanced Format drives soon became prevalent.

[edit]

Sectors versus blocks

[edit]

While sector specifically means the physical disk area, the term block has been used loosely to refer to a small chunk of data. Block has multiple meanings depending on the context. In the context of data storage, a filesystem block is an abstraction over disk sectors possibly encompassing multiple sectors. In other contexts, it may be a unit of a data stream or a unit of operation for a utility.[13] For example, the Unix program dd allows one to set the block size to be used during execution with the parameter bs=bytes. This specifies the size of the chunks of data as delivered by dd, and is unrelated to sectors or filesystem blocks.

In Linux, disk sector size can be determined with sudo fdisk -l | grep "Sector size" and block size can be determined with sudo blockdev --getbsz /dev/sda.[14]

Sectors versus clusters

[edit]

In computer file systems, a cluster (sometimes also called allocation unit or block) is a unit of disk space allocation for files and directories. To reduce the overhead of managing on-disk data structures, the filesystem does not allocate individual disk sectors by default, but contiguous groups of sectors, called clusters.

On a disk that uses 512-byte sectors, a 512-byte cluster contains one sector, whereas a 4-kibibyte (KiB) cluster contains eight sectors.

A cluster is the smallest logical amount of disk space that can be allocated to hold a file. Storing small files on a filesystem with large clusters will therefore waste disk space; such wasted disk space is called slack space. For cluster sizes which are small versus the average file size, the wasted space per file will be statistically about half of the cluster size; for large cluster sizes, the wasted space will become greater. However, a larger cluster size reduces bookkeeping overhead and fragmentation, which may improve reading and writing speed overall. Typical cluster sizes range from 1 sector (512 B) to 128 sectors (64 KiB).

A cluster need not be physically contiguous on the disk; it may span more than one track or, if sector interleaving is used, may even be discontiguous within a track. This should not be confused with fragmentation, as the sectors are still logically contiguous.

A "lost cluster" occurs when a file is deleted from the directory listing, but the File Allocation Table (FAT) still shows the clusters allocated to the file.[15]

The term cluster was changed to allocation unit in DOS 4.0. However the term cluster is still widely used.[16]

Zone bit recording

[edit]

If a sector is defined as the intersection between a radius and a track, as was the case with early hard drives and most floppy disks, the sectors towards the outside of the disk are physically longer than those nearer the spindle. Because each sector still contains the same number of bytes, the outer sectors have lower bit density than the inner ones, which is an inefficient use of the magnetic surface. The solution is zone bit recording, wherein the disk is divided into zones, each encompassing a small number of contiguous tracks. Each zone is then divided into sectors such that each sector has a similar physical size. Because outer zones have a greater circumference than inner zones, they are allocated more sectors. This is known as zoned bit recording.[17]

A consequence of zone bit recording is that contiguous reads and writes are noticeably faster on outer tracks (corresponding to lower block addresses) than on inner tracks, as more bits pass under the head with each rotation; this difference can be 25% or more.

Advanced Format

[edit]

In 1998 the traditional 512-byte sector size was identified as one impediment to increasing capacity which at that time was growing at a rate exceeding Moore's Law. Increasing the length of the data field through the implementation of Advanced Format using 4096-byte sectors removed this impediment; it increased the efficiency of the data surface area by five to thirteen percent while increasing the strength of the ECC which in turn allowed higher capacity. The format was standardized by an industry consortium in 2005 and by 2011 incorporated in all new products of all hard drive manufacturers.

See also

[edit]

References

[edit]
  1. ^ "UDF - OSDev Wiki". wiki.osdev.org. Retrieved 2024-09-01.
  2. ^ Hamington, Suzie (2004-01-01). Computer Science. Lotus Press. p. 42. ISBN 9788189093242.
  3. ^ Tucker, Allen B. (2004-06-28). Computer Science Handbook, Second Edition. CRC Press. p. 86. ISBN 9780203494455.
  4. ^ 305 RAMAC Random Access Method of Accounting and Control Manual of Operation (PDF). IBM. 1957.
  5. ^ IBM 1301, Models 1 and 2, Disk Storage and IBM 1302, Models 1 and 2, Disk Storage with IBM 7090, 7094, and 7094 II Data Processing Systems (PDF). IBM. A22-6785.
  6. ^ IBM 1301, Models 1 and 2, Disk Storage and IBM 1302, Models 1 and 2, Disk Storage with IBM 1410 and 7010 Data Processing Systems (PDF). IBM. A22-6788.
  7. ^ Technical Data - Series 4000 Disk File (PDF). Bryant Computer Products. 1963.
  8. ^ Porter, James (October 1988). "Rigid Magnetic Disk Drive Specifications". 1988 DISK/TREND REPORT, RIGID DISK DRIVES. DISK/TREND, Inc. p. 63, 122.
  9. ^ Porter, James (June 1997). "Rigid Magnetic Disk Drive Specifications". 1997 DISK/TREND REPORT, RIGID DISK DRIVES. DISK/TREND, Inc.
  10. ^ Reference Manual for IBM 3330 Series Disk Storage (PDF). IBM. March 1974. GA26-1615-3.
  11. ^ a b c "The Advent of Advanced Format". IDEMA. Retrieved 2013-11-18.
  12. ^ Skinner, Heather (29 June 2010). "IDEMA launches "Are you ready?" campaign to prepare industry for Hard Disk Drive sector format change" (PDF). www.idema.org. Archived from the original on 14 December 2020. Retrieved 14 December 2020.
  13. ^ "Difference between block size and cluster size". unix.stackexchange.com. Retrieved 2015-12-13.
  14. ^ "Disk Sector and Block Allocation For File". stackoverflow.com. Retrieved 2015-12-13.
  15. ^ "Errors Caused by Cross-Linked Files or Lost Clusters". Archived from the original on 2015-03-06. Retrieved 2020-08-03.
  16. ^ Mueller, Scott (2002). Upgrading and repairing PCs, p. 1354. ISBN 0-7897-2745-5.
  17. ^ Kern Wong (January 1989), DP8459 Zoned Bit Recording (PDF), National Semiconductor, archived from the original (PDF) on 2011-06-15, retrieved 2010-03-10