DBMS Topics
File Organization
Last Updated : 21 May, 2026
File organization refers to how data records are physically stored on disk. The choice of file organization affects the speed of data retrieval, insertion, deletion, and
Overview
File organization refers to how data records are physically stored on disk. The choice of file organization affects the speed of data retrieval, insertion, deletion, and update operations.
Basic Storage Concepts
Disk Storage Hierarchy
┌────────────────────────────────────────────────────┐
│ Database File │
│ ├── Segment (tablespace) │
│ │ ├── Extent (group of pages) │
│ │ │ ├── Page / Block (8KB–16KB) │
│ │ │ │ └── Records (rows/tuples) │
│ │ │ │ └── Fields (columns/attributes)│
└────────────────────────────────────────────────────┘
Disk I/O is the bottleneck
Reading one page from disk ≈ 5–10ms
Reading from RAM ≈ nanoseconds
Goal: Minimize disk I/O for all operations
1. Heap File (Unordered File)
Records are stored in no particular order — new records are appended wherever space is available.
| │ Page 1 | [R3][R7][R1][R9] │ |
| │ Page 2 | [R5][R2][R8][R4] │ |
| │ Page 3 | [R6][R10][ empty ][ empty ] │ |
| Operation | Cost |
|---|---|
| Insert | O(1) — append to last page |
| Search (no index) | O(N) — full scan |
| Delete | Mark as deleted; space reclaimed later |
| Update | Find record + modify |
Use case: Bulk loading data, temporary tables, when full-scan queries are common.
2. Sequential (Ordered) File
Records stored in sorted order based on a key attribute (the ordering field/key).
| │ Page 1 | [E101][E102][E103][E104] │ |
| │ Page 2 | [E105][E106][E107][E108] │ |
| │ Page 3 | [E109][E110] │ |
| Operation | Cost |
|---|---|
| Search by ordering key | O(log N) — binary search |
| Insert | O(N) — must maintain order; expensive |
| Range queries | Efficient — physically contiguous |
Use case: Range queries, reports, when sorted access is common.
3. Hashed File
Records stored based on a hash function applied to a key attribute. The hash function maps a key value to a bucket (page).
| Hash Function | bucket = hash(EmpID) % 4 |
| Bucket 0 | [E104][E108] |
| Bucket 1 | [E101][E105][E109] |
| Bucket 2 | [E102][E106] |
| Bucket 3 | [E103][E107][E110] |
| Operation | Cost |
|---|---|
| Search by hash key | O(1) — compute hash, go to bucket |
| Insert | O(1) — compute hash, insert in bucket |
| Range queries | Poor — records not in order |
Use case: Equality searches (WHERE emp_id = 105).
4. Clustered (Indexed Sequential) File
Records stored in sorted order AND an index (usually B+ tree) is built on the ordering key. This is how most RDBMS primary keys work.
| Operation | Cost |
|---|---|
| Search by key | O(log N) via index |
| Range query | Excellent — sorted + indexed |
| Insert | Moderate — may need page splits |
Use case: Most production OLTP tables; primary key access.
Variable-Length Records
Some records have variable-length fields (VARCHAR, TEXT, BLOB):
| Fixed-length | | EmpID(4) | Name(50) | Salary(8) | ← easy to navigate |
| Variable-length | | EmpID(4) | Name(??) | Salary(8) | ← need extra info |
| 1. Slotted Page | page header has array of (offset, length) per record |
| 2. Null bitmap | track which nullable fields contain null |
| 3. Overflow pages | long fields stored in separate overflow pages |
Record Formats
Fixed-Length Records
| EmpID | Name (padded to 50 chars) | Salary |
|---|---|---|
| (4 bytes) | (50 bytes) | (8 bytes) |
Comparison of File Organizations
| Organization | Insert | Search | Delete | Range Query |
|---|---|---|---|---|
| Heap | Fast | O(N) | Easy | Poor |
| Sequential | Slow | O(log N) | Moderate | Excellent |
| Hashed | Fast | O(1) | Fast | Poor |
| Clustered | Moderate | O(log N) | Moderate | Excellent |
Exam Focus
Revise definitions, diagrams, examples, and short-answer points for File Organization.
Interview Use
Prepare one clear explanation, one practical example, and one common mistake for this DBMS topic.
Search Terms
dbms, database management system, database notes, sql, unit, file, organization, file organization
Related DBMS Topics