In data structure, file is a collection of records one after another and having fixed length.
- Insertion – At the end;
- Deletion – Anywhere;
- Search – Record by record;
In Data Structure it is called as heap, in OS it is called as sequential file. OS visualizes file as a series of blocks as it is not concerned about the contents of the file. A block is generally of the size of sector defined on disk. Disk can be hard sector (predefined sectors generally 8) or soft sector (user defined sectors which we use). OS does everything (deleting or appending records) in terms of blocks.
If we consider a disk with 8 sectors then every track will have sectors numbered from 1 to 8. Contents of the file (block) is defines using track number - sector number combination. FAT maintains this entry indicating the ith block (bi) is stored at track n and sector j.
Space to a file cannot be allocated in advance as it may grow beyond expectations. Hence every time when a block is required OS looks for empty block (a mark is associated with the block indicating its availability). When an available block is encountered then it is given to the file, an entry is made in the FAT and the block’s entry is deleted from available list; and finally file contents are appended.
If the FAT is too long then locating a block with desired data will be an exhaustive search. Hence in UNIX, the first 10 entries of DAT are direct addresses (directly pointing to data blocks). The 11th block will store the address of the block holding addresses of other data blocks. The 12th block will point to a block, which is pointing to other blocks holding addresses of data blocks.
Direct
|
Single Indirect
|
Double Indirect
|
Triple Indirect
|
10
|
500
|
500 X 500 = 5002
|
500 X 500 X 500 =
5003
|
Note: The number 500 will differ from OS to OS, installation to installation, machine to machine. It is size of block/size of each address.
This system supports small size files and at the same time it can support large files too. It provides quick access to small files and at the same time supports big files with little reduced efficiency. The table with 10+3 entries has to be memory resident. The direct addressing blocks can be configured if average file size is smaller or larger than predefined 10 direct blocks. A 4th level of indirection can also be added using 14th block. Levels of indirection are added in incremental manner.
Hence the advantages of UNIX FAT over traditional FAT are
- FAT need to be memory resident and huge FAT can’t be accommodated.
- Searching through big size FAT – As it won’t fit on memory so you need to perform multiple disk accesses and at the end you may not be having the address you were looking for. We can’t spend too much time on just locating the address of a block.
Disk Access & Block Organization
As head passes through blocks, contents are read, loaded into disk buffer and then transferred to RAM. Once the disk buffer is full, it needs some time to get flushed before other data can be placed in it.
Track no. is the thing that expects physical movement of head which occurs in a jumpy manner. The track no. access requests need to be optimized with the intention of minimizing head movement on disk. Hence different disk scheduling techniques are used. As the disk scheduling algorithms optimize the head movement therefore a request placed fists may be fulfilled later.
Record Number -> Block Number -> (Track Number, Sector Number) -> Disk
If records are of fixed size then a fixed n records can be placed on a block. Never ever split record across blocks. Now translate record no. to block no. using simple calculation. E.g. if 2 records are on each block then 9th record will be on 4th block.
- B Tree: Key Value -> Record id (rid) -> Block id (bid) -> (Track number, Sector Number)
- Hashing: h (Key Value) - > Record id (rid) -> Block id (bid) -> (Track number, Sector Number)
No comments:
Post a Comment
Your comments are very much valuable for us. Thanks for giving your precious time.