E-Format discs
From: pcolmer@acorn.co.uk (Philip Colmer) Date: 26 Oct 90 12:55:54 GMT Time to put this one to rest, methinks ... When you format a hard drive, be it an ADFS drive or a SCSI drive, you can choose between "Old map" and "New map" formats. To see which format your hard drive is in, type
*ADFS:Map :4 If the output says "new map", you've got a new map format. Some facts about the new map format:
The larger the drive, the higher the LFAU needs to be in order for the drive to be able to be formatted. The following is a table of disc sizes versus possible LFAUs:
Capacity LFAU "Size" of a directory (approx) ======== ==== ============================== Up to 124Mbytes 256 4K Up to 249Mbytes 512 8K Up to 499Mbytes 1024 16K Up to 999Mbytes 2048 32K The "up to" values are inclusive. You can, of course, specify a large LFAU, but the side effect is that the larger the LFAU, the more disc space a directory takes, hence the "wasted disc space". This arrangement, therefore, allows files up to the LFAU to be stored on the disc without taking up any more disc space, and without fragmenting the disc too much.
Hope this clears everything up. --Fil.
From: nick@abccam.abcl.co.uk (Nick Reeves) Summary: E FORMAT DESIGN DOCUMENT Date: 26 Oct 90 16:53:50 GMT Organization: Active Book Company Limited, Cambridge, UK Briefly: The large directory size is the price you pay for a disc structure which almost always keeps files contiguously, but can use fragments if it has to. The rest of the space is not always wasted as small files in the directory can use it, and only have their length rounded up to the sector size rather than the cluster size. They are also quick to access as they are close to the directory and don't need to update the free space map.
NEW DISC FILE STRUCTURE FOR RISC OSMOTIVATIONThe aim of the new file structure is firstly to remove the following restrictions of the original file structure.
STRUCTURE OVERVIEWThe disc structure is made up of the following parts
NEW MAP STRUCTUREMost of the map is bitstream, with each bit in the map being associated with the allocation of a fixed number of bytes of disc space. This size (called the bit size) is always a power of two. The cluster size is the maximum of the bit size and the sector size. The map is divided into zones by the natural sector boundaries. Floppies are limited to one zone in the present implementation. The structure of a zone is as follows
start byte byte length use 0 1 this is a checksum on this zone calculated as below 1 2 offset in bits to first free space in zone, or 0 if none, with top bit always set 3 1 this byte when EORed with values for other zones should give FF 4 60 (only zone 0) the first 32 bytes are a disc record describing the disc, the other 28 bytes are 0, being reserved. The rest of the zone is the bitstream. All entries in the bitstream have the same format. The start of an entry is an id field (of width given in the disc record), the end is marked by a 1, and any other bits are set to 0. The bitstream contains the following three types of object with their respective interpretation of the id field.
OBJECT TYPE ID VALUE a) (fragments of) files file number (minimum value 2) b) unallocated fragments bit offset of next free fragment in zone or 0 c) disc defects always 1 When searching for the fragments of a file zone containing the first fragment can be calculated from zone = file number DIV ids per zone Where ids per zone = map bits in zone DIV (width of id field + 1) An exception is the file number 2. This is reserved for the structure which is placed on an empty disc (boot block, map and root directory). Searching for file number 2 should start at zone = total zones DIV 2. This is to allow the root directory and map of hard disc to be placed near the middle. The rest of the file's fragments (if any) can be found by searching through the zones in ascending order, wrapping back to zone 0 if necessary, looking for fragments with the correct file number. However because of the allocation strategies given below it is rare for a file to be in more than one piece unless it has been extended or is very large. The need for an id field places a minimum size on a fragment. For example possible fragment sizes are 2K, 3K, 4K, 5K ... for an 800K floppy, or 12K, 13K, 14K, 15K ... for a 20 Mb hard disc with a ten zone 2.5K map. Thus a compact map representation (down to 1 bit per cluster if necessary) has been achieved by making it impossible to represent inefficient allocations, (ie those with small fragments). In order to avoid wasting disc space for small files, when a file or directory does not use the whole of its allocated fragment and the fragment is too small too split sharing is allowed. Directories can share fragments with their sub files and files in the same directory can share a fragment. Sharing is at sector rather than cluster resolution, saving disc space. This also improves performance, firstly because head movement is reduced since small files are closer to each other and their parent directory on average, and secondly the map does not always need to be modified for small files. The SIN of a file is a 24 bit value. Bits 0-7 are the sector offset+1 in the fragment if the file may share a fragment or 0 if not. Bits 8-23 are the file number. ALLOCATION STRATEGIESDirectories are allocated from the middle zone outwards and files produced by whole file operations are allocated from the zone containing their parent outwards. For hard discs as well as keeping files in the same directory close to each other and their parent directory, it also tends to keep small files near the middle of the disc and large files further out. Both these reduce head movement. When choosing disc space on openout it is put at the start of a zone which balances distance from parent directory with amount of free space in the zone.As all allocation is indirected through the map it is possible to re-allocate to reduce fragmentation without having to read or write directories and a zone compaction routine is available for use by the allocation routines. This is highly optimised both in the choice of moves (eg it can spot fragments of the same file that can be joined) and in the execution of the moves. It builds up lists of moves which it cancels, combines, joins, splits, collects together in groups to be done together, and sorts into an order that reduces head movement for the scatter read/write primitives of the device drivers. Small compactions happen in response to particular allocation needs and opportunities rather than compacting a whole zone or disc at once so it not usually apparent to the user. Files produced by whole file operations (eg SAVE) tend to be longer lived than those produced by partial file operations (eg OPENOUT, GBPB). So if they cannot be allocated a single extent compaction continues until a free space of the correct size or data totalling twice the length of the file has been moved. If compaction fails the file is allocated either a pair of extents which totals the correct size or a set of adjacent free spaces (possibly with an extent removed or added to give a better fit). The only partial file operations that do compaction are open and close, and then only if a very good opportunity is found. As files produced by whole file ops are almost always allocated a single extent, and long lived files produced by partial file ops tend to re-assemble their fragments as compactions happen, seeking between extents of a file is greatly reduced.
MINOR DETAILS ------------- The checksum on a zone is calculated/checked as below ;entry ; R0 -> start ; R1 zone length ;exit ; LR check byte, Z=0 <=> good NewCheck ROUT Push "R1,R2,LR" MOV LR, #0 ADDS R1, R1, R0 ;C=0 loop LDR R2, [R1, #-4] ! ADCS LR, LR, R2 TEQS R1, R0 ;preserves C BNE loop AND R2, R2, #&FF ;ignore old sum SUB LR, LR, R2 EOR LR, LR, LR, LSR #16 EOR LR, LR, LR, LSR #8 AND LR, LR, #&FF CMPS R2, LR Pull "R1,R2,PC" The checksum on directories remains the same. But the validation string for a directory has changed from 'Hugo' to 'Nick'
; Directory Start ^ 0 StartMasSeq # 1 StartName # 4 DirFirstEntry # 0 ; Old Directory End ^ 0 # -1 DirCheckByte # 0 ;RETRO DEFINITION was reserved # -4 EndName # 0 # -1 EndMasSeq # 0 # -14 ;reserved DirTitleSz * 19 # -DirTitleSz OldDirTitle # 0 # -3 OldDirParent # 0 # -NameLen OldDirName # 0 # -1 OldDirLastMark # 0 ;dummy last entry marker ; New Directory End ^ 0 # -1 ASSERT DirCheckByte=@ # -4 ASSERT EndName=@ # -1 ASSERT EndMasSeq=@ # -NameLen NewDirName # 0 # -DirTitleSz NewDirTitle # 0 # -3 NewDirParent # 0 # -1 ;reserved # -1 ;reserved # -1 NewDirLastMark # 0 ;dummy last entry marker ; ================ ; TestDirCheckByte ; ================ ; entry ; R3 ind disc add of dir ; R5 -> dir start ; R6 -> dir end ; exit ; LR check byte ; Z clear if matches existing byte TestDirCheckByte Push "R0-R2,R5,R7,LR" BL EndDirEntries ;(R3,R5,R6->R0) BL TestDir ;(R3->LR,Z) ADDEQ R7, R6, #NewDirLastMark+1 ;dir tail ADDNE R7, R6, #OldDirLastMark+1 05 BIC R1, R0, #3 MOV R2, #0 10 ;whole words before end of entries LDR LR, [R5],#4 EOR R2, LR, R2,ROR #13 TEQS R5, R1 BNE %BT10 20 ;odd bytes before end of entries LDRNEB LR, [R5], #1 ;not first pass through loop EORNE R2, LR, R2,ROR #13 TEQS R5, R0 BNE %BT20 MOV R5, R7 30 ;odd bytes before at start of tail TSTS R5, #3 LDRNEB LR, [R5],#1 EORNE R2, LR, R2, ROR #13 BNE %BT30 ASSERT DirCheckByte=-1 ;dont include last word as it contains SUB R1, R6, #4 ;the check byte 40 ;whole words in tail LDR LR, [R5],#4 EOR R2, LR, R2,ROR #13 TEQS R5, R1 BNE %BT40 EOR R2, R2, R2, LSR #16 ;reduce to 8 bits EOR R2, R2, R2, LSR #8 AND R2, R2, #&FF LDRB LR, [R6,#DirCheckByte] ;compare with old check byte TEQS R2, LR MOV LR, R2 Pull "R0-R2,R5,R7,PC" ; ============= ; EndDirEntries ; ============= ; Find the end of the list of entries in a dir ; entry ; R3 ind disc add of dir ; R5 -> dir start ; R6 -> dir end ; exit ; R0 -> first empty entry EndDirEntries ROUT Push "R2,LR" BL TestDir ;(R3->LR,Z) ADDEQ R2, R6, #NewDirLastMark ADDNE R2, R6, #OldDirLastMark ADD R0, R5, #DirFirstEntry SUB R0, R0, #DirEntrySz 10 ;loop examining entries LDRB LR, [R0,#DirEntrySz] ! CMPS LR, #0 ;until null entry CMPNES R0, R2 ;or exhausted BLO %BT10 MOVHI R0, R2 Pull "R2,PC",,^ poppy@poppyfields.net |