CMSC 412 Project #5
File System (Part 1)
Due Tuesday, November 29, at 6:00pm
Introduction
The purpose of this project is to add a new filesystem to GeekOS, as well as
the standard operations for file management
GOSFS
- GeekOS FileSystem
The main part of this project is to develop a new filesystem for the GeekOS. This filesystem will reside on the second IDE disk drive in the QEMU
emulator. This will allow you to continue to use your existing PFAT drive to
load user programs while you test your filesystem.
The second IDE disk's image is called diskd.img.
GOSFS will provide a filesystem
that includes multiple directories and long file name support.
The Mount system call allows you to associate a
filesystem with a place in the file name hierarchy. The
Mount call is implemented as part of the VFS code we supply.
Then you can mount the GOSFS
file system on drive 1 onto /d, for
instance.
VFS
and file operations
Since GEEKOS will have two types of filesystems (PFAT and GOSFS), it will have a virtual filesystem layer (VFS) to handle sending requests to an
appropriate filesystem (see figure below). We have
provided an implementation of the VFS layer in the file vfs.c. The VFS layer will call the appropriate GOSFS
routines when a file operation refers a file in the GOSFS filesystem.
The System Call layer is already implemented in syscall.c and the PFAT in pfat.c. Thus the only component you need to take care of is
the GOSFS one.
Each user space process will have a file descriptor
table that keeps track of which files that process can currently read and
write. Any user process should be able to have up to 10 files open at once.
The file descriptors for a user process are kept in the files[MAX_OPEN_FILES] array in struct User_Context. Note
that not all the entries in the files are open files, since usually a process has less than
10 files open at once. If the field openFile.fsType
== FS_TYPE_NONE that represents a free slot (file descriptor not used).
But the good news is that file descriptor management is already implemented for
you (see Open() function in vfs.c).
Your filesystem should
support fixed length filenames (at most 64 bytes, including a null at the end
for a file/directory name). A full path to a file will be no more than 1024
characters.
You should keep track of free disk blocks using a bit
vector (as described in class). A library called bitset
is provided (see bitset.h and bitset.c) that manages a set of bits and provides functions to
find bits that are 0 (i.e. correspond to free disk blocks).
All disk allocations will be in units of 4KB (i.e. 8
physical disk blocks). Thus one bit in a bitset
corresponds to a 4KB block. A bitset that is 8192
bits (1024 bytes) large will obviously keep track of
8192 * 4KB = 32 MB of data.
Directory
Structure
See the recitation slides for
details on directory structure. Each directory in GOSFS takes up a single disk
block. The structure of the directory is defined in gosfs.h.
A directory is an array of GOSFSfileNodes (36
elements, since they have to fit in a single 4KB block). Each filenode can represent either a file in the directory or a
subdirectory.
The filenode for a directory
is distinguished by the isDirectory bit. The location of the block that holds the data
for the directory will be stored in the first entry in the blocks array of the directory's filenode
(hence entries blocks[1]..blocks[7] are unused).
Files
Unlike directories, that have
a fixed size of one blocks (irrespective of how many files the hold), files can
take up an arbitrary number of disk blocks. You will use a version of indexed
allocation to represent the data blocks of your filesystem.
The blocks field (GOSFSfileNode, gosfs.h) keeps track of data blocks for a file. The
first eight 4KB-blocks are direct blocks, the ninth points to a single indirect
block, the tenth to a double indirect block. See the recitation slides for a
detailed layout.
New
System Calls
You have to implement the sematics
of the new system calls as described below. As you see, the semantics is very
similar to the UNIX one.
·
All
of these functions vector through the VFS layer before you implement them at
the GOSFS level. So the functions names are all of the form GOSFS_<function>. So the Mount call you implement is GOSFS_Mount in gosfs.c
·
You
can look in pfat.c to see how a complete
implementation of a filesystem using the VFS layer
works. Be sure to look at the use of VFS
functionality such as Allocate_File, which will be
critical to use.
System Call User Function |
Return on success |
Return on failure |
Reasons for failure |
Comment |
SYS_MOUNT Mount(char *dev, char *prefix, char *fstype) |
0 |
-1 |
· a filesystem
already mounted under name · illegal value for one of the parameters |
Your Mount function should not "validate" the filesystem settings except for magic and version fields,
and that block size is support-able (a multiple of 512, or 512/1024/4096 at
least). Other items, e.g., the number and start location of inodes and the total number of blocks, can be arbitrary. |
SYS_OPEN Open(char *name, int permissions) |
new file descriptor number |
-1 |
· name does not exist (if permissions don't include O_CREATE ) · path to name does not exist (if permissions include O_CREATE ) · O_WRITE and O_CREATE not allowed for directories, use CreateDirectory instead |
· there's no create syscall,
so setting O_CREATE will create the file. If the file exists, the call
succeeds (return >= 0) but its data contents is not affected. · Should NOT create directories recursively if needed, e.g. Open("/d/d1/d2/d3/xFile",
O_CREATE), will NOT create d1 inside of
d, d2 inside of d1, etc. if they don't exist already. If the leading path
/d/d1/d2/d3 does not exist, the syscall fails,
returning -1 · The permissions values are flags and may be or'ed together in a call. For example:
|
SYS_OPEN_DIRECTORY Open_Directory(char *name) |
New file descriptor number |
-1 |
· name does not exist · name is not a directory |
|
SYS_CLOSE Close(int
fd) |
0 |
-1 |
· fd not within 0-9 · fd is not an open file |
|
SYS_DELETE Delete(char *name) |
0 |
-1 |
· name does not exist · name is a non-empty directory |
if Delete(file) is called and file is still open in other threads or even in the
thread that called Delete(),
all the subsequent operations on that file (except Close()) should fail |
SYS_READ Read(int
fd, char *buffer, int
length) |
number of bytes read |
-1 |
· fd not within 0-9 · fd is not an open file · fd was not open with O_READ flag |
· it's OK if return value <
length, for instance reading close to
end of file · increase the filePos, if successful There is special behavior when SYS_READ is called on
a directory:
dirEntry is defined in fileio.h |
SYS_READ_ENTRY Read_Entry(int fd,
struct VFS_Dir_Entry *dirent) |
1 |
-1 |
·
fd is not a directory ·
file pointer is at end of directory |
|
SYS_WRITE Write(int
fd, char *buffer, int
length) |
number of bytes written |
-1 |
· fd not within 0-9 · fd is not an open file · fd was not open with O_WRITE flag · fd is a directory |
· increases filePos is successful · "Grow on write"- allocate blocks
"on the fly" if past end of file |
SYS_STAT Stat(char *file, fileStat *stat) |
0 |
-1 |
· file is not found, readable |
|
SYS_FSTAT Stat(int
fd, fileStat *stat) |
0 |
-1 |
· fd not within 0-9 · fd is not an open file |
|
SYS_SEEK Seek(int
fd, int offset) |
0 |
-1 |
· fd not within 0-9 · fd is not an open file · offset > fileSize |
offset is an
absolute position; could be equal to fileSize, then Write appends, see above |
SYS_CREATEDIR CreateDirectory(char *name) |
0 |
-1 |
· name already exists, as file or directory · regular file encountered on the path to name |
Should create directories
recursively if needed, e.g. CreateDirectory("/d/d1/d2/d3/d4"), will create d1 inside of d, d2 inside of d1, etc.
if they don't exist already. This operation should
be atomic, in the sense that either the whole directory chain is created or
no directory is created. |
SYS_FORMAT Format(int
drive) |
0 |
-1 |
· illegal value for drive (it must work with 1, higher is optional) · drive is in use, i.e. mounted |
formats a drive with GOSFS;
don't need to support formatting with PFAT ; don't need to format in init
code; so you can save your data between sessions |
Disk
Layout
A guideline is provided above. First block (0) is
called SUPERBLOCK, and contains filesystem
housekeeping data. Blocks >= 1 contain files and directories.
· The Magic number at the very beginning should be 0xbeebee.
This tells you that the disk has a GOSFS filesystem
on it. If you try to mount a drive and you don't find the magic signature,
return error.
· Root
Dir Pointer holds the block number of the block containing the
root directory.
· Size is the size of the
disk, in 4KB blocks. (32M / 4K = 8K for the example above)
· Free
Blocks Bitmap is : Size bits large, that is Size/8 bytes large. ( 8K / 8 = 1K for the
example above). Every block has an associated bit.
When you do a Format() , you make a raw disk usable with GOSFS.
That is:
Keep in mind that the superblock and root directory have no associated GOSFSfileNode.
Notes
You do not need to consider situations where two processes have the
same file open. You do not need to consider situations where one process opens
the same file twice without closing it in between.
Too allow you to
cache information, the VFS layer includes a Sync function. When the Sync function is called, all changed
state needs to be saved to disk (i.e. the machine can be rebooted after it). You may choose to make all operations
synchronous, in that case sync will be a no-op.
If a read() is called
on a directory, the data returned should be in the form of an array of dirEntry structures. The length argument and the return
value will indicate the number of entries to read and the number of entries
that were read, rather than the number of bytes.
Requirements
· Make sure your Mount() works well, so that we can test your
project. If we cannot Mount() a GOSFS, we cannot
grade your project.
· You might also want to mount "/d" (dee) automatically in
· You should support disk sizes of at least
32 MB. More than 32 MB is optional. Following the procedure described in the
"How to create an arbitrary size big diskd.img" section above,
in your submitted project, when someone types gmake, a 32 MB file should
be created.
· You should support file sizes of at least
5 MB (double indirect threshold crossed, yes). More than 5 MB is optional.
Testing
As you saw at the top, in src/user there are some programs that can be used to test your
file management syscalls: cp,c, ls.c, mkdir.c,
mount.c, p5test.c.