Creating the Filesystems


URI:

http://herbert.gandraxa.com/herbert/lfscfs.asp

HTML template:   

<a href="http://herbert.gandraxa.com/herbert/lfscfs.asp">Creating the Filesystems</a>


Link symbols:   

Local LinkOn current page | DocumentOn this site | External PageOn external site | WikipediaWikipedia article | Compressed ArchiveZIP archive | PDF documentPDF | E-MailE-Mail


Article

Organization

DocumentHome » DocumentArticles » DocumentLinux from Scratch » Creating the Filesystems

Scope

This article is part of the series Linux from Scratch. — When partitions were created, we need to organize them by giving them a filesystem (i.e. "formatting" them).

Author

DocumentHerbert Glarner

Published

2008-Jan-08


Introduction

In the previous article (DocumentPartitioning the Harddisk) we have segmented our harddisk in a total of 5 partitions (and not 6, like a listing might suggest: /dev/sda4 is not a partition, but a container for logical partitions). For ease of reference, they are listed here once again:

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1        1045     8393931   82  Linux swap / Solaris
/dev/sda2   *        1046        3134    16779892+  83  Linux
/dev/sda3            3135        4179     8393962+  83  Linux
/dev/sda4            4180       38913   279000855    5  Extended
/dev/sda5            4180        7312    25165791   83  Linux
/dev/sda6            7313       15667    67111506   83  Linux

Partitioning had the effect that the machine treats the single physical harddisk as if there were 5 storage media. This is fine and precisely what we wanted, but we still won't be able to meaningfully store any data on them, because they lack any internal organization.

An internal organization is required to tell Linux, where on the media to look for data, and what information to expect about such data. In other words, an "internal organization" means to have data about data (meta-data), and this is provided by a filesystem, which we are going to create in the course of this article.1

Under Linux, we have the choice between many filesystems.2 Among those, there is no "best" or "fastest" filesystem: everything depends from the planned usage. At the time of writing this article, the filesystem named ext3 (Wikipedia articlethird extended filesystem) is the most widespread, and it is very stable. Development of ext4 has begun, but is still considered experimental. There are filesystems which in some circumstances are some percent faster than ext3, but in my opinion the advantages of ext3 outweigh this by far.

Each partition requires its own filesystem, and they can be different from each other. In this article, we are going to create the filesystem ext3 on 3 of our partitions, namely /dev/sda2 (also known as the root partition /), /dev/sda5 (aka /var and /dev/sda6 (aka /home), and furthermore we are going to initialize another partition to be used as the swap area, namely /dev/sda1 (aka swap).

Notice, that /dev/sda3 is not part of our Linux setup, thence it will not get a filesystem at this point. We deliberately left that gap in the harddisk in anticipation of a future usefulness.

________

1 If you are accustomed to Microsoft Windows: there the process of creating a filesystem is called Formatting.

2 Filesystems under Windows are VFAT and NTFS.


Organization

Apparent View

This section sketches the internal organization of Linux filesystems before we are actually going to implement a such onto our partitions.

The filesystem shows itself to the outside world as a collection of directories1 and files. Directories can contain themselves directories and files. Directories within a directory are referred to as child directories (or just "children"), whereas the directory in which child directories are contained is called the parent directory (or just the "parent"). Both directories and files are identified by a name. This name needs not to be unique within the whole device, but it must be unique within the actual directory, since it usually is via this name that a user accesses the directory or file.

There is a special directory, the root directory with the name / (just an ordinary slash). Ultimately, via a hierarchy of directories, each file can traced back to this root directory. For example is /home/herbert/letter77.txt referring to a file named letter77.txt, contained within a directory herbert, which is contained itself in a directory named home, which is a directory directly under the root directory /.

Note however, that under Linux "file name extensions" like .txt (somewhat arbitrarily referred to in above example as being a file) have no particular meaning. It is very well possible to have a directory named letters.txt and a text file being named letter77 (i.e. without any extension). The distinction between a directory and a non-directory is carried out by a variety of other means, which will be discussed in a later article.

Now, this is how the filesystem appears to the outside world. Internally, though, things are slightly more complicated, as there is an additional layer in between the name and the actual data.

Internal Organization

It is important to realize, that both a "directory" and a "file" ultimately are nothing else than a bit collection, i.e. some bits in a specific order, placed somewhere on the device in question (in our case a partition of a harddisk).

To access such a bit collection, one must know its starting place on the disk. This is where one of the tasks of the filesystem comes into play.

Let's assume for a moment, that we already created a functioning ext3 filesystem. Then it is the filesystem which takes your input, say letter77.txt within the directory /home/herbert, and translates it to the start of the bit collection represented by this file. Only, that this translation does not require just one lookup (as is the case in Windows, for example), but two of them.

The first lookup translates the direction-file-combination (the whole /home/herbert/letter77.txt) into a number, say 1608. This number is important in all Unix-like environments. Because it is so important, it was given an own name, and that name you will encounter often. The number is called the Wikipedia articleinode number. (It is uncertain what the "i" in inode stands for. Some translate inode to "information node", but as per one of the original Unix pioneers this is not really correct.2)

Having this inode number (1608, as per the example), another lookup occurs, which eventually translates the request to the correct starting address of the bit collection, if the details are met.

These details now are quite important. For example do they regulate access rights. In other words, it is checked, if you have legitimate access for the intended usage (read, write, execute) for the bit collection internally identified by the inode number 1608. Only if you do, the start of the bit collection will be accessed and, as per intended usage, given back, replaced with new content, or executed, resp.

This additional layer, on which all bit collections are managed by just a number, has some advantages. For instance is it possible, to have two different file names in (possibly different) directories, both being translated to the very same number. Because one number within one filesystem is uniquely identifying a specific bit collection, both (and possibly many more) "references" are ultimately identifying the very same file (and its access rights and more). This situation, in which a file name references an already existing bit collection (i.e. without duplicating the bit collection's data) is called a link. More precisely, we have a hard link here.

More Files

Under Linux practically everything appears to be realized via the filesystem depicted above. Harddisks, harddisk partitions, input and output devices, networking cards, even the machine's RAM — all these devices are integrated the same way into a special "directory" named /dev (see the initial table on this page).

However, since there is no filesystem yet on our harddisk, it is apparent, that "directories" like /dev are not truely realized as files.

________

1 Folders in Windows parlance.

2 "In truth, I don't know either. It was just a term that we started to use. "Index" is my best guess, because of the slightly unusual file system structure that stored the access information of files as a flat array on the disk, with all the hierarchical directory information living aside from this. Thus the i-number is an index in this array, the i-node is the selected element of the array. (The "i-" notation was used in the 1st edition manual; its hyphen became gradually dropped)."—Dennis Ritchie


Implementation

Data Partitions

In this section, we are going to install the filesystem ext3 into the harddisk's partitions /dev/sda2, /dev/sda5 and /dev/sda6. The swap partition /dev/sda1 does not need a filesystem, as we will see later. Also, /dev/sda4 is just a container and needs no filesystem either. Finally, /dev/sda3 is left out intentionally, as it will not be a part of the Linux system.

The program to create the filesystem ext3 is mke2fs (make second extended filesystem). Note, that despite its name, the command is also responsible for the third extended filesystem. mke2fs has quite many options, of which we will use just a few:

If we pass the option -c, then the partition is checked for hardware errors. If we pass that option twice (-cc), then an extended and quite time-consuming and destructive read-and-write test is performed. "Destructive" refers to the fact, that each byte within the partition is overwritten, thus destroying all data on the partition. If you followed this article series from scratch, then this can not bother you, because the harddisk is empty anyway, and in this case it is strongly recommended to run that (very time consuming) test anyway. However, if you jumped in directly to this page, be warned: all the data on your harddisk will be lost if you execute this program as indicated.

The option -j activates the journaling capability. It is mainly this feature which makes the filesystem an ext3 filesystem: without it it would be an ext2 filesystem.

The option -v stands for "verbose" and gives us feedback about the progress of the filesystem creation. You are advised to enable this option, simply because of the duration of the program: such you know that "it's still doing something meaningful".

As the last used option we provide -L volumelabel, which we can use to name the partition. If you follow the article series, you are advised to name the partitions 2, 5 and 6 with Root, Var and Home, resp.

After all these oprions, we must tell the program, onto which partition we want to install the filesystem. Let's start with implementing a filesystem on the smallest of the 3 partitions, namely /dev/sda2:

root [ ~ ]# mke2fs -ccjvL Root /dev/sda2

A quite verbose output stating the most important details follows almost instantly:

mke2fs 1.40.2 (12-Jul-2007)
Filesystem label=Root
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
2101152 inodes, 4194973 blocks
209748 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
129 block groups
32768 blocks per group, 32768 fragments per group
16288 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000
...

Block size=4096 means, that 1 block has a size of 4 KiB (4,096 bytes). Together with the information 4194973 blocks, we see that we have just a bit over 16 GiB. Expressed in KiB, these are 16,779,892 KiB, not surprisingly exactly the number of 1 KiB-blocks when we made the partition (see article DocumentPartitioning the harddisk).

As per above aoutput, we have 2101152 inodes, and because each file has one inode, we can store maximally the same number of files in this partition, i.e. the partition is good for more than 2 million files.

The term Wikipedia articlesuper user, more commonly called root, is the user who is in charge of the system administration.1

After this, a quite lengthy procedure is started in order to thoroughly check your partition's physical integrity. This is done indirectly, by implicitely running the program badblocks. Invocation of this program was commanded by the option -cc. Although it is possible to run badblocks manually, you are discouraged to do so, because faulty parameters result in possibly unusable partitions: it is better to let mke2fs provide the needed arguments (as is done by the -cc options).

...
Running command: badblocks -b 4096 -X -s -w /dev/sda2 4194972
Testing with pattern 0xaa: done
Reading and comparing: done
Testing with pattern 0x55: done
Reading and comparing: done
Testing with pattern 0xff: done
Reading and comparing: done
Testing with pattern 0x00: done 
Reading and comparing:          690816/       4194973

The four tests first write a specific bit pattern into the whole partition (during the phase Testing with pattern 0x...), then it is checked if the value intended was indeed written correctly (during the phase Reading and comparing).

Both phases of each of the four tests (i.e. 8 phases) access each byte within each block of your partition. The program gives you some feedback on the actual progress (as seen above in the last line, where it at the moment reads 690816/4194973).

Once all eight phases were done (which, depending on your partition size and your hardware speed can take much time), you will find yourself back to the Linux prompt.

root [ ~ ]# 

Now you need to create an according filesystem also for the partitions 5 and 6. Note, that if you are following this article series literally, then you might want to consider the expected durations of this task: both these partitions are larger than the just formatted Root partition; especially is partition 6 (Home) eight times larger than Root, which means that it will take roughly 8 times so much time — it probably is a good idea to do some shopping while the task runs).

These are the needed commands for the remaining two partitions (output similar to above, hence the many lines are abbreviated with an ellipse ...):

root [ ~ ]# mke2fs -ccjvL Var /dev/sda5
...
root [ ~ ]# 

and

root [ ~ ]# mke2fs -ccjvL Home /dev/sda6
...
root [ ~ ]# 

Swap Partition

The swap partition does not need to be given a filesystem. However, we nevertheless must set up a Linux swap area.

This is done with the command mkswap at the Linux prompt. We do need to give the program the partition to use, which in our case is sda1.

Similarly to the program mke2fs, there is an option -c, which instructs the program to check the partition for bad blocks, before the creation of the swap area is attempted. Unlike mke2fs, though, there is no -cc option to enforce a verbose feedback during the check: so prepare for a lengthy waiting time before being returned to the prompt. Alternatively, if you fully trust your brandnew harddisk, omit the -c option altogether.

Another similarity with mke2fs is the possibility to label the partition with the -L name option. Because this will come in handy in the next task, we won't omit the chance. As name we naturally choose Swap. Note, however, that you are allowed to have several swap partitions (or better, swap devices, as you are not really restricted to disk partitions): if you plan to employ more than one swap space, you might want to name this swap partition differently instead, e.g. Swap1.

root [ ~ ]# mkswap -cL Swap /dev/sda1
Setting up swapspace version 1, size = 8595378 kB
no label, UUID=c7215317-2336-481c-afa8-9f784b80a52b
root [ ~ ]# 

Note, that the k in kB is an ISO prefix here, meaning 1,000 and not 1,024.2

________

1 Note, that a Windows Administrator only very distantly reflects the powers and abilities of a Unix super user. External siteFurther reading.

2 Thus mkswap refers to 1,000×8,595,378=8,595,378,000 Bytes. Comparing this with the data available via fdisk -l unveils, that 1,045 units were reserved, at 8,225,280 Bytes each. This corresponds to a total of 8,595,417,600 Bytes for the partition. Thus 39,600 Bytes are actually missing, i.e. not included into the swap system. Maybe I will investigate this further one time, but for the time being I just accept it. (Which doesn't mean your mail with the reason is not welcome, of course.)