the LEAN file system

LEAN is a new file system that...

  • is easy to understand and implement
  • suits very small and very large volumes
  • uses extent-based allocation
  • provides Unicode long file names
  • supports extended attributes
  • requires little memory and CPU power
  • is compatible with POSIX

LEAN is a free, simple, portable, personal file system created to provide an alternative to the proprietary -and partly patented- Microsoft FAT file system. It is primarly intended for media exchange and for use with embedded devices, and to expand the file system functionality of the FreeDOS-32 operating system.

The word LEAN is a recursive backronym for Lean yet Effective Allocation and Naming, based on the ironical contrary of FAT, as an English word. The initial inspiration of the file system came from the Linux Second Extended file system (ext2), followed by original research and integration of comments and criticism. Although independently developed, there are some similarities with the QNX 4 file system.

LEAN aims to be the simplest file system providing the full set of most common and useful file system features. Any design choices have gone in the direction of the greatest simplicity rather than performance. LEAN is designed for use even on platforms with limited memory and CPU power.

News

2011-04-09 - Version 0.6.1 of the reference implementation is available. This is a bug fix release.

2010-03-22 - The first public release of the reference implementation is available for download. Labelled 0.6.0, this is an implementation of the LEAN file system specification version 0.6.

Overview

Structure of a LEAN file system volume File allocation in the LEAN file system

The basic allocation unit of a LEAN volume is a conventional sector of 512 bytes. Since all LEAN data structures require read-modify-write of sectors to operate properly, a conventional sector size that is not equal to the physical sector size is not an issue, for it can be simply handled with appropriate sector buffering. Logical sector addresses and file sizes are represented by 64-bit numbers, and there is no intrinsic limit on the count of sectors per files, files per directory or files per volume. Thus, LEAN may be used on media as small as old floppy disks, up to very large disks with billions of terabytes of capacity.

The volume status is stored in a superblock at the beginning of the hosting block device, as well as in a superblock backup in another location. Free sectors are tracked by a bitmap, that is evenly spread across the file system to take advantage of locality in space. This has the net effect of subdividing a LEAN volume in logical bands, each made of a contiguous amount of sectors with its own bitmap. This can be used for strategies for reducing fragmentation such as those used by the ext2 file system with its cylinder groups.

File metadata are stored in inode structures, located in the first sector of each file, like a header. An inode structure stores the attributes of a file and information to access any other sector of the file. Each file is uniquely identified by an inode number, that is the address of its first sector. File allocation is extent-based, that is each file is composed by several chunks, called extents, each made up of one or more contiguous sectors. Thus, an extent is specified by a starting sector and a count of sectors. A few extent specifications of the file, called the direct extents, are stored in the inode structure itself for maximum efficiency. If more extents are needed, they are addressed indirectly, using a doubly-linked list of indirect sectors, special sectors that contain extent specifications instead of file data.

In order to simplify implementation of file system drivers, LEAN defines only one type of file system object, that is a file. Other kind of file system objects, such as directories, are implemented using the same model, that is the allocation model described in the previous paragraph.

While regular files contain arbitrary data, directories are files which store an unordered list of directory entries, each specifying a file name and an inode number. This means that each file can be accessed under one or more names: the hard links to the file. The LEAN file system provides case sensitive international long file names, encoded in Unicode UTF-8, practically unlimited in length (up to 4068 bytes long). In addition, symbolic links are provided as an indirect means to locate a file: they are files which containin a single string that is the pathname of the pointed-to object.

Finally, forks are files which store an unordered list of extended attributes. Extended attributes are optional name-value pairs of metadata associated with a file. Up to one fork can be associated with each non-fork file system object. An arbitrary number of extended attributes can be associated with a file in this way. For performance reasons, a limited amount of extended attributes may also be embedded in the first sector of the non-fork file itself, right after the inode structure, before actual file data. They are called the inline extended attributes.

Technical résumé

Allocation unit512-byte conventional sectors
Max volume size263-1 sectors = billions of terabytes
Max file size264-1 bytes = billions of gigabytes
Number of filesarbitrary
Files per directoryarbitrary
Free space trackingband-spread bitmap
Allocation typemixed linked/indexed, extent-based
Directory formatplain unordered list
File namesup to 4068 bytes, case sensitive, in UTF-8
Linkshard and symbolic
Extended attributesname-value metadata, embedded in a file or in a separate fork
Forksplain unordered list of extended attributes associated with a file
Access controloptional POSIX permissions
Endiannesslittle endian

Where to go from here

To go into the depth of structure formats and algorithms of the file system, you may read the LEAN file system specification.

A reference implementation, written in C++, is available in the Downloads section. A file-manager-like GUI is included.