mirror of https://github.com/ARMmbed/mbed-os.git
Merge commit '0171b57a04c3eb6444fdf1163e0e21993445bfd8'
commit
26ade6236a
|
@ -290,31 +290,32 @@ The path to data block 0 is even more quick, requiring only two jumps:
|
|||
|
||||
We can find the runtime complexity by looking at the path to any block from
|
||||
the block containing the most pointers. Every step along the path divides
|
||||
the search space for the block in half. This gives us a runtime of O(log n).
|
||||
the search space for the block in half. This gives us a runtime of O(logn).
|
||||
To get to the block with the most pointers, we can perform the same steps
|
||||
backwards, which keeps the asymptotic runtime at O(log n). The interesting
|
||||
backwards, which puts the runtime at O(2logn) = O(logn). The interesting
|
||||
part about this data structure is that this optimal path occurs naturally
|
||||
if we greedily choose the pointer that covers the most distance without passing
|
||||
our target block.
|
||||
|
||||
So now we have a representation of files that can be appended trivially with
|
||||
a runtime of O(1), and can be read with a worst case runtime of O(n logn).
|
||||
a runtime of O(1), and can be read with a worst case runtime of O(nlogn).
|
||||
Given that the the runtime is also divided by the amount of data we can store
|
||||
in a block, this is pretty reasonable.
|
||||
|
||||
Unfortunately, the CTZ skip-list comes with a few questions that aren't
|
||||
straightforward to answer. What is the overhead? How do we handle more
|
||||
pointers than we can store in a block?
|
||||
pointers than we can store in a block? How do we store the skip-list in
|
||||
a directory entry?
|
||||
|
||||
One way to find the overhead per block is to look at the data structure as
|
||||
multiple layers of linked-lists. Each linked-list skips twice as many blocks
|
||||
as the previous linked-list. Or another way of looking at it is that each
|
||||
as the previous linked-list. Another way of looking at it is that each
|
||||
linked-list uses half as much storage per block as the previous linked-list.
|
||||
As we approach infinity, the number of pointers per block forms a geometric
|
||||
series. Solving this geometric series gives us an average of only 2 pointers
|
||||
per block.
|
||||
|
||||

|
||||

|
||||
|
||||
Finding the maximum number of pointers in a block is a bit more complicated,
|
||||
but since our file size is limited by the integer width we use to store the
|
||||
|
@ -322,7 +323,7 @@ size, we can solve for it. Setting the overhead of the maximum pointers equal
|
|||
to the block size we get the following equation. Note that a smaller block size
|
||||
results in more pointers, and a larger word width results in larger pointers.
|
||||
|
||||

|
||||

|
||||
|
||||
where:
|
||||
B = block size in bytes
|
||||
|
@ -335,8 +336,83 @@ widths:
|
|||
|
||||
Since littlefs uses a 32 bit word size, we are limited to a minimum block
|
||||
size of 104 bytes. This is a perfectly reasonable minimum block size, with most
|
||||
block sizes starting around 512 bytes. So we can avoid the additional logic
|
||||
needed to avoid overflowing our block's capacity in the CTZ skip-list.
|
||||
block sizes starting around 512 bytes. So we can avoid additional logic to
|
||||
avoid overflowing our block's capacity in the CTZ skip-list.
|
||||
|
||||
So, how do we store the skip-list in a directory entry? A naive approach would
|
||||
be to store a pointer to the head of the skip-list, the length of the file
|
||||
in bytes, the index of the head block in the skip-list, and the offset in the
|
||||
head block in bytes. However this is a lot of information, and we can observe
|
||||
that a file size maps to only one block index + offset pair. So it should be
|
||||
sufficient to store only the pointer and file size.
|
||||
|
||||
But there is one problem, calculating the block index + offset pair from a
|
||||
file size doesn't have an obvious implementation.
|
||||
|
||||
We can start by just writing down an equation. The first idea that comes to
|
||||
mind is to just use a for loop to sum together blocks until we reach our
|
||||
file size. We can write equation equation as a summation:
|
||||
|
||||

|
||||
|
||||
where:
|
||||
B = block size in bytes
|
||||
w = word width in bits
|
||||
n = block index in skip-list
|
||||
N = file size in bytes
|
||||
|
||||
And this works quite well, but is not trivial to calculate. This equation
|
||||
requires O(n) to compute, which brings the entire runtime of reading a file
|
||||
to O(n^2logn). Fortunately, the additional O(n) does not need to touch disk,
|
||||
so it is not completely unreasonable. But if we could solve this equation into
|
||||
a form that is easily computable, we can avoid a big slowdown.
|
||||
|
||||
Unfortunately, the summation of the CTZ instruction presents a big challenge.
|
||||
How would you even begin to reason about integrating a bitwise instruction?
|
||||
Fortunately, there is a powerful tool I've found useful in these situations:
|
||||
The [On-Line Encyclopedia of Integer Sequences (OEIS)](https://oeis.org/).
|
||||
If we work out the first couple of values in our summation, we find that CTZ
|
||||
maps to [A001511](https://oeis.org/A001511), and its partial summation maps
|
||||
to [A005187](https://oeis.org/A005187), and surprisingly, both of these
|
||||
sequences have relatively trivial equations! This leads us to the completely
|
||||
unintuitive property:
|
||||
|
||||

|
||||
|
||||
where:
|
||||
ctz(i) = the number of trailing bits that are 0 in i
|
||||
popcount(i) = the number of bits that are 1 in i
|
||||
|
||||
I find it bewildering that these two seemingly unrelated bitwise instructions
|
||||
are related by this property. But if we start to disect this equation we can
|
||||
see that it does hold. As n approaches infinity, we do end up with an average
|
||||
overhead of 2 pointers as we find earlier. And popcount seems to handle the
|
||||
error from this average as it accumulates in the CTZ skip-list.
|
||||
|
||||
Now we can substitute into the original equation to get a trivial equation
|
||||
for a file size:
|
||||
|
||||

|
||||
|
||||
Unfortunately, we're not quite done. The popcount function is non-injective,
|
||||
so we can only find the file size from the block index, not the other way
|
||||
around. However, we can solve for an n' block index that is greater than n
|
||||
with an error bounded by the range of the popcount function. We can then
|
||||
repeatedly substitute this n' into the original equation until the error
|
||||
is smaller than the integer division. As it turns out, we only need to
|
||||
perform this substitution once. Now we directly calculate our block index:
|
||||
|
||||

|
||||
|
||||
Now that we have our block index n, we can just plug it back into the above
|
||||
equation to find the offset. However, we do need to rearrange the equation
|
||||
a bit to avoid integer overflow:
|
||||
|
||||

|
||||
|
||||
The solution involves quite a bit of math, but computers are very good at math.
|
||||
We can now solve for the block index + offset while only needed to store the
|
||||
file size in O(1).
|
||||
|
||||
Here is what it might look like to update a file stored with a CTZ skip-list:
|
||||
```
|
||||
|
@ -1129,7 +1205,7 @@ So, to summarize:
|
|||
metadata block is active
|
||||
4. Directory blocks contain either references to other directories or files
|
||||
5. Files are represented by copy-on-write CTZ skip-lists which support O(1)
|
||||
append and O(n logn) reading
|
||||
append and O(nlogn) reading
|
||||
6. Blocks are allocated by scanning the filesystem for used blocks in a
|
||||
fixed-size lookahead region is that stored in a bit-vector
|
||||
7. To facilitate scanning the filesystem, all directories are part of a
|
||||
|
|
|
@ -1004,19 +1004,20 @@ int lfs_dir_rewind(lfs_t *lfs, lfs_dir_t *dir) {
|
|||
|
||||
|
||||
/// File index list operations ///
|
||||
static int lfs_index(lfs_t *lfs, lfs_off_t *off) {
|
||||
lfs_off_t i = 0;
|
||||
|
||||
while (*off >= lfs->cfg->block_size) {
|
||||
i += 1;
|
||||
*off -= lfs->cfg->block_size;
|
||||
*off += 4*(lfs_ctz(i) + 1);
|
||||
static int lfs_ctz_index(lfs_t *lfs, lfs_off_t *off) {
|
||||
lfs_off_t size = *off;
|
||||
lfs_off_t b = lfs->cfg->block_size - 2*4;
|
||||
lfs_off_t i = size / b;
|
||||
if (i == 0) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
i = (size - 4*(lfs_popc(i-1)+2)) / b;
|
||||
*off = size - b*i - 4*lfs_popc(i);
|
||||
return i;
|
||||
}
|
||||
|
||||
static int lfs_index_find(lfs_t *lfs,
|
||||
static int lfs_ctz_find(lfs_t *lfs,
|
||||
lfs_cache_t *rcache, const lfs_cache_t *pcache,
|
||||
lfs_block_t head, lfs_size_t size,
|
||||
lfs_size_t pos, lfs_block_t *block, lfs_off_t *off) {
|
||||
|
@ -1026,8 +1027,8 @@ static int lfs_index_find(lfs_t *lfs,
|
|||
return 0;
|
||||
}
|
||||
|
||||
lfs_off_t current = lfs_index(lfs, &(lfs_off_t){size-1});
|
||||
lfs_off_t target = lfs_index(lfs, &pos);
|
||||
lfs_off_t current = lfs_ctz_index(lfs, &(lfs_off_t){size-1});
|
||||
lfs_off_t target = lfs_ctz_index(lfs, &pos);
|
||||
|
||||
while (current > target) {
|
||||
lfs_size_t skip = lfs_min(
|
||||
|
@ -1048,7 +1049,7 @@ static int lfs_index_find(lfs_t *lfs,
|
|||
return 0;
|
||||
}
|
||||
|
||||
static int lfs_index_extend(lfs_t *lfs,
|
||||
static int lfs_ctz_extend(lfs_t *lfs,
|
||||
lfs_cache_t *rcache, lfs_cache_t *pcache,
|
||||
lfs_block_t head, lfs_size_t size,
|
||||
lfs_off_t *block, lfs_block_t *off) {
|
||||
|
@ -1075,7 +1076,7 @@ static int lfs_index_extend(lfs_t *lfs,
|
|||
}
|
||||
|
||||
size -= 1;
|
||||
lfs_off_t index = lfs_index(lfs, &size);
|
||||
lfs_off_t index = lfs_ctz_index(lfs, &size);
|
||||
size += 1;
|
||||
|
||||
// just copy out the last block if it is incomplete
|
||||
|
@ -1139,7 +1140,7 @@ relocate:
|
|||
}
|
||||
}
|
||||
|
||||
static int lfs_index_traverse(lfs_t *lfs,
|
||||
static int lfs_ctz_traverse(lfs_t *lfs,
|
||||
lfs_cache_t *rcache, const lfs_cache_t *pcache,
|
||||
lfs_block_t head, lfs_size_t size,
|
||||
int (*cb)(void*, lfs_block_t), void *data) {
|
||||
|
@ -1147,7 +1148,7 @@ static int lfs_index_traverse(lfs_t *lfs,
|
|||
return 0;
|
||||
}
|
||||
|
||||
lfs_off_t index = lfs_index(lfs, &(lfs_off_t){size-1});
|
||||
lfs_off_t index = lfs_ctz_index(lfs, &(lfs_off_t){size-1});
|
||||
|
||||
while (true) {
|
||||
int err = cb(data, head);
|
||||
|
@ -1459,7 +1460,7 @@ lfs_ssize_t lfs_file_read(lfs_t *lfs, lfs_file_t *file,
|
|||
// check if we need a new block
|
||||
if (!(file->flags & LFS_F_READING) ||
|
||||
file->off == lfs->cfg->block_size) {
|
||||
int err = lfs_index_find(lfs, &file->cache, NULL,
|
||||
int err = lfs_ctz_find(lfs, &file->cache, NULL,
|
||||
file->head, file->size,
|
||||
file->pos, &file->block, &file->off);
|
||||
if (err) {
|
||||
|
@ -1526,7 +1527,7 @@ lfs_ssize_t lfs_file_write(lfs_t *lfs, lfs_file_t *file,
|
|||
file->off == lfs->cfg->block_size) {
|
||||
if (!(file->flags & LFS_F_WRITING) && file->pos > 0) {
|
||||
// find out which block we're extending from
|
||||
int err = lfs_index_find(lfs, &file->cache, NULL,
|
||||
int err = lfs_ctz_find(lfs, &file->cache, NULL,
|
||||
file->head, file->size,
|
||||
file->pos-1, &file->block, &file->off);
|
||||
if (err) {
|
||||
|
@ -1539,7 +1540,7 @@ lfs_ssize_t lfs_file_write(lfs_t *lfs, lfs_file_t *file,
|
|||
|
||||
// extend file with new blocks
|
||||
lfs_alloc_ack(lfs);
|
||||
int err = lfs_index_extend(lfs, &lfs->rcache, &file->cache,
|
||||
int err = lfs_ctz_extend(lfs, &lfs->rcache, &file->cache,
|
||||
file->block, file->pos,
|
||||
&file->block, &file->off);
|
||||
if (err) {
|
||||
|
@ -2074,7 +2075,7 @@ int lfs_traverse(lfs_t *lfs, int (*cb)(void*, lfs_block_t), void *data) {
|
|||
|
||||
dir.off += lfs_entry_size(&entry);
|
||||
if ((0x70 & entry.d.type) == (0x70 & LFS_TYPE_REG)) {
|
||||
int err = lfs_index_traverse(lfs, &lfs->rcache, NULL,
|
||||
int err = lfs_ctz_traverse(lfs, &lfs->rcache, NULL,
|
||||
entry.d.u.file.head, entry.d.u.file.size, cb, data);
|
||||
if (err) {
|
||||
return err;
|
||||
|
@ -2093,7 +2094,7 @@ int lfs_traverse(lfs_t *lfs, int (*cb)(void*, lfs_block_t), void *data) {
|
|||
// iterate over any open files
|
||||
for (lfs_file_t *f = lfs->files; f; f = f->next) {
|
||||
if (f->flags & LFS_F_DIRTY) {
|
||||
int err = lfs_index_traverse(lfs, &lfs->rcache, &f->cache,
|
||||
int err = lfs_ctz_traverse(lfs, &lfs->rcache, &f->cache,
|
||||
f->head, f->size, cb, data);
|
||||
if (err) {
|
||||
return err;
|
||||
|
@ -2101,7 +2102,7 @@ int lfs_traverse(lfs_t *lfs, int (*cb)(void*, lfs_block_t), void *data) {
|
|||
}
|
||||
|
||||
if (f->flags & LFS_F_WRITING) {
|
||||
int err = lfs_index_traverse(lfs, &lfs->rcache, &f->cache,
|
||||
int err = lfs_ctz_traverse(lfs, &lfs->rcache, &f->cache,
|
||||
f->block, f->pos, cb, data);
|
||||
if (err) {
|
||||
return err;
|
||||
|
|
|
@ -52,6 +52,10 @@ static inline uint32_t lfs_npw2(uint32_t a) {
|
|||
#endif
|
||||
}
|
||||
|
||||
static inline uint32_t lfs_popc(uint32_t a) {
|
||||
return __builtin_popcount(a);
|
||||
}
|
||||
|
||||
static inline int lfs_scmp(uint32_t a, uint32_t b) {
|
||||
return (int)(unsigned)(a - b);
|
||||
}
|
||||
|
|
Loading…
Reference in New Issue