HashFS is a content-addressable file management system. What does that mean? Simply, that HashFS manages a directory where files are saved based on the file’s hash.
Typical use cases for this kind of system are ones where:
- Files are written once and never change (e.g. image storage).
- It’s desirable to have no duplicate files (e.g. user uploads).
- File metadata is stored elsewhere (e.g. in a database).
HashFS(root, depth=4, width=1, algorithm='sha256', fmode=436, dmode=493)¶
Content addressable file manager.
Directory path used as root of storage space.
Depth of subfolders to create when saving a file.
Width of each subfolder to create when saving a file.
Hash algorithm to use when computing file hash. Algorithm should be available in
hashlibmodule. Defaults to
File mode permission to set when adding files to directory. Defaults to
0o664which allows owner/group to read/write and everyone else to read.
Directory mode permission to set for subdirectories. Defaults to
0o755which allows owner/group to read/write and everyone else to read and everyone to execute.
Return generator that yields corrupted files as
pathis the path of the corrupted file and
HashAddressof the expected location.
Delete file using id or path. Remove any empty directories after deleting. No exception is raised if file doesn’t exist.
Parameters: file (str) – Address ID or path of file.
Check whether a given file id or path exists on disk.
HashAdressfrom given id or path. If file does not refer to a valid file, then
Parameters: file (str) – Address ID or path of file. Returns: File’s hash address. Return type: HashAddress
Build the file path for a given hash id. Optionally, append a file extension.
Physically create the folder path on disk.
Return open buffer object from given id or path.
- file (str) – Address ID or path of file.
- mode (str, optional) – Mode to open file in. Defaults to
iobuffer dependent on the mode.
IOError– If file doesn’t exist.
Store contents of file on disk using its content hash for the address.
- file (mixed) – Readable object or path to file.
- extension (str, optional) – Optional extension to append to file when saving.
File’s hash address.
Attempt to determine the real path of a file id or path through successive checking of candidate paths. If the real path is stored with an extension, the path is considered a match if the basename matches the expected file path of the id.
Successively remove all empty folders starting with subpath and proceeding “up” through directory tree until reaching the
Repair any file locations whose content address doesn’t match it’s file path.
Shard content ID into subfolders.
Unshard path to determine hash value.
File address containing file’s path on disk and it’s content hash ID.
Hash ID (hexdigest) of file contents.
Absoluate path location of file on disk.
Whether the hash address created was a duplicate of a previously existing file. Can only be
Trueafter a put operation. Defaults to