API Reference¶
HashFS is a content-addressable file management system. What does that mean? Simply, that HashFS manages a directory where files are saved based on the file’s hash.
Typical use cases for this kind of system are ones where:
- Files are written once and never change (e.g. image storage).
- It’s desirable to have no duplicate files (e.g. user uploads).
- File metadata is stored elsewhere (e.g. in a database).
-
class
hashfs.
HashFS
(root, depth=4, width=1, algorithm='sha256', fmode=436, dmode=493)[source]¶ Content addressable file manager.
-
root
¶ str – Directory path used as root of storage space.
-
depth
¶ int, optional – Depth of subfolders to create when saving a file.
-
width
¶ int, optional – Width of each subfolder to create when saving a file.
-
algorithm
¶ str – Hash algorithm to use when computing file hash. Algorithm should be available in
hashlib
module. Defaults to'sha256'
.
-
fmode
¶ int, optional – File mode permission to set when adding files to directory. Defaults to
0o664
which allows owner/group to read/write and everyone else to read.
-
dmode
¶ int, optional – Directory mode permission to set for subdirectories. Defaults to
0o755
which allows owner/group to read/write and everyone else to read and everyone to execute.
-
corrupted
(extensions=True)[source]¶ Return generator that yields corrupted files as
(path, address)
wherepath
is the path of the corrupted file andaddress
is theHashAddress
of the expected location.
-
delete
(file)[source]¶ Delete file using id or path. Remove any empty directories after deleting. No exception is raised if file doesn’t exist.
Parameters: file (str) – Address ID or path of file.
-
folders
()[source]¶ Return generator that yields all folders in the
root
directory that contain files.
-
get
(file)[source]¶ Return
HashAdress
from given id or path. If file does not refer to a valid file, thenNone
is returned.Parameters: file (str) – Address ID or path of file. Returns: File’s hash address. Return type: HashAddress
-
idpath
(id, extension='')[source]¶ Build the file path for a given hash id. Optionally, append a file extension.
-
open
(file, mode='rb')[source]¶ Return open buffer object from given id or path.
Parameters: - file (str) – Address ID or path of file.
- mode (str, optional) – Mode to open file in. Defaults to
'rb'
.
Returns: An
io
buffer dependent on the mode.Return type: Buffer
Raises: IOError
– If file doesn’t exist.
-
put
(file, extension=None)[source]¶ Store contents of file on disk using its content hash for the address.
Parameters: - file (mixed) – Readable object or path to file.
- extension (str, optional) – Optional extension to append to file when saving.
Returns: File’s hash address.
Return type:
-
realpath
(file)[source]¶ Attempt to determine the real path of a file id or path through successive checking of candidate paths. If the real path is stored with an extension, the path is considered a match if the basename matches the expected file path of the id.
-
remove_empty
(subpath)[source]¶ Successively remove all empty folders starting with subpath and proceeding “up” through directory tree until reaching the
root
folder.
-
-
class
hashfs.
HashAddress
[source]¶ File address containing file’s path on disk and it’s content hash ID.
-
id
¶ str – Hash ID (hexdigest) of file contents.
-
relpath
¶ str – Relative path location to
HashFS.root
.
-
abspath
¶ str – Absoluate path location of file on disk.
-
is_duplicate
¶ boolean, optional – Whether the hash address created was a duplicate of a previously existing file. Can only be
True
after a put operation. Defaults toFalse
.
-