Storage interfaces

dsch storage representation.

The data node classes provided by dsch.data form the abstraction layer between the different backend’s specific data storage mechanisms for the individual node types (i.e. data types) modeled by dsch. In addition to that, the storage entity itself, e.g. a file or database, must be made available to the user, consequently using data nodes to model the data fields. The structure and hierarchy of these nodes is determined by the schema, using the classes from dsch.schema.

This module provides base classes for backends to derive from, so that common functionality may be implemented in a single place without unnecessary repetition.

class dsch.storage.FileStorage(storage_path, schema_node=None)

Storage interface base class for file-based storage.

FileStorage extends Storage by common functionality that is shared by all file-based storage mechanisms. This also provides a common interface to the user, independent of the specific file format (i.e. backend) in use.

Variables:
  • storage_path (str) – Path to the current storage file.
  • schema_node – Top-level schema node used for the stored data.
  • data – Top-level data node, providing access to all managed data.
save(force=False)

Save the current data to the file in storage_path.

Before the file is saved, data validation is automatically performed via validate(). This can be skipped (although it should not) by setting force to True.

Parameters:force (bool) – If True, automatic data validation is skipped.
class dsch.storage.Storage(storage_path, schema_node=None)

Generic storage interface base class.

Storage interfaces provide access to a specific data storage that is managed by dsch. Depending on the specific backend, this can for example be a file, a directory or a database.

Once created, the Storage provides access to all contained data via data. Internally, this maps to the top-level data node in the hierarchy.

Warning

Once created, changes to schema_node are not automatically propagated through the data node tree, so no changes should be made to it while using a Storage object.

Variables:
  • storage_path (str) – Path to the current storage (backend-specific).
  • schema_node – Top-level schema node used for the stored data.
  • data – Top-level data node, providing access to all managed data.
complete

Check whether the stored data is currently complete.

The data held by this storage interface is considered complete when the top-level data node is complete. In most cases, this will be a dsch.data.Compilation or dsch.data.List, which recursively check their sub-nodes for completeness. When the Storage is complete, this means that all required data fields are filled out. Compilation fields marked as optional via schema.Compilation.optionals are not considered in this process.

Returns:True if the stored data is complete, False otherwise.
Return type:bool
save_as(storage_path, backend=None)

Create a new storage by copying schema and data.

Note

Creating a copy can be useful to migrate existing data to a different storage backend, or to persistently store data that was collected in an in-memory storage.

This is a convenience method, effectively wrapping create_from().

Parameters:
  • storage_path (str) – Path to the new dsch storage (backend-specific).
  • backend (str) – Backend to use for the new dsch storage. If omitted, the backend will be selected based on the storage_path, e.g. file extension.
Returns:

Newly created dsch storage.

schema_hash()

Calculate the SHA256 hash of the (serialized) schema.

This uses the JSON-serialized schema specification that is mostly included with the respective dsch storage.

Returns:SHA256 hash (hex) of the schema.
Return type:str
validate()

Validate the entire data storage.

This recursively validates all individual data nodes inside the storage.

If validation succeeds, the method terminates silently. Otherwise, an exception is raised.

Raises: