User frontend

dsch user frontend.

When using dsch, users normally start with a top-level object representing the dsch storage (e.g. a file) and work through that object’s attributes. Although they of course work with a variety of different objects in the process, they usually do not create any of these manually. When creating a new storage, a schema specification is required, which can be built with the classes from dsch.schema. Then, a dsch storage can be created.

The user front end in this module provides a convenient, backend-independent interface for loading from existing dsch storages and creating new ones.

class dsch.frontend.PseudoStorage(data_storage, schema_node, defer_open=False, schema_alternatives=None)

Provide abstraction between storages and data nodes.

PseudoStorage provides an easy way to manage data with dsch when it is either in a Storage or in a data-node. Being able to handle both variants with the same code base is relevant especially for libraries that support schema extension.

Schema extension means that when a library defines its dsch schema, other code (higher-level libraries or applications) may incorporate it into a broader schema. That way, the application can use a single schema for its specific purposes while retaining compatibility with the library’s schema. To support this, the library must be able to handle its own schema being at either the top level or a subordinate level inside the storage. In the former case, tasks like the creation of the storage are part of the library’s responsibility, while in the latter, they are not.

PseudoStorage can be initialized with either a str or a data node (from dsch.data) object. If a string is given, it is interpreted as a storage_path, just like create() and load(). The corresponding storage is made available through storage and its data through data. Alternatively, if a data node is given during initialization, it is direcly made available through data. In that case storage is set to None, indicating that the PseudoStorage is not managing the actual storage object.

In addition, the object can be used as a context manager using the with statement. This causes the data to be made available (e.g. by opening a file) when the context is entered and to be cleared (i.e. data set to None) when the context is left. When an entire storage is managed (i.e. the object was initialized with a string), the storage is also saved (see dsch.storage.FileStorage.save()) if applicable. If usage as a context manager is not desired, the same functionality is also exposed as open() and close().

To support older versions of a schema once schema changes cannot be avoided, schema_alternatives can be passed on object creation. If an existing storage is loaded (or a data node is passed) based on an alternative schema, no exception is raised and the process continues normally. Note that new storages are always created with the given schema_node. The actual top-level schema node of the available data is made available through schema_node, so it can be easily inspected by subsequent code. schema_alternatives is an iterable containing either the SchemaNode object or the corresponding hash for every supported schema.

Variables:
  • data – Data node, corresponding to the top-level node of the schema.
  • storage – A dsch storage object, if the PseudoStorage was initialized with a string.
  • schema_node – A SchemaNode representing the top-level schema node of the actual data loaded. This is either the same as the schema_node specified on object creation, or one of the schemas listed in schema_alternatives upton object creation.
close()

Finalize data access.

This saves and closes the corresponding storage, if applicable. Afterwards, the data is no longer available through data.

open()

Make the desired data available as data.

If necessary, this loads or creates a new storage.

dsch.frontend.create(storage_path, schema_node, backend=None)

Create a new dsch storage.

Creates a new dsch storage in the location given by storage_path, using the desired backend. If no backend is specified, it is detected automatically by interpreting the storage_path, e.g. via a file extension. Note that the format of storage_path depends on the choice of backends, so a compatible one must be chosen.

Currently, the following backends are available:

Name Description Path format
hdf5 HDF5 file Path to regular file
inmem In-memory storage Fixed string “::inmem::”
mat MATLAB data file Path to regular file
npz NumPy .npz file Path to regular file
Parameters:
  • storage_path (str) – Path to the new dsch storage (backend-specific).
  • schema_node – Top-level schema node for the dsch storage. See dsch.schema for details.
  • backend (str) – Backend to use for the new dsch storage.
Returns:

Storage object.

dsch.frontend.create_from(storage_path, source_storage, backend=None)

Create a new dsch storage by copying from an existing one.

The new storage is created, just like with create(), but the schema is automatically copied from the given source_storage. In addition, all data currently stored in source_storage is also copied.

If no backend is specified, it is detected automatically by interpreting the storage_path, e.g. via a file extension. For details, see create().

Parameters:
  • storage_path (str) – Path to the new dsch storage (backend-specific).
  • source_storage – dsch storage to copy schema and data from.
  • backend (str) – Backend to use for the new dsch storage.
Returns:

Newly created dsch storage.

dsch.frontend.load(storage_path, backend=None, required_schema=None, required_schema_hash=None, force=False)

Load a dsch storage from the given path.

Normally, the correct backend is detected automatically by interpreting the storage_path, e.g. via a file extension. Alternatively, the backend can be forced to a desired value by additionally passing a backend argument.

The required_schema or required_schema_hash arguments can be used to ensure that the loaded storage uses a specific schema. The former is used to supply a schema object while the latter must be the SHA256 hash of the required schema JSON, as can be determined by storage.Storage.schema_hash(). If the loaded storage uses a different schema, an exception is raised.

In addition, the loaded storage is validated automatically, unless force is set to True. This ensures that the loaded data really conforms to the desired schema, so that following code, e.g. for data evaluation, can safely depend on the structure, datatypes and met constraints.

Parameters:
  • storage_path (str) – Path to the dsch storage (backend-specific).
  • backend (str) – Backend to be used. By default, perform auto-detection.
  • required_schema (dsch.schema.SchemaNode) – Top-level schema node of the required schema.
  • required_schema_hash (str) – SHA256 hash of the required schema.
  • force (bool) – If True, the automatic validation step is skipped.
Returns:

Storage object.

Raises: