db.importer

Importer workflow for metadata JSON + ZIP package pairs.

Functions

load_metadata_json(metadata_path)

Load metadata JSON and enforce object-root structure.

read_txt_rows(archive, entry_name)

Read one txt entry and return (line_number, value) tuples.

import_package_pair(conn, *, metadata_path, zip_path)

Import one metadata+zip pair and create one SQLite table per *.txt.

Module Contents

db.importer.load_metadata_json(metadata_path)

Load metadata JSON and enforce object-root structure.

Parameters:

metadata_path (pathlib.Path) – Path to metadata JSON file.

Returns:

Parsed JSON object.

Raises:

ValueError – If the root JSON value is not an object.

Return type:

dict[str, Any]

db.importer.read_txt_rows(archive, entry_name)

Read one txt entry and return (line_number, value) tuples.

Empty and whitespace-only lines are skipped during import so DB tables only store meaningful values.

db.importer.import_package_pair(conn, *, metadata_path, zip_path)

Import one metadata+zip pair and create one SQLite table per *.txt.

The importer ignores JSON files inside the archive. It uses metadata files_included (when provided) to limit which *.txt entries are imported.

Parameters:
Returns:

API-style summary payload describing imported package and created tables.

Raises:
  • FileNotFoundError – If metadata or zip path does not exist.

  • ValueError – For invalid metadata, duplicate imports, or zip format/data issues.

Return type:

dict[str, Any]