build_tools.syllable_walk_web.run_discovery
Run directory discovery for the syllable-walk web pipeline history.
History discovery is manifest-driven: a run is discoverable only when
manifest.json exists and is parseable. This keeps the run directory itself as
the single source of truth and avoids legacy text-file parsing heuristics.
Classes
Metadata about one manifest-backed pipeline run directory. |
Functions
|
Discover all pipeline run directories. |
|
Load selection data from a JSON file. |
|
Get a specific run by its directory name. |
Module Contents
- class build_tools.syllable_walk_web.run_discovery.RunInfo[source]
Metadata about one manifest-backed pipeline run directory.
- path
Absolute path to the run directory
- run_id
Canonical run identifier (matches directory name)
- extractor_type
Type of extractor (“nltk” or “pyphen”)
- timestamp
Run timestamp in YYYYMMDD_HHMMSS format
- display_name
Human-readable display name
- corpus_db_path
Path to corpus.db artifact if present and exists
- annotated_json_path
Path to annotated JSON artifact if present and exists
- syllable_count
Number of unique syllables from manifest metrics
- selections
Dict mapping name class to selection file path
- path: pathlib.Path
- corpus_db_path: pathlib.Path | None
- annotated_json_path: pathlib.Path | None
- selections: dict[str, pathlib.Path]
- build_tools.syllable_walk_web.run_discovery.discover_runs(base_path=None)[source]
Discover all pipeline run directories.
Scans _working/output/ (or specified base path) for directories matching the pattern YYYYMMDD_HHMMSS_{extractor}. Returns metadata for all valid runs found, sorted by timestamp (newest first).
- Parameters:
base_path (pathlib.Path | None) – Directory to scan. Default: _working/output/
- Returns:
List of RunInfo objects, sorted by timestamp (newest first)
- Return type:
Examples
>>> runs = discover_runs() >>> len(runs) 2 >>> runs[0].extractor_type 'nltk'
- build_tools.syllable_walk_web.run_discovery.get_selection_data(selection_path)[source]
Load selection data from a JSON file.
- Parameters:
selection_path (pathlib.Path) – Path to selection JSON file
- Returns:
Dictionary with metadata and selections list
- Raises:
FileNotFoundError – If file doesn’t exist
json.JSONDecodeError – If file is not valid JSON
- Return type:
- build_tools.syllable_walk_web.run_discovery.get_run_by_id(run_id, base_path=None)[source]
Get a specific run by its directory name.
- Parameters:
run_id (str) – Run directory name (e.g., “20260121_084017_nltk”)
base_path (pathlib.Path | None) – Base path to search. Default: _working/output/
- Returns:
RunInfo for the run, or None if not found
- Return type:
RunInfo | None