build_tools.syllable_analysis.common.paths
Path management and default path configuration for analysis tools.
This module provides centralized path management for all analysis tools in the syllable feature annotator. It eliminates code duplication by providing a single source of truth for project structure and default paths.
Key Features
Automatic project root detection
Standard default paths for inputs and outputs
Per-tool output directory management
Platform-independent path handling
Usage
Using the module-level singleton (recommended):
from build_tools.syllable_analysis.common import default_paths
# Access default input path
input_path = default_paths.annotated_syllables
# Get tool-specific output directory
output_dir = default_paths.analysis_output_dir("tsne")
Creating a custom instance:
from build_tools.syllable_analysis.common.paths import AnalysisPathConfig
from pathlib import Path
# Use custom root
custom_paths = AnalysisPathConfig(root=Path("/custom/project/root"))
input_path = custom_paths.annotated_syllables
Module Contents
AnalysisPathConfig: Main path configuration class
default_paths: Module-level singleton instance for convenience
Attributes
Classes
Centralized path configuration for analysis tools. |
Module Contents
- class build_tools.syllable_analysis.common.paths.AnalysisPathConfig(root=None)[source]
Centralized path configuration for analysis tools.
This class manages all default paths used by analysis tools, including: - Project root detection - Input file paths (annotated syllables, frequencies) - Output directory paths (per-tool subdirectories)
The class automatically detects the project root based on this file’s location in the directory structure, but can also accept a custom root path for testing or alternative project layouts.
Attributes
- rootPath
Project root directory (auto-detected or explicitly set)
Examples
Using default (auto-detected) root:
>>> config = AnalysisPathConfig() >>> config.root PosixPath('/path/to/pipeworks_name_generation') >>> config.annotated_syllables PosixPath('/path/to/pipeworks_name_generation/data/annotated/syllables_annotated.json')
Using custom root:
>>> from pathlib import Path >>> config = AnalysisPathConfig(root=Path("/custom/root")) >>> config.annotated_syllables PosixPath('/custom/root/data/annotated/syllables_annotated.json')
Getting tool-specific output directories:
>>> config = AnalysisPathConfig() >>> config.analysis_output_dir("tsne") PosixPath('/path/to/pipeworks_name_generation/_working/analysis/tsne') >>> config.analysis_output_dir("feature_signatures") PosixPath('/path/to/pipeworks_name_generation/_working/analysis/feature_signatures')
Notes
This class is designed to be instantiated once per process (typically via the module-level default_paths singleton). Multiple instances are supported for testing purposes.
The auto-detection assumes this file is located at:
build_tools/syllable_analysis/common/paths.pyIf the directory structure changes, the
_detect_project_root()method must be updated accordingly.Initialize path configuration.
Args
- rootPath, optional
Project root path. If None (default), auto-detects based on this file’s location.
Examples
Default auto-detection:
>>> config = AnalysisPathConfig()
Custom root path:
>>> from pathlib import Path >>> config = AnalysisPathConfig(root=Path("/my/project"))
- root
- property annotated_syllables: pathlib.Path
Default path to syllables_annotated.json.
This is the primary input file for most analysis tools, containing syllables with their frequencies and feature annotations.
Returns
- Path
Path to
data/annotated/syllables_annotated.json
Examples
>>> config = AnalysisPathConfig() >>> config.annotated_syllables PosixPath('.../data/annotated/syllables_annotated.json')
Use in argument parser:
parser.add_argument( "--input", type=Path, default=default_paths.annotated_syllables, help="Path to annotated syllables" )
Notes
This file is produced by the syllable feature annotator pipeline and contains a JSON array of syllable records with structure:
[ { "syllable": "ka", "frequency": 187, "features": { "starts_with_vowel": false, "contains_plosive": true, ... } }, ... ]
- property syllables_frequencies: pathlib.Path
Default path to syllables_frequencies.json.
This file contains frequency counts for each syllable from the normalizer, useful for weighted analysis or filtering.
Returns
- Path
Path to
data/normalized/syllables_frequencies.json
Examples
>>> config = AnalysisPathConfig() >>> config.syllables_frequencies PosixPath('.../data/normalized/syllables_frequencies.json')
Notes
This file is produced by the syllable normalizer and contains a JSON object mapping syllables to their occurrence counts:
{ "ka": 187, "ra": 162, "mi": 145, ... }
The frequencies represent pre-deduplication counts, capturing how often each canonical syllable appeared in the raw corpus.
- analysis_output_dir(tool_name)[source]
Get output directory for a specific analysis tool.
Each analysis tool should have its own subdirectory under
_working/analysis/to keep outputs organized and avoid naming conflicts.Args
- tool_namestr
Name of the analysis tool (e.g., ‘tsne’, ‘feature_signatures’, ‘random_sampler’). This will be used as the subdirectory name.
Returns
- Path
Path to
_working/analysis/{tool_name}/
Examples
>>> config = AnalysisPathConfig() >>> config.analysis_output_dir("tsne") PosixPath('.../pipeworks_name_generation/_working/analysis/tsne') >>> config.analysis_output_dir("feature_signatures") PosixPath('.../pipeworks_name_generation/_working/analysis/feature_signatures')
Use in argument parser:
parser.add_argument( "--output", type=Path, default=default_paths.analysis_output_dir("tsne"), help="Output directory" )
Notes
The directory is not created by this method - it only returns the path. Use
common.output.ensure_output_dir()to create the directory if needed.The
_working/directory is typically git-ignored and used for build-time artifacts that don’t need to be committed.
- build_tools.syllable_analysis.common.paths.default_paths