build_tools.pyphen_syllable_extractor ===================================== .. py:module:: build_tools.pyphen_syllable_extractor .. autoapi-nested-parse:: Syllable Extractor - Dictionary-Based Syllable Extraction The syllable extractor uses dictionary-based hyphenation to extract syllables from text files. This is a **build-time tool only** - not used during runtime name generation. The tool supports two modes: - **Interactive Mode** - Guided prompts for single-file processing - **Batch Mode** - Automated processing of multiple files via command-line arguments Features: - Dictionary-based hyphenation using pyphen (LibreOffice dictionaries) - Support for 40+ languages - Automatic language detection (optional, via langdetect) - Configurable syllable length constraints - Deterministic extraction (same input = same output) - Unicode support for accented characters - Comprehensive metadata and statistics - Automatic provenance tracking via corpus_db ledger (batch mode) Main Components: - SyllableExtractor: Core extraction class - ExtractionResult: Data model for extraction results - FileProcessingResult: Result for single file in batch mode - BatchResult: Aggregate results for batch processing - SUPPORTED_LANGUAGES: Dictionary of supported language codes Usage: >>> from pathlib import Path >>> from build_tools.pyphen_syllable_extractor import SyllableExtractor >>> >>> # Initialize extractor for English (US) >>> extractor = SyllableExtractor('en_US', min_syllable_length=2, max_syllable_length=8) >>> >>> # Extract syllables from text >>> syllables = extractor.extract_syllables_from_text("Hello wonderful world") >>> print(sorted(syllables)) ['der', 'ful', 'hel', 'lo', 'won', 'world'] >>> >>> # Extract from a file >>> syllables = extractor.extract_syllables_from_file(Path('input.txt')) >>> >>> # Save results >>> extractor.save_syllables(syllables, Path('output.txt')) CLI Usage: .. code-block:: bash # Interactive mode python -m build_tools.pyphen_syllable_extractor # Single file with specific language python -m build_tools.pyphen_syllable_extractor --file input.txt --lang en_US # Batch processing with auto-detection python -m build_tools.pyphen_syllable_extractor --source ~/docs/ --recursive --auto Submodules ---------- .. toctree:: :maxdepth: 1 /autoapi/build_tools/pyphen_syllable_extractor/batch/index /autoapi/build_tools/pyphen_syllable_extractor/cli/index /autoapi/build_tools/pyphen_syllable_extractor/extractor/index /autoapi/build_tools/pyphen_syllable_extractor/file_io/index /autoapi/build_tools/pyphen_syllable_extractor/interactive/index /autoapi/build_tools/pyphen_syllable_extractor/language_detection/index /autoapi/build_tools/pyphen_syllable_extractor/languages/index /autoapi/build_tools/pyphen_syllable_extractor/models/index Attributes ---------- .. autoapisummary:: build_tools.pyphen_syllable_extractor.main_interactive build_tools.pyphen_syllable_extractor.main_batch build_tools.pyphen_syllable_extractor.process_single_file_batch Package Contents ---------------- .. py:data:: main_interactive .. py:data:: main_batch .. py:data:: process_single_file_batch