build_tools.pyphen_syllable_extractor.models
Data models for syllable extraction results.
This module defines the data structures used to represent extraction results and their associated metadata.
Classes
Container for syllable extraction results and associated metadata. |
|
Result of processing a single file in batch mode. |
|
Aggregate results from a batch processing operation. |
Module Contents
- class build_tools.pyphen_syllable_extractor.models.ExtractionResult[source]
Container for syllable extraction results and associated metadata.
This dataclass stores both the extracted syllables and all relevant metadata about the extraction process for reporting and persistence.
- syllables
Set of unique syllables extracted from the input text
- language_code
Pyphen language/locale code used for hyphenation
- min_syllable_length
Minimum syllable length constraint
- max_syllable_length
Maximum syllable length constraint
- input_path
Path to the input text file
- timestamp
When the extraction was performed
- only_hyphenated
Whether whole words were excluded
- length_distribution
Map of syllable length to count
- sample_syllables
Representative sample of extracted syllables
- total_words
Total words found in source text
- skipped_unhyphenated
Words skipped because they couldn’t be hyphenated
- rejected_syllables
Syllables rejected due to length constraints
- processed_words
Words that were successfully processed
- input_path: pathlib.Path
- timestamp: datetime.datetime
- class build_tools.pyphen_syllable_extractor.models.FileProcessingResult[source]
Result of processing a single file in batch mode.
This dataclass stores the outcome of processing one file during batch operations, including success status, extracted syllables count, and any error information if processing failed.
- input_path
Path to the input file that was processed
- success
Whether processing completed successfully
- syllables_count
Number of unique syllables extracted (0 if failed)
- language_code
Detected or specified language code used
- syllables_output_path
Path where syllables were saved (None if failed)
- metadata_output_path
Path where metadata was saved (None if failed)
- error_message
Error message if processing failed (None if success)
- processing_time
Time taken to process this file in seconds
Example
>>> result = FileProcessingResult( ... input_path=Path("book.txt"), ... success=True, ... syllables_count=245, ... language_code="en_US", ... syllables_output_path=Path("output.syllables.en_US.txt"), ... metadata_output_path=Path("output.meta.en_US.txt"), ... processing_time=2.45 ... ) >>> print(f"Processed {result.syllables_count} syllables") Processed 245 syllables
- input_path: pathlib.Path
- syllables_output_path: pathlib.Path | None = None
- metadata_output_path: pathlib.Path | None = None
- class build_tools.pyphen_syllable_extractor.models.BatchResult[source]
Aggregate results from a batch processing operation.
This dataclass stores summary statistics and individual file results from processing multiple files in batch mode.
- total_files
Total number of files attempted in the batch
- successful
Number of files processed successfully
- failed
Number of files that failed to process
- results
List of individual FileProcessingResult objects
- total_time
Total time taken for entire batch operation in seconds
- output_directory
Directory where all outputs were saved
Example
>>> result = BatchResult( ... total_files=5, ... successful=4, ... failed=1, ... results=[...], ... total_time=12.34, ... output_directory=Path("_working/output") ... ) >>> print(f"Success rate: {result.successful/result.total_files*100:.1f}%") Success rate: 80.0%
- results: List[FileProcessingResult]
- output_directory: pathlib.Path
- format_summary()[source]
Format batch processing summary as a human-readable string.
Creates a detailed summary report showing overall statistics, successful extractions with details, and failed files with error messages.
- Returns:
Multi-line formatted string with batch statistics and results
- Return type:
Example
>>> summary = batch_result.format_summary() >>> print(summary) ====================================================================== BATCH PROCESSING SUMMARY ====================================================================== Total Files: 5 Successful: 4 (80.0%) ...