build_tools.nltk_syllable_extractor.models ========================================== .. py:module:: build_tools.nltk_syllable_extractor.models .. autoapi-nested-parse:: Data models for NLTK-based syllable extraction results. This module defines the data structures used to represent extraction results and their associated metadata for the NLTK syllable extractor. Classes ------- .. autoapisummary:: build_tools.nltk_syllable_extractor.models.ExtractionResult build_tools.nltk_syllable_extractor.models.FileProcessingResult build_tools.nltk_syllable_extractor.models.BatchResult Module Contents --------------- .. py:class:: ExtractionResult Container for syllable extraction results and associated metadata. This dataclass stores both the extracted syllables and all relevant metadata about the extraction process for reporting and persistence. .. attribute:: syllables List of all syllables extracted (includes duplicates) .. attribute:: language_code Language code used (always "en_US" for NLTK extractor) .. attribute:: min_syllable_length Minimum syllable length constraint .. attribute:: max_syllable_length Maximum syllable length constraint .. attribute:: input_path Path to the input text file .. attribute:: timestamp When the extraction was performed .. attribute:: only_hyphenated Whether whole words were excluded .. attribute:: length_distribution Map of syllable length to count .. attribute:: sample_syllables Representative sample of extracted syllables .. attribute:: total_words Total words found in source text .. attribute:: fallback_count Words not in CMUDict (used fallback heuristics) .. attribute:: rejected_syllables Syllables rejected due to length constraints .. attribute:: processed_words Words that were successfully processed .. py:attribute:: syllables :type: List[str] .. py:attribute:: language_code :type: str .. py:attribute:: min_syllable_length :type: int .. py:attribute:: max_syllable_length :type: int .. py:attribute:: input_path :type: pathlib.Path .. py:attribute:: timestamp :type: datetime.datetime .. py:attribute:: only_hyphenated :type: bool :value: True .. py:attribute:: length_distribution :type: Dict[int, int] .. py:attribute:: sample_syllables :type: List[str] :value: [] .. py:attribute:: total_words :type: int :value: 0 .. py:attribute:: fallback_count :type: int :value: 0 .. py:attribute:: rejected_syllables :type: int :value: 0 .. py:attribute:: processed_words :type: int :value: 0 .. py:method:: format_metadata() Format extraction metadata as a human-readable string. :returns: Multi-line string containing all extraction metadata formatted for display or file output. .. py:class:: FileProcessingResult Result of processing a single file in batch mode. This dataclass stores the outcome of processing one file during batch operations, including success status, extracted syllables count, and any error information if processing failed. .. attribute:: input_path Path to the input file that was processed .. attribute:: success Whether processing completed successfully .. attribute:: syllables_count Number of unique syllables extracted (0 if failed) .. attribute:: language_code Language code used (always "en_US") .. attribute:: syllables_output_path Path where syllables were saved (None if failed) .. attribute:: metadata_output_path Path where metadata was saved (None if failed) .. attribute:: error_message Error message if processing failed (None if success) .. attribute:: processing_time Time taken to process this file in seconds .. admonition:: Example >>> result = FileProcessingResult( ... input_path=Path("book.txt"), ... success=True, ... syllables_count=245, ... language_code="en_US", ... syllables_output_path=Path("output.syllables.en_US.txt"), ... metadata_output_path=Path("output.meta.en_US.txt"), ... processing_time=2.45 ... ) >>> print(f"Processed {result.syllables_count} syllables") Processed 245 syllables .. py:attribute:: input_path :type: pathlib.Path .. py:attribute:: success :type: bool .. py:attribute:: syllables_count :type: int .. py:attribute:: language_code :type: str .. py:attribute:: syllables_output_path :type: Optional[pathlib.Path] :value: None .. py:attribute:: metadata_output_path :type: Optional[pathlib.Path] :value: None .. py:attribute:: error_message :type: Optional[str] :value: None .. py:attribute:: processing_time :type: float :value: 0.0 .. py:class:: BatchResult Aggregate results from a batch processing operation. This dataclass stores summary statistics and individual file results from processing multiple files in batch mode. .. attribute:: total_files Total number of files attempted in the batch .. attribute:: successful Number of files processed successfully .. attribute:: failed Number of files that failed to process .. attribute:: results List of individual FileProcessingResult objects .. attribute:: total_time Total time taken for entire batch operation in seconds .. attribute:: output_directory Directory where all outputs were saved .. admonition:: Example >>> result = BatchResult( ... total_files=5, ... successful=4, ... failed=1, ... results=[...], ... total_time=12.34, ... output_directory=Path("_working/output") ... ) >>> print(f"Success rate: {result.successful/result.total_files*100:.1f}%") Success rate: 80.0% .. py:attribute:: total_files :type: int .. py:attribute:: successful :type: int .. py:attribute:: failed :type: int .. py:attribute:: results :type: List[FileProcessingResult] .. py:attribute:: total_time :type: float .. py:attribute:: output_directory :type: pathlib.Path .. py:method:: format_summary() Format batch processing summary as a human-readable string. Creates a detailed summary report showing overall statistics, successful extractions with details, and failed files with error messages. :returns: Multi-line formatted string with batch statistics and results .. admonition:: Example >>> summary = batch_result.format_summary() >>> print(summary) ====================================================================== BATCH PROCESSING SUMMARY ====================================================================== Total Files: 5 Successful: 4 (80.0%) ...