build_tools.nltk_syllable_extractor.batch
Batch mode for the NLTK syllable extractor.
This module provides batch processing functionality for extracting syllables from multiple files using NLTK’s CMU Pronouncing Dictionary.
Functions
|
Process a single file in batch mode with comprehensive error handling. |
|
Process multiple files sequentially in batch mode. |
|
Batch mode entry point for the NLTK syllable extractor CLI. |
Module Contents
- build_tools.nltk_syllable_extractor.batch.process_single_file(input_path, min_len, max_len, output_dir, run_timestamp, verbose=False)[source]
Process a single file in batch mode with comprehensive error handling.
This function attempts to extract syllables from a single file and saves the results. Unlike interactive mode, this function catches all exceptions and returns a result object indicating success or failure, allowing batch processing to continue even when individual files fail.
- Parameters:
input_path (pathlib.Path) – Path to the input text file to process
min_len (int) – Minimum syllable length to include in results
max_len (int) – Maximum syllable length to include in results
output_dir (pathlib.Path) – Directory where output files should be saved
run_timestamp (str) – Timestamp for the batch run (shared across all files in batch)
verbose (bool) – If True, print detailed progress messages (default: False)
- Returns:
FileProcessingResult object with success status, syllables count, output paths (if successful), or error message (if failed).
- Return type:
build_tools.nltk_syllable_extractor.models.FileProcessingResult
Note
This function never raises exceptions. All errors are caught and returned in the FileProcessingResult.error_message field.
- build_tools.nltk_syllable_extractor.batch.process_batch(files, min_len, max_len, output_dir, quiet=False, verbose=False)[source]
Process multiple files sequentially in batch mode.
This is a backwards-compatible wrapper around run_batch_extraction.
- Parameters:
files (list[pathlib.Path]) – List of input file paths to process
min_len (int) – Minimum syllable length to include
max_len (int) – Maximum syllable length to include
output_dir (pathlib.Path) – Output directory for all results (created if needed)
quiet (bool) – If True, suppress all output except errors (default: False)
verbose (bool) – If True, show detailed progress for each file (default: False).
- Returns:
BatchResult with overall statistics and individual file results.
- Return type:
- build_tools.nltk_syllable_extractor.batch.run_batch(args)[source]
Batch mode entry point for the NLTK syllable extractor CLI.
This function processes multiple files based on command-line arguments, providing progress indicators and comprehensive error reporting.
- Parameters:
args (argparse.Namespace) – Parsed command-line arguments from argparse.Namespace containing: - file: Single file path (optional) - files: List of file paths (optional) - source: Directory path for scanning (optional) - pattern: File pattern for directory scanning (default: “*.txt”) - recursive: Whether to scan directories recursively - min: Minimum syllable length (default: 1) - max: Maximum syllable length (default: 999) - output: Output directory (default: _working/output/) - quiet: Suppress progress indicators - verbose: Show detailed processing information
- Exit Codes:
0: All files processed successfully 1: One or more files failed to process
- Raises:
SystemExit – On validation errors or processing completion