build_tools.syllable_walk_web.services.pipeline_runner
Pipeline runner service for the web application.
Executes extraction → normalization → annotation → database build as sequential subprocesses in a background thread.
Attributes
Functions
|
Start a pipeline run in a background thread. |
|
Cancel a running pipeline job. |
|
Return current pipeline job status as a JSON-serialisable dict. |
Module Contents
- build_tools.syllable_walk_web.services.pipeline_runner.STAGE_NAMES = ('extract', 'normalize', 'annotate', 'database')
- build_tools.syllable_walk_web.services.pipeline_runner.start_pipeline(job, *, extractor='pyphen', language='auto', source_path=None, output_dir=None, file_pattern='*.txt', min_syllable_length=2, max_syllable_length=8, run_normalize=True, run_annotate=True)[source]
Start a pipeline run in a background thread.
Updates
jobstate in-place as stages progress.- Parameters:
job (build_tools.syllable_walk_web.state.PipelineJobState) – Mutable pipeline job state (shared with server).
extractor (str) –
"pyphen"or"nltk".language (str) – Language code for pyphen (e.g.
"en_US","auto").source_path (str | None) – Source directory containing text files.
output_dir (str | None) – Parent directory for pipeline output.
file_pattern (str) – Glob pattern for input files.
min_syllable_length (int) – Minimum syllable length filter.
max_syllable_length (int) – Maximum syllable length filter.
run_normalize (bool) – Whether to run normalization stage.
run_annotate (bool) – Whether to run annotation stage (requires normalization).