build_tools.syllable_walk_web.services.pipeline_runner

Pipeline runner service for the web application.

Executes extraction → normalization → annotation → database build as sequential subprocesses in a background thread.

Attributes

STAGE_NAMES

Functions

`start_pipeline`(job, *[, extractor, language, ...])	Start a pipeline run in a background thread.
`cancel_pipeline`(job)	Cancel a running pipeline job.
`get_status`(job)	Return current pipeline job status as a JSON-serialisable dict.

Module Contents

build_tools.syllable_walk_web.services.pipeline_runner.STAGE_NAMES = ('extract', 'normalize', 'annotate', 'database')

build_tools.syllable_walk_web.services.pipeline_runner.start_pipeline(job, *, extractor='pyphen', language='auto', source_path=None, output_dir=None, file_pattern='*.txt', min_syllable_length=2, max_syllable_length=8, run_normalize=True, run_annotate=True)[source]

Start a pipeline run in a background thread.

Updates job state in-place as stages progress.

Parameters:

job (build_tools.syllable_walk_web.state.PipelineJobState) – Mutable pipeline job state (shared with server).
extractor (str) – "pyphen" or "nltk".
language (str) – Language code for pyphen (e.g. "en_US", "auto").
source_path (str | None) – Source directory containing text files.
output_dir (str | None) – Parent directory for pipeline output.
file_pattern (str) – Glob pattern for input files.
min_syllable_length (int) – Minimum syllable length filter.
max_syllable_length (int) – Maximum syllable length filter.
run_normalize (bool) – Whether to run normalization stage.
run_annotate (bool) – Whether to run annotation stage (requires normalization).

build_tools.syllable_walk_web.services.pipeline_runner.cancel_pipeline(job)[source]

Cancel a running pipeline job.

build_tools.syllable_walk_web.services.pipeline_runner.get_status(job)[source]

Return current pipeline job status as a JSON-serialisable dict.