build_tools.syllable_walk_web.services.pipeline_runner

Pipeline runner service for the web application.

Executes extraction → normalization → annotation → database build as sequential subprocesses in a background thread.

Attributes

STAGE_NAMES

Functions

start_pipeline(job, *[, extractor, language, ...])

Start a pipeline run in a background thread.

cancel_pipeline(job)

Cancel a running pipeline job.

get_status(job)

Return current pipeline job status as a JSON-serialisable dict.

Module Contents

build_tools.syllable_walk_web.services.pipeline_runner.STAGE_NAMES = ('extract', 'normalize', 'annotate', 'database')
build_tools.syllable_walk_web.services.pipeline_runner.start_pipeline(job, *, extractor='pyphen', language='auto', source_path=None, output_dir=None, file_pattern='*.txt', min_syllable_length=2, max_syllable_length=8, run_normalize=True, run_annotate=True)[source]

Start a pipeline run in a background thread.

Updates job state in-place as stages progress.

Parameters:
  • job (build_tools.syllable_walk_web.state.PipelineJobState) – Mutable pipeline job state (shared with server).

  • extractor (str) – "pyphen" or "nltk".

  • language (str) – Language code for pyphen (e.g. "en_US", "auto").

  • source_path (str | None) – Source directory containing text files.

  • output_dir (str | None) – Parent directory for pipeline output.

  • file_pattern (str) – Glob pattern for input files.

  • min_syllable_length (int) – Minimum syllable length filter.

  • max_syllable_length (int) – Maximum syllable length filter.

  • run_normalize (bool) – Whether to run normalization stage.

  • run_annotate (bool) – Whether to run annotation stage (requires normalization).

build_tools.syllable_walk_web.services.pipeline_runner.cancel_pipeline(job)[source]

Cancel a running pipeline job.

build_tools.syllable_walk_web.services.pipeline_runner.get_status(job)[source]

Return current pipeline job status as a JSON-serialisable dict.