Pipeline TUI

Overview

Pipeline Build Tools TUI - Interactive interface for syllable extraction pipelines.

This TUI provides a visual interface for running and monitoring the syllable extraction, normalization, and annotation pipeline. It complements the existing CLI tools without replacing them.

Key Features:

  • Source Selection: Browse and select input text files/directories

  • Extractor Configuration: Choose pyphen or NLTK extractor with options

  • Pipeline Execution: Run extraction, normalization, annotation in sequence

  • Job Monitoring: Watch progress, view logs, inspect outputs

  • Run History: Browse previous pipeline runs from corpus_db

Relationship to CLI Tools:

This TUI wraps the existing CLI tools and does not replace them:

  • build_tools.pyphen_syllable_extractor - Pyphen extraction

  • build_tools.nltk_syllable_extractor - NLTK extraction

  • build_tools.pyphen_syllable_normaliser - Pyphen normalization

  • build_tools.nltk_syllable_normaliser - NLTK normalization

  • build_tools.syllable_feature_annotator - Feature annotation

Architecture:

pipeline_tui/
├── __init__.py           # This file
├── __main__.py           # Entry point (python -m build_tools.pipeline_tui)
├── core/
│   ├── app.py            # Main PipelineApp class
│   └── state.py          # Application state management
├── screens/
│   ├── config.py         # Pipeline configuration screen
│   ├── monitor.py        # Job monitoring screen
│   └── history.py        # Run history browser
└── services/
    ├── pipeline.py       # Pipeline execution service
    └── validators.py     # Directory/file validators

Usage:

# Launch the TUI
python -m build_tools.pipeline_tui

# With initial directory
python -m build_tools.pipeline_tui --source ~/corpora

Shared Components:

This TUI uses shared components from build_tools.tui_common:

The Pipeline TUI is an interactive terminal user interface for running and monitoring the syllable extraction pipeline. Built with Textual, it provides a visual interface for configuring and executing extraction, normalization, and annotation workflows.

Key Features:

  • Source Selection: Browse and select input text files/directories

  • Extractor Configuration: Choose pyphen or NLTK extractor with options

  • Pipeline Execution: Run extraction, normalization, annotation in sequence

  • Job Monitoring: Watch progress, view logs, inspect outputs

  • Run History: Browse previous pipeline runs from corpus_db

Relationship to CLI Tools:

This TUI wraps the existing CLI tools and does not replace them:

  • build_tools.pyphen_syllable_extractor - Pyphen extraction

  • build_tools.nltk_syllable_extractor - NLTK extraction

  • build_tools.pyphen_syllable_normaliser - Pyphen normalization

  • build_tools.nltk_syllable_normaliser - NLTK normalization

  • build_tools.syllable_feature_annotator - Feature annotation

Command-Line Interface

# Launch the TUI
python -m build_tools.pipeline_tui

# Start with a specific source directory
python -m build_tools.pipeline_tui --source ~/corpora/english

# Start with a specific output directory
python -m build_tools.pipeline_tui --output _working/output

Options:

Option

Description

--source PATH

Initial source directory for input files

--output PATH

Initial output directory for results (default: _working/output)

--theme NAME

Color theme (nord, dracula, monokai, textual-dark, textual-light)

Interface Overview

The TUI uses a tabbed interface with three main screens:

Configure Tab

Set up extraction parameters:

  • Source directory selection (d key)

  • Individual file selection (f key)

  • Output directory selection (o key)

  • Extractor type (pyphen/NLTK)

  • Language selection (for pyphen)

  • Syllable length constraints

  • Pipeline stage toggles (normalize, annotate)

Monitor Tab

Watch job progress and logs:

  • Job status indicator

  • Progress bar

  • Current stage display

  • Log output area

  • Cancel button

History Tab

Browse previous pipeline runs:

  • List of runs from corpus_db

  • Run details panel

  • Output file browser

  • Re-run capability

Keyboard Shortcuts

Global Actions:

Key(s)

Action

q / Ctrl+Q

Quit application

? / F1

Show help

1

Switch to Configure tab

2

Switch to Monitor tab

3

Switch to History tab

d

Select source directory

f

Select individual source files

o

Select output directory

r

Run pipeline

c

Cancel running job

Directory Browser:

Key(s)

Action

j /

Move down

k /

Move up

h /

Collapse directory

l /

Expand directory

Space

Toggle expand/collapse

Enter

Select directory

Esc

Cancel selection

Integration Guide

Typical Workflow

# 1. Launch the TUI
python -m build_tools.pipeline_tui

# 2. Select source directory (press 's')
#    Navigate to directory containing .txt files

# 3. Select output directory (press 'o')
#    Choose where results will be saved

# 4. Configure extraction options
#    Select extractor type, language, etc.

# 5. Run pipeline (press 'r')
#    Monitor progress in Monitor tab

# 6. View results
#    Browse output in History tab

When to Use

Use the Pipeline TUI when:

  • You want a visual interface for pipeline configuration

  • You need to monitor job progress in real-time

  • You’re exploring different extraction options

  • You want to browse previous runs and their outputs

Use the CLI tools when:

  • You’re scripting automated pipelines

  • You need batch processing

  • You want precise control over individual steps

  • You’re integrating with other tools

Architecture

pipeline_tui/
├── __init__.py           # Package entry point
├── __main__.py           # CLI entry point
├── core/
│   ├── app.py            # Main PipelineTuiApp class
│   └── state.py          # Application state management
├── screens/
│   ├── __init__.py       # Screen exports
│   └── configure.py      # ConfigurePanel widget
└── services/
    ├── __init__.py
    └── validators.py     # Directory validation functions

State Management:

  • PipelineState: Top-level application state

  • ExtractionConfig: Extractor settings (type, language, constraints)

  • JobState: Current job execution status

Shared Components:

Uses build_tools.tui_common for:

  • DirectoryBrowserScreen: File browser modal

  • IntSpinner, FloatSlider: Parameter controls

  • RadioOption: Selection widgets

  • KeybindingConfig: Keybinding management

Notes

Current Status:

The Pipeline TUI has a working Configure tab with full settings UI. Monitor and History tabs are placeholders. Pipeline execution is planned for a future release.

Dependencies:

Requires Textual library:

pip install -e ".[dev]"

Python Version:

Requires Python 3.12+.

Related Documentation:

API Reference

Core

Core components for Pipeline TUI.

This module contains the main application class and state management for the pipeline TUI.

Components:

class build_tools.pipeline_tui.core.PipelineState(config=<factory>, job=<factory>, last_source_dir=<factory>, last_output_dir=<factory>, run_normalize=True, run_annotate=True)[source]

Bases: object

Top-level application state for Pipeline TUI.

This dataclass holds all state for the application, including configuration, job status, and UI state.

config

Current extraction configuration

job

Current or most recent job state

last_source_dir

Last browsed source directory (for browser initial path)

last_output_dir

Last browsed output directory

run_normalize

Whether to run normalization after extraction

run_annotate

Whether to run annotation after normalization

complete_job(output_path)[source]

Mark job as completed successfully.

Parameters:

output_path (Path) – Path to the output directory

Return type:

None

config: ExtractionConfig
fail_job(error)[source]

Mark job as failed with error message.

Parameters:

error (str) – Error message describing the failure

Return type:

None

job: JobState
last_output_dir: Path
last_source_dir: Path
reset_job()[source]

Reset job state to idle, preserving configuration.

Return type:

None

run_annotate: bool = True
run_normalize: bool = True
start_job()[source]

Start a new job with current configuration.

Creates a new JobState with RUNNING status and current timestamp.

Return type:

None

class build_tools.pipeline_tui.core.PipelineTuiApp(source_dir=None, output_dir=None, theme='nord')[source]

Bases: App

Main application for Pipeline Build Tools TUI.

A Textual application providing an interactive interface for running syllable extraction, normalization, and annotation pipelines.

state

Application state (config, job status, UI state)

theme

Color theme name

Keybindings:
  • q: Quit application

  • ?: Show help screen

  • 1/2/3: Switch tabs

  • r: Run pipeline

  • c: Cancel job

  • s: Select source directory

  • o: Select output directory

BINDINGS: ClassVar[list[BindingType]] = [Binding(key='q', action='quit', description='Quit', show=True, key_display=None, priority=True, tooltip='', id=None, system=False, group=None), Binding(key='ctrl+q', action='quit', description='Quit', show=True, key_display=None, priority=True, tooltip='', id=None, system=False, group=None), Binding(key='question_mark', action='help', description='Help', show=True, key_display=None, priority=True, tooltip='', id=None, system=False, group=None), Binding(key='1', action='tab_configure', description='Configure', show=True, key_display=None, priority=True, tooltip='', id=None, system=False, group=None), Binding(key='2', action='tab_monitor', description='Monitor', show=True, key_display=None, priority=True, tooltip='', id=None, system=False, group=None), Binding(key='3', action='tab_history', description='History', show=True, key_display=None, priority=True, tooltip='', id=None, system=False, group=None), Binding(key='r', action='run_pipeline', description='Run', show=True, key_display=None, priority=True, tooltip='', id=None, system=False, group=None), Binding(key='c', action='cancel_job', description='Cancel', show=True, key_display=None, priority=True, tooltip='', id=None, system=False, group=None), Binding(key='d', action='select_source', description='Directory', show=True, key_display=None, priority=True, tooltip='', id=None, system=False, group=None), Binding(key='f', action='select_files', description='Select', show=True, key_display=None, priority=True, tooltip='', id=None, system=False, group=None), Binding(key='o', action='select_output', description='Output', show=True, key_display=None, priority=True, tooltip='', id=None, system=False, group=None)]

The default key bindings.

DEFAULT_CSS: ClassVar[str] = '\n    PipelineTuiApp {\n        background: $surface;\n    }\n\n    #main-container {\n        width: 100%;\n        height: 100%;\n    }\n\n    .status-bar {\n        height: 3;\n        background: $panel;\n        border: solid $primary;\n        padding: 0 1;\n    }\n\n    .status-label {\n        width: 1fr;\n    }\n\n    /* Ensure TabPane content fills available space */\n    TabPane {\n        height: 1fr;\n        width: 1fr;\n    }\n\n    ContentSwitcher {\n        height: 1fr;\n        width: 1fr;\n    }\n\n    .placeholder-content {\n        width: 100%;\n        height: 100%;\n        content-align: center middle;\n        color: $text-muted;\n    }\n    '

Default TCSS.

SUB_TITLE: str | None = 'Syllable Extraction Pipeline Manager'

A class variable to set the default sub-title for the application.

To update the sub-title while the app is running, you can set the [sub_title][textual.app.App.sub_title] attribute. See also [the Screen.SUB_TITLE attribute][textual.screen.Screen.SUB_TITLE].

TITLE: str | None = 'Pipeline Build Tools'

A class variable to set the default title for the application.

To update the title while the app is running, you can set the [title][textual.app.App.title] attribute. See also [the Screen.TITLE attribute][textual.screen.Screen.TITLE].

__init__(source_dir=None, output_dir=None, theme='nord')[source]

Initialize the Pipeline TUI application.

Parameters:
  • source_dir (Path | None) – Initial source directory (optional)

  • output_dir (Path | None) – Initial output directory (optional)

  • theme (str) – Color theme name (default: “nord”)

action_cancel_job()[source]

Cancel the currently running job.

Return type:

None

action_help()[source]

Show help screen.

Return type:

None

action_run_pipeline()[source]

Start pipeline execution.

Validates configuration and starts the pipeline job.

Return type:

None

action_select_files()[source]

Open file selector to choose specific files.

Uses FileSelectorScreen for selecting individual files. Runs as a worker to support push_screen_wait.

Return type:

None

action_select_output()[source]

Open directory browser to select output directory.

Uses shared DirectoryBrowserScreen with output validator. Runs as a worker to support push_screen_wait.

Return type:

None

action_select_source()[source]

Open directory browser to select source.

Uses shared DirectoryBrowserScreen with source validator. Runs as a worker to support push_screen_wait.

Return type:

None

action_tab_configure()[source]

Switch to Configure tab.

Return type:

None

action_tab_history()[source]

Switch to History tab.

Return type:

None

action_tab_monitor()[source]

Switch to Monitor tab.

Return type:

None

compose()[source]

Compose the application layout.

Layout: - Header with title - Status bar showing current config summary - Tabbed content (Configure, Monitor, History) - Footer with keybinding hints

Yields:

Application widget tree

Return type:

Iterable[Widget]

on_configure_panel_constraints_changed(event)[source]

Handle constraints change from ConfigurePanel.

Updates syllable length constraints and file pattern in state.

Parameters:

event (ConstraintsChanged) – Constraints changed event with new values

Return type:

None

on_configure_panel_extractor_changed(event)[source]

Handle extractor type change from ConfigurePanel.

Updates the application state with the new extractor type. NLTK is English-only, so language setting is ignored for NLTK.

Parameters:

event (ExtractorChanged) – Extractor changed event with new extractor type

Return type:

None

on_configure_panel_files_selected(event)[source]

Handle file selection request from ConfigurePanel.

Opens the file selector modal for choosing specific files.

Parameters:

event (FilesSelected) – Files selected event

Return type:

None

on_configure_panel_language_changed(event)[source]

Handle language selection change from ConfigurePanel.

Updates the application state with the new language code. Only applies to pyphen extractor.

Parameters:

event (LanguageChanged) – Language changed event with new language code

Return type:

None

on_configure_panel_output_selected(event)[source]

Handle output directory selection request from ConfigurePanel.

Triggers the directory browser modal via the existing action.

Parameters:

event (OutputSelected) – Output selected event

Return type:

None

on_configure_panel_pipeline_stages_changed(event)[source]

Handle pipeline stage toggle changes from ConfigurePanel.

Updates which pipeline stages (normalize, annotate) will run.

Parameters:

event (PipelineStagesChanged) – Pipeline stages changed event with toggle states

Return type:

None

on_configure_panel_source_selected(event)[source]

Handle source directory selection request from ConfigurePanel.

Triggers the directory browser modal via the existing action. The browse button in ConfigurePanel posts this message.

Parameters:

event (SourceSelected) – Source selected event

Return type:

None

Services

Services for Pipeline TUI.

This module contains backend services for directory validation, pipeline execution, and job management.

Services:

  • validators - Directory validation functions for browsers

  • pipeline - Pipeline execution and monitoring

class build_tools.pipeline_tui.services.PipelineExecutor[source]

Bases: object

Executes pipeline stages as subprocesses with progress monitoring.

This class manages the execution of extraction, normalization, and annotation stages as separate Python subprocesses. It provides:

  • Real-time stdout/stderr capture

  • Progress updates via callbacks

  • Cancellation support

  • Clean error handling

_current_process

Currently running subprocess (for cancellation)

_cancelled

Flag indicating if cancellation was requested

Example

>>> executor = PipelineExecutor()
>>> result = await executor.run_pipeline(config, on_progress=callback)
>>> if result.success:
...     print(f"Output: {result.run_directory}")
__init__()[source]

Initialize the pipeline executor.

async cancel()[source]

Cancel the currently running pipeline.

Terminates the current subprocess if one is running.

Return type:

None

async run_pipeline(config, run_normalize=True, run_annotate=True, on_progress=None, on_log=None)[source]

Execute the full pipeline with configured stages.

Runs extraction, then optionally normalization and annotation. Progress is reported via callbacks for UI updates.

Parameters:
  • config (ExtractionConfig) – Extraction configuration specifying source, output, etc.

  • run_normalize (bool) – Whether to run normalization after extraction

  • run_annotate (bool) – Whether to run annotation after normalization

  • on_progress (Optional[Callable[[str, int, str], None]]) – Callback for progress updates (stage, percent, message)

  • on_log (Optional[Callable[[str], None]]) – Callback for log messages

Return type:

PipelineResult

Returns:

PipelineResult with success status and stage results

Raises:

ValueError – If config is invalid

class build_tools.pipeline_tui.services.PipelineResult(success, stages=<factory>, run_directory=None, cancelled=False, total_duration_seconds=0.0)[source]

Bases: object

Result from executing the full pipeline.

success

Whether all stages completed successfully

stages

List of individual stage results

run_directory

Path to the output run directory

cancelled

Whether the pipeline was cancelled

total_duration_seconds

Total pipeline duration

cancelled: bool = False
run_directory: Path | None = None
stages: list[StageResult]
success: bool
total_duration_seconds: float = 0.0
class build_tools.pipeline_tui.services.StageResult(stage, success, output_path=None, return_code=0, stdout='', stderr='', duration_seconds=0.0, error_message='')[source]

Bases: object

Result from executing a single pipeline stage.

stage

Name of the stage (extraction, normalization, annotation)

success

Whether the stage completed successfully

output_path

Path to the output (run directory or file)

return_code

Process return code

stdout

Captured standard output

stderr

Captured standard error

duration_seconds

How long the stage took

error_message

Error message if stage failed

duration_seconds: float = 0.0
error_message: str = ''
output_path: Path | None = None
return_code: int = 0
stage: str
stderr: str = ''
stdout: str = ''
success: bool
build_tools.pipeline_tui.services.validate_output_directory(path)[source]

Validate a directory as an output location for pipeline results.

Any existing directory is valid. Non-existent paths are invalid (the pipeline will create timestamped subdirectories, but the parent must exist).

Parameters:

path (Path) – Directory path to validate

Returns:

  • is_valid: True if directory exists and is writable

  • type_label: “output” if valid

  • message: Status description

Return type:

Tuple of (is_valid, type_label, message)

build_tools.pipeline_tui.services.validate_source_directory(path)[source]

Validate a directory as a source for text extraction.

A valid source directory contains at least one .txt file, either directly or in subdirectories.

Parameters:

path (Path) – Directory path to validate

Returns:

  • is_valid: True if directory contains extractable files

  • type_label: “source” if valid

  • message: File count if valid, error description if invalid

Return type:

Tuple of (is_valid, type_label, message)