build_tools.name_selector.selector
==================================

.. py:module:: build_tools.name_selector.selector

.. autoapi-nested-parse::

   Main selector orchestration logic.

   This module provides the high-level selection function that coordinates
   loading candidates, evaluating them against a policy, and producing
   ranked output.

   The selector is the central orchestrator of the Selection Policy Layer.
   It ties together:
   - Candidate loading (from name_combiner output)
   - Policy evaluation (from policy.py)
   - Result ranking and filtering

   Usage
   -----
   >>> from build_tools.name_selector import select_names, load_name_classes
   >>>
   >>> # Load policies and candidates
   >>> policies = load_name_classes("data/name_classes.yml")
   >>> with open("candidates/pyphen_candidates_2syl.json") as f:
   ...     candidates_data = json.load(f)
   >>>
   >>> # Select names
   >>> selected = select_names(
   ...     candidates=candidates_data["candidates"],
   ...     policy=policies["first_name"],
   ...     count=100,
   ...     mode="hard",
   ... )
   >>>
   >>> for name in selected[:5]:
   ...     print(f"{name['name']}: score={name['score']}, rank={name['rank']}")


Functions
---------

.. autoapisummary::

   build_tools.name_selector.selector.select_names
   build_tools.name_selector.selector.compute_selection_statistics


Module Contents
---------------

.. py:function:: select_names(candidates, policy, count = 100, mode = 'hard', order = 'alphabetical', seed = None)

   Select and rank name candidates against a policy.

   Evaluates all candidates, filters out rejected ones, ranks by score,
   and returns the top N.

   Parameters
   ----------
   candidates : Sequence[dict]
       List of candidate dictionaries from name_combiner output.
       Each must have "name", "syllables", and "features" keys.

   policy : NameClassPolicy
       The policy to evaluate against.

   count : int, optional
       Maximum number of names to return. Default: 100.

   mode : {"hard", "soft"}, optional
       Evaluation mode. "hard" rejects on discouraged features.
       "soft" applies penalties. Default: "hard".

   order : {"alphabetical", "random"}, optional
       Ordering for names with equal scores. "alphabetical" sorts by name
       for deterministic output. "random" shuffles within score groups
       using the provided seed. Default: "alphabetical".

   seed : int, optional
       RNG seed for random ordering. Only used when order="random".
       Required for deterministic random ordering. Default: None.

   Returns
   -------
   list[dict]
       List of selected candidates, sorted by score (descending).
       Each candidate is augmented with "score", "rank", and "evaluation".

   Examples
   --------
   >>> selected = select_names(candidates, policy, count=50)
   >>> selected[0]["rank"]
   1
   >>> selected[0]["score"]  # Highest score
   4
   >>> len(selected)
   50

   Notes
   -----
   The returned candidates are augmented with:
   - score: int - The policy score
   - rank: int - 1-based rank (1 = best)
   - evaluation: dict - Detailed evaluation breakdown


.. py:function:: compute_selection_statistics(candidates, policy, mode = 'hard')

   Compute statistics about a selection operation.

   Evaluates all candidates and returns aggregate statistics without
   building the full result list.

   Parameters
   ----------
   candidates : Sequence[dict]
       List of candidate dictionaries.

   policy : NameClassPolicy
       The policy to evaluate against.

   mode : {"hard", "soft"}, optional
       Evaluation mode. Default: "hard".

   Returns
   -------
   dict
       Statistics dictionary containing:
       - total_evaluated: int
       - admitted: int
       - rejected: int
       - rejection_reasons: dict[str, int]
       - score_distribution: dict[int, int] (score -> count)

   Examples
   --------
   >>> stats = compute_selection_statistics(candidates, policy)
   >>> stats["admitted"]
   2341
   >>> stats["rejection_reasons"]["ends_with_stop"]
   1234