7.7 KiB
Adaptive Sentence Difficulty Cloze — v0.20 Design Spec
Date: 2026-03-15 Status: Approved Release: v0.20
Problem
Cloze cards currently select the example sentence closest to 9 words in length. This ignores whether the surrounding context words are familiar to the learner. A sentence full of rare words is harder than one with common words, regardless of length.
Solution
Replace the length-based _score() function in epub_examples.py with a frequency-based difficulty score. The easiest sentence (most common context words) becomes the cloze. All vetted sentences remain on the card, ordered easy→hard.
Scoring Pipeline
Token Frequency Lookup (5-tier)
Given a nikkud sentence token, resolve its frequency rank:
- Known mapping — look up token in the nikkud→ktiv_male map built from words.json headwords, conjugations, and inflections (94k mappings). If found, look up the ktiv_male in the frequency data.
- Nikkud prefix stripping — use
_try_strip_prefix()to strip validated Hebrew prefixes (בהוכלמש), then resolve the remainder via the known mapping. - Academy rules converter — apply
nikkud_to_ktiv_male.convert()(91.6% accuracy) to produce ktiv_male, look up in frequency data. - strip_nikkud fallback — use
helpers.strip_nikkud()as a lossy fallback. - Ktiv_male prefix stripping — strip 1-2 character Hebrew prefixes from the converted/stripped form and look up the stem.
Tokens not found in any tier are assigned a default high rank (50,000).
Coverage: ~93% of example sentence tokens resolve to a frequency rank (measured empirically on 7,588 sentences).
Frequency data source: Use frequency_lookup.py which auto-selects frequency_clean.json when available, falling back to frequency_cache.json.
Sentence Difficulty Score
For a given word's candidate sentence:
- Tokenize: split on whitespace, strip punctuation (.,!?;:"'"״׳–—()[]{}), split on maqaf (־).
- Exclude the target word's token using
cloze_word_start/cloze_word_endoffsets from the matched sentence. - For each remaining token (length >= 2), resolve its frequency rank via the 5-tier pipeline.
- Score = median frequency rank of context tokens.
Lower score = easier (context words are more common). Median resists outliers (one rare proper noun shouldn't dominate).
Integration Point
The scoring integrates into epub_examples.py's existing _score() closure inside update_words_json() (line ~677). Currently:
def _score(s: dict) -> tuple[int,]:
wc = s["word_count"]
length_score = abs(wc - 9) if not (6 <= wc <= 12) else 0
return (length_score,)
New scoring replaces length with frequency-based difficulty. The _score function gains access to the frequency pipeline via closure over the nikkud_map, nikkud_index, and freq_data built once at the start of update_words_json().
Minimum sentence length: Reduced from 4 words to 3 words (MIN_WORDS = 3 in epub_examples.py). Hebrew is more concise than English — 3-word sentences are valid and common. This expands the candidate pool for cloze selection.
Behavioral change: Because pool.sort(key=_score) determines which 3 sentences are selected as best = pool[:3], changing the scoring function changes which sentences are selected, not just their order. This is intentional — we want the easiest sentences as cloze candidates, not the closest-to-9-words ones. Existing cloze GUIDs will be preserved when the same sentence text is re-selected; entries where a different sentence wins will get new GUIDs.
Data Model Changes
words.json
The examples.cloze dict (single sentence) gains an optional difficulty_score field:
{
"examples": {
"vetted": [
{"text": "...", "source": "...", "match_method": "..."},
{"text": "...", "source": "...", "match_method": "..."}
],
"cloze": {
"text": "...",
"cloze_word_start": 5,
"cloze_word_end": 10,
"cloze_hint": null,
"cloze_guid": "abc123",
"difficulty_score": 234
}
}
}
The vetted list is also sorted by difficulty (easiest first), so the card back shows sentences in pedagogically useful order.
SCHEMA.yaml
Add difficulty_score as optional integer field under examples.cloze.
Implementation Scope
New file: sentence_difficulty.py
Standalone module for sentence scoring. No pipeline step — called by epub_examples.py.
score_sentence(sentence_text: str, target_start: int, target_end: int, nikkud_map: dict, nikkud_index: dict, freq_data: dict) -> int— returns median context frequency rank. Usestarget_start/target_endcharacter offsets to exclude the cloze target token.build_nikkud_map(words: dict) -> dict[str, str]— builds nikkud→ktiv_male lookup from words.json (headwords + conjugation forms + noun inflections). Returns{nikkud_form: ktiv_male_form}. Implementation note: should share iteration logic withepub_examples._build_nikkud_index()or derive from its output to avoid duplicating the traversal of words.json forms._resolve_token_frequency(token: str, nikkud_map: dict, nikkud_index: dict, freq_data: dict) -> int— the 5-tier lookup. Uses_try_strip_prefixfrom epub_examples (made importable by removing underscore or adding a public wrapper).
Modified files
epub_examples.py:- Import
sentence_difficulty.score_sentenceandsentence_difficulty.build_nikkud_map - In
update_words_json(): build nikkud_map and load freq_data once at start (before per-word loop) - Replace
_score()closure with frequency-based scoring that callsscore_sentence() - Sort vetted list by difficulty score (easiest first)
- Store
difficulty_scorein the cloze dict - Make
_try_strip_prefiximportable (rename totry_strip_prefixor add public alias)
- Import
frequency_lookup.py— addget_freq_data() -> dictpublic accessor to expose the loaded frequency dict (avoids accessing private_freqdirectly)SCHEMA.yaml— adddifficulty_scorefieldrun.py— no changes; scoring happens inside epub_examples step
Not modified
apkg_builder.py— reads cloze as-is; vetted order is already respectednikkud_to_ktiv_male.py— used as-is- Card templates — no changes needed
Dependencies
nikkud_to_ktiv_male.convert()— Academy rules converter (already written)epub_examples._try_strip_prefix()/_build_nikkud_index()— nikkud prefix stripping and indexfrequency_lookup.py— loads frequency data (auto-selects clean vs cache)helpers.strip_nikkud()— fallback converter
Validation
- Unit tests for
score_sentence()with known easy/hard sentences - Unit tests for
_resolve_token_frequency()covering all 5 tiers - Integration test: verify cloze selection picks easiest sentence, vetted list is sorted
- Spot check: manually review 10 words with 3+ sentences to confirm ordering
- Regression: existing tests pass, GUID coverage unchanged, deck validates
Constraints
examples.clozeremains a single dict (not converted to list)- No new Anki card types or fields
- No runtime JS in Anki cards
- No network calls during scoring
difficulty_scoreis informational metadata; card rendering doesn't depend on it- Existing cloze GUIDs preserved when the same sentence is re-selected
Scope Exclusions (Future Work)
- Pronominal suffix stripping — would improve the ~7% unscored token rate; deferred (PROJECT_NOTES.md)
- Kamatz katan disambiguation — requires morphological analysis; accepted limitation
- Per-learner adaptive difficulty — requires Anki plugin; out of scope for static deck
- Multiple cloze sentences per card — would require schema migration to list; deferred