hebrew_flash_cards/data
Sochen af186e2030 Sprint 17: homograph example dedup + plural audio + prep extraction
- Homograph collision fix: _deduplicate_confusable_examples() clears
  shared examples from less-common confusable group members (36 entries
  fixed). Keeps examples only on highest-frequency meaning.
- Plural deck audio: wired up PluralAudio field in apkg_builder.py,
  downloaded 613 plural audio files from pealim.com for all deck entries.
- Prep extraction upstream: moved Hebrew preposition parsing from build
  time into list/detail scrapers (SCHEMA.yaml prep field added).
- Validation: new no_shared_confusable_examples check in validate_data.py
- Tests: 9 new unit tests for confusable deduplication (98 total)
- Release: v0.19

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 21:51:35 +00:00
..
fonts feat: Sprint 3 — Heebo font files, image fetch, verb validator scripts 2026-03-03 08:37:08 +00:00
conjugations.json Sprint 9: cloze cards, plurals deck, project reorg, lint tooling 2026-03-07 08:09:39 +00:00
emoji_lookup.json feat: curated emoji denylist, vocab audio URLs in CSV 2026-03-06 12:29:15 +00:00
epub_sentence_index.json Sprint 9: cloze cards, plurals deck, project reorg, lint tooling 2026-03-07 08:09:39 +00:00
examples_cache.json Sprint 9: cloze cards, plurals deck, project reorg, lint tooling 2026-03-07 08:09:39 +00:00
frequency_cache.json feat: add apkg builder, frequency, Ben Yehuda examples, conjugation deck 2026-03-03 01:58:31 +00:00
frequency_clean.json feat: YAP-cleaned frequency corpus + two-tier assignment pipeline 2026-03-10 06:22:55 +00:00
frequency_discarded.json feat: YAP-cleaned frequency corpus + two-tier assignment pipeline 2026-03-10 06:22:55 +00:00
ktiv_male_forms.json v0.14: rescrape vocab, formatting fixes for all decks 2026-03-07 09:26:41 +00:00
legacy_guid_map.json Sprint 9: cloze cards, plurals deck, project reorg, lint tooling 2026-03-07 08:09:39 +00:00
noun_plurals.json v0.14: rescrape vocab, formatting fixes for all decks 2026-03-07 09:26:41 +00:00
noun_slug_map.json Sprint 9: cloze cards, plurals deck, project reorg, lint tooling 2026-03-07 08:09:39 +00:00
refined_meanings.json Sprint 9: cloze cards, plurals deck, project reorg, lint tooling 2026-03-07 08:09:39 +00:00
vetted_sentences.json Sprint 9: cloze cards, plurals deck, project reorg, lint tooling 2026-03-07 08:09:39 +00:00
vocab_sentence_matches.json Sprint 9: cloze cards, plurals deck, project reorg, lint tooling 2026-03-07 08:09:39 +00:00
words.json Sprint 17: homograph example dedup + plural audio + prep extraction 2026-03-14 21:51:35 +00:00