• v0.20 138acb06d8

    sochen released this 2026-03-15 07:09:19 -07:00 | 4 commits to master since this release

    What's New

    • Frequency-based cloze sentence selection: Sentences are now scored by context-word difficulty (median frequency rank). The easiest sentence becomes the cloze card.
    • 5-tier nikkud→ktiv_male pipeline: Known mappings → nikkud prefix stripping → Academy rules converter → strip_nikkud fallback → ktiv_male prefix stripping (93.4% coverage)
    • 3,163 scored cloze sentences (score range 1–50,000, median 179)
    • MIN_WORDS reduced to 3: Hebrew is more concise than English — 3-word sentences are now valid candidates
    • New difficulty_score field on cloze entries in words.json
    • New nikkud_to_ktiv_male.py: Academy-rules-based converter (91.6% accuracy vs 77.2% for strip_nikkud)

    Files

    12 deck variants (vocab×4, conj×2, conf×2, plurals×2, complete×2)

    Downloads
  • v0.19 af186e2030

    sochen released this 2026-03-14 14:51:35 -07:00 | 14 commits to master since this release

    • Homograph collision fix: _deduplicate_confusable_examples() clears
      shared examples from less-common confusable group members (36 entries
      fixed). Keeps examples only on highest-frequency meaning.
    • Plural deck audio: wired up PluralAudio field in apkg_builder.py,
      downloaded 613 plural audio files from pealim.com for all deck entries.
    • Prep extraction upstream: moved Hebrew preposition parsing from build
      time into list/detail scrapers (SCHEMA.yaml prep field added).
    • Validation: new no_shared_confusable_examples check in validate_data.py
    • Tests: 9 new unit tests for confusable deduplication (98 total)
    • Release: v0.19

    Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

    Downloads
  • v0.18 0d92451271

    sochen released this 2026-03-10 18:34:14 -07:00 | 15 commits to master since this release

    Changes

    • All secondary fields behind a collapsible "מידע נוסף" toggle button
    • Related words shown as frequency-sorted table with meanings
    • Conjugation meaning/binyan also behind toggle
    • Dark mode support for new elements

    Deck Stats

    • 9,100 vocab notes (3,598 with examples, 3,289 cloze)
    • 2,567 conjugation notes
    • 333 confusable notes
    • 235 plural notes
    • 12,235 total notes
    Downloads
  • v0.17 c85063ee2f

    sochen released this 2026-03-10 03:44:14 -07:00 | 16 commits to master since this release

    • Regenerated all example sentences from scratch (deleted legacy + stale entries)
    • Added .txt file support to epub_examples.py for Ben Yehuda corpus
    • 7 Ben Yehuda nikkud'd children's texts + 3 new Time Tunnel EPUBs
    • Maqaf-stripped construct form indexing (+68% inflected matches)
    • Total: 3,598 words with examples, 3,289 with cloze (was ~2,900)
    • Cloze prefix preservation (_cloze_prefix_len)
    • Hebrew spoiler stripping from English meanings
    • Gender field (זָכָר/נְקֵבָה) on vocab cards
    • sec-table CSS layout for aligned key:value pairs
    • Mishkal uses mishkal_hebrew on plural cards
    • Improved mishkal extraction from pealim detail pages
    • 21 new pytest tests (cloze, PoS, Hebrew stripping, gender, mishkal)
    • 2 new validate_data.py tests + mishkal stats
    • Colliding forms tracking (local-only)
    • Release tag v0.17

    Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

    Downloads
  • v0.16 efd0745ada

    sochen released this 2026-03-10 00:44:47 -07:00 | 17 commits to master since this release

    Sprint 14

    Template & CSS fixes (15 items)

    • Fix conjugation front showing 3ms form instead of infinitive
    • Rename conjugation model to "Hebrew Conjugation"
    • Strip Hebrew text from English meanings
    • Shoresh dots (א.כ.ל), parenthesis spacing
    • Remove examples from vocab front/back (cloze only)
    • Center audio buttons, unify fonts
    • Size overhaul: bigger everything
    • Sort confusables by avg frequency
    • Plurals: Hebrew gender labels, emoji cleanup
    • Cloze quotation mark cleanup

    Frequency pipeline (Sprint 13)

    • YAP-cleaned frequency corpus (30,430 entries)
    • Two-tier assignment: 6,691 entries with frequency
    • PoS-aware homograph handling

    Detail scraping (Sprint 12)

    • Adjective/preposition detail pages
    • EPUB example matching rewrite
    Downloads
  • v0.15.1 04a4b52113

    sochen released this 2026-03-08 21:12:45 -07:00 | 20 commits to master since this release

    Homographs (same nikkud form, different meanings) had identical
    plurals_guid values. Regenerated unique GUIDs by including meaning
    in the hash. Also updated build-time fallback to use meaning.

    Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

    Downloads
  • v0.14 802c369365

    sochen released this 2026-03-07 01:26:41 -08:00 | 27 commits to master since this release

    • Full pealim.com rescrape: 9,120 words (15 new), all with audio URLs
    • Plurals deck: 2:1 regular:irregular ratio (649 notes), RTL arrows, 1.6x hint text
    • Conjugation deck: blue infinitive on front, plain meaning on back, nikkud labels
    • Confusables deck: larger prompt text (32px), audio only when all words have it
    • Validator: non-audio variants no longer false-fail on audio check
    • 14 new audio files downloaded

    Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

    Downloads
  • v0.13 17f7458d19

    sochen released this 2026-03-07 00:09:39 -08:00 | 31 commits to master since this release

    What's New

    • Sentence cloze cards: 924 fill-in-the-blank cards from AI-vetted Hebrew book sentences (הנסיך הקטן, מנהרת הזמן 82)
    • Plurals deck: 375 notes (144 irregular + 231 regular exemplars from 86 mishkal patterns)
    • 12 deck variants: vocabulary (4), conjugations (2), confusables (2), plurals (2), complete (2)
    • Combined deck: hebrew_complete.apkg bundles all 4 decks as subdecks

    Code Quality

    • Full ruff/vulture/bandit lint pass (0 errors)
    • Project reorganized: shared helpers.py, scripts/ for one-off tools, smoke tests
    • pyproject.toml with tooling config

    Deck Stats

    Deck Notes Cards
    Vocabulary 9,090 19,104 (incl. 924 cloze)
    Conjugations 1,857 1,857
    Confusables 339 339
    Plurals 375 750
    Complete 11,664 22,056
    Downloads
  • v0.12 3fc3a21a33

    sochen released this 2026-03-05 13:01:47 -08:00 | 34 commits to master since this release

    Hebrew Flash Cards v0.12

    Vocabulary Deck

    • 8,957 notes × 2 card types (Heb→Eng, Eng→Heb) = 17,914 cards
    • Frequency-sorted from Hebrew word corpus (49,999 entries)
    • 1,820 words with emoji (Unicode lookup with curated denylist)
    • 158 Hebrew prepositions extracted and displayed with word
    • 8,727 audio files from pealim.com (in audio variants)
    • 40 images for concrete nouns (in image variants)
    • Back shows: root, PoS, related words, unvoweled form, example sentence

    Conjugation Deck

    • 1,834 notes across 70 paradigm verbs (all binyanim)
    • Seeded RNG for deterministic pronoun/gender selection
    • Nikkud on gender labels: זָכָר / נְקֵבָה
    • 1,765 audio files (in audio variant)
    • Random card order

    Download Variants

    File Size Contents
    hebrew_vocabulary.apkg 6 MB Base vocab deck
    hebrew_vocabulary_audio.apkg 54 MB + 8,727 audio files
    hebrew_vocabulary_images.apkg 7 MB + 40 noun images
    hebrew_vocabulary_audio_images.apkg 55 MB + audio + images
    hebrew_conjugations.apkg 0.9 MB Base conjugation deck
    hebrew_conjugations_audio.apkg 11 MB + 1,765 audio files

    Changes since v0.11

    • Emoji Unicode lookup from emoji-test.txt with curated 80-keyword denylist
    • Hebrew preposition extraction (parenthetical Hebrew → displayed after word)
    • Conjugation nikkud fix (זכר→זָכָר, נקבה→נְקֵבָה)
    • Vocab audio: scraped 8,727 audio URLs from pealim.com, bundled in audio variants
    • Conjugation card reduction: 1 pronoun per present form key (seeded RNG)
    • 6-variant release system (4 vocab × 2 conj)
    Downloads
  • v0.11 64a1b18951

    sochen released this 2026-03-04 21:49:51 -08:00 | 38 commits to master since this release

    Sprint 7 Changes

    Emoji on Vocab Cards

    Emoji extracted from meanings (🥑 avocado, 🍎 apple…) and displayed prominently on card back. Meaning text is cleaned.

    Hebrew Prepositions

    Hebrew parentheticals like (על) extracted from English meaning and displayed inline after the Hebrew word on both card sides.

    Conjugation Card Reduction (~630 fewer cards)

    Present-tense expansion now picks 1 pronoun per form key (seeded RNG, deterministic per verb). past_3p picks 1 gender. 1st-person forms gain a gender label (זכר/נקבה).
    Total conjugation cards: 1,834 (was ~2,464).

    Renamed: Hebrew Flash Cards

    • Output files: hebrew_vocabulary.apkg, hebrew_conjugations.apkg
    • Anki deck names: "Hebrew Vocabulary", "Hebrew Conjugations"
    • Anki model name: "Hebrew Flash Cards"
    • No study data loss: deck IDs and note GUIDs unchanged.

    Vocab Audio Infrastructure

    Scraper now captures data-audio URLs from pealim.com list pages into CSV. Audio download step reads directly from CSV (no extra scraping needed).

    Bug Fix

    Deduplicate source CSV entries with identical Hebrew words to prevent GUID collisions (eliminated 149 duplicate entries).

    Downloads