Anki Flash Cards for Learning Hebrew Vocabulary and Conjugations!
Find a file
Sochen d26e4c8ce5 feat: Sprint 3 — passive/active separation, random card order, card UX fixes
Conjugation extraction:
- Active entries now extract active forms only (no auto passive partner)
- Passive (# 3ms:) entries extract passive section only via new
  _extract_passive_from_active_slug(); search-based fallback also uses
  this path so no active forms leak into passive entries
- # slug: VERB SLUG override syntax for search-ambiguous active verbs
- # 3ms: FORM ACTIVE-SLUG syntax for passive entries with known active page
- Fixed verb spellings: בוטל (was בותל), slug overrides for תואם →
  2344-letaem, זוכה → 503-lezakot, לָשִׂים → 45-lasim, העבר → 1442-lehaavir

Card UX:
- Passive card front: shows active partner infinitive (e.g. לְבַטֵּל) with
  (סָבִיל) inline in smaller font instead of bare 3ms past form
- Removed פָּעִיל label from active cards; only passive cards carry voice label
- New cards introduced in random order (new.order=0 via _RandomOrderPackage)
- Frequency badge: words outside top 50k show "50k+" instead of blank

README: updated CLI options, output files table, pipeline list, card
descriptions to reflect Sprint 3 state

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 10:16:50 +00:00
data feat: Sprint 3 — passive/active separation, random card order, card UX fixes 2026-03-03 10:16:50 +00:00
.gitignore feat: Sprint 2 + Sprint 3 — verb list, audio, passive forms, CSS/UX, validation, Heebo font, images 2026-03-03 08:36:51 +00:00
apkg_builder.py feat: Sprint 3 — passive/active separation, random card order, card UX fixes 2026-03-03 10:16:50 +00:00
benyehuda.py feat: Sprint 2 + Sprint 3 — verb list, audio, passive forms, CSS/UX, validation, Heebo font, images 2026-03-03 08:36:51 +00:00
conjugation_extract.py feat: Sprint 3 — passive/active separation, random card order, card UX fixes 2026-03-03 10:16:50 +00:00
extract_verb_list.py feat: Sprint 2 + Sprint 3 — verb list, audio, passive forms, CSS/UX, validation, Heebo font, images 2026-03-03 08:36:51 +00:00
flashcard.png added a pic 2024-06-08 21:24:41 -07:00
frequency_lookup.py feat: add apkg builder, frequency, Ben Yehuda examples, conjugation deck 2026-03-03 01:58:31 +00:00
image_fetch.py feat: Sprint 3 — Heebo font files, image fetch, verb validator scripts 2026-03-03 08:37:08 +00:00
pealim.apkg Initial commit 2024-06-08 21:15:20 -07:00
pealim_dict.csv Initial commit 2024-06-08 21:15:20 -07:00
pealim_dict_for_anki.csv Initial commit 2024-06-08 21:15:20 -07:00
pealim_extract.py Improve scraper robustness and Hebrew text handling 2026-02-26 21:57:20 +00:00
README.md feat: Sprint 3 — passive/active separation, random card order, card UX fixes 2026-03-03 10:16:50 +00:00
requirements.txt feat: add apkg builder, frequency, Ben Yehuda examples, conjugation deck 2026-03-03 01:58:31 +00:00
run.py feat: Sprint 3 — passive/active separation, random card order, card UX fixes 2026-03-03 10:16:50 +00:00
test_scrape.py Improve scraper robustness and Hebrew text handling 2026-02-26 21:57:20 +00:00
validate_verb_list.py feat: Sprint 3 — Heebo font files, image fetch, verb validator scripts 2026-03-03 08:37:08 +00:00
verbs_input.txt feat: Sprint 3 — passive/active separation, random card order, card UX fixes 2026-03-03 10:16:50 +00:00

Pealim — Hebrew Vocabulary & Verb Flashcards for Anki

Flashcard screenshot


For Hebrew learners

This project generates two Anki decks for learning Modern Hebrew:

  • Vocabulary deck — ~14,400 words from pealim.com, with nikkud (vowel marks), roots, parts of speech, related words, example sentences from classic Hebrew literature, and audio pronunciation.
  • Conjugation deck — 71 paradigm verbs from Coffin & Bolozky's A Reference Grammar of Modern Hebrew (2005), fully conjugated in all tenses and persons, across all seven binyanim.

All card data comes from open or academic sources:

  • Word data: pealim.com — a free Modern Hebrew dictionary
  • Example sentences: Project Ben-Yehuda — public-domain Hebrew literature corpus
  • Word frequency: hermitdave/FrequencyWords — Hebrew frequency list
  • Verb paradigm list: Coffin, Edna Amir and Shmuel Bolozky. A Reference Grammar of Modern Hebrew. Cambridge University Press, 2005.

Just give me the flashcards

  1. Download the .apkg files from Releases
  2. Double-click to import into Anki (free, cross-platform)
  3. Start studying

Both decks can be imported independently. If you already have one, re-importing the same file updates your deck without losing study progress.


What's in the vocabulary deck

Each card has two sides:

Hebrew → English: See the Hebrew word (with nikkud) + hear audio → recall the meaning.

English → Hebrew: See the English meaning → recall the Hebrew word, its root, and how to write it.

Fields on each card:

Field Example
Hebrew word (nikkud) שָׁמַר
Meaning kept, watched over
Root שמ״ר
Part of speech פועל (verb)
Without nikkud שמר
Related words שׁוֹמֵר, שְׁמִירָה
Example sentence from Ben-Yehuda corpus
Audio pronunciation from pealim.com
Frequency rank #412

Cards are presented in random order within Anki's spaced-repetition system, but frequency rank is displayed on every card so you can see how common each word is. Words not in the top 50,000 show a "50k+" badge.


What's in the conjugation deck

71 paradigm verbs from Coffin & Bolozky's A Reference Grammar of Modern Hebrew (Appendix 1), covering all seven binyanim:

  • פָּעַל (Pa'al), נִפְעַל (Nif'al), פִּעֵל (Pi'el), פֻּעַל (Pu'al)
  • הִתְפַּעֵל (Hitpa'el), הִפְעִיל (Hif'il), הֻפְעַל (Huf'al)

Each verb is drilled in: present, past, future, imperative, infinitive — all persons and genders.

Present tense expansion: Each present form generates 3 cards (one per pronoun that uses it), so you learn אֲנִי, אַתָּה, and הוּא all separately with the same masculine singular form.

Modern Hebrew 2fp/3fp: Classical feminine plural future forms (e.g., תִּשְׁמֹרְנָה) are shown in parentheses; the card's primary answer is the modern masculine plural form used in everyday speech.

Passive label: Pu'al and Huf'al cards show the active partner's infinitive on the front (e.g., לְבַטֵּל) followed by (סָבִיל) in smaller text, so you know you're drilling the passive conjugation. Active verbs show no label.

Card order: New cards are introduced in random order.

Citation: Coffin, Edna Amir and Shmuel Bolozky. A Reference Grammar of Modern Hebrew. Cambridge University Press, 2005.


Suggested study strategy

Start with the vocabulary deck. Anki will present the most frequent words first. Aim for 1020 new cards per day.

Once you have ~300500 vocabulary words, add the conjugation deck. The conjugation cards reinforce verb forms you've already seen in vocabulary.

Use the Hebrew → English direction to build reading comprehension. Use the English → Hebrew direction to build writing and speaking recall.


About the data sources

pealim.com — A comprehensive free Modern Hebrew dictionary with nikkud, roots, conjugations, and audio. This project scrapes the public dictionary listing (not conjugation tables, which are covered separately).

Project Ben-Yehuda — A public-domain digital library of Hebrew literature. Example sentences come from the nikkud corpus (classic texts with full vowel marks).

FrequencyWords — An open Hebrew word frequency list derived from subtitle data. Used to sort vocabulary cards from most to least common.

Coffin & Bolozky — The verb paradigm list for the conjugation deck comes from Appendix 1 of A Reference Grammar of Modern Hebrew (Cambridge University Press, 2005), which provides a comprehensive reference for Modern Hebrew verbal morphology.


Fixing errors

If you notice a wrong translation, missing audio, or incorrect conjugation:

  • For vocabulary errors: the source is pealim.com — you can suggest corrections there.
  • For conjugation errors: open an issue in this repository with the verb and the correct form.
  • For example sentence issues: open an issue with the word and sentence.

For developers

Installation

pip install -r requirements.txt

Quick test (20 words, no network)

python run.py --skip-scrape --skip-audio --skip-examples --test 20

Full pipeline

# Use cached dictionary (recommended after first run)
python run.py --skip-scrape

# Full rebuild including verb list extraction from PDF
python extract_verb_list.py
python run.py --skip-scrape --refresh-examples

CLI options

python run.py [options]

  --skip-scrape        Use cached data/pealim_dict.csv (no pealim.com scraping)
  --skip-audio         Skip audio .mp3 downloads
  --skip-examples      Skip Ben Yehuda example fetching
  --only {vocab,conjugations}  Run only one deck (skips all unrelated steps)
  --skip-conjugations  Skip verb conjugation extraction
  --skip-images        Skip image fetching for concrete nouns
  --refresh-examples   Force rebuild of Ben Yehuda index (nikkud corpus)
  --test N             Process only first N words

Output files

File Description
data/pealim_dict.csv Raw dictionary
data/pealim_dict_for_anki.csv Enriched Anki CSV
data/conjugations.json Verb conjugation data
data/audio/ Vocabulary audio (.mp3)
data/audio_conj/ Conjugation audio (.mp3)
data/fonts/ Heebo font files (bundled in .apkg)
data/images/ Noun images from Wikipedia/Commons
data/image_cache.json Image fetch cache
output/pealim_vocabulary.apkg Vocabulary Anki deck
output/pealim_conjugations.apkg Conjugation Anki deck

Pipeline overview

  1. pealim_extract.py — scrapes pealim.com dictionary
  2. frequency_lookup.py — downloads/loads Hebrew frequency data
  3. benyehuda.py — builds sentence index from Ben-Yehuda corpus
  4. extract_verb_list.py — extracts verb list from Coffin & Bolozky PDF
  5. conjugation_extract.py — fetches conjugation tables from pealim.com
  6. image_fetch.py — fetches Wikipedia/Commons images for concrete nouns
  7. validate_verb_list.py — validates verb list against pealim.com
  8. apkg_builder.py — assembles both .apkg files
  9. run.py — orchestrates all steps

AnkiWeb

The generated decks will be published on AnkiWeb. See ANKIWEB_DESCRIPTION.md for the submission content.