Anki Flash Cards for Learning Hebrew Vocabulary and Conjugations!
Find a file
Sochen 372680be3c Add missing 70th verb להתקלח (to shower, Hitpa'el)
להתלקלח in the original source was a typo for להתקלח (1896-lehitkaleach),
not for להתקלקל as previously assumed — it's a completely different word.
Conjugation deck now has the correct 70 paradigm verbs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-04 06:46:07 +00:00
data Add missing 70th verb להתקלח (to shower, Hitpa'el) 2026-03-04 06:46:07 +00:00
.gitignore fix: correct word/verb counts in README, add missing .gitignore entries 2026-03-04 06:32:44 +00:00
apkg_builder.py Sprint 4: fix insertion order, skip infinitive cards, split past_3p, fix empty binyan 2026-03-04 06:42:32 +00:00
benyehuda.py feat: Sprint 2 + Sprint 3 — verb list, audio, passive forms, CSS/UX, validation, Heebo font, images 2026-03-03 08:36:51 +00:00
conjugation_extract.py Sprint 4: fix insertion order, skip infinitive cards, split past_3p, fix empty binyan 2026-03-04 06:42:32 +00:00
extract_verb_list.py feat: Sprint 2 + Sprint 3 — verb list, audio, passive forms, CSS/UX, validation, Heebo font, images 2026-03-03 08:36:51 +00:00
flashcard.png added a pic 2024-06-08 21:24:41 -07:00
frequency_lookup.py feat: add apkg builder, frequency, Ben Yehuda examples, conjugation deck 2026-03-03 01:58:31 +00:00
image_fetch.py feat: Sprint 3 — Heebo font files, image fetch, verb validator scripts 2026-03-03 08:37:08 +00:00
pealim.apkg Initial commit 2024-06-08 21:15:20 -07:00
pealim_dict.csv Initial commit 2024-06-08 21:15:20 -07:00
pealim_dict_for_anki.csv Initial commit 2024-06-08 21:15:20 -07:00
pealim_extract.py Improve scraper robustness and Hebrew text handling 2026-02-26 21:57:20 +00:00
README.md fix: correct word/verb counts in README, add missing .gitignore entries 2026-03-04 06:32:44 +00:00
requirements.txt feat: add apkg builder, frequency, Ben Yehuda examples, conjugation deck 2026-03-03 01:58:31 +00:00
run.py feat: Sprint 3 — passive/active separation, random card order, card UX fixes 2026-03-03 10:16:50 +00:00
test_scrape.py Improve scraper robustness and Hebrew text handling 2026-02-26 21:57:20 +00:00
validate_verb_list.py feat: Sprint 3 — Heebo font files, image fetch, verb validator scripts 2026-03-03 08:37:08 +00:00
verbs_input.txt Add missing 70th verb להתקלח (to shower, Hitpa'el) 2026-03-04 06:46:07 +00:00

Pealim — Hebrew Vocabulary & Verb Flashcards for Anki

Flashcard screenshot


For Hebrew learners

This project generates two Anki decks for learning Modern Hebrew:

  • Vocabulary deck — ~9,100 words from pealim.com, with nikkud (vowel marks), roots, parts of speech, related words, example sentences from classic Hebrew literature, and audio pronunciation.
  • Conjugation deck — 69 paradigm verbs from Coffin & Bolozky's A Reference Grammar of Modern Hebrew (2005), fully conjugated in all tenses and persons, across all seven binyanim.

All card data comes from open or academic sources:

  • Word data: pealim.com — a free Modern Hebrew dictionary
  • Example sentences: Project Ben-Yehuda — public-domain Hebrew literature corpus
  • Word frequency: hermitdave/FrequencyWords — Hebrew frequency list
  • Verb paradigm list: Coffin, Edna Amir and Shmuel Bolozky. A Reference Grammar of Modern Hebrew. Cambridge University Press, 2005.

Just give me the flashcards

  1. Download the .apkg files from Releases
  2. Double-click to import into Anki (free, cross-platform)
  3. Start studying

Both decks can be imported independently. If you already have one, re-importing the same file updates your deck without losing study progress.


What's in the vocabulary deck

Each card has two sides:

Hebrew → English: See the Hebrew word (with nikkud) + hear audio → recall the meaning.

English → Hebrew: See the English meaning → recall the Hebrew word, its root, and how to write it.

Fields on each card:

Field Example
Hebrew word (nikkud) שָׁמַר
Meaning kept, watched over
Root שמ״ר
Part of speech פועל (verb)
Without nikkud שמר
Related words שׁוֹמֵר, שְׁמִירָה
Example sentence from Ben-Yehuda corpus
Audio pronunciation from pealim.com
Frequency rank #412

Cards are presented in random order within Anki's spaced-repetition system, but frequency rank is displayed on every card so you can see how common each word is. Words not in the top 50,000 show a "50k+" badge.


What's in the conjugation deck

69 paradigm verbs from Coffin & Bolozky's A Reference Grammar of Modern Hebrew (Appendix 1), covering all seven binyanim:

  • פָּעַל (Pa'al), נִפְעַל (Nif'al), פִּעֵל (Pi'el), פֻּעַל (Pu'al)
  • הִתְפַּעֵל (Hitpa'el), הִפְעִיל (Hif'il), הֻפְעַל (Huf'al)

Each verb is drilled in: present, past, future, imperative, infinitive — all persons and genders.

Present tense expansion: Each present form generates 3 cards (one per pronoun that uses it), so you learn אֲנִי, אַתָּה, and הוּא all separately with the same masculine singular form.

Modern Hebrew 2fp/3fp: Classical feminine plural future forms (e.g., תִּשְׁמֹרְנָה) are shown in parentheses; the card's primary answer is the modern masculine plural form used in everyday speech.

Passive label: Pu'al and Huf'al cards show the active partner's infinitive on the front (e.g., לְבַטֵּל) followed by (סָבִיל) in smaller text, so you know you're drilling the passive conjugation. Active verbs show no label.

Card order: New cards are introduced in random order.

Citation: Coffin, Edna Amir and Shmuel Bolozky. A Reference Grammar of Modern Hebrew. Cambridge University Press, 2005.


Suggested study strategy

Start with the vocabulary deck. Anki will present the most frequent words first. Don't try to study to many cards every single day-- Anki suggests 20 per day.

The conjugation cards reinforce verb forms you've already seen in vocabulary.

Use the Hebrew → English direction to build reading comprehension. Use the English → Hebrew direction to build writing and speaking recall.


About the data sources

pealim.com — A comprehensive free Modern Hebrew dictionary with nikkud, roots, conjugations, and audio. This project scrapes the public dictionary and conjugation tables.

Project Ben-Yehuda — A public-domain digital library of Hebrew literature. Example sentences come from the nikkud corpus (classic texts with full vowel marks).

FrequencyWords — An open Hebrew word frequency list derived from subtitle data. Used to sort vocabulary cards from most to least common.

Coffin & Bolozky — The verb paradigm list for the conjugation deck comes from Appendix 1 of A Reference Grammar of Modern Hebrew (Cambridge University Press, 2005), which provides a comprehensive reference for Modern Hebrew verbal morphology.


Fixing errors

If you notice a wrong translation, missing audio, or incorrect conjugation:

  • For vocabulary errors: the source is pealim.com — you can suggest corrections there. But if you think morfix has a correct translation and pealim.com does not, we may be able to encode an override.

For any other issue, whether you know to code or not: Email me at pealim [at] nevo [dot] engineer


For developers

Installation

pip install -r requirements.txt

Quick test (20 words, no network)

python run.py --skip-scrape --skip-audio --skip-examples --test 20

Full pipeline

# Use cached dictionary (recommended after first run)
python run.py --skip-scrape

# Full rebuild including verb list extraction from PDF
python extract_verb_list.py
python run.py --skip-scrape --refresh-examples

CLI options

python run.py [options]

  --skip-scrape        Use cached data/pealim_dict.csv (no pealim.com scraping)
  --skip-audio         Skip audio .mp3 downloads
  --skip-examples      Skip Ben Yehuda example fetching
  --only {vocab,conjugations}  Run only one deck (skips all unrelated steps)
  --skip-conjugations  Skip verb conjugation extraction
  --skip-images        Skip image fetching for concrete nouns
  --refresh-examples   Force rebuild of Ben Yehuda index (nikkud corpus)
  --test N             Process only first N words

Output files

File Description
data/pealim_dict.csv Raw dictionary
data/pealim_dict_for_anki.csv Enriched Anki CSV
data/conjugations.json Verb conjugation data
data/audio/ Vocabulary audio (.mp3)
data/audio_conj/ Conjugation audio (.mp3)
data/fonts/ Heebo font files (bundled in .apkg)
data/images/ Noun images from Wikipedia/Commons
data/image_cache.json Image fetch cache
output/pealim_vocabulary.apkg Vocabulary Anki deck
output/pealim_conjugations.apkg Conjugation Anki deck

Pipeline overview

  1. pealim_extract.py — scrapes pealim.com dictionary
  2. frequency_lookup.py — downloads/loads Hebrew frequency data
  3. benyehuda.py — builds sentence index from Ben-Yehuda corpus
  4. extract_verb_list.py — extracts verb list from Coffin & Bolozky PDF
  5. conjugation_extract.py — fetches conjugation tables from pealim.com
  6. image_fetch.py — fetches Wikipedia/Commons images for concrete nouns
  7. validate_verb_list.py — validates verb list against pealim.com
  8. apkg_builder.py — assembles both .apkg files
  9. run.py — orchestrates all steps

AnkiWeb

The decks will be published as shared decks on AnkiWeb (TBD).