Hapax Legomena — Words Occurring Once in the Biblical Text¶

A hapax legomenon (Greek: "said only once") is a word that appears exactly once in a given corpus. Biblical hapaxes are significant for lexicography and translation because their meaning must be inferred from context, cognates, or ancient translations.

The book of Job is famously the most hapax-rich book in the Hebrew Bible, reflecting its archaic vocabulary, possible non-Israelite setting, and the specialized poetic language of lament and disputation. Leviticus concentrates technical cultic vocabulary that appears nowhere else. In the NT, Revelation's vivid apocalyptic imagery draws on unique Greek coinages.

In [ ]:

Copied!

import sys
sys.path.insert(0, '../../../src')

import pandas as pd
from bible_grammar.hapax import hapax_legomena, hapax_table, hapax_summary
import sys
sys.path.insert(0, '../../../src')

import pandas as pd
from bible_grammar.hapax import hapax_legomena, hapax_table, hapax_summary

1. What Is a Hapax?¶

A hapax legomenon is a word found exactly once in a corpus. Because we have no other occurrences to compare, translation requires reasoning from:

Context — what the surrounding text implies
Cognates — related words in Aramaic, Arabic, Ugaritic, Akkadian, etc.
Ancient translations — the LXX (Septuagint), Vulgate, and Peshitta
Parallelism — in Hebrew poetry, the parallel colon often illuminates meaning

Job has more hapaxes than any other OT book — over 100 unique lemmas appear nowhere else in the Hebrew Bible. This contributes to Job being one of the hardest OT books to translate.

The summary below shows hapax counts per book across the entire OT, sorted by count:

In [ ]:

Copied!

ot_summary = hapax_summary('OT')
ot_summary.head(20)
ot_summary = hapax_summary('OT')
ot_summary.head(20)

2. OT Hapaxes by Book¶

The full table of hapaxes per OT book, followed by the NT summary:

In [ ]:

Copied!





# Full OT summary
ot_summary = hapax_summary('OT')
print(f"OT books with hapax data: {len(ot_summary)}")
ot_summary
# Full OT summary
ot_summary = hapax_summary('OT')
print(f"OT books with hapax data: {len(ot_summary)}")
ot_summary

In [ ]:

Copied!





# NT summary
nt_summary = hapax_summary('NT')
print(f"NT books with hapax data: {len(nt_summary)}")
nt_summary
# NT summary
nt_summary = hapax_summary('NT')
print(f"NT books with hapax data: {len(nt_summary)}")
nt_summary

3. Hapaxes in Job¶

Job famously has more hapax legomena than any other OT book. Several factors contribute:

Archaic vocabulary — Job may preserve very ancient Hebrew or reflect dialect
Non-Israelite setting — The characters are Edomites and Arabians; the text may incorporate foreign poetic traditions
Specialized themes — Meteorology, cosmology, and animal behavior demand technical vocabulary
Poetic compression — The dialogue sections avoid everyday vocabulary in favor of elevated, rare diction

The top 25 hapaxes in Job (corpus-wide — words appearing only once in the entire OT):

In [ ]:

Copied!

hapax_table('Job', top_n=25)
hapax_table('Job', top_n=25)

4. Hapaxes in Leviticus¶

Leviticus contains highly specialized cultic and legal vocabulary — terms for sacrificial procedures, skin diseases, priestly garments, and purity regulations — that rarely appear elsewhere in the OT. Many of these terms are technical and may have been used primarily in priestly oral tradition.

The top 20 hapaxes in Leviticus:

In [ ]:

Copied!

hapax_table('Lev', top_n=20)
hapax_table('Lev', top_n=20)

5. Hapaxes in the NT — Revelation¶

In the Greek NT, Revelation has the highest concentration of hapax legomena. John's apocalyptic vision required vocabulary for creatures, catastrophes, and cosmic realities that Paul's epistles and the Gospels never encounter. Some NT hapaxes are also found in the LXX, which was John's primary biblical idiom.

The top 20 NT hapaxes in Revelation:

In [ ]:

Copied!

hapax_table('Rev', top_n=20, corpus='NT')
hapax_table('Rev', top_n=20, corpus='NT')

6. Filtering by Part of Speech¶

We can restrict hapax searches to a specific part of speech. Verb hapaxes are especially interesting because the verbal root rarely appears elsewhere, making the action hard to pin down.

Verb hapaxes across the entire OT:

In [ ]:

Copied!

verb_hapaxes = hapax_legomena(corpus='OT', part_of_speech='Verb')
print(f"OT verb hapaxes: {len(verb_hapaxes)}")
verb_hapaxes.head(20)
verb_hapaxes = hapax_legomena(corpus='OT', part_of_speech='Verb')
print(f"OT verb hapaxes: {len(verb_hapaxes)}")
verb_hapaxes.head(20)

7. Rare Words (max_count=3)¶

Extending the definition to words appearing three times or fewer ("dis legomena" and "tris legomena" alongside hapaxes) reveals a broader picture of rare vocabulary. This is useful for identifying specialized domains within a book.

Words in Psalms appearing three times or fewer:

In [ ]:

Copied!

rare_psalms = hapax_legomena(book='Psa', max_count=3)
print(f"Rare words in Psalms (<=3 occurrences in OT): {len(rare_psalms)}")
rare_psalms.head(25)
rare_psalms = hapax_legomena(book='Psa', max_count=3)
print(f"Rare words in Psalms (<=3 occurrences in OT): {len(rare_psalms)}")
rare_psalms.head(25)

8. Quick Reference¶

from bible_grammar.hapax import hapax_legomena, hapax_table, hapax_summary

# All OT hapaxes (corpus-wide; one occurrence across entire OT)
hapax_legomena(corpus='OT')

# Hapaxes in a specific book
hapax_legomena(book='Job')
hapax_legomena(book='Rev')          # NT book

# Filter by part of speech
hapax_legomena(corpus='OT', part_of_speech='Verb')
hapax_legomena(corpus='NT', part_of_speech='Noun')

# Rare words (appearing <= N times)
hapax_legomena(corpus='OT', max_count=5)
hapax_legomena(book='Psa', max_count=3)

# Words unique to one book (scope='book')
hapax_legomena(book='Ezk', scope='book')

# Formatted table printed to console
hapax_table(book='Job', top_n=25)
hapax_table('Rev', corpus='NT', top_n=20)

# Summary: hapax count per book
hapax_summary(corpus='OT')          # returns DataFrame sorted by hapax_count desc
hapax_summary(corpus='NT')
hapax_summary(corpus='OT', max_count=3)  # rare words summary
hapax_summary(corpus='OT', part_of_speech='Verb')