Hebrew OT Noun Morphology¶
Statistical analysis of Hebrew noun morphology across the Old Testament. Data: MACULA Hebrew WLC (~468,000 tokens; ~144,000 Hebrew noun tokens).
Hebrew nouns have three morphological categories:
- State: absolute (free) / construct (bound to following genitive) / determined*
- Gender: masculine / feminine
- Number: singular / plural / dual
*Note: Biblical Hebrew marks definiteness with a separate definite article prefix ה — not an internal state. The 'determined' state appears in Aramaic, not Hebrew.
This notebook covers:
- Overall noun statistics and state/gender/number profiles
- State distribution: absolute vs. construct
- Construct chain analysis — top lemmas in construct state
- Gender distribution across books
- Definite article usage by genre
- Top noun lemmas and lemma × state crosstab
- Distribution across OT books and genres
import sys
sys.path.insert(0, '../../../src')
from bible_grammar.ot_noun_profile import (
ot_noun_data, ot_adj_data,
ot_noun_gender_profile, ot_noun_number_profile, ot_noun_state_profile,
ot_noun_gender_state, ot_noun_top_lemmas, ot_noun_lemma_state,
ot_noun_book_distribution, ot_noun_genre_profile,
ot_article_usage, ot_construct_top_lemmas,
print_ot_noun_overview, print_ot_noun_gender, print_ot_noun_state,
print_ot_noun_top_lemmas, print_ot_construct_top_lemmas,
print_ot_noun_genre_profile, print_ot_noun_book_distribution, print_ot_article_usage,
ot_noun_state_chart, ot_noun_gender_chart, ot_noun_genre_heatmap, ot_noun_book_chart,
)
1. Overview¶
Nouns make up roughly 30% of Hebrew word tokens — the largest single word class after verbs and prepositions. There are ~6,100 distinct noun lemmas.
Key facts for BBH students:
- Hebrew does not have a 'determined state' morphology — definiteness is marked by the prefix article ה (class_='art' in MACULA, a separate token)
- Roughly half of all noun tokens appear in the construct state (bound form)
- The masculine gender dominates; feminine is marked (usually by ה- / ת- suffix)
print_ot_noun_overview()
2. State Distribution — Absolute vs. Construct¶
The construct state marks a noun bound to the next word (its genitive):
- בֵּית יְהוָה = "the house of YHWH" (בֵּית = construct of בַּיִת)
- מֶלֶךְ יִשְׂרָאֵל = "king of Israel"
Construct chains are one of the most common syntactic patterns in Hebrew.
print_ot_noun_state()
from IPython.display import Image
path = ot_noun_state_chart()
print(f'Saved: {path}')
Image(str(path))
# State profile for Torah only
print_ot_noun_state('Gen')
3. Construct Chain Analysis¶
Which nouns appear most frequently in the construct state? These are the "binding" nouns in the most common construct chains.
print_ot_construct_top_lemmas(20)
# Lemma × state crosstab for the top 15 nouns
ct = ot_noun_lemma_state(top_n=15)
ct
4. Gender Distribution¶
Hebrew gender is grammatical — every noun has a fixed gender.
- Masculine nouns typically have no special suffix
- Feminine nouns typically end in ה- or ת-
- Some nouns are "both" (common gender — used with either masculine or feminine agreement)
print_ot_noun_gender()
from IPython.display import Image
path = ot_noun_gender_chart()
print(f'Saved: {path}')
Image(str(path))
# Gender × state crosstab
ct = ot_noun_gender_state()
ct
5. Definite Article Usage¶
In Hebrew the definite article ה is a prefix — not a separate word in the MT, but represented as a separate class_='art' token in MACULA WLC.
The article rate (articles per noun) varies by genre:
- Narrative prose tends to have more anarthrous nouns in construct chains
- Poetry and prophecy often use nouns without an article even in definite contexts
print_ot_article_usage()
6. Top Noun Lemmas¶
The most frequent Hebrew nouns are largely theological and relational: יהוה (LORD), כֹּל (all), בֵּן (son), אֱלֹהִים (God), מֶלֶךְ (king), אֶרֶץ (land/earth).
print_ot_noun_top_lemmas(25)
7. Genre and Book Distribution¶
print_ot_noun_genre_profile()
from IPython.display import Image
path = ot_noun_genre_heatmap()
print(f'Saved: {path}')
Image(str(path))
print_ot_noun_book_distribution()
from IPython.display import Image
path = ot_noun_book_chart()
print(f'Saved: {path}')
Image(str(path))
Quick Reference¶
from bible_grammar.ot_noun_profile import (
# Data
ot_noun_data, # Hebrew noun tokens (book= optional)
ot_adj_data, # Hebrew adjective tokens
ot_noun_state_profile, # absolute/construct distribution
ot_noun_gender_profile, # masculine/feminine distribution
ot_noun_number_profile, # singular/plural/dual distribution
ot_noun_gender_state, # gender × state crosstab
ot_noun_top_lemmas, # top-n most frequent noun lemmas
ot_noun_lemma_state, # lemma × state crosstab
ot_construct_top_lemmas, # nouns most often in construct state
ot_article_usage, # article/noun ratio by genre
ot_noun_genre_profile, # state % by genre
ot_noun_book_distribution, # count + % per OT book
# Print
print_ot_noun_overview,
print_ot_noun_state,
print_ot_noun_gender,
print_ot_noun_top_lemmas,
print_ot_construct_top_lemmas,
print_ot_noun_genre_profile,
print_ot_noun_book_distribution,
print_ot_article_usage,
# Charts
ot_noun_state_chart,
ot_noun_gender_chart,
ot_noun_genre_heatmap,
ot_noun_book_chart,
)