Greek NT Style and Authorship Analysis¶
Quantitative stylometric profiling of New Testament books, using metrics that capture vocabulary richness, syntactic complexity, and morphological fingerprints.
Greek style metrics:
| Metric | What it captures |
|---|---|
| TTR | Type-token ratio — raw vocabulary richness |
| MSTTR (1k window) | Mean segmental TTR — fair cross-length comparison |
| Hapax density % | Rare/unique vocabulary |
| Ptc-to-finite ratio | Participial style vs. parataxis (high in Luke/Hebrews, low in Mark) |
| Optative/1k | Classical register marker (Luke-Acts is dominant) |
| ἵνα/1k | Purpose/content clause density (Johannine signature) |
| Infinitive/1k | Verbal noun usage |
Classic authorship questions:
- Luke-Acts stylistic unity
- Pauline authentic (Galatians/Romans) vs. disputed (Ephesians/Pastorals)
- Johannine fingerprint across Gospel/Epistles/Revelation
References: Kenny, A Stylometric Study of the New Testament (1986); Moulton, Grammar of NT Greek, Vol. 4.
import sys
sys.path.insert(0, '../../../src')
from bible_grammar import (
msttr, book_style_profile, style_comparison,
print_style_profile, print_style_comparison,
style_radar_chart, style_heatmap,
)
import pandas as pd
1. Overview — Individual Book Profile¶
# Luke — known for elevated style
print_style_profile('Luk', lang='G')
# Mark — paratactic, simple
print_style_profile('Mrk', lang='G')
# Hebrews — most literary Greek in the NT
print_style_profile('Heb', lang='G')
2. Participle Density — Elevated vs. Paratactic Greek¶
The participle-to-finite-verb ratio distinguishes:
- High (Hebrews, Luke, Acts, 1 Peter) — elevated, literary Greek
- Low (Mark, Revelation) — paratactic, Semitic-influenced Greek
gospels_acts = ['Mat', 'Mrk', 'Luk', 'Jhn', 'Act']
print_style_comparison(gospels_acts, lang='G')
# Full NT — sorted by participle density
all_nt = ['Mat', 'Mrk', 'Luk', 'Jhn', 'Act', 'Rom', '1Co', '2Co',
'Gal', 'Eph', 'Php', 'Col', '1Th', '2Th', '1Ti', '2Ti',
'Tit', 'Phm', 'Heb', 'Jas', '1Pe', '2Pe', '1Jn', 'Rev']
df_all = style_comparison(all_nt, lang='G')
df_all[['total_tokens', 'ttr', 'msttr_1k', 'ptc_to_finite_ratio', 'hina_per1k']].sort_values(
'ptc_to_finite_ratio', ascending=False
)
3. Pauline Letters — Authentic vs. Disputed¶
Galatians, Romans, 1–2 Corinthians are universally accepted Pauline; Ephesians, Colossians, and the Pastorals (1–2 Tim, Tit) are disputed. Do stylometric metrics cluster them differently?
pauline = ['Rom', '1Co', '2Co', 'Gal', 'Php', '1Th', # undisputed
'Eph', 'Col', '2Th', # disputed
'1Ti', '2Ti', 'Tit'] # pastoral
print_style_comparison(pauline, lang='G')
# Radar: undisputed vs. disputed Pauline
style_radar_chart(['Rom', '1Co', 'Gal', 'Eph', '1Ti', 'Tit'], lang='G')
# Heatmap of all Pauline letters
style_heatmap(pauline, lang='G')
4. Luke-Acts Stylistic Unity¶
Luke and Acts are attributed to the same author. The optative mood and elevated participle usage are known Lukan signatures. Do the metrics show them clustering together?
# Luke-Acts vs. other Gospels
print_style_comparison(['Mat', 'Mrk', 'Luk', 'Jhn', 'Act'], lang='G')
style_radar_chart(['Luk', 'Act', 'Mrk', 'Jhn', 'Heb'], lang='G')
5. Johannine Fingerprint¶
John's Gospel, 1–3 John, and Revelation are all associated with the Johannine tradition. ἵνα density is a known Johannine marker. How stylistically similar are they?
johannine = ['Jhn', '1Jn', '2Jn', '3Jn', 'Rev']
print_style_comparison(johannine, lang='G')
style_radar_chart(johannine, lang='G')
6. Vocabulary Richness — MSTTR Across the NT¶
df_all[['total_tokens', 'ttr', 'msttr_1k', 'hapax_density_pct']].sort_values(
'msttr_1k', ascending=False
)
# Hebrews is often cited as the most literary NT book — confirm with MSTTR
for book in ['Heb', 'Luk', 'Act', 'Rom', 'Rev', 'Mrk']:
val = msttr(book, lang='G', window=500)
print(f"{book}: MSTTR(500) = {val}")
7. Full NT Heatmap¶
nt_sample = [
'Mat', 'Mrk', 'Luk', 'Jhn', 'Act',
'Rom', 'Gal', 'Eph', '1Ti',
'Heb', 'Jas', '1Pe', '1Jn', 'Rev'
]
style_heatmap(nt_sample, lang='G')