Greek NT Style and Authorship Analysis¶

Quantitative stylometric profiling of New Testament books, using metrics that capture vocabulary richness, syntactic complexity, and morphological fingerprints.

Greek style metrics:

Metric	What it captures
TTR	Type-token ratio — raw vocabulary richness
MSTTR (1k window)	Mean segmental TTR — fair cross-length comparison
Hapax density %	Rare/unique vocabulary
Ptc-to-finite ratio	Participial style vs. parataxis (high in Luke/Hebrews, low in Mark)
Optative/1k	Classical register marker (Luke-Acts is dominant)
ἵνα/1k	Purpose/content clause density (Johannine signature)
Infinitive/1k	Verbal noun usage

Classic authorship questions:

Luke-Acts stylistic unity
Pauline authentic (Galatians/Romans) vs. disputed (Ephesians/Pastorals)
Johannine fingerprint across Gospel/Epistles/Revelation

References: Kenny, A Stylometric Study of the New Testament (1986); Moulton, Grammar of NT Greek, Vol. 4.

In [ ]:

Copied!





import sys
sys.path.insert(0, '../../../src')

from bible_grammar import (
    msttr, book_style_profile, style_comparison,
    print_style_profile, print_style_comparison,
    style_radar_chart, style_heatmap,
)
import pandas as pd
import sys
sys.path.insert(0, '../../../src')

from bible_grammar import (
    msttr, book_style_profile, style_comparison,
    print_style_profile, print_style_comparison,
    style_radar_chart, style_heatmap,
)
import pandas as pd

1. Overview — Individual Book Profile¶

In [ ]:

Copied!

# Luke — known for elevated style
print_style_profile('Luk', lang='G')
# Luke — known for elevated style
print_style_profile('Luk', lang='G')

In [ ]:

Copied!

# Mark — paratactic, simple
print_style_profile('Mrk', lang='G')
# Mark — paratactic, simple
print_style_profile('Mrk', lang='G')

In [ ]:

Copied!

# Hebrews — most literary Greek in the NT
print_style_profile('Heb', lang='G')
# Hebrews — most literary Greek in the NT
print_style_profile('Heb', lang='G')

2. Participle Density — Elevated vs. Paratactic Greek¶

The participle-to-finite-verb ratio distinguishes:

High (Hebrews, Luke, Acts, 1 Peter) — elevated, literary Greek
Low (Mark, Revelation) — paratactic, Semitic-influenced Greek

In [ ]:

Copied!

gospels_acts = ['Mat', 'Mrk', 'Luk', 'Jhn', 'Act']
print_style_comparison(gospels_acts, lang='G')
gospels_acts = ['Mat', 'Mrk', 'Luk', 'Jhn', 'Act']
print_style_comparison(gospels_acts, lang='G')

In [ ]:

Copied!





# Full NT — sorted by participle density
all_nt = ['Mat', 'Mrk', 'Luk', 'Jhn', 'Act', 'Rom', '1Co', '2Co',
          'Gal', 'Eph', 'Php', 'Col', '1Th', '2Th', '1Ti', '2Ti',
          'Tit', 'Phm', 'Heb', 'Jas', '1Pe', '2Pe', '1Jn', 'Rev']
df_all = style_comparison(all_nt, lang='G')
df_all[['total_tokens', 'ttr', 'msttr_1k', 'ptc_to_finite_ratio', 'hina_per1k']].sort_values(
    'ptc_to_finite_ratio', ascending=False
)
# Full NT — sorted by participle density
all_nt = ['Mat', 'Mrk', 'Luk', 'Jhn', 'Act', 'Rom', '1Co', '2Co',
          'Gal', 'Eph', 'Php', 'Col', '1Th', '2Th', '1Ti', '2Ti',
          'Tit', 'Phm', 'Heb', 'Jas', '1Pe', '2Pe', '1Jn', 'Rev']
df_all = style_comparison(all_nt, lang='G')
df_all[['total_tokens', 'ttr', 'msttr_1k', 'ptc_to_finite_ratio', 'hina_per1k']].sort_values(
    'ptc_to_finite_ratio', ascending=False
)

3. Pauline Letters — Authentic vs. Disputed¶

Galatians, Romans, 1–2 Corinthians are universally accepted Pauline; Ephesians, Colossians, and the Pastorals (1–2 Tim, Tit) are disputed. Do stylometric metrics cluster them differently?

In [ ]:

Copied!





pauline = ['Rom', '1Co', '2Co', 'Gal', 'Php', '1Th',  # undisputed
           'Eph', 'Col', '2Th',                          # disputed
           '1Ti', '2Ti', 'Tit']                          # pastoral
print_style_comparison(pauline, lang='G')
pauline = ['Rom', '1Co', '2Co', 'Gal', 'Php', '1Th',  # undisputed
           'Eph', 'Col', '2Th',                          # disputed
           '1Ti', '2Ti', 'Tit']                          # pastoral
print_style_comparison(pauline, lang='G')

In [ ]:

Copied!

# Radar: undisputed vs. disputed Pauline
style_radar_chart(['Rom', '1Co', 'Gal', 'Eph', '1Ti', 'Tit'], lang='G')
# Radar: undisputed vs. disputed Pauline
style_radar_chart(['Rom', '1Co', 'Gal', 'Eph', '1Ti', 'Tit'], lang='G')

In [ ]:

Copied!

# Heatmap of all Pauline letters
style_heatmap(pauline, lang='G')
# Heatmap of all Pauline letters
style_heatmap(pauline, lang='G')

4. Luke-Acts Stylistic Unity¶

Luke and Acts are attributed to the same author. The optative mood and elevated participle usage are known Lukan signatures. Do the metrics show them clustering together?

In [ ]:

Copied!

# Luke-Acts vs. other Gospels
print_style_comparison(['Mat', 'Mrk', 'Luk', 'Jhn', 'Act'], lang='G')
# Luke-Acts vs. other Gospels
print_style_comparison(['Mat', 'Mrk', 'Luk', 'Jhn', 'Act'], lang='G')

In [ ]:

Copied!

style_radar_chart(['Luk', 'Act', 'Mrk', 'Jhn', 'Heb'], lang='G')
style_radar_chart(['Luk', 'Act', 'Mrk', 'Jhn', 'Heb'], lang='G')

5. Johannine Fingerprint¶

John's Gospel, 1–3 John, and Revelation are all associated with the Johannine tradition. ἵνα density is a known Johannine marker. How stylistically similar are they?

In [ ]:

Copied!

johannine = ['Jhn', '1Jn', '2Jn', '3Jn', 'Rev']
print_style_comparison(johannine, lang='G')
johannine = ['Jhn', '1Jn', '2Jn', '3Jn', 'Rev']
print_style_comparison(johannine, lang='G')

In [ ]:

Copied!

style_radar_chart(johannine, lang='G')
style_radar_chart(johannine, lang='G')

6. Vocabulary Richness — MSTTR Across the NT¶

In [ ]:

Copied!

df_all[['total_tokens', 'ttr', 'msttr_1k', 'hapax_density_pct']].sort_values(
    'msttr_1k', ascending=False
)
df_all[['total_tokens', 'ttr', 'msttr_1k', 'hapax_density_pct']].sort_values(
    'msttr_1k', ascending=False
)

In [ ]:

Copied!





# Hebrews is often cited as the most literary NT book — confirm with MSTTR
for book in ['Heb', 'Luk', 'Act', 'Rom', 'Rev', 'Mrk']:
    val = msttr(book, lang='G', window=500)
    print(f"{book}: MSTTR(500) = {val}")
# Hebrews is often cited as the most literary NT book — confirm with MSTTR
for book in ['Heb', 'Luk', 'Act', 'Rom', 'Rev', 'Mrk']:
    val = msttr(book, lang='G', window=500)
    print(f"{book}: MSTTR(500) = {val}")

7. Full NT Heatmap¶

In [ ]:

Copied!





nt_sample = [
    'Mat', 'Mrk', 'Luk', 'Jhn', 'Act',
    'Rom', 'Gal', 'Eph', '1Ti',
    'Heb', 'Jas', '1Pe', '1Jn', 'Rev'
]
style_heatmap(nt_sample, lang='G')
nt_sample = [
    'Mat', 'Mrk', 'Luk', 'Jhn', 'Act',
    'Rom', 'Gal', 'Eph', '1Ti',
    'Heb', 'Jas', '1Pe', '1Jn', 'Rev'
]
style_heatmap(nt_sample, lang='G')