BM25 Ranking Visualizer

Explore how term frequency, document frequency, and document length normalization interact in BM25 scoring.

1

Input

Documents

Hyperparameters

Term frequency saturation

Typical range: balanced term frequency saturation

Length normalization

Typical range: moderate length normalization

2

Corpus Statistics

Total Documents (N)

4

Avg Document Length (avgdl)

14.0

= total words / 4 docs

IDF Explorer

Query term:
IDF(t)=ln ⁣(Ndf(t)+0.5df(t)+0.5+1)\textcolor{#047857}{\text{IDF}}(t) = \ln\!\left(\frac{N - df(t) + 0.5}{df(t) + 0.5} + 1\right)
=ln ⁣(41+0.51+0.5+1)=1.204= \ln\!\left(\frac{4 - 1 + 0.5}{1 + 0.5} + 1\right) = \textcolor{#047857}{1.204}

"quick" appears in 1 of 4 documents (rare). IDF = 1.204: high weight — discriminating term.

Color Legend

Term Frequency (TF)Inverse Document Frequency (IDF)Document Length Normalization
3

Document Rankings

1
13 words

The quick brown fox jumps over the lazy dog in the sunny meadow

2
14 words

A fast fox runs through the forest chasing a rabbit through the dense trees

3
15 words

The lazy dog sleeps all day long by the warm fireplace and never chases anything

4
14 words

Foxes and dogs are both popular animals but foxes are much harder to domesticate