Authors: Srijith Rajamohan, Ahmed Salhin, Josh Frazier, Rohit Kumar, Yu-Cheng Tsai, Todd Cook
Abstract: The output of Large Language Models (LLMs) are a function of the internal
model’s parameters and the input provided into the context window. The
hypothesis presented here is that under a greedy sampling strategy the variance
in the LLM’s output is a function of the conceptual certainty embedded in the
model’s parametric knowledge, as well as the lexical variance in the input.
Finetuning the model results in reducing the sensitivity of the model output to
the lexical input variations. This is then applied to a classification problem
and a probabilistic method is proposed for estimating the certainties of the
predicted classes.
Source: http://arxiv.org/abs/2502.08631v1