Mitigating Hallucination in Large Models: A Modular Framework for Detection and Counterfactual Correction

Nyantakyi, A.B.

Mitigating Hallucination in Large Models: A Modular Framework for Detection and Counterfactual Correction

dc.contributor.author	Nyantakyi, A.B.
dc.date.accessioned	2026-03-02T15:16:44Z
dc.date.available	2026-03-02T15:16:44Z
dc.date.issued	2025-12
dc.description.abstract	Large Language Models (LLMs) demonstrate impressive fluency yet remain unreliable in safetycritical environments due to persistent hallucination; confidently generating factually incorrect or semantically not supported answers. This research proposes a modular mitigation framework integrating Hallucination Potential Minimization (HPM) with Self-Generated Counterfactual Training (SGCT) to improve factual consistency in generative outputs. A lightweight DistilBERT-based HPM classifier was trained as a binary factuality judge using benchmark datasets including FEVER and TruthfulQA, prioritising recall to ensure conservative hallucination detection. Building on this foundation, SGCT fine-tuned a GPT-2 generative model rather than more recent architectures due to its computational accessibility, reproducibility, and suitability for controlled experimentation under resource constraints. SGCT incorporates likelihood loss for factual responses, unlikelihood loss to penalize hallucinations, and a contrastive objective to separate factual versus hallucinated answers representations in an embedding space. Experimental results demonstrated measurable improvements following SGCT, with accuracy increasing from 0.556 to 0.614, recall from 0.705 to 0.890, precision from 0.532 to 0.548, and F1- score from 0.607 to 0.692. Threshold calibration further revealed flexible trade-offs between factuality and output strictness, enabling uncertain responses to be routed into a safe “abstain” category. The findings indicate that classifier-guided generation provides a practical strategy for enhancing reliability in LLM-based systems while maintaining computational efficiency. The proposed SGCT-HPM pipeline represents a reproducible and adaptable approach for hallucination mitigation, with potential applications in domains requiring verifiable AI-generated content.
dc.identifier.uri	https://space.uenr.edu.gh//handle/123456789/63
dc.language.iso	en
dc.publisher	UENR
dc.subject	Hallucinations
dc.subject	Datasets
dc.subject	Minimization
dc.title	Mitigating Hallucination in Large Models: A Modular Framework for Detection and Counterfactual Correction
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Mitigating Hallucination in Large Models_A Modular Framework for Detection and Counterfactual Correction.pdf
Size:: 1.32 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

All theses