Mitigating Hallucination in Large Models: A Modular Framework for Detection and Counterfactual Correction

dc.contributor.authorNyantakyi, A.B.
dc.date.accessioned2026-03-02T15:16:44Z
dc.date.available2026-03-02T15:16:44Z
dc.date.issued2025-12
dc.description.abstractLarge Language Models (LLMs) demonstrate impressive fluency yet remain unreliable in safetycritical environments due to persistent hallucination; confidently generating factually incorrect or semantically not supported answers. This research proposes a modular mitigation framework integrating Hallucination Potential Minimization (HPM) with Self-Generated Counterfactual Training (SGCT) to improve factual consistency in generative outputs. A lightweight DistilBERT-based HPM classifier was trained as a binary factuality judge using benchmark datasets including FEVER and TruthfulQA, prioritising recall to ensure conservative hallucination detection. Building on this foundation, SGCT fine-tuned a GPT-2 generative model rather than more recent architectures due to its computational accessibility, reproducibility, and suitability for controlled experimentation under resource constraints. SGCT incorporates likelihood loss for factual responses, unlikelihood loss to penalize hallucinations, and a contrastive objective to separate factual versus hallucinated answers representations in an embedding space. Experimental results demonstrated measurable improvements following SGCT, with accuracy increasing from 0.556 to 0.614, recall from 0.705 to 0.890, precision from 0.532 to 0.548, and F1- score from 0.607 to 0.692. Threshold calibration further revealed flexible trade-offs between factuality and output strictness, enabling uncertain responses to be routed into a safe “abstain” category. The findings indicate that classifier-guided generation provides a practical strategy for enhancing reliability in LLM-based systems while maintaining computational efficiency. The proposed SGCT-HPM pipeline represents a reproducible and adaptable approach for hallucination mitigation, with potential applications in domains requiring verifiable AI-generated content.
dc.identifier.urihttps://space.uenr.edu.gh//handle/123456789/63
dc.language.isoen
dc.publisherUENR
dc.subjectHallucinations
dc.subjectDatasets
dc.subjectMinimization
dc.titleMitigating Hallucination in Large Models: A Modular Framework for Detection and Counterfactual Correction
dc.typeThesis

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Mitigating Hallucination in Large Models_A Modular Framework for Detection and Counterfactual Correction.pdf
Size:
1.32 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description:

Collections