Evaluation
This project aims to evaluate how well structured reasoning improves reliability of outputs.
Current Status
- Qualitative examples demonstrate behavior
- System identifies unsupported or exaggerated claims
Planned Evaluation
- Dataset of claims (news / public statements)
- Comparison with baseline LLM outputs
- Metrics:
- Credibility accuracy
- Detection of unsupported claims
- Quality of reasoning
Goal
To measure whether structured reasoning reduces misleading or unsupported outputs.