ReadFact: A Workflow Framework for Readability and Factual Consistency in Medical Text Simplification

Kexin Weng

doi:10.54097/94ns3t96

Authors

Kexin Weng

DOI:

https://doi.org/10.54097/94ns3t96

Keywords:

Medical text simplification, readability, factual consistency, PICO, Direct Preference Optimization, biomedical NLP

Abstract

Biomedical literature often remains inaccessible to lay readers due to technical complexity. Medical text simplification (MTS) aims to improve readability while preserving factual accuracy. We propose ReadFact, a workflow that integrates three complementary components: (i) a simplifier trained with Direct Preference Optimization (DPO), (ii) a readability reward model trained with Proximal Policy Optimization (PPO), and (iii) a PICO-based fact checker for structured factual alignment. Our system is trained on the Cochrane Database of Systematic Reviews. Intermediate simplifications are first generated using DeepSeek-V3, and each (source, mid, target) triple is expanded into preference pairs to supervise both DPO and PPO training. Factual consistency is evaluated using SciBERT-based PICO similarity, while readability is optimized through preference-driven learning. Experiments show that ReadFact improves factual consistency by more than 23% over the DPO baseline and increases readability by over 5 percent. On the NapSS benchmark, ReadFact-DPO achieves the highest BERTScore, demonstrating closer alignment with human references.

Downloads

Download data is not yet available.

References

[1] Lu, J., Li, J., Wallace, B. C., He, Y., & Pergola, G. (2023). NapSS: Paragraph-level medical text simplification via narrative prompting and sentence-matching summarization. arXiv. https://doi.org/10.48550/arXiv.2302.05574

[2] Sun, Z., et al. (2022). PHEE: A dataset for pharmacovigilance event extraction. arXiv. https://doi.org/10.48550/arXiv.2210.12560

[3] Pergola, G., Kochkina, E., Gui, L., & Liakata, M. (2021). Boosting low-resource biomedical QA via entity-aware masking strategies. arXiv. https://doi.org/10.48550/arXiv.2102.0836

[4] Phatak, A., Savage, D. W., Ohle, R., & Mago, V. (2022). Medical text simplification using reinforcement learning (TESLEA). JMIR Medical Informatics, 10(11), e38095. https://doi.org/10.2196/38095

[5] Kincaid, J. P. (1975). Derivation of new readability formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula). Chief of Naval Technical Training.

[6] Lyu, C., & Pergola, G. (2024). Society of medical simplifiers. arXiv. https://doi.org/10.48550/arXiv.2410.09631

[7] Jianping, L., Xintao, C., Jian, W., Xunxun, G., & Yingfei, W. (2024). Semantic matching model for Chinese scientific datasets. Journal of Zhengzhou University: Engineering Science, 45(6).

[8] Zha, Y., Yang, Y., & Hu, Z. (2023). AlignScore: Evaluating factual consistency with a unified alignment function. arXiv. https://doi.org/10.48550/arXiv.2305.16739

[9] Li, Y., et al. (2022). Just cloze! A fast and simple method for evaluating the factual consistency in abstractive summarization. arXiv. https://doi.org/10.48550/arXiv.2210.02804

[10] Lewis, M. (2019). BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv. https://doi.org/10.48550/arXiv.1910.13461

[11] Raffel, C., et al. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140), 1–67.

[12] Devaraj, A., Wallace, B. C., Marshall, I. J., & Li, J. J. (2021). Paragraph-level simplification of medical texts. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 4972–4984). Association for Computational Linguistics.

[13] Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023). Direct preference optimization: Your language model is secretly a reward model. Advances in Neural Information Processing Systems, 36, 53728–53741.

[14] Lyu, C., & Pergola, G. (2024). SciGisPy: A novel metric for biomedical text simplification via gist inference score. arXiv. https://doi.org/10.48550/arXiv.2410.09632

[15] Rashid, A., Wu, R., Fan, R., Li, H., Kristiadi, A., & Poupart, P. (2025). Towards cost-effective reward guided text generation. In Proceedings of the 42nd International Conference on Machine Learning.

[16] Chernodub, A., Saini, A., Huh, Y., Kulkarni, V., & Raheja, V. (2025). Automatic prompt induction and optimization for grammatical error correction and text simplification. arXiv. https://doi.org/10.48550/arXiv.2508.09378

[17] Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv. https://doi.org/10.48550/arXiv.1707.06347

[18] Wan, D., & Bansal, M. (2022). FactPEGASUS: Factuality-aware pre-training and fine-tuning for abstractive summarization. arXiv. https://doi.org/10.48550/arXiv.2205.07830

[19] Lu, J., Li, J., Wallace, B. C., He, Y., & Pergola, G. (2023). NapSS: Paragraph-level medical text simplification via narrative prompting and sentence-matching summarization. arXiv. https://doi.org/10.48550/arXiv.2302.05574

[20] Devaraj, A., Marshall, I., Wallace, B., & Li, J. J. (2021). Paragraph-level simplification of medical texts. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 4972–4984). Association for Computational Linguistics.