Support MedMCQA and MedBullets benchmark #2054

mar-cry · 2025-04-26T05:58:22Z

Motivation

The motivation for this PR is to enrich the evaluation capabilities of existing LLMs in the medical domain. By adding support for two new medical benchmarks, MedMCQA and MedBullets, this PR aims to provide a broader and more comprehensive assessment of LLMs' performance on domain-specific tasks, particularly in the healthcare and clinical fields.

Modification

This PR introduces two new benchmark configuration files for MedMCQA and MedBullets. These additions allow users to easily evaluate LLMs on these datasets by leveraging the existing evaluation framework.

BC-breaking (Optional)

This PR does not introduce any backward compatibility breaking changes. Existing workflows and configurations remain fully functional without any required modifications.

Use cases (Optional)

With the addition of MedMCQA and MedBullets benchmarks:

Researchers can benchmark LLMs specifically for medical QA and clinical knowledge tasks.
Developers can better understand and improve their models' performance in healthcare-related applications.
Facilitates more targeted fine-tuning and domain-specific evaluations.

Checklist

Before PR:

✅ Pre-commit or other linting tools have been used to fix potential lint issues.
✅ Bug fixes are fully covered by unit tests.
✅ The modifications are covered by complete unit tests.
✅ Documentation has been updated accordingly.

After PR:

✅ Potential downstream or related projects have been considered for testing.
✅ CLA has been signed and all committers have signed the CLA for this PR.

mar-cry added 2 commits April 26, 2025 03:53

support medmcqa and medbullets benchmark

7ad5116

Add Medbullets data folder for benchmark support

bb05609

mm-assistant bot assigned tonysy Apr 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support MedMCQA and MedBullets benchmark #2054

Support MedMCQA and MedBullets benchmark #2054

mar-cry commented Apr 26, 2025

Support MedMCQA and MedBullets benchmark #2054

Are you sure you want to change the base?

Support MedMCQA and MedBullets benchmark #2054

Conversation

mar-cry commented Apr 26, 2025

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist