We are dedicated to LLMs that serve the agricultural sector. Specifically, due to the current lack of fine-tuning datasets for LLMs in crop science, we have released our CROP dataset, which is a large open-source dataset with over 210K Q&A pairs. Furthermore, to provide a high-quality evaluation standard for this vertical domain, we have introduced the CROP benchmark, which is a large open-source dataset with 5045 multiple-choice questions. We hope our work will advance the field of LLMs in agricultural production and contribute to solving hunger issues.
Our work is accepted by NeurIPS2024 Dataset & Benchmark Track. All datasets and benchmarks are open-sourced. You can see our project website at https://github.com/RenqiChen/The_Crop for more details about our work.
If you find our codes and datasets useful, please consider citing our work:
@inproceedings{zhangempowering,
title={Empowering and Assessing the Utility of Large Language Models in Crop Science},
author={Zhang, Hang and Sun, Jiawei and Chen, Renqi and Liu, Wei and Yuan, Zhonghang and Zheng, Xinzhe and Wang, Zhefan and Yang, Zhiyuan and Yan, Hang and Zhong, Han-Sen and others},
booktitle={The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track}
}