ProjectEval is a multi-level benchmark for evaluating LLMs on complex project-level code generation tasks. It simulates realistic software engineering workflows by combining natural language prompts, structured checklists, and code skeletons.
📄 Paper | 🚀 Project | ✉️ Contact Us | 📤 Submit Your Model's Result
Model | Report By | Report Date | Output Format | Cascade | Direct | All Avg. | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
Level 1 | Level 2 | Avg. | Level 1 | Level 2 | Level 3 | Avg. |
Model | Report By | Report Date | Output Format | Cascade | Direct | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Level 1 | Level 2 | Level 1 | Level 2 | Level 3 | ||||||||||||
CL | SK | Code | PV | SK | Code | PV | Code | PV | Code | PV | Code | PV |