
Soft-Evals
Model | Dev | Test |
---|---|---|
TBD | TBD | TBD |
About CORGI SQL
Welcome to the CORGI SQL benchmark hosted by Cornell University and Gena. The CORGI benchmark was made to push the boundaries of txt2sql in the generative AI era. There are a few noticeable differences between CORGI and previous txt2sql benchmarks.
Soft Evaluation System
A significant portion of the questions are recommendation or prediction based natural language queries. These queries are "soft evaluated" with human input.
Business Domain Focus
CORGIv1.0 has business domain databases and queries, designed to test domain-specific lingo.
Complex Schema Design
There are many more tables and relations per database in CORGI than previous benchmarks. Many schemas are based on real industry schema designs.
Flexible Evaluation
There is no test split. Groups are free to experiment with zero-shot/template methods or generate train data themselves.