Back in 2018, our Founder & CEO Ludwig Bull co-authored a blog post on the University of Oxford's website together with Dr. Felix Steffek, Professor of Law at the University of Cambridge and an Advisor to CourtCorrect since our founding in 2019.
The blog post was titled: "Paving the Way for Legal Artificial Intelligence – A Common Dataset for Case Outcome Predictions" and is available here.
This blog post argued that in order to truly compare the performance of machine learning models on legal & regulatory tasks (as showcased in CourtCorrect's BBC Lawyer Challenge, where an early iteration of our system beat over 100 lawyers in predicting the outcomes of FOS decisions on PPI cases), it is necessary for researchers to have access to a common dataset against which they can benchmark the performance of their models.
We are absolutely delighted to share that this dataset has now become a reality in the Cambridge Law Corpus: a Dataset for Legal AI Research.
Co-developed and carefully curated together with researchers from the University of Cambridge and Uppsala University, this dataset contains over 320,000 UK court decisions and other relevant legal data. The paper introducing the dataset was recently published in NeurIPS (Conference on Neural Information Processing Systems), one of the leading peer-reviewed journals for machine learning research.
CourtCorrect provided technical support in the creation of the dataset, drawing on our vast experience in handling, structuring and curating large-scale datasets from the legal and regulatory space. Our involvement underlines our commitment to making legal data and information useful for researchers and end-users in a responsible way to close the access to justice gap. The publication of the paper to critical acclaim in one of the leading machine learning journals worldwide underscores the fact that CourtCorrect operates at the cutting edge of what is possible in legal and regulatory AI.
A request for access to the dataset can be lodged on the University of Cambridge website and will be subject to the dataset terms. Final decisions on access are made by the academic co-authors of the paper.
Ludwig and Felix discussing the Cambridge Law Corpus at the recent University Alumni Festival
It's been an absolute pleasure to collaborate with Dr. Felix and his team. Onwards and upwards!