We have scraped a large database of court case summaries from
here. For each district, a sample of cases, ranging from ~1000 in number for some districts, to more than 500,000 for other districts is created. Each of these samples, is analysed across several different metrics to identify anomalies. Additionally, we have created an aggregate performance metric, using KL divergence.