Guy Lacroix publie un article écrit conjointement avec deux anciens étudiants du département dans la revue Journal of Quantitative Criminology

9 avril 2025

L’article peut être consulté ici.

Voici le résumé de l’article:

Titre: Beyond Traditional Risk Scores: Tackling LS/CMI Offender Misclassifications with Machine Learning

Objectives. This paper investigates the accuracy of offender risk assessment scoring methods.

We study the degree of misclassification resulting from the conventional practice of aggregating

individual items to derive risk scores and categories. We document which types of offenders are

prone to misclassification, particularly in relation to age and gender.

Methods. We use a machine learning algorithm to leverage the rich set of information available in

the LS/CMI. Using all 45,535 assessments conducted between 2008 and 2015 in Quebec (Canada),

we estimate probabilities from a random forest algorithm to predict individual risks of recidivism

over a two-year follow-up. We compare the resulting probabilities to those inferred from the risk

scores or categories to document the extent of misclassification. We devise a simple algorithm

to construct alternative risk categories that reduce misclassification relative to the LS/CMI total

scores and categories.

Results. The probabilities obtained from the random forest approach accurately predict individual

 probabilities to reoffend. Compared with these predictions, the traditional aggregation of items

into risk scores or categories yields substantial misclassification for certain groups of offenders. In

particular, we find that the risk associated with older individuals when using the LS/CMI risk

categories is overestimated by about 10 percentage points. Our alternative risk categories, devised

from our machine learning predictions, successfully avoid such misclassification.

Conclusions. Traditional methods of aggregating items from risk assessments into scores may

lead to substantial misclassification, especially for older offenders. Misclassification arises from 1)

items not being equally risk-relevant; 2) information collected by the LS/CMI being excluded or

overly simplified when constructing scores; and 3) age being omitted from risk scores. Machine

learning algorithms avoid these pitfalls and can be used to construct less biased categories.