中文

Compas

The Compas dataset for recidivism prediction. Dataset known to have racial bias issues, check this Propublica article on the topic.

Configurations and tasks

Configuration Task Description
encoding Encoding dictionary showing original values of encoded features.
two-years-recidividity Binary classification Will the defendant be a violent recidivist?
two-years-recidividity-no-race Binary classification As above, but the race feature is removed.
priors-prediction Regression How many prior crimes has the defendant committed?
priors-prediction-no-race Binary classification As above, but the race feature is removed.
race Multiclass classification What is the race of the defendant?

Usage

from datasets import load_dataset

dataset = load_dataset("mstz/compas", "two-years-recidividity")["train"]

Features

Feature Type Description
sex int64
age int64
race int64
number_of_juvenile_fellonies int64
decile_score int64 Criminality score
number_of_juvenile_misdemeanors int64
number_of_other_juvenile_offenses int64
number_of_prior_offenses int64
days_before_screening_arrest int64
is_recidivous int64
days_in_custody int64 Days spent in custody
is_violent_recidivous int64
violence_decile_score int64 Criminality score for violent crimes
two_years_recidivous int64