数据描述
The Financial Fraud Alert Review (FiFAR) is designed for Human-AI collaboration systems, where the goal is to defer instances to an expert or ML model in order to maximize the system's performance.
This dataset contains:
- 1M bank account applications labeled Not Fraud (0) vs Fraud (1) - these are the base variant of the Bank Account Fraud dataset.
- Binary predictions of 50 synthetic fraud analysts for each instance and 1 pre-trained ML model.
The predictions for each instance are available in the files:
- experts/train_predictions.csv - months 1-3 of the Base dataset (ML Model Training Split)
- experts/deployment_predictions.csv - months 4-8 of the Base dataset.
Goal
As the fraud prevalence is low (~1%), in our work, we weigh the cost of false positive (FP) and false negative (FN) mistakes, assuming a FP incurs 5.7% of the cost of a FN. The combined loss is given by L = λ (# False positives) + (# False negatives).
Realistic Assignment System Scenario Training: To train assignment algorithms under realistic data availability we include the files in testbed/train/small_regular:
- train.csv - Tabular dataset where each bank account application has a single expert decision, simulating a history of past predictions. The instances are distributed throughout the team according to their capacity constraints.
- batches.csv: Associates each instance to a batch_id, representing a time period
- capacity.csv: Defines the maximum number of instances that can be assigned to an expert for each batch_id
Testing: The resources for testing are available in testbed/test/:
- test.csv - test split of bank account applications
- test_expert_pred.csv expert predictions to be queried.
- We also include an array of 300 expert capacity constraints to test your system.
For examples on how to use FiFAR, please follow the steps detailed on our GitHub repo. To see the details of our data generation process see our paper.
Additional Information The synthetically generated predictions capture several aspects of real-world Human-AI collaboration settings:
- Social Bias: Experts are more likely to incorrectly deem older clients' (>50 years) applications as fraudulent, thus unfairly reducing their access to bank accounts. The degree of discrimination varies across experts
- Algorithmic Bias: Experts are influenced by the model score for each application. The degree of algorithmic bias is variable.
- Feature Dependence: The likelihood that an expert will make an error on a given instance is a function of its features, simulating real human behaviour. The dependence of each expert on the instance's features is unique, creating highly variable decision processes.
- Varying Expertise: The team has a wide variability of expertise, with some experts being more proficient than others at detecting fraud. Experts also have variable propensities for false positive and false negative mistakes.
There are four types of experts, three of which are variations on the "Standard" expert type:
- Unfair - Higher likelihood for older customer false positives
- Model Agreeing - Higher likelihood of making the same decision as the model
- Sparse - Dependent on less features, simulating a simpler decision making process
验证报告
以下为卖家选择提供的数据验证报告:
