Ethics4DS: Coursework 1

Author

[Candidate number here]

Published

November 1, 2022

Discussion questions

Data science is usually framed as utilitarian because of its focus on prediction/causation (consequences) and optimization (maximizing utility). Describe an example data science application using explicitly utilitarian language, then refer to at least one non-consequentialist theory to identify some aspect of the application that utilitarianism might overlook.

  • Example application:

  • A non-utilitarian aspect of this application:

Choose one of the ethical data science guidelines that we read. Find some part of it that you agreed with strongly, quote that part, and describe why you thought it was important. Then find another part that you think is too limited, quote that part, and describe what you think is its most important limitation.

  • Guideline document: (choose one of ASA/RSS/ACM)

  • Agreement

quoted text

Reasoning:

  • Disagreement

quoted text

Reasoning:

Data questions

Computing fairness metrics

Use the fairness package. Pick one of the example datasets in the package. Fit a predictive model using that dataset. Choose three different fairness metrics to compute using the predictions from that model. For each of these, compute the values in the fairness metric in two ways: (1) using standard R functions, e.g. arithmetic operations, and (2) using the fairness package functions. Check to see whether you get the same answer.

# install.packages("fairness")
library(fairness)
# Predictive model

Fairness metric 1

Which metric: (name here)

# Computing manually
# Comparing to the fairness package answer

Fairness metric 2

Which metric: (name here)

# Computing manually
# Comparing to the fairness package answer

Fairness metric 3

Which metric: (name here)

# Computing manually
# Comparing to the fairness package answer

Simulating a response variable

Now replace the outcome variable in the original dataset with a new variable that you generate. You can decide how to generate the new outcome. Your goal is to make this outcome result in all the fairness metrics you chose above indicating that the predictive model is fair.

# n <- nrow(datasetname)
# datasetname$outcomename <- somefunction(n, etc)
# Predictive model

Fairness metric 1

Which metric: (name here)

# Computing manually
# Comparing to the fairness package answer

Fairness metric 2

Which metric: (name here)

# Computing manually
# Comparing to the fairness package answer

Fairness metric 3

Which metric: (name here)

# Computing manually
# Comparing to the fairness package answer

Concluding thoughts

Do any of the results above require some explanation? Briefly describe your conclusion here.