With the recent growth in artificial intelligence models and its expanding role in automated decision making, ensuring that these models are not biased is of vital importance. There is an abundance of evidence suggesting that these models could contain or even amplify the bias present in the data on which they are trained, inherent to their objective function and learning algorithms. In this paper, we propose a novel classification algorithm that improves fairness, while maintaining accuracy of the predictions. Utilizing the embedding layer of a pre-trained classifier for the protected attributes, the network uses an attention layer to distract the classification from depending on the protected attribute in its predictions. We compare our model with six state-of-the-art methodologies proposed in fairness literature, and show that the model is superior to those methods in terms of minimizing bias while maintaining accuracy.
Latest posts by Ryan Watkins (see all)
- Exploring Student Behaviors and Motivations using AI TAs with Optional Guardrails - April 16, 2025
- AI-University: An LLM-based platform for instructional alignment to scientific classrooms - April 15, 2025
- Interaction-Required Suggestions for Control, Ownership, and Awareness in Human-AI Co-Writing - April 14, 2025