Gender Bias is the most studied fairness issue in Natural Language Processing models. In their recent EMNLP-2020 article Vargas & Cotterell show that within word embeddingspace, gender bias occupies a linear subspace; this was only (implicitly) assumed by Bolukbasi et al. (2016)'s work on gender bias mitigation .
Altough Bolukbasi et al. (2016)'s mitigation technique has been criticized to be flawed, Vargas & Cotterell believe it is spread widely enough to deserve analysis. They show that Bolukbasi et al. (2016)'s approach of extracting a gender subspace via Singular Value Decomposition (SVD) can be re-formulated as performing a Principal Component Analysis (PCA) — a linear method. They go on and "kernalize" this PCA reformulation, allowing them to extract non-linear (gender bias) subspaces, too. Using a suite of benchmarks they then demonstrate that their non-linear gender bias mitigation technique offers no notable performance benefit over the linear bias mitigation technique.
In practice this means:
Altough linear bias mitigation isn't perfect, it is still a viable option for debiasing word embedding models to a certain degree.
Ideally, instead of debiasing the NLP models, the data that is used to train the models should be debiased. If neither model debiasing nor data debiasing is an option, (potential) gender bias should be clearly communicated, e.g. using model cards.