In their NeurIPS-2020 article Wang et al. discuss another attack against Federated Learning (FL): the injection of backdoors into a model during model training. In an FL setting, the goal of a backdoor is to corrupt the global (federated) model such that it mispredicts on a targeted sub-task. E.g., in an image classification task a backdoor may force the image classifier model to misclassify airplanes as trucks. The authors particularly explore what they call edge-case backdoors . Edge-case backdoors force a model to misclassify seemingly "easy" input that is unlikely to be part of the training or test data. In other words: they target inputs (i.e., features) that are rarely observed. Note that "edge-yness" of these backdoors concerns only the inputs; the outputs (i.e., labels) are unconstrained.
Two ways of injecting backdoors are examined:
- Data poisening: an attacker has black-box access to its device and fraudulently manipulates its local training data.
- Model poisening: an attacker has white-box access to its device and fraudulently manipulates the local model updates.
The authors proof theoretically that the existence of adversarial examples implies the existence of edge-case backdoors. Because robustness w.r.t. adversarial examples is still an open problem, the robustness w.rt. edge-case backdoors is, too. Furthermore they proof theoretically that it is hard to detect them. Additionally, a plethora of empirical evidence via both Computer Vision and Natural Language Processing experiments is provided that well-known defense mechanisms fail to defend against both black-box and white-box attacks. 5 defense mechanisms are investigated:
They find that — when norm constraints are carefully tuned — white-box attacks pass NDC, Krum, Multi-Krum and RFA. Black-box attacks pass NDC and RFA but are defended by Krum and Multi-Krum. Adding Gaussian noise defends both against black- and white-box attacks, but comes at the price of a reduced model accuracy. Maybe not to surprisingly, the authors also find that models with high capacity are more prone to edge-case backdoor attacks than models with low capacity. Therefore, low(er) capacity models are another way to defend against such attacks, but again come at the price of a (potentially) reduced model accuracy.
In practice this means:
For every FL use case, a balance must be found between global model accuracy and robustness.
If there may be malicious participants in an FL use case, adding noise when aggregating local models and reducing model capacity are two effective defenses. If malicious intentions can be ruled out, these mechanisms may be dispensed with.