We're now seeing client projects that use federated machine learning (ML). Traditionally, ML model training has required data to be placed in a centralized location where the relevant training algorithm can be run. From a privacy point of view, this is problematic, especially when the training data contains sensitive or personally identifiable information; users might be reluctant to share data or local data protection legislation may prevent us from moving data to a central location. Federated ML is a decentralized technique for training on a large and diverse set of data that allows the data to remain remote — for example, on a user's device. Network bandwidth and the computational limitations of devices still present significant technical challenges, but we like the way federated ML leaves users in control of their own personal information.
Model training generally requires collecting data from its source and transporting it to a centralized location where the model training algorithm runs. This becomes particularly problematic when the training data consists of personally identifiable information. We're encouraged by the emergence of federated learning as a privacy-preserving method for training on a large diverse set of data relating to individuals. Federated learning techniques allow the data to remain on the users' device, under their control, yet contribute to an aggregate corpus of training data. In one such technique, each user device updates a model independently; then the model parameters, rather than the data itself, are combined into a centralized view. Network bandwidth and device computational limitations present some significant technical challenges, but we like the way federated learning leaves users in control of their own personal information.