What is Federated Learning?
Federated learning is a machine learning approach where AI models are trained across multiple decentralised devices or servers holding local data, without transferring raw data to a central location, enabling organisations to build powerful models while preserving data privacy and complying with data sovereignty regulations.
What Is Federated Learning?
Federated learning is a distributed machine learning technique where a model is trained collaboratively across multiple devices or organisations without sharing the underlying raw data. Instead of collecting all data into a central server for training, the model is sent to where the data lives, learns from local data, and only shares the learned model updates, not the data itself, back to a central coordinator.
This approach was pioneered by Google in 2016 for improving the predictive keyboard on Android phones. Rather than uploading every user's typing data to Google's servers, the keyboard model is sent to each phone, learns from that user's typing patterns locally, and sends only the model improvements back. The user's actual messages never leave their device.
For businesses in Southeast Asia, where data sovereignty regulations vary across countries and cross-border data transfer is increasingly restricted, federated learning offers a way to build AI models across markets without moving sensitive data across borders.
How Federated Learning Works
The federated learning process follows a cyclical pattern:
Step 1: Model Distribution
A central server sends the current global model to participating devices or organisations.
Step 2: Local Training
Each participant trains the model on their local data. This training produces updated model parameters that reflect what the model learned from that participant's data.
Step 3: Model Update Aggregation
Participants send only their model updates (not their data) back to the central server. The server aggregates these updates, typically by averaging them, to create an improved global model.
Step 4: Iteration
The improved global model is sent back to participants, and the cycle repeats. Over many rounds, the global model improves as if it had been trained on the combined data of all participants, without any participant sharing their raw data.
Types of Federated Learning
Cross-Device Federated Learning
Training across many individual devices such as smartphones, IoT sensors, or edge devices. Each device has a small amount of data, and the system must handle thousands or millions of participants with variable connectivity.
Use cases: Mobile keyboard prediction, on-device recommendation systems, health monitoring wearables.
Cross-Silo Federated Learning
Training across a small number of organisations or institutions, each holding large datasets. The participants are typically reliable servers with stable connectivity.
Use cases: Healthcare consortia training diagnostic models across hospitals without sharing patient records, financial institutions collaborating on fraud detection without exposing transaction data, multinational companies training models across regional subsidiaries without transferring data across borders.
Why Federated Learning Matters for Business
Federated learning addresses several critical business challenges:
Data Privacy Compliance
Data protection regulations like Singapore's PDPA, Thailand's PDPA, Indonesia's PDP Law, and the Philippines' Data Privacy Act impose strict requirements on how personal data is collected, stored, and transferred. Federated learning minimises privacy risk by keeping data where it was collected. This is particularly valuable for businesses operating across multiple ASEAN markets with different regulatory frameworks.
Data Sovereignty
Several Southeast Asian countries require certain categories of data to remain within national borders. Federated learning enables multinational organisations to train unified AI models across their ASEAN operations without transferring data between countries, satisfying data localisation requirements while still benefiting from the combined intelligence of data across all markets.
Collaboration Without Exposure
Organisations in the same industry can collaboratively train better AI models without exposing their proprietary data to competitors. For example, multiple banks in Southeast Asia could train a superior fraud detection model together, each contributing patterns from their transaction data without revealing individual customer transactions.
Access to More Data
The quality of AI models generally improves with more training data. Federated learning enables access to broader and more diverse datasets that would be impossible to centralise due to privacy, regulatory, or competitive barriers.
Challenges and Limitations
Federated learning introduces complexities that organisations must address:
- Communication overhead: Transmitting model updates across networks, especially in cross-device scenarios, requires significant bandwidth and careful optimisation
- Non-uniform data: Different participants may have very different data distributions, making model convergence more challenging than centralised training
- Security considerations: While raw data is not shared, model updates can potentially be reverse-engineered to infer information about the training data. Techniques like differential privacy and secure aggregation address this risk
- Computational requirements: Each participant needs sufficient computing resources to train the model locally
- Coordination complexity: Managing training across distributed participants requires robust orchestration and fault tolerance
Implementing Federated Learning
For organisations considering federated learning:
- Identify the business case: Federated learning adds complexity, so it should be adopted when centralised training is not possible due to privacy, regulatory, or competitive constraints
- Choose a framework: Flower (flwr) is the most popular open-source federated learning framework. PySyft by OpenMined offers strong privacy guarantees. Cloud providers also offer managed options, including Google Cloud Federated Learning and NVIDIA FLARE
- Start with cross-silo scenarios: Training across a few organisational partners is significantly simpler than cross-device scenarios involving millions of endpoints
- Add privacy protections: Implement differential privacy and secure aggregation to protect against inference attacks on model updates
- Test with simulated federation first: Before deploying across real participants, simulate the federated setup to validate that your model converges properly with distributed data
- Establish governance frameworks: Define clear agreements between participating organisations about model ownership, update schedules, and dispute resolution
Federated learning is not yet mainstream for most SMBs, but for organisations operating across multiple jurisdictions with strict data regulations, or industries where data collaboration between competitors would create mutual value, it represents an increasingly practical approach to AI model development.
Federated learning matters to CEOs and CTOs because it unlocks AI model development in situations where data cannot be centralised, which is an increasingly common constraint as privacy regulations tighten across Southeast Asia. For multinational businesses operating across ASEAN, the alternative to federated learning is often accepting that data in each country trains only a local model, missing the opportunity to build superior AI that learns from patterns across all markets.
The most compelling business case is in regulated industries. Financial services companies in ASEAN could collaboratively train fraud detection models that identify cross-border fraud patterns no single institution can detect alone. Healthcare organisations could develop diagnostic AI trained on patient data from multiple hospitals without violating patient privacy. These collaborative models would be significantly better than anything a single organisation could build in isolation.
For business leaders evaluating federated learning, the key question is whether the value of training AI on distributed data justifies the additional technical complexity. If your data can be centralised without regulatory or competitive issues, traditional training is simpler and should be preferred. Federated learning is most valuable when centralisation is impossible but collaborative model development would deliver a meaningful competitive or operational advantage.
- Adopt federated learning only when centralised training is not feasible due to privacy, regulatory, or competitive constraints. It adds complexity that is not justified when data can be pooled directly.
- Start with cross-silo federated learning between a small number of partner organisations rather than attempting cross-device scenarios with many endpoints.
- Implement differential privacy and secure aggregation to protect against inference attacks that could reverse-engineer information from model updates.
- Establish clear legal and governance frameworks with all participating organisations before beginning federated training, covering data rights, model ownership, and liability.
- Test convergence with simulated federated setups before deploying across real participants to ensure the model trains effectively with distributed, non-uniform data.
- Consider the communication costs of federated learning, particularly for cross-border training across ASEAN markets with varying network quality.
- Evaluate whether a data clean room or trusted execution environment might solve your data collaboration needs more simply than full federated learning.
Frequently Asked Questions
Is federated learning truly private if model updates are shared?
Federated learning significantly improves privacy compared to centralised training because raw data never leaves its source. However, model updates can theoretically leak some information about the training data through techniques like gradient inversion attacks. To address this, federated learning is typically combined with additional protections: differential privacy adds mathematical noise to model updates to prevent individual data inference, and secure aggregation encrypts updates so the central server only sees the aggregated result, not individual contributions. With these protections, federated learning provides strong practical privacy guarantees.
Can federated learning be used across companies in different ASEAN countries?
Yes, this is one of its most valuable applications for Southeast Asian businesses. Federated learning enables organisations across Singapore, Indonesia, Thailand, and other ASEAN markets to train collaborative AI models without transferring data across borders. This satisfies data sovereignty requirements while building models that learn from regional patterns. The key challenges are establishing legal frameworks between participants, managing network latency across countries, and handling non-uniform data distributions. Several industry consortia in banking and healthcare across ASEAN are exploring federated learning for exactly this purpose.
More Questions
Data anonymisation and centralisation is simpler to implement but has significant limitations. Research has repeatedly shown that anonymised datasets can often be re-identified, especially when combined with other data sources. Some regulations, particularly the EU GDPR and emerging ASEAN frameworks, may not consider anonymised data sufficient for sensitive use cases. Federated learning avoids these risks entirely because raw data never moves. However, if your data can be effectively anonymised and your regulatory environment permits centralisation, it remains a valid and simpler approach for many use cases.
Need help implementing Federated Learning?
Pertama Partners helps businesses across Southeast Asia adopt AI strategically. Let's discuss how federated learning fits into your AI roadmap.