Skip to content

Decision tree methods in federated learning with FLEXible.

License

Notifications You must be signed in to change notification settings

FLEXible-FL/flex-trees

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

flex-trees

The flex-trees package consists of a set of tools and utilities to work with Decision Tree (DT) models in Federated Learning (FL). It is designed to be used with the FLEXible framework, as it is an extension of it.

flex-trees comes with some state-of-the-art decision tree models for federated learning. It also provides multiple tabular datasets to test the models.

The methods implemented in the repository are:

Model Description Citation
Federated ID3 The ID3 model adapted to a federated learning scenario. A Hybrid Approach to Privacy-Preserving Federated Learning
Federated Random Forest The Random Forest (RF) model adapted to a federated learning scenario. Each client builds a RF locally, then N trees are randomly sampled from each client to get a global RF composed from the N trees retrieved from the clients. Federated Random Forests can improve local performance of predictive models for various healthcare applications
Federated Gradient Boosting Decision Trees The Gradient Boosting Decision Trees model adapted to a federated learning scenario. In this model a global hash table is first created to aling the data between the clients within sharing it. After that, N trees (CART) are built by the clients. The process of building the ensemble is iterative, and one client builds the tree, then it is added to the ensemble, and after that the weights of the instances is updated, so the next client can build the next tree with the weights updated. Practical Federated Gradient Boosting Decision Trees
Interpretable Client Decision Tree Aggregation For Federated Learning process (ICDTA4FL process) The ICDTA4FL process is a process that allows the clients to build a decision tree locally, and then the trees are aggregated in a global tree by merging the rules extracted from the local trees. The process is iterative, and the clients can build a tree, then the trees that surpass a threshold are selected to be merged. In order the merge the trees, these are transformed into rules, and then the merged rules are used to build a global tree. This process is tree independent, and the code is available for merging ID3, CART and C4.5 trees. Interpretable Client Decision Tree Aggregation For Federated Learning process

The tabular datasets available in the repository are:

Dataset Description Citation
Adult The Adult dataset is a dataset that contains demographic information about the people, and the task is to predict if the income of the person is greater than 50K. UCI Machine Learning Repository
Breast Cancer The Breast Cancer dataset is a dataset that contains information about the breast cancer, and the task is to predict if the cancer is benign or malignant. UCI Machine Learning Repository
Credit Card The Credit Card dataset is a dataset that contains information about the credit card transactions, and the task is to predict if the transaction is fraudulent or not. Kaggle
ILPD The ILPD dataset is a dataset that contains information about the Indian Liver Patient, and the task is to predict if the patient has liver disease or not. UCI Machine Learning Repository
Nursery The Nursery dataset is a dataset that contains information about the nursery, and the task is to predict the acceptability of the nursery. UCI Machine Learning Repository
Bank Marketing The Bank Marketing dataset is a dataset that contains information about the bank marketing, and the task is to predict if the client will subscribe to a term deposit. UCI Machine Learning Repository
Magic Gamma The Magic Gamma dataset is a dataset that contains information about the magic gamma, and the task is to predict if the gamma is signal or background. UCI Machine Learning Repository

 Tutorials

To get started with flex-trees, you can check the notebooks available in the repository. They cover the following topics:

Installation

We recommend Anaconda/Miniconda as the package manager. The following is the corresponding flex-trees versions and supported Python versions.

flex flex-trees Python
main / nightly main / nightly >=3.8, <=3.11
v0.6.0 v0.1.0 >=3.8, <=3.11

To install the package, you can use the following commands:

Using pip:

pip install flextrees

Download the repository and install it locally:

git clone [email protected]:FLEXible-FL/flex-trees.git
cd flex-trees
pip install -e .

 Citation

If you use this package, please cite the following paper:

  title={FLEX: FLEXible Federated Learning Framework},
  author={Herrera, Francisco and Jim{\'e}nez-L{\'o}pez, Daniel and Argente-Garrido, Alberto and Rodr{\'\i}guez-Barroso, Nuria and Zuheros, Cristina and Aguilera-Martos, Ignacio and Bello, Beatriz and Garc{\'\i}a-M{\'a}rquez, Mario and Luz{\'o}n, M},
  journal={arXiv preprint arXiv:2404.06127},
  year={2024}
}