Elastic Deep Learning

Elastic Deep Learning (EDL) is an LF AI Foundation incubation project designed to help deep learning cloud service providers build cluster cloud services using deep learning frameworks such as PaddlePaddle. The project was originally developed and open-sourced by Baidu and it is licensed under the Apache 2.0 license. EDL includes a Kubernetes controller, PaddlePaddle auto-scaler, which changes the number of processes of distributed jobs to the idle hardware resource in the cluster, and a new fault-tolerable architecture.

Efficiency

Provides parallelism strategies to minimize adjustment overheads.

Consistency

Accuracy verification on multiple models compared those without scaling.

Flexibility

Any components can be killed or joined at any time.

Easy to Use

Few lines of code need to be added to support EDL.

Open Source

Please visit us on GitHub where our development happens. We invite you to join our community both as a user of EDL and also as a contributor to its development. We look forward to your contributions!

Join the Conversation

EDL maintains three primary mailing lists. You are invited to join the ones that best meet your interests.

EDL Announce Mailing List: Top-level milestone messages and announcements

EDL Technical Discussion: Technical discussions

EDL Technical Steering Committee: Technical governance discussions