Early-exit network

Early-exit networks are a class of dynamic neural networks designed for efficient inference by allowing models to make confident predictions at intermediate layers, rather than processing the full network.^[1]

Early-exit mechanisms are methods for deep neural networks that add intermediate classifiers, allowing inference to stop at earlier layers for inputs assessed as low uncertainty. Decisions to exit are typically based on confidence measures such as softmax-derived scores, classification margins, or entropy-based criteria, with the goal of reducing computational cost. These approaches are commonly paired with specialized training procedures and system-level optimizations to improve efficiency while preserving accuracy.^[2]

The main idea behind the technology is to stop excessive calculations when a good answer can already be given with a high degree of probability, which can save both computation and time.^[3]^[4]

Early-exit networks have also been extended with expert-based exit criteria, where intermediate classifiers are treated as multiple “experts” whose predictions and confidence scores can be aggregated to decide whether to stop computation early.^[5]

Hardware implementations are also being developed.^[6]

[1]

[2]

[3]

[4]

[5]

[6]

Early-exit network

References

Related Articles