Inproceedings,

Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers.

, , , , and .
ICLR, OpenReview.net, (2023)

Meta data

Tags

Users

  • @dblp

Comments and Reviews