Inproceedings,

Edge Intelligence Optimization for Large Language Model Inference with Batching and Quantization.

, , , , , and .
WCNC, page 1-6. IEEE, (2024)

Meta data

Tags

Users

  • @dblp

Comments and Reviews