1

STRONGHOLD: Fast and Affordable Billion-scale Deep Learning Model Training

Towards Scalable Supercomputing Resource Management

AIACC-Training: Optimizing Distributed Deep Learning Training through Multi-streamed and Concurrent Gradient Communications

Automating Reinforcement Learning Architecture Design for Code Optimization

Exploring the Impact of Negative Samples of Contrastive Learning: A Case Study of Sentence Embedding

Parallelizing and Balancing Large-scale Coupled DSMC/PIC Based Particle Simulations

Explicitly Modeling Importance and Coherence for Timeline Summarization

Optimizing Sparse Matrix Multiplications for Graph Neural Networks

RISE: Robust Wireless Sensing using Probabilistic and Statistical Assessments

LibShalom: Optimizing Small and Irregular-shaped Matrix Multiplications on ARMv8 Multi-Cores