Jiang, B., Cheng, X., Li, Y., Fu, S., Yang, Q., Liu, M., Olvera, A. (2023). Output-Directed Dynamic Quantization for DNN Acceleration. 645-654. International Conference on Parallel Processing (ICPP).
Jiang, B., Cheng, X., Tang, S., Ma, X., Gu, Z., Fu, S., Yang, Q., Liu, M. (2022). MLCNN: Cross-Layer Cooperative Optimization and Accelerator Architecture for Speeding Up Deep Learning Applications.. International Parallel and Distributed Processing Symposium (IPDPS. 1184-1194. https://ieeexplore.ieee.org/abstract/document/9820611
Gu, Z., Tang, S., Jiang, B., Huang, S., Guan, Q., Fu, S. (2021). Characterizing Job-Task Dependency in Cloud Workloads Using Graph Learning. 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). https://ieeexplore.ieee.org/abstract/document/9460684
Jiang, B., Cheng, X., Tang, S., Ma, X., Gu, Z., Zhao, H., Fu, S. (2021). APCNN: Explore Multi-Layer Cooperation for CNN Optimization and Acceleration on FPGA. The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. https://dl.acm.org/doi/abs/10.1145/3431920.3439461
Cheng, X., Zhao, H., Kandemir, M., Jiang, B., Mehta, G. (2020). AMOEBA: a coarse-grained reconfigurable architecture for dynamic GPU scaling. The 34th ACM International Conference on Supercomputing. https://dl.acm.org/doi/abs/10.1145/3392717.3392738
Cheng, X., Zhao, H., Kandemir, M., Mohanty, S., Jiang, B. (2020). Alleviating Bottlenecks for DNN Execution on GPUs via Opportunistic Computing. 2020 21st International Symposium on Quality Electronic Design (ISQED). https://ieeexplore.ieee.org/abstract/document/9136967
Cheng, X., Zhao, Y., Robaei, M., Jiang, B., Zhao, H., Fang, J. (2019). A low-cost and energy-efficient noc architecture for GPGPUs. ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS).