Liu, Z., Zhang, S., Garrigus, J., Zhao, H. (2023). Genomics-GPU: A Benchmark Suite for GPU-Accelerated Genome Analysis. ISPASS 2023: 178-188. Raleigh, North Carolina April 23-25, 2023.: 2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS2023).
Ho, K., Zhao, H., Jog, A., Mohanty, S. (2022). Improving GPU Throughput Through Parallel Execution Using Tensor Cores and CUDA Cores. 2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).
Liu, Z., Exley, T., Geek, A., Yang, R., Zhao, H., Albert, M. (2022). Predicting GPU Performance and System Parameter Configuration Using Machine Learning. 2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).
Kandemir, M. T., Tang, X., Zhao, H., Ryoo, J., Karakoy, M. (2021). Distance-in-time versus distance-in-space. Association for Computing Machinery,New York NY,United States.
Jiang, B., Cheng, X., Tang, S., Ma, X., Gu, Z., Zhao, H., Fu, S. (2021). APCNN: Explore Multi-Layer Cooperation for CNN Optimization and Acceleration on FPGA. The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. https://dl.acm.org/doi/abs/10.1145/3431920.3439461
Cui, Y., Prabhakar, S., Zhao, H., Mohanty, S., Fang, J. (2020). A Low-Cost Conflict-Free NoC Architecture for Heterogeneous Multicore Systems (ISVLSI).
Cheng, X., Zhao, H., Kandemir, M., Mohanty, S., Jiang, B. (2020). Alleviating Bottlenecks for DNN Execution on GPUs via Opportunistic Computing (ISQED).
Cheng, X., Zhao, H., Kandemir, M., Jiang, B., Mehta, G. (2020). AMOEBA: a coarse-grained reconfigurable architecture for dynamic GPU scaling. The 34th ACM International Conference on Supercomputing. https://dl.acm.org/doi/abs/10.1145/3392717.3392738
Kandemir, M., Ryoo, ., Zhao, H., Jung, M., Karakoy, M. (2020). Collective Affinity Aware Computation Mapping (PACT).
Fang, J., Zhang, J., Lu, S., Zhao, H. (2020). Exploration on Task Scheduling Strategy for CPU-GPU Heterogeneous Computing System (ISVLSI).
Cheng, X., Zhao, H., Kandemir, M., Jiang, B., Mehta, G. (2020). AMOEBA: A Coarse Grained Reconfigurable Architecture for Dynamic GPU Scaling. Proceedings of the 34th ACM International Conference on Supercomputing.
Cheng, X., Zhao, H., Kandemir, M., Mohanty, S., Jiang, B. (2020). Alleviating Bottlenecks for DNN Execution on GPUs via Opportunistic Computing. 2020 21st International Symposium on Quality Electronic Design (ISQED). https://ieeexplore.ieee.org/abstract/document/9136967
Fang, J., Zhou, K., Zhao, H. (2019). Dynamic block size adjustment and workload balancing strategy based on CPU-GPU heterogeneous platform. ISPA 2019.
Tang, X., Kandemir, M., Zhao, H., Jung, M., Karakoy, M. (2019). Computing with Near Data (abstract). SIGMETRICS (Abstracts) 2019: 27-28.
Zhang, L., Cheng, X., Zhao, H., Mohanty, S., Fang, J. (2019). Exploration of System Configuration in Effective Training of CNNs on GPGPUs. 2019 IEEE International Conference on Consumer Electronics (ICCE). 1--4.
Cheng, X., Zhao, H., Mohanty, S., Fang, J. (2019). Improving GPU NoC Power Efficiency through Dynamic Bandwidth Allocation. 2019 IEEE International Conference on Consumer Electronics (ICCE). 1--4.
Cheng, x., Zhao, Y., Robaei, M., Jiang, B., Zhao, H., Fang, J. (2019). A Low-Cost and Energy-Efficient NoC Architecture for GPGPUs.. ANCS.
Cheng, Y., Fang, J., Zhao, H. (2019). A Congestion-adaptive Fault-tolerant Routing Algorithm on HNoC. IEEE Cyber.
Zhao, H., Cheng, X., Mohanty, S., Fang, J. (2018). Designing Scalable Hybrid Wireless NoC for GPGPUs. 2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). 703--708.
Fang, J., Chang, Z., Cheng, Y., Zhao, H. (2018). Exploration on Routing Configuration of HNoC with Reasonable Energy Consumption. 2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). 744-749.
Cheng, X., Zhao, Y., Zhao, H., Xie, Y. (2018). Packet Pump: Overcoming Network Bottleneck in On-Chip Interconnects for GPGPUs. Design Automation Conference. 84:1-84:6.
Sharifi, A., Ding, W., Guttman, D., Zhao, H., Tang, X., Kandemir, M., Das, C. (2017). DEMM: A Dynamic Energy-Saving Mechanism for Multicore Memories. MASCOTS 2017: 210-220. MASCOTS. 210-220.
Fang, J., Zhang, J., Lu, S., Zhao, H., Zhang, D., Cui, Y. (2022). Task Scheduling Strategy for Heterogeneous Multicore Systems.
Fang, J., Lu, J., Wang, M., Zhao, H. (2019). A Performance Conserving Approach for Reducing Memory Power Consumption in Multi-Core Systems. Journal of Circuits, Systems, and Computers 28(7): 1950113:1-1950113:16 (2019). Journal of Circuits, Systems, and Computers. DOI: 10.1142/S0218126619501135.
Cheng, X., Zhao, H., Kandemir, M., Mohanty, S., Jiang, B. (2019). Alleviating Bottlenecks for DNN Execution on GPUs via Opportunistic Computing. Other.
Robaei, M., Zhao, H., (2019). Broadcast-Based Hybrid Wired-Wireless Network-on-Chip for GPGPUs. IEEE Consumer Electronics Magazine 8(6): 62-67 (2019). Consumer Electronics Magazine.
Fang, J., Hao, X., Fan, Q., Li, K., Zhao, H. (2019). Efficient Data Transfer in a Heterogeneous Multicore-Based CE Systems using Cache Performance Optimization. IEEE Consumer Electronics Magazine 8(5): 46-50 (2019). Consumer Electronics Magazine.
Tang, X., Kandemir, M., Zhao, H., Jung, M., Karakoy, M. (2018). Computing with Near Data. POMACS 2(3): 42:1-42:30 (2018).