Publications
Table of Contents
Journal and conference papers #
- T. Ji, N. Balasubramanian, M. Ferdman and P. Milder, “Enabling Efficient SpMM for Sparse Attention on GEMM-Optimized Hardware with Block Aggregation”, FPGA'26 [Github]
- M. Treviso et al., “Efficient Methods for Natural Language Processing: A Survey”, TACL 2023 [arXiv ver.]
- T. Ji, S. Jain, M. Ferdman, P. Milder, H. A. Schwartz, and N. Balasubramanian, “On the Distribution and Sparsity of Attention within Transformers”, Findings of ACL'21 [arXiv ver.][Github]
- Y. Shen, T. Ji, M. Ferdman, and P. Milder, “Argus: An End-to-End Framework for Accelerating CNNs on FPGAs”, IEEE Micro
- Y. Shen, T. Ji, M. Ferdman, and P. Milder, “Medusa: A Scalable Interconnect for Many-Port DNN Accelerators and Wide DRAM Controller Interfaces”, in FPL'18 [arXiv ver.]
Thesis #
T. Ji, “Accelerating Sparse Attention for Large Language Models on GEMM-Optimized Hardware”, Ph.D. Thesis, Stony Brook University, 2026