Publications

Table of Contents

Journal and conference papers #

T. Ji, N. Balasubramanian, M. Ferdman and P. Milder, “Enabling Efficient SpMM for Sparse Attention on GEMM-Optimized Hardware with Block Aggregation”, FPGA'26 [Github]
M. Treviso et al., “Efficient Methods for Natural Language Processing: A Survey”, TACL 2023 [arXiv ver.]
T. Ji, S. Jain, M. Ferdman, P. Milder, H. A. Schwartz, and N. Balasubramanian, “On the Distribution and Sparsity of Attention within Transformers”, Findings of ACL'21 [arXiv ver.][Github]
Y. Shen, T. Ji, M. Ferdman, and P. Milder, “Argus: An End-to-End Framework for Accelerating CNNs on FPGAs”, IEEE Micro
Y. Shen, T. Ji, M. Ferdman, and P. Milder, “Medusa: A Scalable Interconnect for Many-Port DNN Accelerators and Wide DRAM Controller Interfaces”, in FPL'18 [arXiv ver.]

Thesis #

T. Ji, “Accelerating Sparse Attention for Large Language Models on GEMM-Optimized Hardware”, Ph.D. Thesis, Stony Brook University, 2026