Spring 2025 DAIS Qual Reading List

The written exam of the DAIS Qual Exam in Spring 2025 will be held on Monday,  March 3, 2025, at 1pm-5pm in room 2406 Siebel Center. 

This reading list consists of  multiple topic sections, each containing 2-3 papers.  The questions in the written exam will be based on the papers listed here, with 1-2 questions related to each section. If a section has two papers, you can usually expect to see one question related to the section in the qual exam, while if a section has three papers, you can usually expect to see two questions related to the section. You only need to answer four of those questions in the exam, so there is no need for you to read every paper. Instead, it would make sense for you to browse through the list and identify up to 4 sections that have papers that you are most familiar with or most comfortable with reading, and then focus on reading/digesting those papers. In general, you will likely find some sections to be closer to your interests or background than others, and you can focus more on reading the papers in those few sections that seem to be closest to your research interests. We will ask all the qual exam participants to submit the four sections that they have chosen and ensure that there will be questions designed based on those selected sections.   

Section 1

  • Daniel Rothchild, Ashwinee Panda, Enayat Ullah, Nikita Ivkin, Ion Stoica, Vladimir Braverman, Joseph Gonzalez, Raman Arora. FetchSGD: Communication-Efficient Federated Learning with Sketching, ICML, 2020 https://arxiv.org/abs/2007.07682
  • Z. Song, Y.Wang, Z. Yu, and L. Zhang.  Sketching for first order method: Efficient algorithm for low-bandwidth channel and vulnerability.  ICML, 2023 https://arxiv.org/abs/2210.08371
  • M. Shrivastava, B. Isik, Q. Li, S. Koyejo, and A. Banerjee, Sketching for Distributed Deep Learning: A Sharper Analysis, NeurIPS, 2024  

Section 2

Section 3

Section 4

  • Edwards, Carl and Lai, Tuan and Ros, Kevin and Honke, Garrett and Cho, Kyunghyun and Ji, Heng. 2022. Translation between Molecules and Natural Language. Proc. The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP2022). https://blender.cs.illinois.edu/paper/molt5.pdf
  • Wang, Hongwei and Li, Weijiang and Jin, Xiaomeng and Cho, Kyunghyun and Ji, Heng and Han, Jiawei and Burke, Martin. 2022. Chemical-Reaction-aware Molecule Representation Learning. Proc. The International Conference on Learning Representations (ICLR2022). https://blender.cs.illinois.edu/paper/moleculerepresentation2022.pdf
  • Jin, Bowen and Liu, Gang and Han, Chi and Jiang, Meng and Ji, Heng and Han, Jiawei. 2024. Large Language Models on Graphs: A Comprehensive Survey. IEEE Transactions on Knowledge and Data Engineering. https://arxiv.org/pdf/2312.02783

Section 5

Section 6

Section 7

Section 8  

Section 9

  • Jin, Q., Wang, Z., Floudas, C.S. et al. Matching patients to clinical trials with large language models. Nat Commun 15, 9074 (2024). https://doi.org/10.1038/s41467-024-53081-z
  • Theodorou, Brandon, Benjamin Danek, Venkat Tummala, Shivam Pankaj Kumar, Bradley Malin, and Jimeng Sun. 2024. “FairPlay: Improving Medical Machine Learning Models with Generative Balancing for Equity and Excellence.” To appear NPJ Digital Medicine. https://doi.org/10.21203/rs.3.rs-5252769/v1.
  • Wang, Hanyin, Chufan Gao, Bolun Liu, Qiping Xu, Guleid Hussein, Mohamad El Labban, Kingsley Iheasirim, Hariprasad Korsapati, Chuck Outcalt, and Jimeng Sun. 2024. “Adapting Open-Source Large Language Models for Cost-Effective, Expert-Level Clinical Note Generation with on-Policy Reinforcement Learning.” arXiv [Cs.CL]. arXiv. http://arxiv.org/abs/2405.00715.

Section 10

  • Zhining Liu, Ruizhong Qiu, Zhichen Zeng, Yada Zhu, Hendrik F. Hamann, Hanghang Tong: AIM: Attributing, Interpreting, Mitigating Data Unfairness. KDD 2024: 2014-2025. https://dl.acm.org/doi/10.1145/3637528.3671797
  • Huan He, Owen Queen, Teddy Koker, Consuelo Cuevas, Theodoros Tsiligkaridis, and Marinka Zitnik. 2023. Domain adaptation for time series under feature and label shifts. In International Conference on Machine Learning. PMLR, 12746–12774. https://arxiv.org/abs/2302.03133
  • Yong Liu, Tengge Hu, Haoran Zhang, Haixu Wu, Shiyu Wang, Lintao Ma, Mingsheng Long: iTransformer: Inverted Transformers Are Effective for Time Series Forecasting. arXiv:2310.06625. https://arxiv.org/abs/2310.06625

Section 11

Section 12

Section 13

  • Zhengbao Jiang, Frank Xu, Luyu Gao, Zhiqing Sun, Qian Liu, Jane Dwivedi-Yu, Yiming Yang, Jamie Callan, Graham Neubig, “Active Retrieval Augmented Generation”, EMNLP-2023, https://aclanthology.org/2023.emnlp-main.495.pdf 
  • Yunyi Zhang, Ruozhen Yang, Xueqiang Xu, Rui Li, Jinfeng Xiao, Jiaming Shen, Jiawei Han, “TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision”, WWW-2025, https://hanj.cs.illinois.edu/pdf/www25_yyzhang.pdf 

Section 14

  • Shen, S., Zhang, C., Bialkowski, A., Chen, W., and Xu, M. (2024). Camu: Disentangling causal effects in deep model unlearning. In Proceedings of the 2024 SIAM International Conference on Data Mining (SDM), pages 779–787. (https://epubs.siam.org/doi/pdf/10.1137/1.9781611978032.89)
  • Rohekar, R. Y., Gurwicz, Y., and Nisimov, S. (2024). Causal interpretation of self-attention in pre-trained transformers. Advances in Neural Information Processing Systems, 36. (https://arxiv.org/pdf/2412.07446)

Section 15

  • You, J., Gomes-Selman, J. M., Ying, R., & Leskovec, J. (2021, May). Identity-aware graph neural networks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 35, No. 12, pp. 10737-10745). (https://arxiv.org/abs/2101.10320)
  • You, J., Du, T., & Leskovec, J. (2022, August). ROLAND: graph learning framework for dynamic graphs. In Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining (pp. 2358-2366). 2022. (https://arxiv.org/abs/2208.07239)