The written exam of the DAIS Qual Exam in Fall 2022 will be held on Monday, Oct. 3, 2022, at 1pm-5pm in Siebel Center room 3401.
This reading list consists of multiple topic sections, each containing 2-3 papers. The questions in the written exam will be based on the papers listed here, with 1-2 questions related to each section. That is, if a section has two papers, you can usually expect to see one question related to the section in the qual exam, while if a section has three papers, you can usually expect to see two questions related to the section.
Section 1
- Alistair Moffat, Joel Mackenzie, Paul Thomas, and Leif Azzopardi. 2022. A Flexible Framework for Offline Effectiveness Metrics. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22). Association for Computing Machinery, New York, NY, USA, 578–587. https://doi.org/10.1145/3477495.3531924 PDF file: http://www.library.illinois.edu/proxy/go.php?url=https://doi.org/10.1145/3477495.3531924
- Fernando Diaz and Andres Ferraro. 2022. Offline Retrieval Evaluation Without Evaluation Metrics. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22). Association for Computing Machinery, New York, NY, USA, 599–609. https://doi.org/10.1145/3477495.3532033 PDF file:http://www.library.illinois.edu/proxy/go.php?url=https://doi.org/10.1145/3477495.3532033
Section 2
- Ahmed Alaa and Mihaela van Der Schaar. 2020. Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via Higher-Order Influence Functions. In ICML. https://arxiv.org/abs/2007.13481
- Jian Kang, Qinghai Zhou, Hanghang Tong: JuryGCN: Quantifying Jackknife Uncertainty on Graph Convolutional Networks. KDD 2022: 742-752. http://jiank2.web.illinois.edu/files/kdd22/kang22jurygcn.pdf
- Luca Franceschi, Mathias Niepert, Massimiliano Pontil, and Xiao He. 2019. Learning discrete structures for graph neural networks. In ICML. PMLR, 1972–1982. https://arxiv.org/abs/1903.11960
Section 3
- Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464):447–453, 2019. (DOI: 10.1126/science.aax2342)
- Rediet Abebe, Solon Barocas, Jon Kleinberg, Karen Levy, Manish Raghavan, and David G. Robinson. Roles for computing in social change. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, FAT* ’20, pages 252–260, New York, NY, USA, 2020. Association for Computing Machinery. (https://dl.acm.org/doi/abs/10.1145/3351095.3372871)
Section 4
- Tamari, Ronen and Shani, Chen and Hope, Tom and Petruck, Miriam R L and Abend, Omri and Shahaf, Dafna. 2020. {L}anguage (Re)modelling: {T}owards Embodied Language Understanding. ACL2020. https://aclanthology.org/2020.acl-main.559
- Kolluru, Keshav and Mohammed, Muqeeth and Mittal, Shubham and Chakrabarti, Soumen and Mausam. 2022. Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction. ACL2022. https://aclanthology.org/2022.acl-long.179
- Sundriyal, Megha and Malhotra, Ganeshan and Akhtar, Md Shad and Sengupta, Shubhashis and Fano, Andrew and Chakraborty, Tanmoy. 2022. Document Retrieval and Claim Verification to Mitigate {COVID}-19 Misinformation. ACL2022. https://aclanthology.org/2022.constraint-1.8
Section 5
- Yu Meng, Yunyi Zhang, Jiaxin Huang, Yu Zhang and Jiawei Han, “Topic Discovery via Latent Space Clustering of Language Model Embeddings”, in Proc. The ACM Web Conf. 2022 (WWW’22), April 2022
- Jiaming Shen, Yunyi Zhang, Heng Ji and Jiawei Han, “Corpus-based Open-Domain Event Type Induction“, in Proc. 2021 Conf. on Empirical Methods in Natural Language Processing (EMNLP’21), Nov. 2021
Section 6
- Fu, Tianfan, Cao Xiao, Cheng Qian, Lucas M. Glass, and Jimeng Sun. 2021. “Probabilistic and Dynamic Molecule-Disease Interaction Modeling for Drug Discovery.” In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 404–14. KDD ’21. https://dl.acm.org/doi/pdf/10.1145/3447548.3467286
- Fu, Tianfan, Kexin Huang, Cao Xiao, Lucas M. Glass, and Jimeng Sun. 2022. “HINT: Hierarchical Interaction Network for Trial Outcome Prediction Leveraging Web Data.” Cell Patterns and also at arXiv [cs.CY]. arXiv. http://arxiv.org/abs/2102.04252 .
- Huang, Kexin, Cao Xiao, Lucas M. Glass, and Jimeng Sun. 2020. “MolTrans: Molecular Interaction Transformer for Drug–target Interaction Prediction.” Bioinformatics , October. https://doi.org/10.1093/bioinformatics/btaa880.
Section 7
- Chockchowwat, Supawit, Chaitanya Sood, and Yongjoo Park. “Airphant: Cloud-oriented Document Indexing.” 2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 2022. https://arxiv.org/pdf/2112.13323.pdf
- Chockchowwat, Supawit, Wenjie Liu, and Yongjoo Park. “Automatically Finding Optimal Index Structure.” arXiv preprint arXiv:2208.03823 (2022). https://arxiv.org/pdf/2208.03823.pdf
Section 8
- J. Negrea, M. Haghifam, G. K. Dziugaite, A. Khisti, and D. M. Roy (2019). “Information-theoretic generalization bounds for sgld via data-dependent estimates,” NeurIPS, 2019. https://papers.nips.cc/paper/2019/file/05ae14d7ae387b93370d142d82220f1b-Paper.pdf
- A. Banerjee, T. Chen, X. Li, and Y. Zhou (2022), “Stability Based Generalization Bounds for Exponential Family Langevin Dynamics,” ICML, 2022. https://proceedings.mlr.press/v162/banerjee22a/banerjee22a.pdf
Section 9
- Zhuangdi Zhu, Junyuan Hong, Jiayu Zhou: Data-Free Knowledge Distillation for Heterogeneous Federated Learning. ICML 2021: 12878-12889
- Jun Wu, Jingrui He: Domain Adaptation with Dynamic Open-Set Targets. KDD 2022: 2039-2049
- Ekdeep Singh Lubana, Chi Ian Tang, Fahim Kawsar, Robert P. Dick, Akhil Mathur: Orchestra: Unsupervised Federated Learning via Globally Consistent Clustering. ICML 2022: 14461-14484
Session 10
- Hoyeop Lee, Jinbae Im, Seongwon Jang, Hyunsouk Cho, Sehee Chung: MeLU: Meta-Learned User Preference Estimator for Cold-Start Recommendation. KDD 2019: 1073-1082 PDF: https://arxiv.org/abs/1908.00413
- Patrick S. H. Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, Douwe Kiela: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS 2020 PDF file: https://arxiv.org/abs/2005.11401
- Abram Handler, Brendan T. O’Connor: Relational Summarization for Corpus Analysis. NAACL-HLT 2018: 1760-1769 PDF file: https://aclanthology.org/N18-1159.pdf