The written exam of the DAIS Qual Exam in Spring 2023 will be held on Monday, Feb. 20, 2023, at 1pm-5pm in room 2407 Siebel Center.
This reading list consists of multiple topic sections, each containing 2-3 papers. The questions in the written exam will be based on the papers listed here, with 1-2 questions related to each section. If a section has two papers, you can usually expect to see one question related to the section in the qual exam, while if a section has three papers, you can usually expect to see two questions related to the section. You only need to answer four of those questions in the exam, so there is no need for you to read every paper. Instead, it would make sense for you to browse through the list and identify 8~10 papers that you are most familiar with or most comfortable with reading, and then focus on reading/digesting those papers. In general, you will likely find some sections to be closer to your interests or background than others, and you can focus more on reading the papers in those a few sections that seem to be closest to your research interests.
Section 1
- Yu Zhang, Yunyi Zhang, Martin Michalski, Yucheng Jiang, Yu Meng, and Jiawei Han, “Effective Seed-Guided Topic Discovery by Integrating Multiple Types of Contexts”, in Proc. 2023 ACM Int. Conf. on Web Search and Data Mining (WSDM’23), Feb. 2023
- Yizhu Jiao, Sha Li, Yiqing Xie, Ming Zhong, Heng Ji and Jiawei Han, “Open-Vocabulary Argument Role Prediction for Event Extraction”, in Proc. 2022 Conf. on Empirical Methods in Natural Language Processing (EMNLP’22), Dec. 2022
- Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han, “Generating Training Data with Language Models: Towards Zero-Shot Language Understanding“, in Proc. 2022 Conf. on Neural Information Processing Systems (NeurIPS’22), Nov. 2022
Section 2
- Chen, C., Sun, F., Zhang, M., and Ding, B. Recommendation unlearning. In Proceedings of the ACM Web Conference 2022 (New York, NY, USA, 2022), WWW ’22, Association for Computing Machinery, pp. 2768–2777. (https://dl.acm.org/doi/10.1145/3485447.3511997)
- Zhang, Y., Feng, F., He, X., Wei, T., Song, C., Ling, G., and Zhang, Y. Causal intervention for leveraging popularity bias in recommendation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (New York, NY, USA, 2021), SIGIR ’21, Association for Computing Machinery, pp. 11–20. (https://dl.acm.org/doi/10.1145/3404835.3462875)
- Karimi, A.-H., Schölkopf, B., and Valera, I. Algorithmic recourse: From counterfactual explanations to interventions. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (New York, NY, USA, 2021), FAccT ’21, Association for Computing Machinery, pp. 353–362. (https://dl.acm.org/doi/10.1145/3442188.3445899)
Section 3
- Brown et al. 2020. Language Models are Few-Shot Learners. NeurIPS2020. https://arxiv.org/abs/2005.14165
- Masahiro Kaneko and Danushka Bollegala. 2021. Debiasing Pre-trained Contextualised Embeddings. Proc. EACL2021. https://aclanthology.org/2021.eacl-main.107.pdf
- Yue Guo, Yi Yang, Ahmed Abbasi. 2022. Auto-Debias: Debiasing Masked Language Models with Automated Biased Prompts. ACL2022. https://aclanthology.org/2022.acl-long.72/
Section 4
- Vartak, Manasi, et al. “Mistique: A system to store and query model intermediates for model diagnosis.” Proceedings of the 2018 International Conference on Management of Data. 2018. https://www-cs.stanford.edu/~matei/papers/2018/sigmod_mistique.pdf
- Li, Feifei, et al. “Wander join: Online aggregation via random walks.” Proceedings of the 2016 International Conference on Management of Data. 2016. http://www.cs.utah.edu/~lifeifei/papers/wanderjoin.pdf
- Petersohn, Devin, et al. “Towards Scalable Dataframe Systems.” Proceedings of the VLDB Endowment 13.11. http://www.vldb.org/pvldb/vol13/p2033-petersohn.pdf
Section 5
- Zifeng Wang and Jimeng Sun. TransTab: Learning Transferable Tabular Transformers Across Tables. NeurIPS 2022.
- Zhen Lin, Shubhendu Trivedi, and Jimeng Sun. Conformal Prediction with Temporal Quantile Adjustments. NeurIPS 2022
- Tianfan Fu*, Wenhao Gao*, Connor W. Coley, Jimeng Sun. Reinforced Genetic Algorithm for Structure-based Drug Design. NeurIPS 2022.
Section 6
- Arjovsky, Martin, Léon Bottou, Ishaan Gulrajani, and David Lopez-Paz. “Invariant risk minimization.” https://openreview.net/forum?id=BOz47Bq–NB
- Susan Athey, Mohsen Bayati, Nikolay Doudchenko, Guido Imbens & Khashayar Khosravi (2021) Matrix Completion Methods for Causal Panel Data Models, Journal of the American Statistical Association, 116:536, 1716-1730, https://www.tandfonline.com/doi/full/10.1080/01621459.2021.1891924
- Amjad, Muhammad, Devavrat Shah, and Dennis Shen. “Robust synthetic control.” The Journal of Machine Learning Research 19, no. 1 (2018): 802-852. https://www.jmlr.org/papers/volume19/17-777/17-777.pdf
Section 7
Section 8
- Chen Xu, Piji Li, Wei Wang, Haoran Yang, Siyun Wang, and Chuangbai Xiao. 2022. COSPLAY: Concept Set Guided Personalized Dialogue Generation Across Both Party Personas. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22). Association for Computing Machinery, New York, NY, USA, 201–211. https://www.library.illinois.edu/proxy/go.php?url=https://doi.org/10.1145/3477495.3531957
- Wenqiang Lei, Yao Zhang, Feifan Song, Hongru Liang, Jiaxin Mao, Jiancheng Lv, Zhenglu Yang, and Tat-Seng Chua. 2022. Interacting with Non-Cooperative User: A New Paradigm for Proactive Dialogue Policy. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22). Association for Computing Machinery, New York, NY, USA, 212–222.https://www.library.illinois.edu/proxy/go.php?url=https://doi.org/10.1145/3477495.3532001
Section 9
- Zhe Xu, Boxin Du, Hanghang Tong: Graph Sanitation with Application to Node Classification. WWW 2022: 1136-1147. (https://arxiv.org/abs/2105.09384)
- Wei Jin, Yao Ma, Xiaorui Liu, Xianfeng Tang, Suhang Wang, and Jiliang Tang. 2020. Graph Structure Learning for Robust Graph Neural Networks. In SIGKDD. ACM, 66–74. (https://arxiv.org/abs/2005.10203)
Section 10
- Park, M., Leahey, E. & Funk, R.J. (2023) Papers and patents are becoming less disruptive over time. Nature 613, 138–144 . https://doi.org/10.1038/s41586-022-05543-x [relevant preceding paper DOI: 10.1038/s41586-019-0941-9 ]
- Fontana, M., Iori, M., Montobbio, F., and Sinatra, R. (2020) New and atypical combinations: An assessment of novelty and interdisciplinarity (2020) Research Policy, 2020, vol. 49, issue 7 https://doi.org/10.1016/j.respol.2020.104063 [relevant preceding papers are (i) https://doi.org/10.1162/qss_a_00007 (ii) 10.1126/science.12404]
Section 11
- Wang, B., Wang, X., Tao, T., Zhang, Q., & Xu, J. (2020). Neural Question Generation with Answer Pivot. Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), 9138-9145.PDF file: https://ojs.aaai.org/index.php/AAAI/article/view/6449
- DEER: Descriptive Knowledge Graph for Explaining Entity Relationships. Jie Huang, Kerui Zhu, Kevin Chen-Chuan Chang, Jinjun Xiong, Wen-mei Hwu. In The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022. PDF file: https://arxiv.org/abs/2205.10479
- Patrick S. H. Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, Douwe Kiela: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS 2020 PDF file: https://arxiv.org/abs/2005.11401
Section 12
- Y. Zhou, X. Li, and A. Banerjee, Noisy Truncated SGD: Optimization and Generalization, SIAM International Conference on Data Mining (SDM), 2022 https://arxiv.org/abs/2103.00075
- A. Banerjee, T. Chen, X. Li, Y. Zhou, Stability Based Generalization Bounds for Exponential Family Langevin Dynamics, International Conference on Machine Learning (ICML), 2022. https://arxiv.org/abs/2201.03064