DAIS Qual Exam Reading List (Fall 2022) – Data and Information Systems

The DAIS Qual Exam in Fall 2022 (written exam) has been scheduled to be at 1-5pm on Monday, Oct. 3, 2022 in Siebel Center room 3401.

This reading list consists of multiple topic sections, each containing 2-3 papers. The questions in the written exam will be based on the papers listed here, with 1-2 questions related to each section. That is, if a section has two papers, you can usually expect to see one question related to the section in the qual exam, while if a section has three papers, you can usually expect to see two questions related to the section.

Section 1

Maria Maistro, Lucas Chaves Lima, Jakob Grue Simonsen, and Christina Lioma. 2021. Principled Multi-Aspect Evaluation Measures of Rankings. Proceedings of the 30th ACM International Conference on Information & Knowledge Management. Association for Computing Machinery, New York, NY, USA, 1232–1242. DOI:https://doi.org/10.1145/3459637.3482287 PDF file: http://www.library.illinois.edu/proxy/go.php?url=https://dl.acm.org/doi/pdf/10.1145/3459637.3482287
Naseri S., Dalton J., Yates A., Allan J. (2021) CEQE: Contextualized Embeddings for Query Expansion. In: Hiemstra D., Moens MF., Mothe J., Perego R., Potthast M., Sebastiani F. (eds) Advances in Information Retrieval. ECIR 2021. Lecture Notes in Computer Science, vol 12656. Springer, Cham. https://doi.org/10.1007/978-3-030-72113-8_31 PDF file: https://link.springer.com/content/pdf/10.1007%2F978-3-030-72113-8_31.pdf
Salle A., Malmasi S., Rokhlenko O., Agichtein E. (2021) Studying the Effectiveness of Conversational Search Refinement Through User Simulation. In: Hiemstra D., Moens MF., Mothe J., Perego R., Potthast M., Sebastiani F. (eds) Advances in Information Retrieval. ECIR 2021. Lecture Notes in Computer Science, vol 12656. Springer, Cham. https://doi.org/10.1007/978-3-030-72113-8_39. PDF file: https://link.springer.com/content/pdf/10.1007%2F978-3-030-72113-8_39.pdf

Section 2

Ahmed Alaa and Mihaela van Der Schaar. 2020. Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via Higher-Order Influence Functions. In ICML. https://arxiv.org/abs/2007.13481
Luca Franceschi, Mathias Niepert, Massimiliano Pontil, and Xiao He. 2019. Learning discrete structures for graph neural networks. In ICML. PMLR, 1972–1982. https://arxiv.org/abs/1903.11960
Jian Kang, Qinghai Zhou, Hanghang Tong: JuryGCN: Quantifying Jackknife Uncertainty on Graph Convolutional Networks. KDD 2022: 742-752. http://jiank2.web.illinois.edu/files/kdd22/kang22jurygcn.pdf

Section 3

Jan Overgoor, Austin Benson, and Johan Ugander. Choosing to grow a graph: modeling network formation as discrete choice. In The World Wide Web Conference, pages 1409–1420. ACM, 2019. https://arxiv.org/abs/1811.05008
Yuxin Xiao, Adit Krishnan, and Hari Sundaram. Discovering strategic behaviors for collaborative content-production in social networks. In The Web Conference (WebConf 2020), pages 2078–2088, Taipei, Taiwan, April 2020. https://arxiv.org/abs/2003.03670
Harshay Shah, Suhansanu Kumar, and Hari Sundaram. Growing attributed networks through local processes. In The World Wide Web Conference – WWW ’19, pages 3208–3214. ACM Press, May 2019. https://arxiv.org/abs/1712.10195

Section 4

Tamari, Ronen and Shani, Chen and Hope, Tom and Petruck, Miriam R L and Abend, Omri and Shahaf, Dafna. 2020. {L}anguage (Re)modelling: {T}owards Embodied Language Understanding. ACL2020. https://aclanthology.org/2020.acl-main.559
Kolluru, Keshav and Mohammed, Muqeeth and Mittal, Shubham and Chakrabarti, Soumen and Mausam. 2022. Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction. ACL2022. https://aclanthology.org/2022.acl-long.179
Sundriyal, Megha and Malhotra, Ganeshan and Akhtar, Md Shad and Sengupta, Shubhashis and Fano, Andrew and Chakraborty, Tanmoy. 2022. Document Retrieval and Claim Verification to Mitigate {COVID}-19 Misinformation. ACL2022. https://aclanthology.org/2022.constraint-1.8

Section 5

Yu Meng, Yunyi Zhang, Jiaxin Huang, Yu Zhang and Jiawei Han, “Topic Discovery via Latent Space Clustering of Language Model Embeddings”, in Proc. The ACM Web Conf. 2022 (WWW’22), April 2022
Jiaming Shen, Yunyi Zhang, Heng Ji and Jiawei Han, “Corpus-based Open-Domain Event Type Induction“, in Proc. 2021 Conf. on Empirical Methods in Natural Language Processing (EMNLP’21), Nov. 2021

Section 6

Fu, Tianfan, Cao Xiao, Cheng Qian, Lucas M. Glass, and Jimeng Sun. 2021. “Probabilistic and Dynamic Molecule-Disease Interaction Modeling for Drug Discovery.” In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 404–14. KDD ’21.
Fu, Tianfan, Kexin Huang, Cao Xiao, Lucas M. Glass, and Jimeng Sun. 2022. “HINT: Hierarchical Interaction Network for Trial Outcome Prediction Leveraging Web Data.” Cell Patterns and also at arXiv [cs.CY]. arXiv. http://arxiv.org/abs/2102.04252 .
Huang, Kexin, Cao Xiao, Lucas M. Glass, and Jimeng Sun. 2020. “MolTrans: Molecular Interaction Transformer for Drug–target Interaction Prediction.” Bioinformatics , October. https://doi.org/10.1093/bioinformatics/btaa880.

Section 7

Agiwal, Ankur, et al. “Napa: powering scalable data warehousing with robust query performance at Google.” Proceedings of the VLDB Endowment 14.12 (2021): 2986-2997. http://www.vldb.org/pvldb/vol14/p2986-sankaranarayanan.pdf
Durner, Dominik, Viktor Leis, and Thomas Neumann. “JSON Tiles: Fast Analytics on Semi-Structured Data.” Proceedings of the 2021 International Conference on Management of Data. 2021. http://www.db.in.tum.de/~durner/papers/json-tiles-sigmod21.pdf
Lu, Yi, et al. “Epoch-based commit and replication in distributed OLTP databases.” Proceedings of the VLDB Endowment 14.5 (2021): 743-756. https://pages.cs.wisc.edu/~yxy/pubs/coco.pdf

Section 8

Y. Zhou, X. Li, and A. Banerjee, “Noisy Truncated SGD: Optimization and Generalization,” SIAM International Conference on Data Mining (SDM), 2022, https://arxiv.org/abs/2103.00075
M. Belkin, D. Hsu, S. Ma, S. Mandal, “Reconciling modern machine learning practice and the bias-variance trade-off,” Proceedings National Academy of Science (PNAS), 2019, https://www.pnas.org/content/116/32/15849.short

Section 9

Zhuangdi Zhu, Junyuan Hong, Jiayu Zhou: Data-Free Knowledge Distillation for Heterogeneous Federated Learning. ICML 2021: 12878-12889
Jun Wu, Jingrui He: Domain Adaptation with Dynamic Open-Set Targets. KDD 2022: 2039-2049
Ekdeep Singh Lubana, Chi Ian Tang, Fahim Kawsar, Robert P. Dick, Akhil Mathur: Orchestra: Unsupervised Federated Learning via Globally Consistent Clustering. ICML 2022: 14461-14484

Session 10

Jie Huang, Kevin Chang, Jinjun Xiong, Wen-Mei Hwu: Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach. ACL/IJCNLP (1) 2021: 3641-3651.
Hongbin Pei, Bingzhe Wei, Kevin Chen-Chuan Chang, Yu Lei, Bo Yang: Geom-GCN: Geometric Graph Convolutional Networks. ICLR 2020.