Date: Friday, May 12, 2017
Location: 4405 Siebel Center
Speaker: Dr. Chao Liu, CEO of TianYanCha Inc
Title: TianYanCha: The Heterogeneous Information Network of Chinese Enterprises
Abstract: TianYanCha Inc. (www.tianyancha.com) is the No. 1 Business Investigation Platform in China, supported by the State’s venture capital. It accumulates all the public information about 120 million Chinese enterprises from the Web, and organizes them as a dynamically evolving information network. By checking various aspects of a particular enterprise and its connections to other ones, the investigator can usually locate suspicious points that may otherwise miss. For example, KPMG heavily use it in the i-section of everyday auditing work, to uncover related parties and transactions in between. In this talk, I will review the motivation and evolution of TianYanCha, and share with the audience some lessons learnt during the startup.
Bio：Dr. Chao Liu is the founder and CEO of TianYanCha Inc. (www.tianyancha.com), which is the No. 1 Business Investigation Platform in China. Before founding the company, Dr. Liu served as the Chief Scientist and director in Sogou and Tencent. Before going back to China, Dr. Liu was a researcher at Microsoft Research at Redmond, and led the Data Intelligence Group. His research has been focused on Web services (e.g., search and ads) and data mining, with about 40 conference/journal publications in renowned conferences (e.g., ICML, KDD, SIGIR, FSE, etc) and journals (e.g., IEEE-TSE, IEEE-Computer and ACM TKDD). In addition, many of his research results have been transferred to Microsoft Bing search engine. Chao has been on the program and organizing committees of many conferences, and actively campaigns for the mutualism between academia and industry. He served on the funding review committee for the United States National Science Foundation (US-NSF) for the data mining track in Washington DC, 2011. Chao earned his PhD in Computer Science from the University of Illinois at Urbana-Champaign in 2007, and B.S. in Computer Science from Peking University in 2003.
Date: Monday, March 27, 2017
Time: 9:30 AM
Location: 3405 Siebel Center
Speaker: Xifeng Yan, University of California at Santa Barbara
Title: Knowledge Graph and Question Answering
Abstract: The paradigm of information search is undergoing a significant transformation due to the rise of mobile devices. Techniques that can directly answer user questions and help users navigate the answer space are becoming more desired. In this talk, I will give a broad introduction on knowledge graph and question answering problems studied in my lab. Starting with the challenges and issues existing in knowledge graph query processing, I will discuss our efforts in addressing these issues, including schemaless graph querying, user feedback, factoid question benchmark, natural language questions, and query routing in collaborative networks.
Short Bio: Xifeng Yan is a professor at the University of California, Santa Barbara. He holds the Venkatesh Narayanamurti Chair of Computer Science. He received his Ph.D. from the University of Illinois at Urbana-Champaign in 2006 and was a research staff member at the IBM T. J. Watson Research Center between 2006 and 2008. He has been working on modeling, managing, and mining graphs in knowledge graphs, information networks, computer systems, social media and bioinformatics. His works were extensively referenced, with over 14,000 citations per Google Scholar. He received NSF CAREER Award, IBM Invention Achievement Award, ACM-SIGMOD Dissertation Runner-Up Award, and IEEE ICDM 10-year Highest Impact Paper Award.
Location: 3401 Siebel Center
While “Big Data” technologies are gaining great successes in unlocking knowledge from structured data, real-world data are largely unstructured and in the form of natural-language text. One of the grand challenges is to turn such massive text data into machine-actionable structures. Yet, most existing systems have heavy reliance on human efforts when dealing with text corpora of various kinds, slowing down the development of downstream applications.
In this talk, I will introduce a data-driven framework, minimal-effort StructMine, that extracts factual structures from massive text corpora with minimal human involvement. In particular, I will discuss how to apply Minimal-Effort StructMine to solve three subtasks: from identifying typed entities in text, to refining entity types into more fine-grained levels, to understanding the typed relationships between entities. Together, these three solutions form a clear roadmap for turning a massive corpus into a structured network to represent factual knowledge. Finally, I will share some directions towards mining corpus-specific structured networks for knowledge discovery.
Xiang Ren is a Computer Science PhD candidate at University of Illinois at Urbana-Champaign, working with Jiawei Han and the Data and Information System Lab. Xiang’s research develops data-driven methods for turning unstructured text data into machine-actionable structures. More broadly, his research interests span data mining, machine learning, and natural language processing, with a focus on making sense of massive text corpora. His research has been recognized with a Google PhD Fellowship, Yahoo!-DAIS Research Excellence Award, C. W. Gear Outstanding Graduate Student Award, and has been transferred to US Army Research Lab and Microsoft Bing.