Towards Theoretical Understanding of Overparametrization in Deep Learning

Published in 2018

Guest Speaker: Jason Lee, Assistant Professor, University of Southern California

Date/Time: Thursday, October 4, 2018, 2:00 pm

Location: 3403 Siebel

Sponsored by the Department of Computer Science and DAIS


Abstract: We provide new theoretical insights on why over-parametrization is effective in learning neural networks. For a k hidden node shallow network with quadratic activation and n training data points, we show as long as k>=sqrt(2n), over-parametrization enables local search algorithms to find a \emph{globally} optimal solution for general smooth and convex loss functions. Further, despite that the number of parameters may exceed the sample size, we show with weight decay, the solution also generalizes well.

Next, we analyze the implicit regularization effects of various optimization algorithms. In particular we prove that for least squares with mirror descent, the algorithm converges to the closest solution in terms of the bregman divergence. For linearly separable classification problems, we prove that the steepest descent with respect to a norm solves SVM with respect to the same norm. For over-parametrized non-convex problems such as matrix sensing or neural net with quadratic activation, we prove that gradient descent converges to the minimum nuclear norm solution, which allows for both meaningful optimization and generalization guarantees.


Bio: Jason Lee is an assistant professor in Data Sciences and Operations at the University of Southern California. Prior to that, he was a postdoctoral researcher at UC Berkeley working with Michael Jordan. Jason received his PhD at Stanford University advised by Trevor Hastie and Jonathan Taylor. His research interests are in statistics, machine learning, and optimization. Lately, he has worked on high dimensional statistical inference, analysis of non-convex optimization algorithms, and theory for deep learning.

This is joint work with Suriya Gunasekar, Mor Shpigel, Daniel Soudry, Nati Srebro, and Simon Du.

Mao Ye, Associate Professor of Finance, University of Illinois at Urbana-Champaign, “Big Data in Finance”

Published in 2018

Location: 3124 SC

Time: 4:00-5:00pm Friday, June 15, 2018

Abstract. Modern financial markets generate vast quantities of data.  As the data environment has become increasingly “big” and analyses increasingly computerized, the information that different market participants extract and use has grown more varied and diverse.  At one extreme, high-frequency traders (HFTs) implement ultra-minimalist algorithms optimized for speed. At the other extreme, some industry practitioners apply sophisticated machine-learning techniques that take minutes, hours, or days to run. The proposed project seeks to understand this full spectrum of machine-based trading, with the purpose to inform the public policy and to augment theoretical studies on financial markets. Just as insights into human behavior from the psychology literature spawned the field of behavioral finance, insights into algorithmic behavior (or the psychology of machines) can result in an analogous blossoming of research in algorithmic behavioral finance. Most literature to date follows a simple dichotomy that pits HFTs vs. everyone else. We aim to explore diversities within cyber-traders with a particular emphasis on players who are slower than HFTs but faster than humans. This focus will fill in the gap between the literature on HFTs, which focuses on milliseconds or nanoseconds, and the literature on institutional investors, which relies on quarterly holding data.

About the speaker:  Mao Ye is an associate professor of finance at the University of Illinois, Urbana-Champaign. His research focuses on market microstructure, machine learning, and big data. His paper has been published in Journal of Finance, Journal of Financial Economics, and Review of Financial Studies. He is a fellow of National Bureau of Economic Research (NBER) and National Center for Supercomputing Applications (NCSA). In 2016, the University of Illinois, Urbana-Champaign named him the Educator of the Year after a campus-wide competition.

Mao Ye earned his Ph.D. degree from Cornell University. In 2006, he was elected as a trustee member of Cornell’s Board of Trustees, marking the first time an Ivy League institution had elected a trustee from Mainland China. In 2018, the New York Historical Society selected Mao Ye as one of the stories in their book “Journeys: An American Story.”

Dr. Chao Liu, CEO of TianYanCha Inc,”TianYanCha: The Heterogeneous Information Network of Chinese Enterprises”

Published in 2017

Date: Friday, May 12, 2017

Time: 2-3pm

Location: 4405 Siebel Center

Speaker: Dr. Chao Liu, CEO of TianYanCha Inc

Title: TianYanCha: The Heterogeneous Information Network of Chinese Enterprises

Abstract: TianYanCha Inc. ( is the No. 1 Business Investigation Platform in China, supported by the State’s venture capital. It accumulates all the public information about 120 million Chinese enterprises from the Web, and organizes them as a dynamically evolving information network. By checking various aspects of a particular enterprise and its connections to other ones, the investigator can usually locate suspicious points that may otherwise miss. For example, KPMG heavily use it in the i-section of everyday auditing work, to uncover related parties and transactions in between. In this talk, I will review the motivation and evolution of TianYanCha, and share with the audience some lessons learnt during the startup.

Bio:Dr. Chao Liu is the founder and CEO of TianYanCha Inc. (, which is the No. 1 Business Investigation Platform in China. Before founding the company, Dr. Liu served as the Chief Scientist and director in Sogou and Tencent. Before going back to China, Dr. Liu was a researcher at Microsoft Research at Redmond, and led the Data Intelligence Group. His research has been focused on Web services (e.g., search and ads) and data mining, with about 40 conference/journal publications in renowned conferences (e.g., ICML, KDD, SIGIR, FSE, etc) and journals (e.g., IEEE-TSE, IEEE-Computer and ACM TKDD). In addition, many of his research results have been transferred to Microsoft Bing search engine. Chao has been on the program and organizing committees of many conferences, and actively campaigns for the mutualism between academia and industry. He served on the funding review committee for the United States National Science Foundation (US-NSF) for the data mining track in Washington DC, 2011. Chao earned his PhD in Computer Science from the University of Illinois at Urbana-Champaign in 2007, and B.S. in Computer Science from Peking University in 2003.

Xifeng Yan, Professor, UCSB, “Knowledge Graph and Question Answering”

Published in 2017

Date: Monday, March 27, 2017

Time: 9:30 AM

Location: 3405 Siebel Center

Speaker: Xifeng Yan, University of California at Santa Barbara

Title: Knowledge Graph and Question Answering

Abstract:  The paradigm of information search is undergoing a significant transformation due to the rise of mobile devices. Techniques that can directly answer user questions and help users navigate the answer space are becoming more desired.  In this talk, I will give a broad introduction on knowledge graph and question answering problems studied in my lab.  Starting with the challenges and issues existing in knowledge graph query processing, I will discuss our efforts in addressing these issues, including schemaless graph querying, user feedback, factoid question benchmark, natural language questions, and query routing in collaborative networks.

Short Bio: Xifeng Yan is a professor at the University of California, Santa Barbara. He holds the Venkatesh Narayanamurti Chair of Computer Science. He received his Ph.D. from the University of Illinois at Urbana-Champaign in 2006 and was a research staff member at the IBM T. J. Watson Research Center between 2006 and 2008. He has been working on modeling, managing, and mining graphs in knowledge graphs, information networks, computer systems, social media and bioinformatics. His works were extensively referenced, with over 14,000 citations per Google Scholar.  He received NSF CAREER Award, IBM Invention Achievement Award, ACM-SIGMOD Dissertation Runner-Up Award, and IEEE ICDM 10-year Highest Impact Paper Award.