Tutorials Schedule
Morning, August 12, 2012
Information and Influence Spread in Social Networks (slides: link)
(Carlos Castillo, Wei Chen, Laks V. S. Lakshmanan)
08:30-12:00, Ball B, China National Convention Center (CNCC, map)
There is tremendous interest in information propagation in social networks, fueled by applications such as viral marketing, epidemiology, analysis of the spread of innovations, among many others. At the core of these applications there is a phenomenon called influence propagation, where actions performed by people propagate through a social network. In the general setting we have (i) a network representing social ties; (ii) weights representing strengths or probabilities of influence among people; and (iii) a database of traces of past propagations of actions, sometimes described as information cascades.
In this tutorial, we will overview complementary currents that have led to excitement around this area: advances in mathematical sociology, technology and availability of large real data sets, advances in tools (both mathematical and software) for purposes of analysis, and advances in algorithms for analyzing information propagation. After surveying these developments, the bulk of the tutorial will focus on algorithms. No special prerequisite knowledge is needed to attend this tutorial, designed for any data mining researcher or practitioner interested on these topics.
Carlos Castillo is a Senior Scientist at the Social Computing group of the Qatar Computing Research Institute. He has published in the areas of information retrieval, spam detection/demotion, usage analysis and social network analysis. His current research interest is the mining of content, links, and usage data from the Web. | |
Wei Chen is a Lead Research at Microsoft Research Asia in Beijing, China, and an Adjunct Professor at Tsinghua University. His latest research interest is on algorithmic and economic aspects of social networks, and he has published a series of papers on social influence diffusion dynamics and influence maximization, community detection, and pricing in social networks. | |
Laks V. S. Lakshmanan is a professor in the Department of Computer Science at UBC. Laks has published extensively in data management and mining and has served on the PC of all major database and data mining conferences including SIGMOD, PODS, VLDB, ICDE, KDD, and ICDM as member or Area Chair. He is an Associate Editor of the VLDB Journal. His current research interests include social networks and media, personalization and recommender systems. |
Graphical Models
(Xiaojin Zhu)
08:30-12:00, Ball C, China National Convention Center (CNCC, map)
Probabilistic graphical models are important tools in machine learning. This tutorial is an overview of three major aspects of graphical models: (1) Formalism: What graphical models are and what we can do with them. We will go over directed, undirected graphical models and factor graphs; (2) Inference: How to answer probabilistic questions given observed evidence. Exact algorithms (such as variable elimination and junction tree) are effective on certain graph types, while approximate algorithms are needed for general graphs. We will explore both variational (loopy belief propagation, mean field) and Markov Chain Monte Carlo methods for such approximations, and the connection to exponential families. (3) Learning: How to create graph models from data, including both parameter learning (the EM algorithm) and graph structure learning. The tutorial is accessible to everyone with some machine learning experience.
Xiaojin Zhu is an Associate Professor of Computer Science at the University of Wisconsin-Madison. He received his PhD in 2005 at Carnegie Mellon University. His research focuses on semi-supervised learning, graphical models, and Bayesian nonparametrics, and applications in cognitive science and social media data mining. |
Prediction, Belief, and Markets (slides: ppt)
(Jenn Wortman Vaughan, Jacob Abernethy)
08:30-12:00, 205A+B, China National Convention Center (CNCC, map)
Prediction markets are financial markets designed to aggregate opinions across large populations of traders. A typical prediction market offers a set of securities with payoffs determined by the future state of the world. For example, a market might offer a security worth $1 if Barack Obama is re-elected in 2012 and $0 otherwise. Roughly speaking, a trader who believes the probability of Obama's re-election is p should be willing to buy this security at any price less than $p and (short) sell this security at any price greater than $p. For this reason, the going price of this security could be interpreted as traders' collective belief about the likelihood of Obama's re-election. Prediction markets have been used to generate accurate forecasts in a variety of domains including politics, disease surveillance, business, and entertainment, and are cited in the media increasingly often.
This tutorial will cover some of the basic mathematical ideas used in the design of prediction markets, and illustrate several fundamental connections between these ideas and techniques used in machine learning.
We will begin with an overview of proper scoring rules, which can be used to measure the accuracy of a single entity's prediction, and are closely related to proper loss functions. We will then discuss market scoring rules, automated market makers based on proper scoring rules which can be used to aggregate the predictions of many forecasters, and describe how market scoring rules can be implemented as inventory-based markets in which securities are bought and sold. We will describe recent research exploring a duality-based interpretation of market scoring rules which can be exploited to design new markets that can be run efficiently over very large state spaces. Finally, we will explore the fundamental mathematical connections between market scoring rules and two areas of machine learning: online "no-regret" learning and variational inference with exponential families.
This tutorial will be self-contained. No background on markets or specific areas of machine learning is required.
Jenn Wortman Vaughan is an assistant professor in the Computer Science Department at UCLA and an active member of the UCLA Center for Engineering Economics, Learning, and Networks. She completed her Ph.D. at the University of Pennsylvania in 2009, and subsequently spent a year as a Computing Innovation Fellow at Harvard. Her research interests are in machine learning, algorithmic economics, and social computing, all of which she studies using techniques from theoretical computer science. She is the recipient of Penn's 2009 Rubinoff dissertation award for innovative applications of computer technology, a National Science Foundation CAREER award, and best paper or best student paper awards at COLT, ACM EC, and UAI. In her "spare" time, she is involved in a variety of efforts to provide support for women in computer science; most notably, she co-founded the Annual Workshop for Women in Machine Learning, which will be held for the seventh time in 2012. | |
Jacob Abernethy is the Simons Postdoctoral Fellow at the University of Pennsylvania, Department of Computer and Information Science, and is hosted by Professor Michael Kearns. In December 2011, he completed his PhD at the University of California, Berkeley, having been advised by Professor Peter Bartlett. Jacob's thesis focused on the problem of online learning in non-stochastic environments, and much of his research has been concerned with the relationship between learning and game theory. He has done recent work in designing combinatorial information markets, the pricing of financial derivatives, and optimizing efficiency in crowdsourcing platforms. |
Afternoon, August 12, 2012
Factorization Models for Recommender Systems and Other Applications (slides: link)
(Lars Schmidt-Thieme, Steffen Rendle)
13:30-17:00, Ball B, China National Convention Center (CNCC, map)
In many problems nominal variables with many levels occur, e.g., in recommender systems that try to predict how users will rate items based on past rating behavior, whereby users and items are treated as mere ID. For such problems, factorization models that associate every level of these variables with a hidden feature vector and model their influence on a target variable indirectly through the hidden features alone, have proven to work very well.
In this tutorial we will provide a broad introduction to factorization models, starting from the very beginning with matrix factorization and then proceed to generalizations such as tensor factorization models, multi-relational factorization models and factorization machines. We will provide theoretical insight in the assumptions behind the respective modelling approaches as well as present the state-of-the-art learning algorithms.
As time permits, we will cover further aspects of this model family such as factorization models involving time, strong generalization vs. semi-supervised learning with factorization models, factorization models to handle missing data, and Bayesian approaches to factorization models.
Lars Schmidt-Thieme is full professor for machine learning at University of Hildesheim, Germany. After his studies of Mathematics at University of Heidelberg until 1999 he got his PhD from University of Karlsruhe in Business Engineering in 2003. From 2003 to 2006 he has been assistant professor for Computer Science at University of Freiburg and since 2006 is heading the Information Systems and Machine Learning Lab (ISMLL) at University of Hildesheim. His main research interests are supervised machine learning methods for problems with complex data such as found in recommender system problems, time series classification etc. | |
Steffen Rendle is assistant professor for social network analysis at the University of Konstanz, Germany. He studied computer science at the University of Freiburg and received a Ph.D. from the University of Hildesheim in 2010. His research interests include factorization models, relational and sequential data, Bayesian methods and applications to problems in recommender systems and social media. |
Data mining in streams (slides: PDF)
(Edo Liberty, Jelani Nelson)
13:30-17:00, Ball C, China National Convention Center (CNCC, map)
This tutorial will focus on basic data mining in streams. That is, efficiently extracting basic properties of data which is presented to the algorithm one item at a time. The algorithm is limited in its memory footprint and can view each item in the stream only once. This setup is common in large scale computation or in online systems. In practice, such techniques often improve performance even when the data can be read more than once.
First we consider abstract items in a stream. We try to find the most frequently occurring items, produce histograms, or estimate the number of distinct items. This, while maintaining a fixed memory footprint. I will show, for example, how these are used in Yahoo for generating new kinds of Email threads. We will conclude this section by showing how to approximate the frequency moments of item streams.
Next, we consider the case where each item is, in fact, a vector (image, object features, document, etc.). Here we consider dimension reduction, as well as clustering. When interpreting the vectors as columns of a (possibly infinite) matrix we consider generating a rank-k approximation to it. Lastly, time permitting, we will also consider receiving elements in a large matrix one by one in an arbitrary order.
Edo Liberty is a researcher at Yahoo! Research. He received his PhD in Computer Science from Yale university where he stayed for his post doctoral position in the Applied Mathematics program. His research focuses on algorithms and practical aspects of dealing with large data. Especially, randomized algorithms and large scale numerical linear algebra. | |
Jelani Nelson is a postdoctoral researcher in theoretical computer science at Princeton University and the Institute for Advanced Study, and will begin as an Assistant Professor of Computer Science at Harvard University in Fall 2013. He received the S.B. (2005), M.Eng. (2006), and Ph.D. (2011) degrees from MIT in computer science. Jelani's research interests include algorithms and complexity theory, with specific focuses on streaming algorithms and dimensionality reduction. |
Learning to Rank and Its Applications in Web Search and Online Advertising
(Tie-Yan Liu)
13:30-17:00, 205A+B, China National Convention Center (CNCC, map)
Learning to rank is a newly emerged research direction, which is concerned with the adoption of machine learning technologies to generate good ranking results. Learning to rank has been applied to many real applications, such as web search and online advertising, and has demonstrated its advantages over conventional heuristic methods. This tutorial is an introduction of learning to rank, covering both algorithm, theory, and application. In the first part of the tutorial, we will introduce the basic concepts and major approaches to learning to rank. In particular, we categorize exiting learning to rank algorithms into three categories, i.e., the pointwise approach, the pairwise approach, and the listwise approach. For each category (approach), we will discuss its basic settings, show a couple of popular algorithms, and discuss their pros and cons. In addition, we will present the empirical evaluation results for these approaches and make necessary comparisons and discussions, based on benchmark datasets for learning to rank. In the second part of the tutorial, we will go to two extremes of learning to rank, i.e., theories and applications. As for the theories, we will introduce the big picture of statistical learning theory for ranking, including the assumptions on the probability space for ranking, the relationship between widely used surrogate loss functions in learning to rank algorithms and popular evaluation measures for ranking, and the generalization bounds of several pairwise and listwise ranking algorithms with respect to the increasing numbers of query and documents per query. As for the application, we will take Web search and online advertising as examples, to demonstrate how one can use learning to rank technologies in practice. This includes training data collection, feature extraction, feature selection, algorithm selection / modification, etc. It is our goal that the audience will be able to use learning to rank technologies to solve their own tasks after listening to these highly practical guidelines. At the end of the tutorial, we will talk about the future research directions on learning to rank, and give necessary references for the audience to learn more about this research direction.
Tie-Yan Liu is the research manager of the Internet Economics & Computational Advertising group at Microsoft Research Asia. His research interests include learning to rank, large-scale graph ranking, statistical game theory, and empirical economic theory. So far, he has authored two books, more than 90 journal and conference papers, and over 30 granted US / international patents. He is the co-author of the best student paper for SIGIR (2008) and the most cited paper for the Journal of Visual Communication and Image Representation (2004~2006). He is a program committee co-chair of RIAO (2010), a demo/exhibit co-chair of KDD (2012), a track chair of WWW (2011), an area chair of SIGIR (2008~2011) and AIRS (2009-2011), and a co-chair of several workshops at SIGIR, ICML, and NIPS. He is an associate editor of ACM Transactions on Information System (TOIS) and an editorial board member of several other journals including Information Retrieval and ISRN Artificial Intelligence. He is a keynote speaker at PCM (2010) and CCIR (2011), a plenary panelist of KDD (2011), and a tutorial speaker at several conferences including SIGIR and WWW. Prior to joining Microsoft, he obtained his Ph.D. in electronic engineering from Tsinghua University, China. He is a senior member of the IEEE and a member of the ACM. |
Tutorial Co-Chairs
- Yehuda Koren (Google)
- Jennifer Neville (Purdue University)