- AG-66 (Lecture Theatre)
Abstract: In today's age we have an abundance of massive data available online. This creates fantastic opportunities for Statistical learning researchers to analyze data and estimate ongoing patterns. Analyzing this copious amount of data requires us to make certain assumptions about the model. We present a technique called Topic models. Topic models are probabilistic models of text. Here we assume that data exhibits a recurring pattern of sets of semantically related words. These sets of semantically related words are called topics. With the help of Topic models we extract and uncover these patterns. Topic models essentially encode our assumptions into latent variables, and then infer a probability distribution over these latent variables. But often inferring this joint distribution is intractable. We will present techniques which approximate the intractability of the problem. Variational inference is such an approximation algorithm. We illustrate the most commonly used Topic model called Latent Dirichlet Allocation and use Variational Inference techniques to approximate the posterior distribution.