The Michigan Student Artificial Intelligence Lab (MSAIL) is a student organization for discussion of artificial intelligence and machine learning. Andrew Ng said:
“ ...if you read research papers consistently, if you seriously study half a dozen papers a week and you do that for two years, after those two years you will have learned a lot... But that sort of investment, if you spend a whole Saturday studying rather than watching TV, there's no one there to pat you on the back or tell you you did a good job. ” — Andrew NgMSAIL is a community in which motivated students can read and discuss modern machine learning literature together. We welcome students of all backgrounds and ability. To join MSAIL and stay up to date, simply join our Slack team! Also be sure to check out our sister organization: the Michigan Data Science Team! We are both graciously sponsored by the Michigan Institute for Data Science.
Traditional NLP models view language as a set of fixed conventions. Pragmatics is the study of how language is used for communicative purposes. We explore the rational speech act (RSA) model, which is a probabilistic model that formalizes many intuitive aspects of language and enables us to predict human behavior.
Neural networks are dense, parametric, and continuous, while language is sparse, non-parametric, and discrete. So how can the former process the latter? Famously, one uses one-hot embeddings and softmax sampling to translate between continuous and discrete domains. One uses word embeddings to represent sparse sets of words as dense clouds of semantic vectors. One use recurrent neural networks to reduce variable-length sequence problems to local, parametric ones. But there has been another breakthrough recently: one can use Attention Mechanisms to model long-distance relationships between words! Attention lies at the core of this week's papers.
Discriminative models have several key limitations, namely they cannot model the probability of seeing a given input example and therefore cannot generate new examples. Generative Adversarial Networks (GANs) are an application of generative models in which a generator and discriminator are trained to compete against one another in a 2-person game. The generator attempts to create samples that deceive the discriminator into believing they are true samples, and the discriminator attempts to determine which samples are real and generated. We explore the motivation behind GANs, basic theory of how they work, and dive into the future of generative models.
Self-supervised learning has been a key data source for many recent state-of-the-art natural language processing models. We explore a new use case for self-supervised learning with VideoBERT, an attempt to jointly train a visual-linguistic model to learn high-level features without any explicit supervision.