123 episodes

Welcome to the NLP highlights podcast, where we invite researchers to talk about their work in various areas in natural language processing. The hosts are Matt Gardner, Pradeep Dasigi (research scientists at the Allen Institute for Artificial Intelligence) and Waleed Ammar (research scientist at Google). All views expressed belong to the hosts and guests and do not represent their employers.

NLP Highlight‪s‬ Allen Institute for Artificial Intelligence

    • Science
    • 5.0 • 1 Rating

Welcome to the NLP highlights podcast, where we invite researchers to talk about their work in various areas in natural language processing. The hosts are Matt Gardner, Pradeep Dasigi (research scientists at the Allen Institute for Artificial Intelligence) and Waleed Ammar (research scientist at Google). All views expressed belong to the hosts and guests and do not represent their employers.

    122 - Statutory Reasoning in Tax Law, with Nils Holzenberger

    122 - Statutory Reasoning in Tax Law, with Nils Holzenberger

    We invited Nils Holzenberger, a PhD student at JHU to talk about a dataset involving statutory reasoning in tax law Holzenberger et al. released recently. This dataset includes difficult textual entailment and question answering problems that involve reasoning about how sections in tax law are applicable to specific cases. They also released a Prolog solver that fully solves the problems, and show that learned models using dense representations of text perform poorly. We discussed why this is the case, and how one can train models to solve these challenges.

    Project webpage: https://nlp.jhu.edu/law/

    • 46 min
    121 - Language and the Brain, with Alona Fyshe

    121 - Language and the Brain, with Alona Fyshe

    We invited Alona Fyshe to talk about the link between NLP and the human brain. We began by talking about what we currently know about the connection between representations used in NLP and representations recorded in the brain. We also discussed how different brain imaging techniques compare to each other. We then dove into experiments investigating how hidden states of LSTM language models correlate with EEG brain imaging data on three types of language inputs: well-formed grammatical sentences, pseudo-word sentences preserving syntax but not semantics, and word-lists preserving neither. We talk about the kinds of conclusions that can be drawn from these correlations and conclude by discussing avenues for future work.

    • 42 min
    120 - Evaluation of Text Generation, with Asli Celikyilmaz

    120 - Evaluation of Text Generation, with Asli Celikyilmaz

    We invited Asli Celikyilmaz for this episode to talk about evaluation of text generation systems. We discussed the challenges in evaluating generated text, and covered human and automated metrics, with a discussion of recent developments in learning metrics. We also talked about some open research questions, including the difficulties in evaluating factual correctness of generated text.

    Asli Celikyilmaz is a Principal Researcher at Microsoft Research.
    Link to a survey co-authored by Asli on this topic: https://arxiv.org/abs/2006.14799

    • 55 min
    119 - Social NLP, with Diyi Yang

    119 - Social NLP, with Diyi Yang

    In this episode, Diyi Yang gives us an overview of using NLP models for social applications, including understanding social relationships, processes, roles, and power. As NLP systems are getting used more and more in the real world, they additionally have increasing social impacts that must be studied. We talk about how to get started in this field, what datasets exist and are commonly used, and potential ethical issues. We additionally cover two of Diyi's recent papers, on neutralizing subjective bias in text, and on modeling persuasiveness in text.

    Diyi Yang is an assistant professor in the School of Interactive Computing at Georgia Tech.

    • 53 min
    118 - Coreference Resolution, with Marta Recasens

    118 - Coreference Resolution, with Marta Recasens

    In this episode, we talked about Coreference Resolution with Marta Recasens, a Research Scientist at Google. We discussed the complexity involved in resolving references in language, the simplification of the problem that the NLP community has focused on by talking about specific datasets, and the complex coreference phenomena that are not yet captured in those datasets. We also briefly talked about how coreference is handled in languages other than English, and how some of the notions we have about modeling coreference phenomena in English do not necessarily transfer to other languages. We ended the discussion by talking about large language models, and to what extent they might be good at handling coreference.

    • 47 min
    117 - Interpreting NLP Model Predictions, with Sameer Singh

    117 - Interpreting NLP Model Predictions, with Sameer Singh

    We interviewed Sameer Singh for this episode, and discussed an overview of recent work in interpreting NLP model predictions, particularly instance-level interpretations. We started out by talking about why it is important to interpret model outputs and why it is a hard problem. We then dove into the details of three kinds of interpretation techniques: attribution based methods, interpretation using influence functions, and generating explanations. Towards the end, we spent some time discussing how explanations of model behavior can be evaluated, and some limitations and potential concerns in evaluation methods.

    Sameer Singh is an Assistant Professor of Computer Science at the University of California, Irvine.
    Some of the techniques discussed in this episode have been implemented in the AllenNLP Interpret framework (details and demo here: https://allennlp.org/interpret).

    • 56 min

Customer Reviews

5.0 out of 5
1 Rating

1 Rating

Top Podcasts In Science

Listeners Also Subscribed To