Sebastian Ruder on Feedspot

NeurIPS 2023 Primer

Sebastian Ruder

by Sebastian Ruder

4M ago

This post was first published in NLP News. NeurIPS 2023, arguably this year’s biggest AI conference takes place in two weeks from Dec 10–16 in New Orleans. 3586 papers were accepted to the conference, which are available online. In this post, I’ll discuss a selection of 20 papers related to natural language processing (NLP) that caught my eye, with a focus on oral and spotlight papers. Here are the main trends I observed: Most NLP work at NeurIPS is related to large language models (LLMs). While there are some papers that do not employ LLMs or use a different setting (see ..read more

Visit website

Modular Deep Learning

Sebastian Ruder

by Sebastian Ruder

1y ago

This post gives a brief overview of modularity in deep learning. For a more in-depth review, refer to our survey. For modular fine-tuning for NLP, check out our EMNLP 2022 tutorial. For more resources, check out modulardeeplearning.com. Fuelled by scaling laws, state-of-the-art models in machine learning have been growing larger and larger. These models are monoliths. They are pre-trained from scratch in highly choreographed engineering endeavours. Due to their size, fine-tuning has become expensive while alternatives, such as in-context learning are often brittle in practice. At the same tim ..read more

Visit website

The State of Multilingual AI

Sebastian Ruder

by Sebastian Ruder

1y ago

Models that allow interaction via natural language have become ubiquitious. Research models such as BERT and T5 have become much more accessible while the latest generation of language and multi-modal models are demonstrating increasingly powerful capabilities. At the same time, a wave of NLP startups has started to put this technology to practical use. While such language technology may be hugely impactful, recent models have mostly focused on English and a handful of other languages with large amounts of resources. Developing models that work for more languages is important in order to offs ..read more

Visit website

ACL 2022 Highlights

Sebastian Ruder

by Sebastian Ruder

1y ago

ACL 2022 took place in Dublin from 22nd–27th May 2022. This was my first in-person conference since ACL 2019. This is also my first conference highlights post since NAACL 2019. With 1032 accepted papers (604 long, 97 short, 331 in Findings), this post can only offer a glimpse of the diverse research presented at the conference—biased towards my research interests. Here are the themes that were most noticeable for me across the conference program: Language Diversity and Multimodality Prompting Next Big Ideas Favorite Papers The Dark Matter of Language and Intelligence Hybrid Conference Experi ..read more

Visit website

ML and NLP Research Highlights of 2021

Sebastian Ruder

by Sebastian Ruder

1y ago

Credit for the title image: Liu et al. (2021) 2021 saw many exciting advances in machine learning (ML) and natural language processing (NLP). In this post, I will cover the papers and research areas that I found most inspiring. I tried to cover the papers that I was aware of but likely missed many relevant ones. Feel free to highlight them as well as ones that you found inspiring in the comments. I discuss the following highlights: Universal Models Massive Multi-task Learning Beyond the Transformer Prompting Efficient Methods Benchmarking Conditional Image Generation ML for Science Program S ..read more

Visit website

Multi-domain Multilingual Question Answering

Sebastian Ruder

by Sebastian Ruder

1y ago

This post expands on the EMNLP 2021 tutorial on Multi-domain Multilingual Question Answering. The tutorial was organised by Avi Sil and me. In this post, I highlight key insights and takeaways of the tutorial. The slides are available online. You can find the table of contents below: Introduction Open-Retrieval QA vs Reading Comprehension What is a Domain? Multi-Domain QA Datasets for Multi-Domain QA Multi-Domain QA Models Unsupervised Domain Adaptation for QA Domain Adaptation with Pre-trained LMs Domain Generalization Multilingual QA Datasets for Multilingual QA Issues of Multil ..read more

Visit website

Challenges and Opportunities in NLP Benchmarking

Sebastian Ruder

by Sebastian Ruder

1y ago

Over the last years, models in NLP have become much more powerful, driven by advances in transfer learning. A consequence of this drastic increase in performance is that existing benchmarks have been left behind. Recent models "have outpaced the benchmarks to test for them" (AI Index Report 2021), quickly reaching super-human performance on standard benchmarks such as SuperGLUE and SQuAD. Does this mean that we have solved natural language processing? Far from it. However, the traditional practices for evaluating performance of NLP models, using a single metric such as accuracy or BLEU, relyi ..read more

Visit website

ACL 2021 Highlights

Sebastian Ruder

by Sebastian Ruder

1y ago

ACL 2021 took place virtually from 1–6 August 2021. Here are my highlights from the conference: NLP benchmarking is broken NLP is all about pre-trained Transformers Machine translation Understanding models Cross-lingual transfer and multilingual NLP Challenges in natural language generation Virtual conference notes NLP benchmarking is broken Many talks and papers made reference to the current state of NLP benchmarking, which has seen existing benchmarks largely outpaced by rapidly improving pre-trained models. My favourite resources on this topic from the conference are: Chris Pott's keyno ..read more

Visit website

Recent Advances in Language Model Fine-tuning

Sebastian Ruder

by Sebastian Ruder

1y ago

Fine-tuning a pre-trained language model (LM) has become the de facto standard for doing transfer learning in natural language processing. Over the last three years (Ruder, 2018), fine-tuning (Howard & Ruder, 2018) has superseded the use of feature extraction of pre-trained embeddings (Peters et al., 2018) while pre-trained language models are favoured over models trained on translation (McCann et al., 2018), natural language inference (Conneau et al., 2017), and other tasks due to their increased sample efficiency and performance (Zhang and Bowman, 2018). The empirical success of these m ..read more

Visit website

ML and NLP Research Highlights of 2020

Sebastian Ruder

by Sebastian Ruder

1y ago

The selection of areas and methods is heavily influenced by my own interests; the selected topics are biased towards representation and transfer learning and towards natural language processing (NLP). I tried to cover the papers that I was aware of but likely missed many relevant ones—feel free to highlight them in the comments below. In all, I discuss the following highlights: Scaling up—and down Retrieval augmentation Few-shot learning Contrastive learning Evaluation beyond accuracy Practical concerns of large LMs Multilinguality Image Transformers ML for science Reinforcement learning 1 ..read more

Visit website

Follow Sebastian Ruder on FeedSpot