
Sebastian Ruder
573 FOLLOWERS
I'm a research scientist at DeepMind, London. I completed my PhD in Natural Language Processing and Deep Learning at the Insight Research Centre for Data Analytics, while working as a research scientist at Dublin-based text analytics startup AYLIEN.
Sebastian Ruder
3M ago
This post gives a brief overview of modularity in deep learning. For a more in-depth review, refer to our survey. For modular fine-tuning for NLP, check out our EMNLP 2022 tutorial. For more resources, check out modulardeeplearning.com.
Fuelled by scaling laws, state-of-the-art models in machine learning have been growing larger and larger. These models are monoliths. They are pre-trained from scratch in highly choreographed engineering endeavours. Due to their size, fine-tuning has become expensive while alternatives, such as in-context learning are often brittle in practice. At the same tim ..read more
Sebastian Ruder
4M ago
Models that allow interaction via natural language have become ubiquitious. Research models such as BERT and T5 have become much more accessible while the latest generation of language and multi-modal models are demonstrating increasingly powerful capabilities. At the same time, a wave of NLP startups has started to put this technology to practical use.
While such language technology may be hugely impactful, recent models have mostly focused on English and a handful of other languages with large amounts of resources. Developing models that work for more languages is important in order to offs ..read more
Sebastian Ruder
4M ago
ACL 2022 took place in Dublin from 22nd–27th May 2022. This was my first in-person conference since ACL 2019. This is also my first conference highlights post since NAACL 2019. With 1032 accepted papers (604 long, 97 short, 331 in Findings), this post can only offer a glimpse of the diverse research presented at the conference—biased towards my research interests.
Here are the themes that were most noticeable for me across the conference program:
Language Diversity and Multimodality
Prompting
Next Big Ideas
Favorite Papers
The Dark Matter of Language and Intelligence
Hybrid Conference Experi ..read more
Sebastian Ruder
4M ago
Credit for the title image: Liu et al. (2021)
2021 saw many exciting advances in machine learning (ML) and natural language processing (NLP). In this post, I will cover the papers and research areas that I found most inspiring. I tried to cover the papers that I was aware of but likely missed many relevant ones. Feel free to highlight them as well as ones that you found inspiring in the comments. I discuss the following highlights:
Universal Models
Massive Multi-task Learning
Beyond the Transformer
Prompting
Efficient Methods
Benchmarking
Conditional Image Generation
ML for Science
Program S ..read more
Sebastian Ruder
4M ago
This post expands on the EMNLP 2021 tutorial on Multi-domain Multilingual Question Answering.
The tutorial was organised by Avi Sil and me. In this post, I highlight key insights and takeaways of the tutorial. The slides are available online. You can find the table of contents below:
Introduction
Open-Retrieval QA vs Reading Comprehension
What is a Domain?
Multi-Domain QA
Datasets for Multi-Domain QA
Multi-Domain QA Models
Unsupervised Domain Adaptation for QA
Domain Adaptation with Pre-trained LMs
Domain Generalization
Multilingual QA
Datasets for Multilingual QA
Issues of Multil ..read more
Sebastian Ruder
4M ago
Over the last years, models in NLP have become much more powerful, driven by advances in transfer learning. A consequence of this drastic increase in performance is that existing benchmarks have been left behind. Recent models "have outpaced the benchmarks to test for them" (AI Index Report 2021), quickly reaching super-human performance on standard benchmarks such as SuperGLUE and SQuAD. Does this mean that we have solved natural language processing? Far from it.
However, the traditional practices for evaluating performance of NLP models, using a single metric such as accuracy or BLEU, relyi ..read more
Sebastian Ruder
4M ago
ACL 2021 took place virtually from 1–6 August 2021. Here are my highlights from the conference:
NLP benchmarking is broken
NLP is all about pre-trained Transformers
Machine translation
Understanding models
Cross-lingual transfer and multilingual NLP
Challenges in natural language generation
Virtual conference notes
NLP benchmarking is broken
Many talks and papers made reference to the current state of NLP benchmarking, which has seen existing benchmarks largely outpaced by rapidly improving pre-trained models.
My favourite resources on this topic from the conference are:
Chris Pott's keyno ..read more
Sebastian Ruder
4M ago
Fine-tuning a pre-trained language model (LM) has become the de facto standard for doing transfer learning in natural language processing. Over the last three years (Ruder, 2018), fine-tuning (Howard & Ruder, 2018) has superseded the use of feature extraction of pre-trained embeddings (Peters et al., 2018) while pre-trained language models are favoured over models trained on translation (McCann et al., 2018), natural language inference (Conneau et al., 2017), and other tasks due to their increased sample efficiency and performance (Zhang and Bowman, 2018). The empirical success of these m ..read more
Sebastian Ruder
4M ago
The selection of areas and methods is heavily influenced by my own interests; the selected topics are biased towards representation and transfer learning and towards natural language processing (NLP). I tried to cover the papers that I was aware of but likely missed many relevant ones—feel free to highlight them in the comments below. In all, I discuss the following highlights:
Scaling up—and down
Retrieval augmentation
Few-shot learning
Contrastive learning
Evaluation beyond accuracy
Practical concerns of large LMs
Multilinguality
Image Transformers
ML for science
Reinforcement learning
1 ..read more
Sebastian Ruder
4M ago
Natural language processing (NLP) research predominantly focuses on developing methods that work well for English despite the many positive benefits of working on other languages. These benefits range from an outsized societal impact to modelling a wealth of linguistic features to avoiding overfitting as well as interesting challenges for machine learning (ML).
There are around 7,000 languages spoken around the world. The map above (see the interactive version at Langscape) gives an overview of languages spoken around the world, with each green circle representing a native language. Most of t ..read more