ML and NLP Publications in 2021
MAREK REI
by Marek
2y ago
I am sharing here the yearly paper analysis for 2021, containing statistics about ML and NLP publications from the past year. It has arrived later than in previous years - preparing it this time took quite a bit longer than intended. The new data required some manual cleaning and updating of the pipeline, which meant the analysis got delayed quite a bit. But finally, here it is now. The analysis of the papers is done using a series of automated tools. These processes are not perfect so some noise and errors may occur. Some authors have also recently started releasing their papers in an obfusc ..read more
Visit website
Advice for students doing research projects in ML/NLP
MAREK REI
by Marek
2y ago
This is a collection of advice that I give to students doing research projects in NLP/ML/AI. It includes suggestions that I wish I had known when I myself first started, as well as lessons from supervising students in previous years. I would recommend reading this once before starting your project, then again after about a month or two into the project. Different things will seem relevant to you. Implementation and debugging Don’t assume that your code is bug free – check and debug. Deep learning code can be difficult to get right. Make sure the tensors have the right shapes and dimensions. O ..read more
Visit website
ML/NLP Publications in 2017
MAREK REI
by Marek
3y ago
It has been a very productive year for NLP and ML research. Both areas continued to grow, with conferences reaching record numbers of publications. In this post I will break these numbers down a bit more, by individual authors and organisations. The statistics cover the following venues: ACL, EMNLP, NAACL, EACL, COLING, TACL, CL, CoNLL, *Sem+SemEval, NIPS, ICML, ICLR. Compared to last year, I’ve now included ICLR which has grown very rapidly in the last two years and become a highly competitive conference. The analysis is done automatically, by crawling publication information from the confere ..read more
Visit website
Attending to characters in neural sequence labeling models
MAREK REI
by Marek
3y ago
Word embeddings are great. They allow us to represent words as distributed vectors, such that semantically and functionally similar words have similar representations. Having similar vectors means these words also behave similarly in the model, which is what we want for good generalisation properties. However, word embeddings have a couple of weaknesses: If a word doesn’t exist in the training data, we can’t have an embedding for it. Therefore, the best we can do is clump all unseen words together under a single OOV (out-of-vocabulary) token. If a word only occurs a couple of times ..read more
Visit website
NLP and ML Publications – Looking Back at 2016
MAREK REI
by Marek
3y ago
After my last post on analysing publication patterns I received quite a lot of feedback and many feature requests, so I decided to create an update once 2016 is over. It is now quite a bit bigger than before, and includes 11 different conferences and journals: ACL, EACL, NAACL, EMNLP, COLING, CL, TACL, CoNLL, *Sem+SemEval, NIPS, and ICML. The information used in these graphs was collected through crawling the web. ACL Anthology was very useful, listing papers in a consistent format. However, information such as the organisation names in each paper still needed to be extracted directly from the ..read more
Visit website
Analysing NLP publication patterns
MAREK REI
by Marek
3y ago
Recently, I got curious about finding out how much different institutions publish in my area. Does Google publish more than Microsoft? Which university has the strongest publication record in NLP? And are there any interesting trends that can be seen in the recent years? Quantity does not necessarily equal quality, but the number of publications is still a reasonable indicator of general activity in the field, how big the research group is, and how outward-facing are the research projects. My approach was to crawl papers from the 6 biggest conferences that are relevant to my research: ACL, EAC ..read more
Visit website
ML and NLP Publications in 2020
MAREK REI
by Marek
3y ago
I ran my paper analysis pipeline once again in order to get statistics for 2020. It certainly was an unusual year. While ML and NLP conferences again had more publications than ever before, most of them needed to quickly adapt to a new remote format. Each conference took a slightly different approach as everyone was trying to figure out how to make this work. I heard especially positive comments about EMNLP 2020, regarding their smooth organisation and engaging technical solutions. Overall, I think the remote format has its pluses and minuses - while it certainly complicates networking and soc ..read more
Visit website
ML and NLP Publications in 2019
MAREK REI
by Marek
4y ago
It is about time we once again take a look at the publication statistics of the past year. 2019 was another record breaking year in machine learning and NLP research. Nearly all conferences had more attendees and more publications than ever before. For example, NeurIPS had 6,743 submissions and 1,428 accepted papers, which eclipses all the previous iterations. Because the conference sold out so fast last year, the organizers had to implement a randomised lottery for the tickets this time. In this post you will find a number of graphs to illustrate the publication patterns from 2019. I have ..read more
Visit website
74 Summaries of Machine Learning and NLP Research
MAREK REI
by Marek
4y ago
My previous post on summarising 57 research papers turned out to be quite useful for people working in this field, so it is about time for a sequel. Below you will find short summaries of a number of different research papers published in the areas of Machine Learning and Natural Language Processing in the past couple of years (2017-2019). They cover a wide range of different topics, authors and venues. These are not meant to be reviews showing my subjective opinion, but instead I aim to provide a blunt and concise overview of the core contribution of each publication. Given how many papers a ..read more
Visit website
The Geographic Diversity of NLP Conferences
MAREK REI
by Andrew Caines
4y ago
The growth of interest in NLP technology, fuelled largely by investment in AI applications, has been accompanied by unprecedented expansion of the preeminent NLP conferences: ACL, NAACL and EMNLP in particular. Grzegorz Chrupała’s graph shows the rapid growth in submissions to each conference in recent years: Figure 1. Graph of conference submission counts, by Grzegorz ChrupałaThese trends can also be seen in the blog post by the ACL 2019 Chairs. Meanwhile, acceptance rates have tended to remain constant, meaning that the conferences have grown in line with submissions, e.g. from ACL 2019: Tab ..read more
Visit website

Follow MAREK REI on FeedSpot

Continue with Google
Continue with Apple
OR