Published on Feb 06, 2021 (edited on Dec 12, 2021)
In recent years, Machine Learning (ML) and its applied branch known as Data Science (DS) have exploded in popularity. Industry demand for highly skilled professionals has skyrocketed and so have new developments in statistical modelling techniques and algorithms. The number of new and improved machine learning methods published in peer reviewed venues, e.g., conferences and journals, as well as online, e.g., arXiv.org and blogs, has grown so fast that it makes it feel impossible to keep up with.
To prove the point, the below chart shows the number of papers published annually from 2010 to 2019 by the top 2 machine learning conferences, namely the International Conference on Machine Learning (ICML) and Neural Information Processing Systems (NeurIPS).
These 2 conferences are not the only venues where new research is published. For the sake of brevity, the chart does not include many other machine learning, data science, and artificial intelligence conferences such as ICLR, KDD, AAAI, IJCAI, and, of course, several major Computer Vision conferences such as CVPR, ICCV, and ECCV; especially the latter have in just half a decade exploded in popularity due to the success of Deep Learning and Convolutional Neural Networks.
How can machine learning researchers and data science professionals keep up with this flood of published research?
We all know that it is important to continuously read newly published research. Given that there is no tool or method that can distil new knowledge without putting some effort into it what is one to do?
Our team of experienced researchers has put together a 4-step guide to help you keep up with published research.
Machine learning is a large research field covering many important topics. It will be next to impossible to keep up with all of machine learning. Your first step should be to specialise, that is, determine which sub-fields (one or two is our suggestion) is of most interest to you. For example, perhaps your focus is on image segmentation, question answering, graph representation learning, or time-series forecasting. It could be two of these such as the intersection of computer vision and reinforcement learning and their application to self-driving cars; both of these topics are large but you should not be spending too much of your valuable time reading papers in text summarisation or conversational agents.
That said, once you are comfortably familiar with the core ideas and research works in your specialisation, then you should try reading more broadly.
Reading new papers should not be an activity undertaken at random times or whenever you have free time from other research activities. You should make it part of your weekly routine.
Schedule a block of time, weekly or fortnightly, dedicated to reading. How much time you will need depends on how much time you require to read a paper. We suggest you dedicate 2-4 hours weekly. Depending on how familiar you are with the subject, this amount of time will allow you to read 2-6 papers. Your focus should be on understanding what problem the papers solves, how it solves it, and how it relates to prior art. Not all papers are worth spending hours to read and understand. Put aside those papers you believe are more significant and come back to them for a more thorough reading. Lastly but most importantly, don't forget to take notes about each paper you read. You can always come back to your notes a few weeks or months later if you need to refresh your memory and not spend hours reading the same papers again.
Even if you execute Step 1 perfectly, the number of new (and old) papers you have to read can still be overwhelming. Furthermore, due to the ease and popularity of online publishing before peer review, e.g., posting on arXiv.org, the quality of papers can vary significantly. Your goal is to read the best works. So, how can you find the best papers to read?
We don't have the perfect answer to this question but our main strategy has always been to follow the leading labs and researchers and their publications. Once you have completed Step 1 try to identify the key papers in your ML specialisation. Usually, the key papers are the most cited ones; try to find some that were published in the last 5 years or so. Next, identify the authors and their current affiliations. All university labs have pages where they list their most recently published works. Many if not most researchers also have personal pages and social media accounts, e.g., Twitter, where they post their newest and most significant papers. Follow them online to discover the newest works. You can also use online tools such as arxivsanity.org to find highly cited and trending papers.
As a bonus, look for the papers receiving best paper awards at the major machine learning conferences.
Keeping up with the latest research doesn't have to be a solitary exercise. Every university research group organises weekly or fortnightly journal clubs, also known as reading groups, where group members come together to discuss the state of the art.
A major journal club benefit is that you can tap into the wisdom of crowds to discover the best research. More senior members can also guide you in finding the best works and understanding how published works fit together.
If your research group or work team doesn't have a journal club, then why not organise one yourself?