#1 New LFQA System That Tops the KILT Leaderboard on ELI5
Are current benchmarks and evaluation metrics really suitable for making progress on LFQA?
Open-domain long-form question answering (LFQA) is an essential challenge in natural language processing (NLP) that involves retrieving documents relevant to a given question and using them to generate an elaborate paragraph-length answer. While there has been remarkable recent progress in open-domain question answering (QA), where a short phrase or entity is enough to answer a question, much less work has been done in the area of long-form question answering. However, LFQA is an important task, especially because it provides a testbed to measure the factuality of generative text models.
Google AI now presents a new system for open-domain long-form question answering that leverages two recent advances and set to appear at NAACL 2021. The new system achieves state-of-the-art results on ELI5, the only large-scale publicly available dataset for long-form question answering.
However, a detailed analysis reveals several issues with the benchmark that prevent using it to inform meaningful modeling advances. Google hopes that the AI community will work together to solve these issues so that researchers can climb the right hills and make meaningful progress in this challenging but important task.
#2 Google AI: Replacing Rewards with Examples in Reinforcement Learning
Robots are among us to make our daily lives easier and better. They are becoming more advanced, user-friendly, and accessible and can constantly work without any breaks, vacation, and sleep.
But the robotics industry still faces many challenges.
According to Google AI researchers, reinforcement learning (RL) teaches agents to perform new tasks require a reward function, which provides positive feedback to the agent for taking actions that lead to good outcomes. RL is the training of machine learning models to make a sequence of decisions. However, specifying reward functions for agents can be tiresome and can be very difficult to define for situations without a clear objective.
To solve the challenge, Google has proposed a machine learning algorithm for teaching agents to solve new tasks by providing examples of success. The proposed algorithm is a recursive classification of examples (RCE) that does not rely on hand-crafted reward functions, distance functions, or features but instead learns to solve tasks directly from data, requiring the agent to learn how to solve the entire task by itself, without requiring examples of any intermediate states. Experiments show that our approach outperforms prior methods that learn explicit reward functions.
#3 This New Face Editing Framework Outperforms State-of-the-art Approaches
A group of researchers revisited cycle consistency in face editing, observed that the generator model learns to apply a tricky way to satisfy the constraint of cycle consistency by hiding signals in the output image. They thoroughly analyzed the phenomenon and now provide an effective solution to handle the problem.
It is a novel wavelet-based face editing method called HifaFace, for high-fidelity and arbitrary face editing.
Well, qualitative and quantitative results demonstrate the effectiveness of HifaFace, for improving the quality of edited face images.
The key idea is to prevent the generator from encoding hidden information and encourage it to synthesize visual details. By inspecting the hidden information, researchers found that it is highly related to the high-frequency signals of an input image. They thus utilized the widely used wavelet transformation to decompose the image into domains with different frequencies and take the high-frequency parts to represent rich details.
Hopefully, this work will inspire researchers to solve similar problems of cycle consistency in many other tasks.
#4 Introducing Convolutions to Vision Transformers
Microsoft Cloud + AI in collaboration with McGill University recently presented a detailed study that introduces convolutions into the Vision Transformer architecture to merge the benefits of Transformers with the benefits of CNNs for image recognition tasks.
Their work introduces a new architecture, named Convolutional vision Transformer (CvT), that improves Vision Transformer (ViT) in performance and efficiency by introducing convolutions into ViT to yield the best of both designs.
Extensive experiments in this work demonstrate that the convolutional token embedding and convolutional projection, along with the multi-stage design of the network-enabled by convolutions make the proposed CvT architecture achieve superior performance while maintaining computational efficiency.
Besides, due to the built-in local context structure introduced by convolutions, CvT no longer requires a position embedding, giving it a potential advantage for adaption to a wide range of vision tasks requiring variable input resolution. Get the paper, go here
#5 Human Activity Analysis and Recognition from Smartphones using Machine Learning Techniques
Human Activity Recognition (HAR) is considered a valuable research topic in the last few decades. Different types of machine learning models are used for this purpose, and this is a part of analyzing human behavior through machines.
It is not a trivial task to analyze the data from wearable sensors for complex and high dimensions. Nowadays, researchers mostly use smartphones or smart home sensors to capture these data.
In this paper, researchers analyze these data using machine learning models to recognize human activities, which are now widely used for many purposes such as physical and mental health monitoring. They apply different machine learning models and compare performances. Specifically, they use Logistic Regression (LR) as the benchmark model for its simplicity and excellent performance on a dataset, and to compare, they take Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), and Artificial Neural Network (ANN).
Additionally, they select the best set of parameters for each model by grid search and use the HAR dataset from the UCI Machine Learning Repository as a standard dataset to train and test the models. Throughout the analysis, scholars found out that the Support Vector Machine performed (average accuracy 96.33%) far better than the other methods. The study proves that the results are statistically significant by employing statistical significance test methods. Read more: Human Activity Analysis and Recognition from Smartphones using Machine Learning Techniques