Using formal grammars to test ability of recurrent neural networks to model long-distance dependencies in sequential data

Abstract

From the early 1990s on, it has been known that long-distance dependencies (LDDs) pose particular difficulties for recurrent neural networks trained using gradient descent. Over the last number of decades various recurrent neural architectures have been proposed to overcome this problem. However, most of the research on developing computational models capable of processing sequential data fails to explicitly analyze, in terms of presence and/or degree, the LDDs within the datasets used to train and evaluate these models. This lack of understanding of the LDDs within benchmark datasets necessarily limits the analysis of model performance in relation to the specific challenge posed by LDDs. One way to address this is to use formal languages to generate benchmark datasets with specific and well-understood properties. For example, when using Strictly k-Piecewise languages to generate datasets the degree of LDDs within the generated data can be controlled through the k parameter, length of the generated strings, and by choosing appropriate forbidden strings. This talk will present research we have carried out using formal languages to explore the capacity of different RNN extensions to model LDDs, by evaluating these models on a sequence of SPk synthesized datasets, where each subsequent dataset exhibits a longer degree of LDD. Even though SPk are simple languages, the presence of LDDs does have significant impact on the performance of recurrent neural architectures, thus making them prime candidate in benchmarking tasks.

Date
Event
ACL 2019 workshop on Deep Learning and Formal Languages: Building Bridges