PhD Proposal: The Role of Syntax in Deep Compositional Models for Natural Language Processing

Talk
Mohit Iyyer
Time: 
07.13.2015 15:00 to 16:30
Location: 

AVW 3258

Deep neural networks have recently pushed the state-of-the-art for many natural language processing (NLP) tasks. Given variable-length sequences of words as input, such as sentences or documents, these models learn a composition function, or a mathematical process for combining multiple words into a single vector. In contrast to bag-of-words (or unordered) models, syntactic models for NLP problems incorporate information about word order and sentence structure into the composition function. Here we investigate how much syntax actually helps for downstream tasks and propose future directions for both unordered and syntactic models.
We start with an overview of the recursive neural network (RecNN), which along with the convolutional network is one of the most common deep syntactic models used in NLP. These models are shown to perform better than standard unordered baselines on two tasks: political ideology detection and factoid question answering. RecNNs have two distinct advantages over these simple unordered models: they nonlinearly transform the input, and they apply these transformations according to syntactic parse trees.
In an attempt to isolate the impact of syntax, we introduce the deep averaging network (DAN), which is an unordered model that retains the nonlinear transformations of RecNNs. We find that DANs achieve similar and in some cases higher accuracies than RecNNs and other syntactic models on sentiment analysis and question answering tasks. Due to the DAN's simpler architecture, training time is on the order of minutes instead of the hours required by syntactic models.
Two natural questions arise from this result: first, how can we better incorporate syntactic information into deep neural networks; and second, how far can we go with just deep unordered models? We propose one possible answer to the first question, an unsupervised shift-reduce RecNN that incrementally learns syntactic structures during training as opposed to using predetermined structures from a parser. We plan to further extend this model to generate grammatical and meaningful text, as this task is impossible without properly understanding syntax, and also describe a hybrid unordered-syntactic model for generation. For the second question, we propose an unordered "neural topic model" that learns literary character archetypes from just the raw text of novels, a task that for syntactic models would require powerful single machines or clusters due to the time complexity
Examining Committee:
Committee Chair: Dr. Jordan Boyd-Graber
Dept's Representative Dr. Ramani Duraiswami
Committee Member(s): Dr. Hal Daume III