CoNLL 2021

November 10-11, 2021

CoNLL is a yearly conference organized by SIGNLL (ACL's Special Interest Group on Natural Language Learning). This year, CoNLL will be colocated with EMNLP 2021, and will be held entirely online. However, the conference's schedule will be considerate to attendees who are at the EMNLP venue. Please find the program here.

For participants who are planning to attend the conference at Punta Cana, we plan to have an extra in-person poster session. More details to come.

CoNLL 2021 Chairs and Organizers

The conference's co-chairs are:

Our email is

  • Publicity Chair: Leshem Choshen (Hebrew University of Jerusalem, Israel)
  • Publication Chair: Mareike Hartmann (University of Copenhagen, Denmark)


September 28, 2021: Please register to CoNLL through the EMNLP general registration here.

May 22, 2021: The list of areas and area chairs is now published.

April 10, 2021: We thank Jennifer Culbertson and Gary Lupyan for agreeing to give a keynote talk at the conference. Looking forward! The titles and abstracts for their talks can be found here and here.

March 18, 2021: The Call for Papers is published (see below).

March 5, 2021: We are soliciting expressions of interest from individuals who would like to serve as Area Chairs for the conference. Please sign up here to express interest in ACing in CoNLL this year.


The program is given in Punta Cana time (UTC-4).

Day 1 (November 10)

10:00-10:10: Welcome and Opening Remarks (Zoom)

10:10-11:30: Oral Session #1 [Interaction, dialogue, and grounded language learning] (Zoom)

10:10-10:30 "It's our fault!'': Insights Into Users' Understanding and Interaction With an Explanatory Collaborative Dialog System Katharina Weitz, Lindsey Vanderlyn, Ngoc Thang Vu and Elisabeth Andre
10:30-10:50 Dependency Induction Through the Lens of Visual Perception Ruisi Su, Shruti Rijhwani, Hao Zhu, Junxian He, Xinyu Wang, Yonatan Bisk and Graham Neubig
10:50-11:10 VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering Ekta Sood, Fabian Kögel, Florian Strohm, Prajit Dhar and Andreas Bulling
11:10-11:30 "It seemed like an annoying woman'': On the Perception and Ethical Considerations of Affective Language in Text-Based Conversational Agents Lindsey Vanderlyn, Gianna Weber, Michael Neumann, Dirk Väth, Sarina Meyer and Ngoc Thang Vu

11:30-12:00: Break (gathertown)

12:00-13:10: Keynote Talk: Linking learning to language typology / Jennifer Culbertson (Zoom)

13:10-14:10: Lunch break (gathertown)

14:10-15:50: Oral Session #2 [Theoretical analysis, probing and interpretation of language models] (Zoom)

14:10-14:30 On Language Models for Creoles Heather Lent, Emanuele Bugliarello, Miryam de Lhoneux, Chen Qiu and Anders Søgaard
14:30-14:50 Do pretrained transformers infer telicity like humans? Yiyun Zhao, Jian Gang Ngui, Lucy Hall Hartley and Steven Bethard
14:50-15:10 The Low-Dimensional Linear Geometry of Contextualized Word Representations Evan Hernandez and Jacob Andreas
15:10-15:30 Generalising to German Plural Noun Classes, from the Perspective of a Recurrent Neural Network Verna Dankers, Anna Langedijk, Kate McCurdy, Adina Williams and Dieuwke Hupkes
15:30-15:50 Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color Mostafa Abdou, Artur Kulmizev, Daniel Hershcovich, Stella Frank, Ellie Pavlick and Anders Søgaard

15:50-16:20: Break (gathertown)

16:20-18:00: Poster Session #1 (gathertown)

Empathetic Dialog Generation with Fine-Grained Intents Yubo Xie and Pearl Pu
Enriching Language Models with Visually-grounded Word Vectors and the Lancaster Sensorimotor Norms Casey Kennington
Learning Zero-Shot Multifaceted Visually Grounded Word Embeddings via Multi-Task Training Hassan Shahmohammadi, Hendrik P. A. Lensch and Harald Baayen
Does language help generalization in vision models? Benjamin Devillers, Bhavin Choksi, Romain Bielawski and Rufin VanRullen
Understanding Guided Image Captioning Performance across Domains Edwin G. Ng, Bo Pang, Piyush Sharma and Radu Soricut
Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction Shauli Ravfogel, Grusha Prasad, Tal Linzen and Yoav Goldberg
Who’s on First?: Probing the Learning and Representation Capabilities of Language Models on Deterministic Closed Domains David Demeter and Doug Downey
Data Augmentation of Incorporating Real Error Patterns and Linguistic Knowledge for Grammatical Error Correction Xia Li and Junyi He
Agree to Disagree: Analysis of Inter-Annotator Disagreements in Human Evaluation of Machine Translation Output Maja Popović
A Multilingual Benchmark for Probing Negation-Awareness with Minimal Pairs Mareike Hartmann, Miryam de Lhoneux, Daniel Hershcovich, Yova Kementchedjhieva, Lukas Nielsen, Chen Qiu and Anders Søgaard
Explainable Natural Language to Bash Translation using Abstract Syntax Tree Shikhar Bharadwaj and Shirish Shevade
Learned Construction Grammars Converge Across Registers Given Increased Exposure Jonathan Dunn and Harish Tayyar Madabushi
Tokenization Repair in the Presence of Spelling Errors Hannah Bast, Matthias Hertel and Mostafa M. Mohamed
A Coarse-to-Fine Labeling Framework for Joint Word Segmentation, POS Tagging, and Constituent Parsing Yang Hou, Houquan Zhou, Zhenghua Li, Yu Zhang, Min Zhang, Zhefeng Wang, baoxing Huai and Nicholas Jing Yuan
Understanding the Extent to which Content Quality Metrics Measure the Information Quality of Summaries Daniel Deutsch and Dan Roth

Day 2 (November 11)

10:00-11:40: Oral Session #3 [Lexical, compositional, and discourse semantics; Pragmatics] (Zoom)

10:00-10:20 Summary-Source Proposition-level Alignment: Task, Datasets and Supervised Baseline Ori Ernst, Ori Shapira, Ramakanth Pasunuru, Michael Lepioshkin, Jacob Goldberger, Mohit Bansal and Ido Dagan
10:20-10:40 Exploring Metaphoric Paraphrase Generation Kevin Stowe, Nils Beck and Iryna Gurevych
10:40-11:00 Imposing Relation Structure in Language-Model Embeddings Using Contrastive Learning Christos Theodoropoulos, James Henderson, Andrei Catalin Coman and Marie-Francine Moens
11:00-11:20 NOPE: A Corpus of Naturally-Occurring Presuppositions in English Alicia Parrish, Sebastian Schuster, Alex Warstadt, Omar S. Agha, Soo-Hwan Lee, Zhuoye Zhao, Samuel R. Bowman and Tal Linzen
11:20-11:40 Pragmatic competence of pre-trained language models through the lens of discourse connectives Lalchand Pandia, Yan Cong and Allyson Ettinger

11:40-13:20: Poster Session #2 (gathertown)

Predicting Text Readability from Scrolling Interactions Sian Gooding, Yevgeni Berzak, Tony Mak and Matt Sharifi
Modeling the Interaction Between Perception-Based and Production-Based Learning in Children's Early Acquisition of Semantic Knowledge Mitja Nikolaus and Abdellah Fourtassi
Scaffolded input promotes atomic organization in the recurrent neural network language model Philip A. Huebner and Jon A. Willits
Grammatical Profiling for Semantic Change Detection Andrey Kutuzov, Lidia Pivovarova and Mario Giulianelli
Deconstructing syntactic generalizations with minimalist grammars Marina Ermolaeva
Relation-aware Bidirectional Path Reasoning for Commonsense Question Answering Junxing Wang, Xinyi Li, Zhen Tan, Xiang Zhao and Weidong Xiao
Does referent predictability affect the choice of referential form? A computational approach using masked coreference resolution Laura Aina, Xixian Liao, Gemma Boleda and Matthijs Westera
Polar Embedding Ran Iwamoto, Ryosuke Kohita and Akifumi Wachi
Commonsense Knowledge in Word Associations and ConceptNet Chunhua Liu, Trevor Cohn and Lea Frermann
Cross-document Event Identity via Dense Annotation Adithya Pratapa, Zhengzhong Liu, Kimihiro Hasegawa, Linwei Li, Yukari Yamakawa, Shikun Zhang and Teruko Mitamura
Tackling Zero Pronoun Resolution and Non-Zero Coreference Resolution Jointly Shisong Chen, Binbin Gu, Jianfeng Qu, Zhixu Li, An Liu, Lei Zhao and Zhigang Chen
Negation-Instance Based Evaluation of End-to-End Negation Resolution Elizaveta Sineva, Stefan Grünewald, Annemarie Friedrich and Jonas Kuhn
Controlling Prosody in End-to-End TTS: A Case Study on Contrastive Focus Generation Siddique Latif, Inyoung Kim, Ioan Calapodescu and Laurent Besacier
A Large-scale Comprehensive Abusiveness Detection Dataset with Multifaceted Labels from Reddit Hoyun Song, Soo Hyun Ryu, Huije Lee and Jong Park
MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models Qianchu Liu, Fangyu Liu, Nigel Collier, Anna Korhonen and Ivan Vulić
A Data Bootstrapping Recipe for Low-Resource Multilingual Relation Classification Arijit Nag, Bidisha Samanta, Animesh Mukherjee, Niloy Ganguly and Soumen Chakrabarti
FAST: A carefully sampled and cognitively motivated dataset for distributional semantic evaluation Stefan Evert and Gabriella Lapesa
Automatic Error Type Annotation for Arabic Riadh Belkebir and Nizar Habash

13:20-14:20: Lunch break (gathertown)

14:20-15:30: Keynote Talk: What are we learning from language? / Gary Lupyan (Zoom)

15:30-15:50: Break (gathertown)

15:50-16:50: Oral Session #4 [Language evolution, acquisition and linguistic theories] (Zoom)

15:50-16:10 The Emergence of the Shape Bias Results from Communicative Efficiency Eva Portelance, Michael C. Frank, Dan Jurafsky, Alessandro Sordoni and Romain Laroche
16:10-16:30 BabyBERTa: Learning More Grammar With Small-Scale Child-Directed Language Philip A. Huebner, Elior Sulem, Fisher Cynthia and Dan Roth
16:30-16:50 Analysing Human Strategies of Information Transmission as a Function of Discourse Context Mario Giulianelli and Raquel Fernández

16:50-17:10: Break (gathertown)

17:10-17:50: Oral Session #5 [Speech and phonology] (Zoom)

17:10-17:30 Predicting non-native speech perception using the Perceptual Assimilation Model and state-of-the-art acoustic models Juliette Millet, Ioana Chitoran and Ewan Dunbar
17:30-17:50 The Influence of Regional Pronunciation Variation on Children's Spelling and the Potential Benefits of Accent Adapted Spellcheckers Emma O'Neill, Joe Kenny, anthony ventresque and Julie Carson-Berndsen

17:50-18:20: Best Paper Award and Closing Remarks (Zoom)

Keynote Talks

On Day 1

Jennifer Culbertson (The University of Edinburgh, UK)

Linking learning to language typology
One of the most controversial hypotheses in linguistics is that individual-level biases in learning shape language typology at the population-level. While this hypothesis has been around a long time, it has often been supported by less than robust empirical evidence. In this talk, I present a number of studies aimed at providing new sources of evidence linking learning to key features of language. In the first part of the talk, I focus on a classic set of "language universals" which describe common word order patterns. One such pattern is word order harmony, the tendency for syntactic heads and dependents to align across phrases within a language. While harmony has long been claimed to have some special cognitive status, there is also compelling evidence that it may be driven by cognition-external processes of language change. I show that harmony is in fact favoured during learning, influencing how adults and children make inferences under noisy learning conditions, and how they extrapolate to new constructions. I then turn to a more complex pattern of word order which has been proposed to derive from constraints on syntactic representations. I report experimental and quantitative corpus-based evidence to suggest an alternative explanation of this pattern, but one nevertheless driven by learning. In the second part of the talk, I discuss the role of learning in shaping morphosynactic patterns like grammatical gender. I argue that the different biases of children and adults during learning work together to constrain how such patterns emerge and change over time. Finally, I discuss the implications of this work for linguistic theories and models of language learning.

On Day 2

Gary Lupyan (University of Wisconsin–Madison, USA)

What are we learning from language?
Where does semantic knowledge come from? Previous work on semantic knowledge within cognitive science has focused on studying knowledge acquired from direct experience with the world and through inference. But recent advances in natural language processing combined with greater availability of large text corpora have revealed that languages encode far more semantic information than previously suspected. In some cases, knowledge that was thought to require direct perceptual experience or inferential reasoning can be derived entirely from language itself.
I will present some recent investigations of this idea showing, for example, that embedded within the distributional structure of language is substantial information about visual appearance that people can rely on to learn about what things look like. I will also discuss how distributional semantics are informing our understanding of cross-linguistic differences in word meanings, and the relationship between language and thought. I will end by speculating that the robust availability of linguistic information may conceal radical diversity in human cognition.

Call for Papers

SIGNLL invites submissions to the 25th Conference on Computational Natural Language Learning (CoNLL 2021). The focus of CoNLL is on theoretically, cognitively and scientifically motivated approaches to computational linguistics, rather than on work driven by particular engineering applications. Such approaches include:

  • Computational learning theory and other techniques for theoretical analysis of machine learning models for NLP
  • Models of first, second and bilingual language acquisition by humans
  • Models of language evolution and change
  • Computational simulation and analysis of findings from psycholinguistic and neurolinguistic experiments
  • Analysis and interpretation of NLP models, using methods inspired by cognitive science or linguistics or other methods
  • Data resources, techniques and tools for scientifically-oriented research in computational linguistics
  • Connections between computational models and formal languages or linguistic theories
  • Linguistic typology, translation, and other multilingual work
  • Theoretically, cognitively and scientifically motivated approaches to text generation

We welcome work targeting any aspect of language, including:

  • Speech and phonology
  • Syntax
  • Lexical, compositional and discourse semantics
  • Dialogue and interactive language use
  • Sociolinguistics
  • Multimodal and grounded language learning

We do not restrict the topic of submissions to fall into this list. However, the submissions’ relevance to the conference’s focus on theoretically, cognitively and scientifically motivated approaches, will play an important role in the review process.

Submitted papers must be anonymous and use the EMNLP 2021 template. Submitted papers may consist of up to 8 pages of content plus unlimited space for references. Authors of accepted papers will have an additional page to address reviewers’ comments in the camera-ready version (9 pages of content in total, excluding references). Optional anonymized supplementary materials and a PDF appendix are allowed, according to the EMNLP 2021 guidelines. Please refer to the EMNLP 2021 Call for Papers for more details on the submission format. Submission is electronic, using the Softconf START conference management system using this link:

CoNLL adheres to the ACL anonymity policy, as described in the EMNLP 2021 Call for Papers. Briefly, non-anonymized manuscripts submitted to CoNLL cannot be posted to preprint websites such as arXiv or advertised on social media after May 14.

Multiple submission policy

CoNLL 2021 will not accept papers that are currently under submission, or that will be submitted to other meetings or publications, including EMNLP. Papers submitted elsewhere as well as papers that overlap significantly in content or results with papers that will be (or have been) published elsewhere will be rejected. Authors submitting more than one paper to CoNLL 2021 must ensure that the submissions do not overlap significantly (>25%) with each other in content or results.

Important Dates

  • May 14, 2021: Anonymity period begins
  • June 14, 2021: Submission deadline
  • August 31, 2021: Notification of acceptance
  • September 13, 2021: Camera ready papers due
  • November 10-11, 2021: Conference

All deadlines are at 11:59pm UTC-12h ("anywhere on earth").

Areas and ACs

  • Language evolution, language acquisition, and linguistic theories: Ryan Cotterell, Adina Williams
  • Simulation and analysis of findings from psycholinguistic and neurolinguistic experiments: Micha Elsner, Allyson Ettinger
  • Interaction and grounded language learning: Dipendra Misra, Samira Shaikh
  • Resources and tools for scientifically motivated research: Andrew Caines, Roi Reichart
  • Multilingual work and translation: Maja Popovic, Rui Wang
  • Syntax: Carlos Gómez, Rob van der Groot
  • Theoretical analysis and interpretation of ML models for NLP: Dieuwke Hupkes, Xin Eric Wang
  • Lexical, compositional, and discourse semantics: Michael Roth, Gabriel Stanovsky
  • Speech, phonology, computational social science: Tanmoy Chakraborty, Kyle Gorman


We are grateful to Google for its generous support of the conference.