CoNLL 2021 | CoNLL

November 10-11, 2021

CoNLL is a yearly conference organized by SIGNLL (ACL's Special Interest Group on Natural Language Learning), focusing on theoretically, cognitively and scientifically motivated approaches to computational linguistics.

This year, CoNLL will be held in a hybrid format: colocated with EMNLP 2021 but also entirely accessible online. The conference's schedule will be considerate to attendees who are at the EMNLP venue, and we plan to have an extra in-person poster session. The session will be held on November 11, 9:00-10:00am in room Bavaro 4.

CoNLL 2021 will feature two invited talks, by Jennifer Culbertson (The University of Edinburgh, UK) and Gary Lupyan (University of Wisconsin–Madison, USA). Please find the complete program here.

CoNLL 2021 Chairs and Organizers

The conference's co-chairs are:

Arianna Bisazza (University of Groningen, The Netherlands)
Omri Abend (Hebrew University of Jerusalem, Israel)

Our email is conll2021chairs@gmail.com.

Publicity Chair: Leshem Choshen (Hebrew University of Jerusalem, Israel)
Publication Chair: Mareike Hartmann (University of Copenhagen, Denmark)

SIGNLL President: Julia Hockenmaier (University of Illinois at Urbana-Champaign, USA)
SIGNLL Secretary: Afra Alishahi (Tilburg University, Netherlands)

News

November 6, 2021: The proceedings are now out and can be found in the ACL Anthology. We thank the program committee for its professional and diligent work, and would especially like to acknowledge the contribution of our outstanding reviewers; their names are listed in the title section of the proceedings.

October 19, 2021: After a few weeks of uncertainties, we are happy to announce that CoNLL will be held in a hybrid format after all! If you were planning to join online, nothing changes. If you are planning to be at the venue, please know we'll have a dedicated conference room linked to the online zoom session.

September 28, 2021: Please register to CoNLL through the EMNLP general registration here.

May 22, 2021: The list of areas and area chairs is now published.

April 10, 2021: We thank Jennifer Culbertson and Gary Lupyan for agreeing to give a keynote talk at the conference. Looking forward! The titles and abstracts for their talks can be found here and here.

March 18, 2021: The Call for Papers is published (see below).

March 5, 2021: We are soliciting expressions of interest from individuals who would like to serve as Area Chairs for the conference. Please sign up here to express interest in ACing in CoNLL this year.

Best Paper Award

The winner of the Best Paper Award is "Generalising to German Plural Noun Classes, from the Perspective of a Recurrent Neural Network" By Verna Dankers, Anna Langedijk, Kate McCurdy, Adina Williams and Dieuwke Hupkes.

The runner-ups for the award are "Summary-Source Proposition-level Alignment: Task, Datasets and Supervised Baseline" by Ori Ernst, Ori Shapira, Ramakanth Pasunuru, Michael Lepioshkin, Jacob Goldberger, Mohit Bansal and Ido Dagan, and "BabyBERTa: Learning More Grammar With Small-Scale Child-Directed Language" by Philip A. Huebner, Elior Sulem, Cynthia Fisher and Dan Roth.

Program

The program is given in Punta Cana time (UTC-4).

Day 1 (November 10)

10:00-10:10: Welcome and Opening Remarks (Santo Domingo 1-2, Zoom)

10:10-11:30: Oral Session #1 [Interaction, dialogue, and grounded language learning] (Santo Domingo 1-2, Zoom)

10:10-10:30	"It's our fault!'': Insights Into Users' Understanding and Interaction With an Explanatory Collaborative Dialog System	Katharina Weitz, Lindsey Vanderlyn, Ngoc Thang Vu and Elisabeth Andre
10:30-10:50	Dependency Induction Through the Lens of Visual Perception	Ruisi Su, Shruti Rijhwani, Hao Zhu, Junxian He, Xinyu Wang, Yonatan Bisk and Graham Neubig
10:50-11:10	VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering	Ekta Sood, Fabian Kögel, Florian Strohm, Prajit Dhar and Andreas Bulling
11:10-11:30	"It seemed like an annoying woman'': On the Perception and Ethical Considerations of Affective Language in Text-Based Conversational Agents	Lindsey Vanderlyn, Gianna Weber, Michael Neumann, Dirk Väth, Sarina Meyer and Ngoc Thang Vu

11:30-12:00: Break (gathertown)

12:00-13:10: Keynote Talk: Linking learning to language typology / Jennifer Culbertson (Santo Domingo 1-2, Zoom)

13:10-14:10: Lunch break (gathertown)

14:10-15:50: Oral Session #2 [Theoretical analysis, probing and interpretation of language models] (Santo Domingo 1-2, Zoom)

14:10-14:30	On Language Models for Creoles	Heather Lent, Emanuele Bugliarello, Miryam de Lhoneux, Chen Qiu and Anders Søgaard
14:30-14:50	Do pretrained transformers infer telicity like humans?	Yiyun Zhao, Jian Gang Ngui, Lucy Hall Hartley and Steven Bethard
14:50-15:10	The Low-Dimensional Linear Geometry of Contextualized Word Representations	Evan Hernandez and Jacob Andreas
15:10-15:30	Generalising to German Plural Noun Classes, from the Perspective of a Recurrent Neural Network	Verna Dankers, Anna Langedijk, Kate McCurdy, Adina Williams and Dieuwke Hupkes
15:30-15:50	Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color	Mostafa Abdou, Artur Kulmizev, Daniel Hershcovich, Stella Frank, Ellie Pavlick and Anders Søgaard

15:50-16:20: Break (gathertown)

16:20-18:00: Poster Session #1 (gathertown)

Empathetic Dialog Generation with Fine-Grained Intents (#63)	Yubo Xie and Pearl Pu
Enriching Language Models with Visually-grounded Word Vectors and the Lancaster Sensorimotor Norms (#95)	Casey Kennington
Learning Zero-Shot Multifaceted Visually Grounded Word Embeddings via Multi-Task Training (#101)	Hassan Shahmohammadi, Hendrik P. A. Lensch and Harald Baayen
Does language help generalization in vision models? (#123)	Benjamin Devillers, Bhavin Choksi, Romain Bielawski and Rufin VanRullen
Understanding Guided Image Captioning Performance across Domains (#146)	Edwin G. Ng, Bo Pang, Piyush Sharma and Radu Soricut
Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction (#108)	Shauli Ravfogel, Grusha Prasad, Tal Linzen and Yoav Goldberg
Who’s on First?: Probing the Learning and Representation Capabilities of Language Models on Deterministic Closed Domains (#150)	David Demeter and Doug Downey
Data Augmentation of Incorporating Real Error Patterns and Linguistic Knowledge for Grammatical Error Correction (#79)	Xia Li and Junyi He
Agree to Disagree: Analysis of Inter-Annotator Disagreements in Human Evaluation of Machine Translation Output (#105)	Maja Popović
A Multilingual Benchmark for Probing Negation-Awareness with Minimal Pairs (#149)	Mareike Hartmann, Miryam de Lhoneux, Daniel Hershcovich, Yova Kementchedjhieva, Lukas Nielsen, Chen Qiu and Anders Søgaard
Explainable Natural Language to Bash Translation using Abstract Syntax Tree (#168)	Shikhar Bharadwaj and Shirish Shevade
Learned Construction Grammars Converge Across Registers Given Increased Exposure (#8)	Jonathan Dunn and Harish Tayyar Madabushi
Tokenization Repair in the Presence of Spelling Errors (#29)	Hannah Bast, Matthias Hertel and Mostafa M. Mohamed
A Coarse-to-Fine Labeling Framework for Joint Word Segmentation, POS Tagging, and Constituent Parsing (#68)	Yang Hou, Houquan Zhou, Zhenghua Li, Yu Zhang, Min Zhang, Zhefeng Wang, baoxing Huai and Nicholas Jing Yuan
Understanding the Extent to which Content Quality Metrics Measure the Information Quality of Summaries (#152)	Daniel Deutsch and Dan Roth

Day 2 (November 11)

9:00-10:00: In-person Poster Session (Room: Bavaro 4)

Tokenization Repair in the Presence of Spelling Errors	Hannah Bast, Matthias Hertel and Mostafa M. Mohamed
On Eliciting Word-in-Context Representations from Pretrained Language Models	Qianchu Liu, Fangyu Liu, Nigel Collier, Anna Korhonen and Ivan Vulić
Modeling the Interaction Between Perception-Based and Production-Based Learning in Children's Early Acquisition of Semantic Knowledge	Mitja Nikolaus and Abdellah Fourtassi
Polar Embedding	Ran Iwamoto, Ryosuke Kohita and Akifumi Wachi
Data Augmentation of Incorporating Real Error Patterns and Linguistic Knowledge for Grammatical Error Correction	Xia Li and Junyi He
Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction	Shauli Ravfogel, Grusha Prasad, Tal Linzen and Yoav Goldberg
Imposing Relation Structure in Language-Model Embeddings Using Contrastive Learning	Christos Theodoropoulos, James Henderson, Andrei Catalin Coman and Marie-Francine Moens
Summary-Source Proposition-level Alignment: Task, Datasets and Supervised Baseline	Ori Ernst, Ori Shapira, Ramakanth Pasunuru, Michael Lepioshkin, Jacob Goldberger, Mohit Bansal and Ido Dagan
Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color	Mostafa Abdou, Artur Kulmizev, Daniel Hershcovich, Stella Frank, Ellie Pavlick and Anders Søgaard
Grammatical Profiling for Semantic Change Detection	Andrey Kutuzov, Lidia Pivovarova and Mario Giulianelli
VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering	Ekta Sood, Fabian Kögel, Florian Strohm, Prajit Dhar and Andreas Bulling

10:00-11:40: Oral Session #3 [Lexical, compositional, and discourse semantics; Pragmatics] (Santo Domingo 1-2, Zoom)

10:00-10:20	Summary-Source Proposition-level Alignment: Task, Datasets and Supervised Baseline	Ori Ernst, Ori Shapira, Ramakanth Pasunuru, Michael Lepioshkin, Jacob Goldberger, Mohit Bansal and Ido Dagan
10:20-10:40	Exploring Metaphoric Paraphrase Generation	Kevin Stowe, Nils Beck and Iryna Gurevych
10:40-11:00	Imposing Relation Structure in Language-Model Embeddings Using Contrastive Learning	Christos Theodoropoulos, James Henderson, Andrei Catalin Coman and Marie-Francine Moens
11:00-11:20	NOPE: A Corpus of Naturally-Occurring Presuppositions in English	Alicia Parrish, Sebastian Schuster, Alex Warstadt, Omar S. Agha, Soo-Hwan Lee, Zhuoye Zhao, Samuel R. Bowman and Tal Linzen
11:20-11:40	Pragmatic competence of pre-trained language models through the lens of discourse connectives	Lalchand Pandia, Yan Cong and Allyson Ettinger

11:40-13:20: Poster Session #2 (gathertown)

Predicting Text Readability from Scrolling Interactions (#75)	Sian Gooding, Yevgeni Berzak, Tony Mak and Matt Sharifi
Modeling the Interaction Between Perception-Based and Production-Based Learning in Children's Early Acquisition of Semantic Knowledge (#76)	Mitja Nikolaus and Abdellah Fourtassi
Scaffolded input promotes atomic organization in the recurrent neural network language mode (#122)l	Philip A. Huebner and Jon A. Willits
Grammatical Profiling for Semantic Change Detection (#137)	Andrey Kutuzov, Lidia Pivovarova and Mario Giulianelli
Deconstructing syntactic generalizations with minimalist grammars (#156)	Marina Ermolaeva
Relation-aware Bidirectional Path Reasoning for Commonsense Question Answering (#44)	Junxing Wang, Xinyi Li, Zhen Tan, Xiang Zhao and Weidong Xiao
Does referent predictability affect the choice of referential form? A computational approach using masked coreference resolution (#73)	Laura Aina, Xixian Liao, Gemma Boleda and Matthijs Westera
Polar Embedding (#78)	Ran Iwamoto, Ryosuke Kohita and Akifumi Wachi
Commonsense Knowledge in Word Associations and ConceptNet (#154)	Chunhua Liu, Trevor Cohn and Lea Frermann
Cross-document Event Identity via Dense Annotation (#164)	Adithya Pratapa, Zhengzhong Liu, Kimihiro Hasegawa, Linwei Li, Yukari Yamakawa, Shikun Zhang and Teruko Mitamura
Tackling Zero Pronoun Resolution and Non-Zero Coreference Resolution Jointly (#166)	Shisong Chen, Binbin Gu, Jianfeng Qu, Zhixu Li, An Liu, Lei Zhao and Zhigang Chen
Negation-Instance Based Evaluation of End-to-End Negation Resolution (#173)	Elizaveta Sineva, Stefan Grünewald, Annemarie Friedrich and Jonas Kuhn
Controlling Prosody in End-to-End TTS: A Case Study on Contrastive Focus Generation (#103)	Siddique Latif, Inyoung Kim, Ioan Calapodescu and Laurent Besacier
A Large-scale Comprehensive Abusiveness Detection Dataset with Multifaceted Labels from Reddit (#177)	Hoyun Song, Soo Hyun Ryu, Huije Lee and Jong Park
MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models (#53)	Qianchu Liu, Fangyu Liu, Nigel Collier, Anna Korhonen and Ivan Vulić
A Data Bootstrapping Recipe for Low-Resource Multilingual Relation Classification (#174)	Arijit Nag, Bidisha Samanta, Animesh Mukherjee, Niloy Ganguly and Soumen Chakrabarti
FAST: A carefully sampled and cognitively motivated dataset for distributional semantic evaluation (#56)	Stefan Evert and Gabriella Lapesa
Automatic Error Type Annotation for Arabic (#212)	Riadh Belkebir and Nizar Habash

13:20-14:20: Lunch break (gathertown)

14:20-15:30: Keynote Talk: What are we learning from language? / Gary Lupyan (Santo Domingo 1-2, Zoom)

15:30-15:50: Break (gathertown)

15:50-16:50: Oral Session #4 [Language evolution, acquisition and linguistic theories] (Santo Domingo 1-2, Zoom)

15:50-16:10	The Emergence of the Shape Bias Results from Communicative Efficiency	Eva Portelance, Michael C. Frank, Dan Jurafsky, Alessandro Sordoni and Romain Laroche
16:10-16:30	BabyBERTa: Learning More Grammar With Small-Scale Child-Directed Language	Philip A. Huebner, Elior Sulem, Fisher Cynthia and Dan Roth
16:30-16:50	Analysing Human Strategies of Information Transmission as a Function of Discourse Context	Mario Giulianelli and Raquel Fernández

16:50-17:10: Break (gathertown)

17:10-17:50: Oral Session #5 [Speech and phonology] (Santo Domingo 1-2, Zoom)

17:10-17:30	Predicting non-native speech perception using the Perceptual Assimilation Model and state-of-the-art acoustic models	Juliette Millet, Ioana Chitoran and Ewan Dunbar
17:30-17:50	The Influence of Regional Pronunciation Variation on Children's Spelling and the Potential Benefits of Accent Adapted Spellcheckers	Emma O'Neill, Joe Kenny, anthony ventresque and Julie Carson-Berndsen

17:50-18:20: Best Paper Award and Closing Remarks (Santo Domingo 1-2, Zoom)

Keynote Talks

On Day 1

Speaker:
Jennifer Culbertson (The University of Edinburgh, UK)

Title:
Linking learning to language typology
Abstract:
One of the most controversial hypotheses in linguistics is that individual-level biases in learning shape language typology at the population-level. While this hypothesis has been around a long time, it has often been supported by less than robust empirical evidence. In this talk, I present a number of studies aimed at providing new sources of evidence linking learning to key features of language. In the first part of the talk, I focus on a classic set of "language universals" which describe common word order patterns. One such pattern is word order harmony, the tendency for syntactic heads and dependents to align across phrases within a language. While harmony has long been claimed to have some special cognitive status, there is also compelling evidence that it may be driven by cognition-external processes of language change. I show that harmony is in fact favoured during learning, influencing how adults and children make inferences under noisy learning conditions, and how they extrapolate to new constructions. I then turn to a more complex pattern of word order which has been proposed to derive from constraints on syntactic representations. I report experimental and quantitative corpus-based evidence to suggest an alternative explanation of this pattern, but one nevertheless driven by learning. In the second part of the talk, I discuss the role of learning in shaping morphosynactic patterns like grammatical gender. I argue that the different biases of children and adults during learning work together to constrain how such patterns emerge and change over time. Finally, I discuss the implications of this work for linguistic theories and models of language learning.

On Day 2

Speaker:
Gary Lupyan (University of Wisconsin–Madison, USA)

Title:
What are we learning from language?
Abstract:
Where does semantic knowledge come from? Previous work on semantic knowledge within cognitive science has focused on studying knowledge acquired from direct experience with the world and through inference. But recent advances in natural language processing combined with greater availability of large text corpora have revealed that languages encode far more semantic information than previously suspected. In some cases, knowledge that was thought to require direct perceptual experience or inferential reasoning can be derived entirely from language itself.
I will present some recent investigations of this idea showing, for example, that embedded within the distributional structure of language is substantial information about visual appearance that people can rely on to learn about what things look like. I will also discuss how distributional semantics are informing our understanding of cross-linguistic differences in word meanings, and the relationship between language and thought. I will end by speculating that the robust availability of linguistic information may conceal radical diversity in human cognition.

Call for Papers

SIGNLL invites submissions to the 25th Conference on Computational Natural Language Learning (CoNLL 2021). The focus of CoNLL is on theoretically, cognitively and scientifically motivated approaches to computational linguistics, rather than on work driven by particular engineering applications. Such approaches include:

Computational learning theory and other techniques for theoretical analysis of machine learning models for NLP
Models of first, second and bilingual language acquisition by humans
Models of language evolution and change
Computational simulation and analysis of findings from psycholinguistic and neurolinguistic experiments
Analysis and interpretation of NLP models, using methods inspired by cognitive science or linguistics or other methods
Data resources, techniques and tools for scientifically-oriented research in computational linguistics
Connections between computational models and formal languages or linguistic theories
Linguistic typology, translation, and other multilingual work
Theoretically, cognitively and scientifically motivated approaches to text generation

We welcome work targeting any aspect of language, including:

Speech and phonology
Syntax
Lexical, compositional and discourse semantics
Dialogue and interactive language use
Sociolinguistics
Multimodal and grounded language learning

We do not restrict the topic of submissions to fall into this list. However, the submissions’ relevance to the conference’s focus on theoretically, cognitively and scientifically motivated approaches, will play an important role in the review process.

Submitted papers must be anonymous and use the EMNLP 2021 template. Submitted papers may consist of up to 8 pages of content plus unlimited space for references. Authors of accepted papers will have an additional page to address reviewers’ comments in the camera-ready version (9 pages of content in total, excluding references). Optional anonymized supplementary materials and a PDF appendix are allowed, according to the EMNLP 2021 guidelines. Please refer to the EMNLP 2021 Call for Papers for more details on the submission format. Submission is electronic, using the Softconf START conference management system using this link: https://www.softconf.com/emnlp2021/CoNLL/.

CoNLL adheres to the ACL anonymity policy, as described in the EMNLP 2021 Call for Papers. Briefly, non-anonymized manuscripts submitted to CoNLL cannot be posted to preprint websites such as arXiv or advertised on social media after May 14.

Multiple submission policy

CoNLL 2021 will not accept papers that are currently under submission, or that will be submitted to other meetings or publications, including EMNLP. Papers submitted elsewhere as well as papers that overlap significantly in content or results with papers that will be (or have been) published elsewhere will be rejected. Authors submitting more than one paper to CoNLL 2021 must ensure that the submissions do not overlap significantly (>25%) with each other in content or results.

Important Dates

May 14, 2021: Anonymity period begins
June 14, 2021: Submission deadline
August 31, 2021: Notification of acceptance
September 13, 2021: Camera ready papers due
November 10-11, 2021: Conference

All deadlines are at 11:59pm UTC-12h ("anywhere on earth").

Areas and ACs

Simulation and analysis of findings from psycholinguistic and neurolinguistic experiments; language evolution, acquisition, and linguistic theories: Micha Elsner, Allyson Ettinger
Interaction and grounded language learning: Dipendra Misra, Samira Shaikh
Resources and tools for scientifically motivated research: Andrew Caines, Roi Reichart
Multilingual work and translation: Maja Popovic, Rui Wang
Syntax, Morphology and Generation: Carlos Gómez, Rob van der Groot, Ryan Cotterell
Theoretical analysis and interpretation of ML models for NLP: Dieuwke Hupkes, Xin Eric Wang
Lexical, compositional, and discourse semantics: Michael Roth, Gabriel Stanovsky, Adina Williams
Speech, phonology, computational social science: Tanmoy Chakraborty, Kyle Gorman

Sponsorships

We are grateful to Google for its generous support of the conference.