Publications, Mark Johnson, Dept of Computing, Macquarie University
To remove any frames surrounding this page,
click here
(Last updated 12th August 2020)
Links to my entries on
Google Scholar
and the
ACL Anthology network
.
Here is the
ACL Anthology network's ranked list of computational linguistics authors.
-
Paria Jamshid Lou and Mark Johnson
Improving Disfluency Detection by Self-Training a Self-Attentive Model.
In
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020),
pages 3754-3763.
(bib).
-
Paria Jamshid Lou, Yufei Wang and Mark Johnson.
Neural Constituency Parsing of Speech Transcripts,
in
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies ,
pages 2756-2765.
(bib).
-
Harsh Agrawal, Karan Desai, Yufei Wang, Xinlei Chen, Rishabh Jain, Mark Johnson, Dhruv
Batra, Devi Parikh, Stefan Lee, and Peter Anderson.
nocaps: novel object captioning at scale.
In The IEEE International Conference on Computer Vision (ICCV 2019),
pages 8948-8957, October 2019.
(supplemental material)
(bib)
-
Mohammad Javad Hosseini, Shay B. Cohen, Mark Johnson and Mark Steedman (2019)
Duality of Link Prediction and Entailment Graph Induction,
in
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics,
pages 4736-4746.
(bib).
-
Yufei Wang, Mark Johnson, Stephen Wan, Yifang Sun and Wei Wang (2019)
How to Best Use Syntax in Semantic Role Labelling,
in
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics,
pages 5338-5343.
(bib)
-
Long Duong, Vu Cong Duy Hoang, Tuyen Quang Pham, Yu-Heng Hong, Vladislavs Dovgalecs, Guy Bashkansky, Jason Black, Andrew Bleeker, Serge Le Huitouze and Mark Johnson (2019)
An adaptable task-oriented dialog system for stand-alone embedded devices,
in
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations,
pages 49-57.
(bib)
-
Mathieu Bernard, Roland Thiolliere1, Amanda Saksida, Georgia R. Loukatou, Elin Larsen, Mark Johnson, Laia Fibla, Emmanuel Dupoux, Robert Daland, Xuan Nga Cao and Alejandrina Cristia (2019)
WordSeg: Standardizing unsupervised word form segmentation from text,
in
Behavior Research Methods, 4:1-15.
(Springer link),
(bib).
- Mohammad Javad Hosseini, Nathanael Chambers, Siva Reddy, Xavier R. Holt, Shay B. Cohen, Mark Johnson and Mark Steedman (2018)
Learning Typed Entailment Graphs with Global Soft Constraints, in
Transactions of the Association for Computational Linguistics,
6:703-717.
(bib).
-
Peter Anderson, Stephen Gould and Mark Johnson (2018)
Partially-Supervised Image Captioning,
in
Advances in Neural Information Processing Systems 31,
pages 1877-1888.
(bib)
-
Paria Jamshid Lou, Peter Anderson and Mark Johnson (2018)
Disfluency Detection using Auto-Correlational Neural Networks,
in
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing,
pages 4610-4619.
(bib)
-
Jonas Groschwitz, Matthias Lindemann, Meaghan Fowlie, Mark Johnson and Alexander Koller (2018)
AMR dependency parsing with a typed semantic algebra,
in
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics,
pages 1831-1841.
(bib)
-
Long Duong, Hadi Afshar, Dominique Estival, Glen Pink, Philip Cohen and Mark Johnson (2018)
Active learning for deep semantic parsing,
in
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics,
pages 43-48.
(bib)
-
Mark Johnson, Peter Anderson, Mark Dras and Mark Steedman (2018)
Predicting accuracy on large datasets from smaller pilot data,
in
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics,
pages 450-455.
(bib)
-
Peter Anderson, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould and Lei Zhang (2018)
Bottom-up and top-down attention for image captioning and visual question answering, in
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 3, no. 5, p. 6.
(bib)
-
Peter Anderson, Qi Wu, Damien Teney, Jake Bruce, Mark Johnson, Niko Sünderhauf, Ian Reid, Stephen Gould and Anton van den Hengel (2018)
Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments", in
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2.
(bib)
-
Thanh Vu, Dat Quoc Nguyen, Dai Quoc Nguyen, Mark Dras, Mark Johnson (2018)
VnCoreNLP: A Vietnamese Natural Language Processing Toolkit,
CoRR abs/1801.01331
(bib)
-
Mark Johnson (2017)
Marr's levels and the Minimalist Program,
in
Psychonomic Bulletin and Review, 24.1, pages 171-174.
(bib).
(preprint).
-
Peter Anderson, Basura Fernando, Mark Johnson and Stephen Gould (2017)
Guided Open Vocabulary Image Captioning with Constrained Beam Search,
in
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing,
pages 936-945.
(bib)
-
Paria Jamshid Lou and Mark Johnson (2017)
Disfluency Detection using a Noisy Channel Model and a Deep Neural Language Model,
in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics,
pages 547-553
(bib).
-
Shervin Malmasi, Mark Dras, Mark Johnson, Lan Du, and Magdalena Wolska
(2017)
Unsupervised Text Segmentation Based on Native Language Characteristics,
in
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics,
pages 1457-1469
(bib).
-
Kairit Sirts, Olivier Piguet and Mark Johnson (2017)
Idea density for predicting Alzheimer's disease from transcribed speech
, in
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017),
pages 322-332
(bib).
-
Jonas Groschwitz, Meaghan Fowlie, Mark Johnson and Alexander Koller (2017)
A constrained graph algebra for semantic parsing with AMRs,
in
IWCS 2017 - 12th International Conference on Computational Semantics - Long papers,
(bib).
-
Long Duong, Hadi Afshar, Dominique Estival, Glen Pink, Philip Cohen and Mark Johnson (2017)
Multilingual Semantic Parsing And Code-Switching,
in
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017),
pages 379-389
(bib).
-
Dat Quoc Nguyen, Mark Dras and Mark Johnson (2017)
A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing, in
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 134-142.
(bib)
-
Thanh Vu, Dat Quoc Nguyen, Mark Johnson, Dawei Song and Alistair Willis (2017)
Search Personalization with Embeddings,
in
Advances in Information Retrieval,
39th European Conference on IR Research, ECIR 2017
,
pages 598-604,
(bib).
-
Zhuang Li, Lizhen Qu, Qiongkai Xu and Mark Johnson (2016)
Unsupervised Pre-training With Seq2Seq Reconstruction Loss for Deep Relation Extraction Models, in
Proceedings of the Australasian Language Technology Association Workshop 2016
,
pages 65-73,
(bib).
-
Dat Quoc Nguyen, Mark Dras and Mark Johnson (2016)
An empirical study for Vietnamese dependency parsing, in
Proceedings of the Australasian Language Technology Association Workshop 2016
,
pages 143-149,
(bib).
-
John K Pate and Mark Johnson (2016)
Grammar induction from (lots of) words alone,
in
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics,
pages 23-32,
(bib).
-
Dat Quoc Nguyen, Kairit Sirts, Lizhen Qu and Mark Johnson (2016)
Neighborhood Mixture Model for Knowledge Base Completion,
in
Conference on Computational Natural Language Learning,
pages 40-50,
(bib).
-
Hiroshi Noji, Yusuke Miyao and Mark Johnson (2016)
Using Left-corner Parsing to Encode Universal Structural Constraints in Grammar Induction,
in
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing,
pages 33-43,
(bib).
-
Jonas Groschwitz, Alexander Koller and Mark Johnson (2016)
Efficient techniques for parsing with tree automata,
in
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,
pages 2042-2051,
(bib).
-
Didi Surian, Dat Quoc Nguyen, Georgina Kennedy, Mark Johnson, Enrico Coiera and Adam G Dunn (2016)
Characterizing Twitter Discussions About HPV Vaccines Using Topic Modeling and Community Detection
,
in
Journal of Medical Internet Research,
18(8): e232, doi: 10.2196/jmir.6045,
(bib).
-
Peter Anderson, Basura Fernando, Mark Johnson and Stephen Gould (2016)
SPICE: Semantic Propositional Image Caption Evaluation,
in
European Conference on Computer Vision, 2016
pages 382-398,
(bib)
(preprint)
-
Dat Quoc Nguyen, Kairit Sirts, Lizhen Qu and Mark Johnson (2016)
STransE: a novel embedding model of entities and relationships in knowledge bases,
in
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,
pages 460-466,
(bib).
-
Lan Du, Anish Kumar, Mark Johnson and Massimiliano Ciaramita (2015)
Using Entity Information from a Knowledge Base to Improve Relation Extraction,
in
Proceedings of the Australasian Language Technology Association Workshop 2015,
pages 31-38,
(bib).
-
Kairit Sirts and Mark Johnson (2015)
Do POS Tags Help to Learn Better Morphological Segmentations?,
in
Proceedings of the Australasian Language Technology Association Workshop 2015,
pages 91-100,
(bib).
-
Fiona Martin and Mark Johnson (2015)
More Efficient Topic Modelling Through a Noun Only Approach,
in
Proceedings of the Australasian Language Technology Association Workshop 2015,
pages 111-115,
(bib).
-
Matt Honnibal and Mark Johnson (2015)
An Improved Non-monotonic Transition System for Dependency Parsing,
in
Conference on Empirical Methods in Natural Language Processing (EMNLP),
pages 1373-1378,
(bib).
-
Dat Quoc Nguyen, Kairit Sirts and Mark Johnson (2015)
Improving Topic Coherence with Latent Feature Word Representations in MAP Estimation for Topic Modeling,
in
Proceedings of the Australasian Language Technology Association Workshop 2015,
pages 116-121,
(bib).
-
Zhendong Zhao, Lan Du, Benjamin Börschinger, John K Pate, Massimiliano Ciaramita, Mark Steedman and Mark Johnson (2015)
A Computationally Efficient Algorithm for Learning Topical Collocation Models
,
in
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing,
pages 1460-1469.
(bib).
-
Dat Quoc Nguyen, Richard Billingsley, Lan Du and Mark Johnson (2015)
Improving Topic Models with Latent Feature Word Representations,
in
Transactions of the Association for Computational Linguistics, (3), pages 299-313.
(bib).
-
Bharat Ram Ambati, Tejaswini Deoskar, Mark Johnson and Mark Steedman (2015)
An Incremental Algorithm for Transition-based CCG Parsing,
in
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,
pages 53-63.
(bib).
-
Mark Johnson, Joe Pater, Robert Staubs and Emmanuel Dupoux (2015)
Sign constraints on feature weights improve a joint model of word segmentation and phonology,
in
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,
pages 303-313.
(bib).
-
Lan Du, John Pate and Mark Johnson (2015),
Topic Segmentation with an Ordering-Based Topic Model,
in the
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence,
pages 2232-2238.
(bib)
- Gabriel Synnaeve, Isabelle Dautriche, Benjamin Börschinger, Mark Johnson, and Emmanuel Dupoux (2014)
Unsupervised Word Segmentation in Context,
in
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers,
pages 2326-2334.
(bib)
- Lan Du, John Pate and Mark Johnson (2014)
Topic Models with Topic Ordering Regularities for
Topic Segmentation,
in
IEEE International Conference on Data Mining, pages 803-808,
Shenzhen, China.
(bib)
-
Pate, John and Mark Johnson (2014)
Syllable weight encodes mostly the same information for English word segmentation as dictionary stress
, in
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
,
pages 844-853,
Doha, Qatar.
(bib)
- Johnson, Mark (2014)
The forest for the trees: Comment on “Toward a computational framework for cognitive biology: Unifying approaches from cognitive neuroscience and comparative cognition” by W.T. Fitch.
To appear in
Physics of Life Reviews.
- Mark Johnson, Anne Christophe, Emmanuel Dupoux and Katherine Demuth (2014)
Modelling function words improves unsupervised word segmentation,
in
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, (Volume 1: Long Papers),
pages 282-292.
(bib)
- Matthew Honnibal and Mark Johnson (2014)
Joint Incremental Disfluency Detection and Dependency Parsing,
in
Transactions of the Association of Computational Linguistics, 2 (2014),
pages 131-142.
(bib)
- Benjamin Börschinger and Mark Johnson (2014)
Exploring the Role of Stress in Bayesian Word Segmentation using Adaptor Grammars,
in
Transactions of the Association of Computational Linguistics, 2 (2014),
pages 93-104.
(bib)
- Bogdan Ludusan, Maarten Versteegh, Aren Jansen, Guillaume Gravier, Xuan-Nga Cao, Mark Johnson and Emmanuel Dupoux (2014)
Bridging the gap between speech technology and natural language processing: an evaluation toolbox for term discovery systems,
in
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 560-567.
(bib)
- Shay B. Cohen and Mark Johnson (2013)
The effect of non-tightness on Bayesian estimation of PCFGs.
In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics,
Volume 1, pages 1033-1041.
(bib)
(This is a revised version that fixes a bug in the definition of recursive nonterminals
in the original version, which is available
here).
-
Benjamin Börschinger, Mark Johnson and Katherine Demuth (2013)
A joint model of word segmentation and phonological variation for English word-final /t/-deletion.
In
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics,
Volume 1, pages 1508-1516.
(bib)
- Bevan Keeley Jones and Sharon Goldwater and Mark Johnson (2013)
Modeling Graph Languages with Grammars Extracted via Tree Decompositions.
In
Proceedings of the 11th International Conference on Finite State Methods and Natural Language Processing,
pages 54-62.
(bib)
-
Abdellah Fourtassi, Benjamin Börschinger, Mark Johnson and Emmanuel Dupoux (2013)
Why is English so easy to segment?.
In
Proceedings of the Fourth Annual Workshop on Cognitive Modeling and Computational Linguistics (CMCL),
pages 1-10.
(bib)
-
Matthew Honnibal, Yoav Goldberg and Mark Johnson (2013)
A Non-Monotonic Arc-Eager Transition System for Dependency Parsing.
In
Proceedings of the Seventeenth Conference on Computational Natural Language Learning,
pages 163-172.
(bib)
- Minh-Thang Luong, Michael C. Frank and Mark Johnson (2013)
Parsing entire discourses as very long strings: Capturing topic continuity in grounded language learning,
in Transactions of the Association for Computational Linguistics, 1 (2013) 315–326.
(bib)
- Mark Johnson (final draft version of 2013)
Language acquisition as statistical inference, in
Stephen R. Anderson and Jacques Moeschler and Fabienne Reboul, eds.,
The Language-Cognition Interface,
pages 109-134,
Libraire Droz, Geneva.
(bib)
-
Lan Du, Wray Buntine and Mark Johnson (2013)
Topic Segmentation with a Structured Topic Model,
in Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 190-200.
(bib)
-
Aren Jansen, Emmanuel Dupoux, Sharon Goldwater, Mark Johnson, Sanjeev Khudanpur,
Kenneth Church, Naomi Feldman, Hynek Hermansky, Florian Metze, Richard Rose, Mike Seltzer,
Pascal Clark, Ian McGraw, Balakrishnan Varadarajan, Erin Bennett, Benjamin Börschinger,
Justin Chiu, Ewan Dunbar, Abdellah Fourtassi, David Harwath, Chia-ying Lee, Keith Levin,
Atta Norouzian, Vijayaditya Peddinti, Rachael Richardson, Thomas Schatz and Samuel Thomas
(2013)
A summary of the 2012 JHU CLSP workshop on Zero Resource speech technologies and models of early language acquisition
Proceedings of ICASSP 2013.
(bib)
-
Meylan, S., Kurumada, C., Börschinger, B., Johnson, M., and Frank, M. C. (2012).
Modeling online word segmentation performance in structured artificial languages.
Proceedings of the 34th Annual Meeting of the Cognitive Science Society.
(bib)
-
Benjamin Börschinger, Katherine Demuth and Mark Johnson (2012)
Studying the effect of input size for Bayesian word segmentation on the Providence corpus.
In Proceedings of the 24th International Conference
on Computational Linguistics (Coling 2012), pages 325-340, Mumbai, India. Coling 2012
Organizing Committee.
(bib)
-
Sunghwan Mac Kim, Dominick Ng, Mark Johnson, James Curran (2012)
Improving Combinatory Categorial Grammar parse reranking with dependency grammar features.
In Proceedings of the 24th International Conference on Computational Linguistics (Coling 2012),
pages 1441-1458, Mumbai, India.
Coling 2012 Organizing Committee.
(bib)
-
Mark Johnson, Katherine Demuth and Michael Frank (2012)
Exploiting Social Information in Grounded Language Learning via Grammatical Reduction
, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics,
pages 883-891.
(bib)
-
Benjamin Börschinger and Mark Johnson (2012)
Using Rejuvenation to Improve Particle Filtering for Bayesian Word Segmentation
, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics,
pages 85-89.
(bib)
-
Bevan Jones and Mark Johnson and Sharon Goldwater (2012)
Semantic Parsing with Bayesian Tree Transducers,
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics,
pages 488-496.
(bib)
-
Sze-Meng Jojo Wong, Mark Dras and Mark Johnson (2012)
Exploring Adaptor Grammars for Native Language Identification,
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning,
pages 699-709.
(bib)
-
Mark Johnson (2011)
How relevant is linguistics to computational linguistics?,
Linguistic Issues in Language Technology, Volume 12.
-
Benjamin Börschinger and Mark Johnson (2011)
A Particle Filter algorithm for Bayesian Word Segmentation,
Proceedings of the Australasian Language Technology Association Workshop 2011,
pages 10-18.
(bib)
-
Bevan Jones, Mark Johnson and Sharon Goldwater (2011)
Formalizing Semantic Parsing with Tree Transducers,
Proceedings of the Australasian Language Technology Association Workshop 2011,
pages 19-28.
(bib)
-
Mark Johnson (2011)
Parsing in Parallel on Multiple Cores and GPUs,
Proceedings of the Australasian Language Technology Association Workshop 2011,
pages 29-37.
(bib)
-
Mehdi Parviz, Mark Johnson, Blake Johnson and Jon Brock (2011)
Using Language Models and Latent Semantic Analysis to Characterise the N400m Neural Response
,
Proceedings of the Australasian Language Technology Association Workshop 2011,
pages 38-46.
(bib)
-
Sze-Meng Jojo Wong, Mark Dras and Mark Johnson (2011)
Topic Modeling for Native Language Identification,
Proceedings of the Australasian Language Technology Association Workshop 2011,
pages 115-124.
(bib)
-
Sharon Goldwater, Thomas L. Griffiths and Mark Johnson (2011)
Producing Power-Law Distributions and Damping Word Frequencies with
Two-Stage Language Models,
Journal of Machine Learning Research,
volume 12 (July), pages 2335-2382.
-
Benjamin Börschinger, Bevan K. Jones and Mark Johnson (2011)
Reducing Grounded Learning Tasks To Grammatical Inference
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 1416-1425.
(bib)
- Simon Zwarts and Mark Johnson (2011)
The impact of language models and loss functions on repair disfluency
detection,
Proceedings of the 49th Annual Meeting of the Association
for Computational Linguistics, pages 703-711.
(bib)
- Ivan Yuen, Katherine Demuth and Mark Johnson (2011)
Prosodic structure in child speech planning and production,
in ICPhS XVII (Hong Kong), pages 2248-2251.
- Mark Johnson, Katherine Demuth, Michael Frank and Bevan Jones (2010)
Synergies in learning words and their referents,
in
Proceedings of NIPS 2010.
(bib)
- Tahira Naseem, Harr Chen, Regina Barzilay and Mark Johnson (2010)
Using Universal Linguistic Knowledge to Guide Grammar Induction,
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing,
pp. 1234-1244.
(bib)
- Mark Johnson and Katherine Demuth (2010)
Unsupervised phonemic Chinese word segmentation using Adaptor Grammars,
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010),
pp. 528-536.
(bib)
- Simon Zwarts, Mark Johnson and Robert Dale (2010)
Detecting Speech Repairs Incrementally Using a Noisy Channel Approach,
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010),
pp. 1371-1378.
(bib)
- Mark Johnson (2010)
PCFGs, Topic Models, Adaptor Grammars and Learning Topical Collocations and the Structure of Proper Names,
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics,
pp. 1148-1157.
(bib)
-
Michael Lamar, Yariv Maron, Mark Johnson and Elie Bienenstock (2010)
SVD and Clustering for Unsupervised POS Tagging
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics,
pp. 215-219.
(bib)
- David McClosky, Eugene Charniak and Mark Johnson (2010)
Automatic Domain Adaptation for Parsing,
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics,
pp. 28-36.
(bib)
- Bevan K. Jones, Mark Johnson and Michael C. Frank (2010)
Learning Words and Their Meanings from Unsegmented Child-directed Speech,
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics,
pp. 501-509.
(bib)
- Mark Johnson and Ahmet Engin Ural (2010)
Reranking the Berkeley and Brown Parsers
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics,
pp. 665-668.
(bib)
- Phil Blunsom, Trevor Cohn, Sharon Goldwater and Mark Johnson (2009)
A Note on the Implementation of Hierarchical Dirichlet Processes,
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pp. 337-340.
(bib)
- Sharon Goldwater, Tom Griffiths and Mark Johnson (2009)
A Bayesian Framework for Word Segmentation: Exploring the Effects of Context,
Cognition 112:1, pp. 21-54. You can download a preprint
here.
- William P. Headden III, Mark Johnson and David McClosky (2009)
Improving Unsupervised Dependency Parsing with Richer Contexts and Smoothing
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics,
pp. 101-109.
(bib)
-
Micha Elsner, Eugene Charniak and Mark Johnson (2009)
Structured Generative Models for Unsupervised Named-Entity Clustering
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics,
pp. 164-172.
(bib)
- Mark Johnson and Sharon Goldwater (2009)
Improving nonparameteric Bayesian inference: experiments on
unsupervised word segmentation with adaptor grammars,
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics,
pp. 317-325.
(bib)
- Jianfeng Gao and Mark Johnson (2008)
A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers,
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 344-352.
(bib)
- David McClosky, Eugene Charniak, and Mark Johnson (2008)
When is Self-Training Effective for Parsing?
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008),
pp. 561-568.
(bib)
- Mark Johnson (2008)
Using Adaptor Grammars to Identify Synergies in the Unsupervised Acquisition of Linguistic Structure,
Proceedings of the 46th
Annual Meeting
of the Association for
Computational Linguistics:
Human Language
Technologies.
pp. 398-406.
(bib)
- Mark Johnson (2008)
Unsupervised Word Segmentation for Sesotho Using Adaptor Grammars,
Proceedings of the Tenth Meeting of ACL Special Interest Group on Computational Morphology and Phonology.
pp. 20-27.
(bib)
- Kristina Toutanova and Mark Johnson (2007)
A Bayesian LDA-based model for semi-supervised part-of-speech tagging,
to appear in
Proceedings of NIPS 20
(bib)
- Noah Smith and Mark Johnson (2007)
Weighted and Probabilistic Context-Free Grammars Are Equally Expressive
Computational Linguistics 33:4, pages 477-491.
- Mark Johnson (2007)
Why Doesnt EM Find Good HMM POS-Taggers?
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL),
pages 296-305. Note: There are (at least) three mistakes in the paper:
- There are two mistakes in the formula for Digamma in equation (5).
The correct recurrence is: Ψ(x) = Ψ(x+1) - 1/x.
(Thanks to Kevin Gimpel for pointing this out).
- The sign in front of the x- 4 term is incorrect in the approximation for g(x) in equation (5).
The correct approximation is log(x) + 0.04167 x- 2 - 0.00729 x- 4 +0.00384 x- 6 - 0.00413 x- 8.
For more details on how this was computed,
see the comments in the C implementation of the Digamma function. (Thanks to Jason Baldridge for pointing this out).
- There's a mistake in the Gibbs sampler formula in Figure 4 on page 302.
The last denominator is missing a term "+ s αy" (Thanks to David Chiang for
pointing this out).
The code for this research was written while I was a Visiting Researcher
at Microsoft Research, so unfortunately I can't release it.
However, you can also download
a C++ implementation of a Gibbs sampler
for estimating PCFGs,
a C implementation of
the Digamma function and
an implementation of the Inside-Outside algorithm that can optionally
perform Variation Bayes.
- Mark Johnson (2007)
Transforming Projective Bilexical Dependency Grammars into efficiently-parsable CFGs with Unfold-Fold ,
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics,
pages 168-175.
After the final version of this paper was accepted for publication I learnt of very similar work by
Jason Eisner and John Blatz, which predates this paper and takes a very similar perspective
on this issue.
- Jianfeng Gao, Galen Andrew, Mark Johnson and Kristina Toutanova (2007)
A Comparative Study of Parameter Estimation Methods for Statistical Natural Language Processing
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics,
pages 824-831.
- Mark Johnson, Thomas L. Griffiths and Sharon Goldwater (2007)
Bayesian Inference for PCFGs via Markov Chain Monte Carlo,
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, pages 139-146.
- Mark Johnson, Thomas L. Griffiths and Sharon Goldwater (2007)
Adaptor Grammars: A Framework for Specifying Compositional Nonparametric Bayesian Models, in
B. Schoelkopf, J. Platt and T. Hoffman, eds.,
Advances in Neural Information Processing Systems 19,
The MIT Press.
- The published version (in the NIPS proceedings) has
a typo in equation (4) that defines adaptor grammars; it's fixed in the version here.
(Thanks to Julia Hockenmaier for pointing this out).
- Sharon Goldwater, Thomas L. Griffiths, and Mark Johnson (2007)
Distributional Cues toWord Boundaries: Context is Important,
Proceedings of the 31st Boston University Conference on Language Development.
- Sharon Goldwater and Thomas L. Griffiths and Mark Johnson (2006)
Contextual Dependencies in Unsupervised Word Segmentation, Proceedings of ACL/COLING 2006.
- Matthew Lease, Eugene Charniak, Mark Johnson, and David McClosky (2006)
A Look At Parsing and Its Applications,
Proceedings of AAAI 2006.
- Matt Lease and Mark Johnson (2006)
Early Deletion of Fillers In Processing Conversational Speech, Proceedings of the North American Conference on Computational Linguistics (NAACL'06)
- David McClosky, Eugene Charniak, and Mark Johnson (2006)
Effective Self-Training for Parsing, Proceedings of the North American Conference on Computational Linguistics (NAACL'06)
- Brian Roark, Mary Harper, Eugene Charniak, Bonnie Dorr, Mark Johnson, Jeremy Kahn, Yang Liu, Mari Ostendorf, John Hale, Anna Krasnyanskaya, Matthew Lease, Izhak Shafran, Matthew Snover, Robin Stewart and Lisa Yung (2006)
SParseval: Evaluation Metrics for Parsing Speech.
in Proceedings of the Language Resources and Evaluation Conference (LREC),
Genoa, Italy.
- Sharon Goldwater, Tom Griffiths and Mark Johnson (2005)
Interpolating between types and tokens by estimating
power-law generators (draft), to appear in NIPS 2005.
- Eugene Charniak and Mark Johnson (2005)
Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking,
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005)
- Sharon Goldwater and Mark Johnson (2005)
Representational Bias in Unsupervised Learning of Syllable Structure,
Proceedings of the Workshop on Psychocomputational Models of Human Language Acquisition, ACL 2005.
- Matthew Lease, Eugene Charniak, and Mark Johnson (2005)
Parsing and its Applications for Conversational Speech.
2005 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'05).
- M. Johnson, E. Charniak and M. Lease (2004)
An Improved Model for Recognizing Disfluencies in Conversational Speech
Rich Transcription Fall Workshop.
- Michelle Gregory, Mark Johnson and Eugene Charniak (2004)
Sentence-Internal Prosody Does not Help Parsing the Way Punctuation Does
Proceedings of the Human Language Technology Conference of the
North American Chapter of the Association for Computational Linguistics:
HLT-NAACL 2004
- Massimiliano Ciaramita and Mark Johnson.
Multi-Component Word Sense Disambiguation.
Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (Senseval 3/ACL 2004), 97-100.
- Sharon Goldwater and Mark Johnson (2004)
Priors in Bayesian Learning of Phonolgical Rules.
7th Annual Meeting of the ACL Special Interest Group on Computational Phonology (SIGPHON'04), 35-42.
- Keith Hall and Mark Johnson (2004)
Attention Shifting For Parsing Speech. ACL'04, 40-46
- Mark Johnson and Eugene Charniak (2004)
A Tag-Based Noisy Channel Model of Speech Repairs. ACL'04, 33-39.
- Brian Roark, Murat Saraclar, Michael Collins, and Mark Johnson (2004)
Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm. ACL'04, 47-54.
- M. Johnson (2003)
Learning and parsing stochastic unification-based grammars,
in Schoelkopf and Warmuth, "Learning theory and Kernel Machines",
Springer.
- M. Ciaramita and M. Johnson (2003)
Supersense Tagging of Unknown Nouns in WordNet.
In Proceedings of the Conference on Empirical Methods in
Natural Language Processing (EMNLP 2003).
- M.Ciaramita, T. Hofmann, M.Johnson (2003)
Hierarchical Semantic
Classification: Word Sense Disambiguation with World Knowledge. In
Proceedings of the 18th International Joint Conference on Artificial
Intelligence (IJCAI-03).
- Yasemin Altun, Mark Johnson, Thomas Hofmann (2003)
Loss Functions and Optimization Methods for
Discriminative Learning of Label Sequences,
In Proceedings of the Conference on Empirical Methods in
Natural Language Processing (EMNLP 2003).
- Mark Johnson and Stefan Riezler (2002)
Statistical models of language learning and use
Cognitive Science 26, pages 239-253.
- Yasemin Altun, Thomas Hofmann and Mark Johnson (2002)
Discriminative Learning for Label
Sequences via Boosting, in Advances in Neural Information Processing Systems
(NIPS*15), 2003.
- Keith Hall and Mark Johnson (2003)
Language modeling using efficient best-first bottom-up parsing
ASRU 2003
- Goldwater, S. and M. Johnson (2003)
``Learning
OT Constraint Rankings Using a Maximum Entropy Model'',
In Proceedings of the Stockholm Workshop on
'Variation within Optimality Theory. April 26-27, 2003 at Stockholm Univ.
Sweden. Eds: Jennifer Spenader, Anders Eriksson, and Östen Dahl. pp. 111-120.
(also available in gzipped postscript)
- Geman, S. and M. Johnson (2002) ``Dynamic programming
for parsing and estimation of stochastic unification-based grammars'',
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.
- Johnson, M. (2002) ``A simple pattern-matching
algorithm for recovering empty nodes and their antecedents'',
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.
- Riezler, S., T. King, R. Kaplan, R. Crouch, J. Maxwell and M. Johnson (2002)
``Parsing the Wall Street Journal using a Lexical-Functional
Grammar and Discriminative Estimation Techniques'',
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.
- Johnson, M (2002) ``The DOP estimation method is biased and inconsistent''.
Computational Linguistics 28(1), pages 71-76.
Available in Adobe Acrobat format.
- Donald Engel and Eugene Charniak and Mark Johnson (2002) ``Parsing and Disfluency Placement'',
EMNLP. Available in Adobe Acrobat format.
- Don Blaheta and Mark Johnson (2001) ``Unsupervised learning of multi-word verbs.'' Proceedings of the ACL 2001 Workshop on Collocation. Available in either gzipped postscript and Adobe PDF.
- Eugene Charniak and Mark Johnson. ``Edit Detection and Parsing for Transcribed Speech.'' Proceedings of NAACL 2001. Available in either gzipped postscript and Adobe PDF.
- Mark Johnson. ``Joint and Conditional Estimation of Tagging and Parsing Models.'' Proceedings of ACL 2001. Available as gzipped postscript or Adobe PDF.
- Geman, S. and M. Johnson (2001) Probability and statistics in Computational Linguistics: A brief review.
Available in Adobe Acrobat or
gzipped Postscript formats.
- Geman, S. and M. Johnson (2001) Probabilistic Grammars and their Applications.
Available in Adobe Acrobat or
gzipped Postscript formats.
- Stefan Riezler, Detlef Prescher, Jonas Kuhn and Mark Johnson (2000)
``Lexicalized Stochastic Modeling of Constraint-Based Grammars using Log-Linear Measures and EM'',
in Proceedings of the 38th Annual Meeting of the ACL, 2000.
- M. Johnson and S. Riezler (2000) ``Exploiting auxiliary distributions
in stochastic unification-based grammars'',
Proceedings of the 1st NAACL conference.
in Adobe PDF and in
gzipped postscript
- M. Johnson (2000) ``Stochastic Lexical-Functional Grammar'',
slides from talk presented at the LFG 2000 conference, Berkeley.
in Adobe PDF and in
gzipped postscript
- Mark Johnson and Brian Roark (2000)
``Compact non-left-recursive grammars using the selective left-corner transform and factoring'', in
Proceedings of the 18th International Conference on Computational Linguistics (COLING), 2000, pages 355-361.
- Massimiliano Ciaramita and Mark Johnson (2000)
``Explaining away ambiguity: Learning verb selectional preference with Bayesian networks'', in
Proceedings of the 18th International Conference on Computational Linguistics (COLING), Vol.1, p.187.
- M. Johnson, S. Geman, S. Canon, Z. Chi and S. Riezler (1999)
``Estimators for Stochastic ``Unification-based'' Grammars''
in The Proceedings of the ACL 1999
in Adobe PDF and in
gzipped postscript
- M. Johnson (1999) ``PCFG models of linguistic tree representations''
Computational Linguistics, available in
Gzipped Postscript format or
Adobe PDF format
- M. Johnson (1999) ``Type-driven semantic interpretation and
Feature dependencies in R-LFG'', in The syntax-semantics interface
in LFG, M. Dalrymple, ed.,
available in Gzipped Postscript format or
Adobe PDF format
- M. Johnson (1999) ``A Resource-sensitive Interpretation of Lexical Functional Grammar'', JoLLI,
available in Gzipped Postscript format or
Adobe PDF format
- M. Johnson (to appear) ``Optimality-theoretic Lexical Functional Grammar''
(this is a commentary on Joan Bresnan's presentation at the 1998 CUNY conference)
available in Gzipped Postscript format or
Adobe PDF format
- M. Johnson (1998) ``Finite State Approximation of Constraint-based
Grammars using Left-corner Grammar Transforms''
in 1998 Proceedings of COLING/ACL,
(scanned ACL repository version,
also available in original gzipped Postscript format and
Adobe PDF format)
(bib)
- M. Johnson (1996) ``Left Corner Transforms and Finite State Approximations'',
manuscript (a longer but less polished version of the COLING/ACL 98 paper
above) available in
Gzipped Postscript format or
Adobe PDF format
- E. Charniak, S. Goldwater and M. Johnson (1998)
``Edge-based Best-first Chart Parsing'', in
1998 Proceedings of the Workshop on Very Large Corpora,
available in Gzipped Postscript format or
Adobe PDF format
- M. Johnson (1998) ``Proof Nets and the Complexity of Processing Center-Embedded Constructions'',
The Journal of Logic, Language and Information. 7(4),
pages 433-447.
Preprint available in Adobe PDF format.
- M. Johnson and M. Kay (1997)
Copies of slides used in ESSLLI 1997 summer school course
``Topics in Parsing and Generation''
in Adobe Acrobat format
or
Gzipped Postscript format.
A gzipped
tar file of the Prolog code used in this class is also available.
- M. Johnson (1997) ``Features as Resources in R-LFG''
Proceedings of the 1997 LFG Conference, CSLI Press.
in PDF,
- M. Johnson (1996)
Resource-sensitivity in Lexical-Functional Grammar
The Proceedings of the 1996 Roma Workshop.
- M. Johnson and S. Bayer (1995)
Features and Agreement in Lambek Categorial Grammar
Proceedings of the 1995 Formal Grammar Workshop, pages 123-137.
- Mark Johnson (1995)
Memoization in top-down parsing
Computational Linguistics 21:3, pages 405-417
- S. Bayer and M. Johnson (1995)
Features and Agreement
Proceedings of the 33rd Annual Meeting of the Association for
Computational Linguistics.
- Johnson, M. and J. Doerre (1995)
Memoization of Coroutined Constraints
Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics.
- Johnson, M. (1994)
Logical Embedded Push-Down Automata in Tree-Adjoining Grammar Parsing,
in Computational Intelligence, 10:4, pages 495-505.
- Johnson, M. (1994)
Computing with Features and Formulae.
Computational Linguistics, 20.1.
- Johnson, M. (1994)
Two ways of formalizing grammars,
Linguistics and Philosophy 17, pages 221-248.
- Shieber, S. and M. Johnson (1993)
Variations on Incremental Interpretation,
Journal of Psycholinguistic Research,
22(2), pp. 287-318.
- Johnson, M. (1993)
The Complexity of Inducing a Rule from Data,
J. Mead, ed.,
The Proceedings of The Eleventh West Coast Conference on Formal Linguistics,
Stanford Linguistics Association, CSLI Press.
- Johnson, M. (1991)
Features and Formulae.
Computational Linguistics, 17.2.
-
M. Johnson (1991)
Logic and Feature Structures, in The Proceedings of the
International Joint Conference on Artificial Intelligence (IJCAI).
- Steven Abney and Mark Johnson. (1990)
Memory Requirements and Local Ambiguities of Parsing Strategies.
Journal of Psycholinguistic Research 20(3), pages 233-250.
-
M. Johnson (1990)
Expressing Disjunctive and Negative Feature Constraints with
Classical First-Order Logic, in The Proceedings of the
28th Annual Meeting of the Association for Computational Linguistics,
Pittsburgh, Pennsylvania; pages 173-179.
(bib)
-
M. Johnson and M. Kay (1990)
Semantic Abstraction and Anaphora, in
The Proceedings of International Conference on Computational Linguistics
(COLING), Helsinki, Finnland; pages 17-27.
-
M. Johnson (1990)
Features, Frames and Quantifier-Free Formulae, in
P. Saint-Dizier and S. Szpakowicz, eds.,
Logic and Logic Grammars for Language Processing.
Horwood; pages 94-107.
- Johnson, M. (1988). Attribute Value Logic and Theory of Grammar.
CSLI Lecture Notes Series, Chicago University Press.
- Johnson, M. and Klein, E. (1986)
Discourse, Anaphora and Parsing,
in the
Proceedings of the 11th International Conference on Computational Linguistics (COLING),
pages 669-675.
- Johnson, M. (1985) Parsing with Discontinuous Constituents, in the Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics
- Johnson, M. (1984) A Discovery Procedure for Certain Phonological Rules,
in the Proceedings of the 10th International Conference on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics.