Quick answer

Paper2017-09-27•Source ↗•10 attns9,398 checkouts

Claim

Replicability Analysis for Natural Language Processing: Testing Significance with Multiple Datasets

Authors

Discuss with Grok

Rotem Dror·

Gili Baumer·

Marina Bogomolov·

Roi Reichart

ABSTRACT

With the ever-growing amounts of textual data from a large variety of languages, domains, and genres, it has become standard to evaluate NLP algorithms on multiple datasets in order to ensure consistent performance across heterogeneous setups. However, such multiple comparisons pose significant challenges to traditional statistical analysis methods in NLP and can lead to erroneous conclusions. In this paper, we propose a Replicability Analysis framework for a statistically sound analysis of multiple comparisons between algorithms for NLP tasks. We discuss the theoretical advantages of this framework over the current, statistically unjustified, practice in the NLP literature, and demonstrate its empirical value across four applications: multi-domain dependency parsing, multilingual POS tagging, cross-domain sentiment classification and word similarity prediction.

#machine-learning #machine-learning 📋 Awesome List: nlp-classic

Review Snapshot

Explore ratings

0.0

★★★★★

0 ratings

5 star

4 star

3 star

2 star

1 star

Recommendation

recommend this content.

Review this content

Share your opinion to help other learners triage faster.

Write a review

Invite a reviewer

Invite someone by email to share an invited review for Replicability Analysis for Natural Language Processing: Testing Significance with Multiple Datasets.

Author Inquiries

Public questions about this content. Attendemia will route your question to the author. Vote on the most important ones. No guarantee of response.

Post an inquiry

Sort by: Most helpful