Crowdsourcing Image Extraction and Annotation: Software Development and Case Study
Average rating
Cast your vote
You can rate an item by clicking the amount of stars they wish to award to this item.
When enough users have cast their vote on this item, the average rating will also be shown.
Star rating
Your vote was cast
Thank you for your feedback
Thank you for your feedback
Keyword
CrowdsourcingImage extraction and annotation
Crowdsourcing software
Time Magazine
Data verification methodology
Amazon Mechanical Turk (AMT)
Journal title
Digital Humanities QuarterlyDate Published
2020-03
Metadata
Show full item recordURI
http://www.digitalhumanities.org/dhq/vol/14/2/000469/000469.html; http://hdl.handle.net/20.500.12648/7156Abstract
We describe the development of web-based software that facilitates large-scale, crowdsourced image extraction and annotation within image-heavy corpora that are of interest to the digital humanities. An application of this software is then detailed and evaluated through a case study where it was deployed within Amazon Mechanical Turk to extract and annotate faces from the archives of Time magazine. Annotation labels included categories such as age, gender, and race that were subsequently used to train machine learning models. The systemization of our crowdsourced data collection and worker quality verification procedures are detailed within this case study. We outline a data verification methodology that used validation images and required only two annotations per image to produce high-fidelity data that has comparable results to methods using five annotations per image. Finally, we provide instructions for customizing our software to meet the needs for other studies, with the goal of offering this resource to researchers undertaking the analysis of objects within other image-heavy archives.Citation
Jofre, Ana, Vincent Berardi, Kathleen P.J. Brennan, Aisha Cornejo, Carl Bennett, and John Harlan. 2020. “Crowdsourcing Image Extraction and Annotation: Software Development and Case Study.” Digital Humanities Quarterly 14 (2). http://www.digitalhumanities.org/dhq/vol/14/2/000469/000469.html.Collections