Drs. Dutta, Passonneau and Waltz win a grant from the National Endowment for the Humanities to use crowd-sourcing technologies for tagging and learning from historic newspaper articles of the New York Public Library.

NEH_logo_1.jpegComputers may have defeated humans in chess and arithmetic, but there are many areas where the human mind still excels such as visual cognition and language processing (Comm. of ACM, Vol 52, No 3, March ’09). If one mind is good, it has been argued that several minds are likely to be superior in certain tasks than individuals and even experts. This project aims to leverage "the wisdom of the crowds" (von Ahn, 2008) to collaboratively tag historical newspaper articles in the holdings of the New York Public Library (NYPL). Patrons and scholars will be encouraged to generate custom tags for articles they read and use often; these will be integrated into a metadata library and evaluated for their contribution to improving retrieval of documents. Novel machine learning algorithms will be designed for automatic categorization of newspaper articles. The creation and analysis of this corpus will enable advanced search mechanisms on these holdings making them more useful to the general public.   


The project has been selected as a "We the People" project at NEH whose initiative is to encourage and strengthen the teaching, study and understanding of American history and culture