Improving Humanitarian Needs Assessments Through Natural Language Processing

Challenge presented by International Rescue Committee

Background

The International Rescue Committee (IRC) presented its challenge on the lack of qualitative methods in humanitarian surveys at the 2018 Emergency Data Science Workshop. IRC, represented by Bobi Morris and Elena Chopyak, posed the question to workshop attendees if there are ways to use Natural Language Processing (NLP) to automate transcription, translation, and analysis of open-ended responses to various types of humanitarian data collection methods. During the workshop, a multi-disciplinary team of designers, NLP engineers, assessment specialists, and information management specialists worked hard to narrow down the problem and define the technical challenges to be overcome. The working group comprised specialists from ACAPS, Elrha, IRC, OCHA, Pivotal, Purple Compass, World Food Programme (WFP), and York University.

Progress

  • December 4-5, 2018: DIGHR hosted the first International Health Emergency Data Science Workshop. The International Rescue Committee (IRC) presented one of five challenges at the event: Because of a lack of human and financial resources, it is challenging to integrate qualitative methods in humanitarian surveys and feedback collection methods. During an all-day design and feasibility session, our workshop explored if and how NLP can be used to automate transcription, translation, and analysis of open-ended responses. The working group comprised specialists from six countries, including ACAPS, Elrha, Harvard Humanitarian Initiative (HHI) , IRC, NetHope, OCHA, Pivotal, Purple Compass, World Food Programme (WFP), and York University.
  • December 6, 2018: DIGHR convened a smaller working group for an all-day discussion to better define the challenges faced by multiple organizations, prioritize next steps, and discuss piloting options.
  • January 2019: WFP and DIGHR began working to identify a pilot project site, which would require building a transcription and translation model from scratch during an ongoing complex emergency. The organizations submitted several funding applications and began the process of a formal partnership that would allow future data sharing.
  • February 2019: Several York researchers, together with multiple partners, began describing the challenge and the proposed solution in a formal article, which will be published at a peer-reviewed journal in early 2020.
  • March 2019: Project lead Tino Kreutzer presented project plans and early results on the ethical challenges at the International Studies Association and Ethics and Humanitarian Research conferences.
  • April 2019: The DIGHR team began working more closely with Translators Without Borders, whose Gamayun project aims to create NLP models for machine translation in low-resource languages.
  • June 2019: DIGHR and HHI are beginning scoping and design work to better collect interview audio responses and be able to transcribe and translate them with cloud NLP methods.

Learn More

Full Challenge Statement
Challenge Presentation here
Working Group Solution Pres here