In each screen, you will be shown an original GLUE text in English and the corresponding Kinyarwanda translation. The task is to score the translation on the scale of 1–4 as follows:
1: Invalid or meaningless translation
2: Invalid but not totally wrong
3: Almost valid, but not totally correct
4: Valid and correct translation
The General Language Understanding Evaluation (GLUE) benchmark is a collection of resources for training, evaluating, and analyzing natural language understanding systems. We used machine translation to convert a subset of the GLUE dataset into Kinyarwanda. In order to evaluate the machine translation noise in this new dataset, we need volunteers to evaluate the quality of the translated text.
In each screen, you will be shown an original GLUE text in English and the corresponding Kinyarwanda translation. The task is to score the translation on the scale of 1–4 as follows:
1: Invalid or meaningless translation
2: Invalid but not totally wrong
3: Almost valid, but not totally correct
4: Valid and correct translation
Only your answers will be collected and the Internet cost should be minimal. We very much appreciate your help in this effort. If you have any questions, concerns, suggestions or would like to collaborate further on the project, please contact us by email: [email protected].