MQ Names Corpus

The MQ Names Corpus contains over 13k names from 5 cultural groups, as outlined in Table 1 of this paper.

The names are annotated for gender and ethnicity.

These include names of Arabic, German, Iranian and Japanese origin. Romanized versions of all names are used. Additionally, a final set of the most common given names sourced from the 1990 US Census data is also included.

Download The Data

You can contact me to obtain a copy of the name dataset.