NamePrism : a name-based nationality classifier

NamePrism is a non-commercial nationality/ethnicity classification tool that aims to support academic research, e.g. sociology and demographic studies. In this project, we learn name embeddings for name parts (first/last names) and classify names to 39 leaf nationalities and 6 U.S. ethnicities.

Contact

Nationality Taxonomy

NamePrism is trained on a 74M labeled name set from 118 countries. These countries are associated with a 39-leaf nationality taxonomy. The details are shown in the following treemap.

Ethnicity Classes

Six ethnicity/races are considered in our ethnicity classifier: White, Black, API (Asian and Pacific Islander), AIAN (American Indian and Alaska Native), 2PRACE (more than 2 race) and Hispanic.

Citation

NamePrism is a free natinoality/ethnicity classification API that achieves best performance when compared to exisiting free online systems. Please cite following publication if you used NamePrism in your work.

Nationality Classification using Name Embeddings .
Junting Ye, Shuchu Han, Yifan Hu, Baris Coskun, Meizhu Liu, Hong Qin and Steven Skiena.
CIKM, Singapore, Nov. 2017.

Credits

The hierachical pie chart visualization in NamePrism results are inspired by this, under Apache License v2.