NamePrism : a name-based nationality classifier
NamePrism is a non-commercial nationality/ethnicity classification tool that aims to support academic research, e.g. sociology and demographic studies. In this project, we learn name embeddings for name parts (first/last names) and classify names to 39 leaf nationalities and 6 U.S. ethnicities.
- Junting Ye: juyye at cs dot stonybrook dot edu
- Prof. Steven Skiena: skiena at cs dot stonybrook dot edu
- Dr. Yifan Hu: yifanhu at yahoo-inc dot com
NamePrism is trained on a 74M labeled name set from 118 countries. These countries are associated with a 39-leaf nationality taxonomy. The details are shown in the following treemap.
Six ethnicity/races are considered in our ethnicity classifier: White, Black, API (Asian and Pacific Islander), AIAN (American Indian and Alaska Native), 2PRACE (more than 2 race) and Hispanic.
NamePrism is a free natinoality/ethnicity classification API that achieves best performance when compared to exisiting free online systems. Please cite following publication if you used NamePrism in your work.
Nationality Classification using Name Embeddings .
Junting Ye, Shuchu Han, Yifan Hu, Baris Coskun, Meizhu Liu, Hong Qin and Steven Skiena.
CIKM, Singapore, Nov. 2017.