Find Jobs
Hire Freelancers

String classification - Create a code to guess ethnicity of person based on their name

$30-250 USD

I përfunduar
Postuar over 5 years ago

$30-250 USD

Paguhet në dorëzim
I have dataset of 500 000 over rows of data. Each row has the following columns of data: 1) a person's name (first name OR last name), 2) this person's ethnicity. There are a total 18 different ethcnicity groups in the data. Such as "english", "chinese", "russian", "indian", "german", "latino" etc. 3) popularity of this name among this ethnicity. That is, how many times our system has detected this name of a person and this person was from this ethnicity group. 4) popularity of this name among other ethnicity. That is, how many times our system has detected this name of a person and this person was from other ethnicity group. A sample of the dataset is attached to this task as a CSV file. Your job is to create a program or a script that will take as an input a name of a person (first name and last name, or only one name element) and it will output its guess as to what ethnicity group this person belongs to based on the training of the dataset AND a confidence or probability number which tells how sure the system is of this ethnicity being the correct answer. For example, should the program receive input "john smith", it should output ethnicity class "english" and a confidence number as to how sure the system is "john smith" is "english". Thus, this is basically a kind of classic string classification problem. The code must be implemented in a way it can guess the ethnicity of person whose name does not exist in the dataset. In other word, the code must be some sort of learning system (such as artificial neural network system which has been trained using the sample dataset), OR it uses other ways to extract traits from the names which hint as to which ethnicity a person most likely is, for example that of n-grams, bayesian analysis or something else. The code must not be simply a search algorithm which searches the dataset against hits and in case there are no hits (e.g. if name 'john' does not exist in the dataset but user input is 'john', the system cannot produce any guess that 'john' sounds like "english" name). The code should be done in either PHP 7.x, or in a way it can be called from PHP script (e.g. Perl or Python script, for example). In your bid, please tell me what kind of method you would use.
ID e Projektit: 17753084

Rreth projektit

17 propozime
Projekt në distancë
Aktive 6 yrs ago

Po kërkoni të fitoni para?

Përfitimet e ofertës për Freelancer

Vendosni buxhetin dhe afatin tuaj
Paguhuni për punën tuaj
Përshkruani propozimin tuaj
Është falas të regjistrohesh dhe të bësh oferta për punë
I dhënë për:
Avatari i Përdoruesit
Hello, I have worked with NLTK problems before, so I believe the job won't be a problem for me. Please check my profile and feel free to ask me anything! I would use Neural network for predictor. Regards Žiga
$222 USD në 3 ditë
5,0 (7 përshtypje)
4,7
4,7
17 freelancers are bidding on average $215 USD for this job
Avatari i Përdoruesit
I will be using python and a Long short term memory to learn character sequences. I am a data scientist and am proficient at implementing machine learning models and deep learning based models, both in R and python. I also am familiar with web-scrapping using python. I have good command over various regression and classification tasks. I have a background of theoretical mathematics and statistics. I have a strong background of theoretical statistics and probability theory.I have experience of implementing multivariate regression ,Factor Analysis, Principle Components analysis, ANOVA , LASSO and Elastic net, classification and regresison trees, Monte Carlo methods etc. in R and python.
$250 USD në 10 ditë
4,9 (61 përshtypje)
5,9
5,9
Avatari i Përdoruesit
I have a good hands on working with Advanced R and Python and BI tools and technologies, AI, Big Data. I have quite a good knowledge of DL/ML Algorithm , have also developed Dashboards and Web Applications using flask/django. My area of expertise is building financial models (Stock Markets) , Image Processing and building models for food, healthcare and telecom sector, Classification/Prediction/Clustering, NLP and Chatbots. Specifically, I have worked for log on stock data and model building for the same. I understand the project requirement and will deliver the desired product within the time specified. I would like to hear from you. Thanks Shivam
$250 USD në 3 ditë
4,8 (24 përshtypje)
6,0
6,0
Avatari i Përdoruesit
Hello, Greetings of the day.!! Your project attracted my attention at first glance, because I've really rich experience in Machine Learning & Python Programming. I am having 5+ year of experience in Data Science using tools like Python, R, Spark. I have worked on several similar projects before! Worked on multiple projects in various domains like life sciences, CPG, Insurance. I have completed Master of Technology in Machine Learning. I'm really confident about your project, and very eager to join your project. If we have a chance to cooperate, I'll do my best to provide wonderful result. Work Areas - Image Processing(OpenCV), Text Mining(NLP), Supervised and Unsupervised Machine learning problem(Classification, Regression Clustering). Thanks, Abhyudaya D
$155 USD në 3 ditë
4,7 (49 përshtypje)
5,6
5,6
Avatari i Përdoruesit
Hi There! I'm interesting your project very well. I am a full time devloper and can work more than 50 hours in a week. I am good at PHP and I'm a good Software Enginner. I have good experience about optimizing and search algorithm. I think this problem should using Binary Search Tree algorithm. I look forward to hearing from you and hopefully we will have the opportunity to work together. Thank you Wu
$200 USD në 3 ditë
4,9 (34 përshtypje)
5,0
5,0
Avatari i Përdoruesit
Hi I am a very experienced statistician, data scientist and academic writer. I have completed several PhD level thesis projects involving advanced statistical analysis of data. I have worked with data from several companies and have done projects involving high level quantitative analysis and data interpretation skills to study the trends, time behaviour and compare the variables in the data. I can do advanced level analysis in SPSS, R, PYTHON, WEKA, TABLEAU and EXCEL tools like machine learning, hypothesis testing, forecasting, T-test, ANOVA etc. Looking forward to discussion, Best Regards, Suyash
$250 USD në 3 ditë
4,1 (23 përshtypje)
5,7
5,7
Avatari i Përdoruesit
hii sir How are you doing I have good experience in this field and i can do your work in best possible way, kindly text me so that we can discuss the work in more details thanks ...........
$155 USD në 3 ditë
4,9 (6 përshtypje)
4,2
4,2
Avatari i Përdoruesit
hi sir i am computer engineer an as well a certified labview developer so i am intersted to do that in Labview and in general i am intersted in this type of work so if you accept we can cooperate
$200 USD në 3 ditë
4,9 (6 përshtypje)
3,3
3,3
Avatari i Përdoruesit
Hello! Your problem can easily be solved with an artificial neural network, giving the strings as inputs and the ethnicity as labels. This is how I would solve your problem, and I will gladly help you with it, so please, contact me!
$160 USD në 2 ditë
5,0 (4 përshtypje)
3,3
3,3
Avatari i Përdoruesit
I have completed my Bachelor's degree in Engineering in Electrical and electronics. I have 3 years of work experience. I have worked with Infosys for a year and with various startups for two years. Because of my experience with startups, I am extremely flexible in my skillset and enjoy working with startups. I am currently pursuing my MBA from NALSAR university which is the top law college in India. I will be using SVM and Neural networks. I will compare the accuracy of the two and finally choose the best one to make the classifier model. I will be using numpy, python and tensorflow.
$222 USD në 10 ditë
0,0 (0 përshtypje)
0,0
0,0
Avatari i Përdoruesit
Hello, My name is Alexey, I am a Python expert, despite there are only architectural and design works in my profile. I can make this program for you, but unfortunately it will be strongly based on the database you have. And if in the database there are not any "Johns", it will say that "I don't have any information about this name". It doesn't need neural network frameworks and make as a simple analysis of the database, and quantity of reiteration of one or another name with or without sername. Please let me know if it is possible to cooperate. Via talking over skype I can show you my Python projects. I have only one - I work on AI. Exactly on a word recognition system. Please feel free to contact me. Best regards, Alexey
$111 USD në 5 ditë
0,0 (0 përshtypje)
0,0
0,0

Rreth klientit

Flamuri i THAILAND
Turku, Thailand
5,0
642
Mënyra e pagesës u verifikua
Anëtar që nga mar 16, 2011

Verifikimi i klientit

Faleminderit! Ne ju kemi dërguar me email një lidhje për të kërkuar kredinë tuaj falas.
Ndodhi një gabim gjatë dërgimit të email-it tuaj. Ju lutemi provoni përsëri.
Përdorues të regjistruar Punë të postuara
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Po ngarkohet shikimi paraprak
Leja u dha për Geolocation.
Seanca e hyrjes ka skaduar dhe ke dalë. Hyr sërish.