Implementing the procedures for creating and maintaining an index of scientific publications - 2

€750-1500 EUR

Mbyllur

Postuar

about 5 years ago

€750-1500 EUR

Paguhet në dorëzim

Employer: Institute of Neuroinformatics University of Zurich & ETH Zurich Project Objective: Creating an ElasticSearch index of scientific publications (metadata and full-text) by aggregating content from various data sources (i.e. scientific publication databases). Keeping the ElasticSearch index updated as new content is added to the data sources. Implementation of a flexible workflow to integrate additional future data sources. Being able to handle changes to the APIs of data sources. Duration: ~1 month Technologies to use: Node.js, Python, Elastic Stack, Docker. Open for other suggestions Data sources (example list): 1) Crossref ([login to view URL]): contains the metadata of all publications having a digital object identifier (DOI). Content can be downloaded by querying the database through a REST API in a rate-limited fashion. 2) MEDLINE/Pubmed ([login to view URL]): contains metadata and abstracts of most publications related to life sciences. Also contains publications not having DOIs. Content can be bulk downloaded. 3) CORE ([login to view URL]): An aggregate database of most open access publications, including full text of some. If full-text is not available (e.g. papers from arXiv), a link is provided to the original source, which should be crawled to fetch the full text content. Contains data from large databases such as arXiv, CiteSeerX. Database can be bulk downloaded. The workflow should be flexible to include additional data sources as they become available. Project Parts/Tasks: Different tasks should be handled by individual Docker microservices 1) Downloading and parsing the entire content of listed data sources and indexing this in individual ElasticSearch indices. The implementation for parsing the data sources needs to be template based, i.e. same functions can be used with a different template for a different data source. 2) Extracting content of PDF files (in an unstructured-format) if data source only provides PDFs (e.g. CORE) 3) Aggregating downloaded content from data sources in an “meta” ElasticSearch index 4) Keeping meta index updated as new publications appear 5) Maintaining the meta index: handling duplicates, handling different versions of a publication (e.g. arXiv preprints vs their final publication in a journal), adding new fields to the index, etc. We do not ask for the delivery of a database, but the tools to populate it. The source code of your implementation needs to be delivered. Please do not submit code with potential license issues. Third party software/libraries can be used if they are FOSS. We do not ask for a GUI. ElasticSearch index fields (not exhaustive): Title, journal, page, publication date, authors, affiliations, abstract, full-text, references, figures, data source, data source ID, DOI

ID e Projektit: 18920857

Rreth projektit

25 propozime

Projekt në distancë

Aktive 5 yrs ago

Po kërkoni të fitoni para?

Adresa e email-it

Përfitimet e ofertës për Freelancer

Vendosni buxhetin dhe afatin tuaj

Paguhuni për punën tuaj

Përshkruani propozimin tuaj

Është falas të regjistrohesh dhe të bësh oferta për punë

25 freelancers are bidding on average €1 385 EUR for this job

@AwaisChaudhry

Hi there, I have checked the details I have great experience with Docker, Elasticsearch, node.js, Python. Please start the chat so we can discuss this job more in detail. Thanks

€1 250 EUR në 20 ditë

5,0

(22 përshtypje)

7,1

@DevStar925

Flamuri i LAO PEOPLE'S DEMOCRATIC REPUBLIC

Hi, Dear Employer! I am really interested in your project. I have enough experience in Python, C/C++, C#, java programming. I am 100% sure I can satisfy your requirements perfectly. User-Friendly Interface And Clear Algorithm Will Encourage Your Project. I want a long-term relationship with you. Thank you and best regards!

€1 250 EUR në 20 ditë

4,9

(35 përshtypje)

6,7

@shadabkhan92

https://www.freelancer.com/projects/software-architecture/Elastic-search-kibana-cuckoo-API https://www.freelancer.com/projects/javascript/Webdevelopment-Project-for-Shadab/ done the similar tasks using Python, Elasticsearch, Node.js lets discuss

€2 352 EUR në 20 ditë

4,9

(29 përshtypje)

6,2

@meet2amitvw

Hello as a core developer i am having relevant skills and experience as you requested in your project description...i can share some demo as well in further chat. can we discuss more on this to get detail understanding about project ? As i am having some technical question on this so let me know when you get time to discuss on this and clear the doubts. Moreover You can also check my profile page as i have more than 33% Repeat Hire Ratio..so i work as long term basis.

€1 500 EUR në 20 ditë

4,9

(6 përshtypje)

6,1

@polarjin2017

Hello? How are you? I have seen the project - "Implementing the procedures for creating and maintaining an index of scientific publications - 2." I have been working in these fields((Docker, Elasticsearch, node.js, Python)) for 7 yrs as a freelancer. I can work full time as you need. I will never disappoint you and i will try my best to deliver good result always. Hope to work with you. Thank you.

€800 EUR në 3 ditë

5,0

(26 përshtypje)

5,5

@DarkKnight2206

Hello! I am a python developer. I looked at your project and it seems interesting. I have all necessary skills required for this project. Ping me to discuss in detail.

€1 125 EUR në 2 ditë

4,7

(34 përshtypje)

5,5

@seemasit

Hi there, Your Job post has caught my attention and pleased to inform you that I can do this job of yours as I have excellent experience in mentioned technology. Thanks

€1 250 EUR në 20 ditë

4,8

(9 përshtypje)

5,1

@winnow1

Everlytics is an Enterprise AI company headquartered in Singapore with an offshore development team in Bangalore, India. We as a team have an exhaustive technical skill set and have delivered a number of projects successfully in the areas which includes Machine Learning, Big Data, NLP, Computer Vision, Deep Learning and Business Intelligence to name just a few. We believe one of the most important skills for a Data Scientist is to be able to scrape data effectively and efficiently from any data source may it be a custom built database or any API or even a website. Eg: scrape 1000TB of meaningful english sentences from the web in less than 2 months. We have more than 5 years of experience working in ElasticSearch(ES), Kibana, Kafka (alongside Spark Streaming and Elasticsearch in the data pipeline) ES experience: Elasticsearch Cluster setup, Capacity Planning and Configuration Design Created index from web sources, data sources etc.. Mapping of the data model between ES and Database Updation of the ES index with new feeds from data sources Deduplication of the index in ES using Logstash and Python scripts Taking snapshots of the ES clusters for index backup We have got experience in delivering projects where we package the codes into a Docker image and host them on Kubernetes cluster (either cloud managed or self managed) for for easy deployment and scaling. We should be able to run We look forward to work with you. Good day!

€1 333 EUR në 20 ditë

4,1

(4 përshtypje)

5,8

@asjkim

I have extensive ELK projects experience in banking and Telco. Currently I am also engaging a number of ELK project which similar to your requirement. I also have background in Big Data, Python and ETL process so I could transform any data before ingesting into ELK stack. I might not be as cheap as others but I provide quality work within deadline.

€2 222 EUR në 20 ditë

5,0

(2 përshtypje)

4,0

@janat08

Would like to offer to work with arangodb with limited experience that I have with it, as it has arangosearch feature in competition with elasticsearch, with the gist of it that it supports graphs in case that you would really elaborate search parameters besides the text, although I doubt it stands up to elastic search as far as text is concerned.

€1 250 EUR në 20 ditë

2,6

(1 review)

2,3

@Danial2018

Hey, there, Please if possible give me the list of features and also reference that would be great for me. Please come over the chat for the further detailed discussion. Thanks

€1 250 EUR në 20 ditë

5,0

(1 review)

0,6

@naishodayo

hello,dear. I have read all your requirements for 'Implementing the procedures for creating and maintaining an index of scientific publications - 2' and I fully understood it. I am confident and I am sure that I am able to finish this project. Please come in contact with me, so that we can discuss any details via chat:) Skills: Docker, Elasticsearch, node.js, Python

€1 250 EUR në 7 ditë

0,0

(0 përshtypje)

0,0

@SecureForYou

Hi,dear! I am quite interested in your project - 'Implementing the procedures for creating and maintaining an index of scientific publications - 2'. :) I am a skillful software developer who has rich experience in this field. If you contact me, you and I will be happy. Thank you in advance. Skills: Docker, Elasticsearch, node.js, Python

€1 250 EUR në 5 ditë

0,0

(0 përshtypje)

4,2

@shyamiliK

Hi, I am PhD scholar, studying bioinformatics. I have developed a Python based prediction tool. I have already worked with ~20000 protein data analysis and submitted my paper. I am new to free lancing, If I am given a chance I would definitely complete the project with all the deliverable outcomes. I am sincere and dedicated and adaptive. I have good knowledge about curation of manuscripts from online. Hope, Ill be suitable and please give preference to a student like me.

€888 EUR në 30 ditë

0,0

(0 përshtypje)

0,0

@DGiZ

Greetings. I would like to develop your ElasticSearch project using Python3 for crawling and parsing data, and Docker for containerizing the various modules required for your application. The containerized modules to be included are: - Modular template-based crawler for populating ElasticSearch indexes, and for maintaining updated indexes. - One ElasticSearch index for each data source + one meta index. - PostGresSQL Database as a demo database, since your contract does not require delivery of the database, it will be up to you to implement your own bindings if you choose to use a custom database. - PDF parser for extracting full text of PDF documents. Simple parsers such as PyPDF2 are not very robust and can cause problems with different PDF character encodings, therefore tika-python will be used, which calls a local Apache Tika REST server which can handle PDF parsing in a much more robust way. Apache Tika REST server will be running within the container container, and requires Java 7 to be installed. - REST proxy for forwarding API requests to data sources, with customizable rate-limiting. The payment delivery for the project would be divided into 4 milestones, please review the bid for details. I thank you for taking the time to review my application, and look forward to working with you soon.

€2 500 EUR në 45 ditë