facebook-pixel

Data Scientist with Dask and Kubernetes experience

Industry

Specialization Or Business Function

Technical Function

Technology & Tools Programming Languages and Frameworks (Python), DevOps Tools (Kubernetes)

CLOSED FOR BIDDING

Project Description

Looking for a Data Scientist with strong Python Application Development Experience with experience using Dask & Kubernetes in large application pipeline frameworks.

 

Role Description

We are looking for someone to help us modify our existing data science pipeline over to Dask.  We have started the process of migration but are currently running into issues when utilizing Dask’s internal distributed client and with Kubernetes.  Our need is for someone to come in, get an understanding of our codebase, and help us get the pipeline up and running, ultimately in a Kubernetes environment.  Our current pipeline is essentially a large ETL pipeline of unstructured text that incorporates various other NLP activities like text cleaning, NER, scoring/ranking, document deduplication, etc. We are migrating the pipeline over to a distributed system to keep up with the massive quantities of unstructured data our system now has access to.

 

Skills Required

  • Python (Dask)

  • Kubernetes

  • Dask with Spacy would be a great plus

Project Overview

  • Posted
    March 20, 2020
  • Planned Start
    March 25, 2020
  • Delivery Date
    May 31, 2020
  • Preferred Location
    From anywhere

Client Overview


EXPERTISE REQUIRED
dask
NLP
Kubernetes
Spacy
Python

Matching Providers