381 Projects that match your criteria

Sort by:

Proof of Concept: Digital influencers Mapping - Developing an analysis framework

We are a global pharmaceutical company.

We would like to have an analysis framework developed for identifying the digital opinion leaders (influencers) for a certain therapeutic area in the Nordics region of Europe.

Application in scope for this project: one therapeutic area

Analysis framework:

  • Data gathering via social media and web scraping (last 1-2 years) by region (DK,SE,FIN,NOR)
  • Develop a scoring system for identifying the most influential individuals for a therapeutic area depending on their reach (combination of social media reach and online activities/blogs/etc)
  • Apply the algorithm on the collected data and identify who these influencers are and what kind of influencers they are (patients, HCPs, caregivers, other)
  • Identify the influencers’ social media network 


  • Analysis results (raw data) and graphs (social network)
  • Simple PPT presentation explaining the findings and other relevant insights
  • Source code that should be entirely reproducible across all steps with the adequate documentation 

Languages: R and/or Python (R preferably)

Source code documentation: via R/Python notebooks

Timeline: Q4 2018

Possibility of expanding: apply the same methodology on multiple therapeutic areas (7 additional) by the end of 2018.


  • Please tell us more about how you would tackle this project and an estimate.
  • We would also like to know your past experience performing similar work


Social Media Analytics
Market Research
Customer Analytics

$75/hr - $175/hr

Starts Oct 16, 2018

19 Proposals Status: IN PROGRESS

Client: J*******

Posted: Sep 27, 2018

AI/ML Expert for Voice Recognition

I am working on a project (detailed in attached product summary and AI Development tasks) that utilizes speech to text with DeepSpeech, speaker identification, and Named Entity Recognition.  While training can be done on AWS or another cloud provider, these tasks must all be done locally on the embedded linux device.  As a software developer, I've implemented a few of these a-la-carte but am looking for someone with more expertise in deep learning and voice technology to take this to the next level.

The scope of the first part of this project is defined in the file "AI Development Tasks" and should be completed in 3-4 weeks.  Future product scope will be defined around week 3 but is expected to last quite longer.

Current Technology Stack: Python 3, DeepSpeech, Embedded Linux

Expertise needed: Deep learning, DeepSpeech, tensor flow, python 3, Python open source, NumPy. Voice/Speech Recognition. Understanding of how to optimize code for minimal power consumption on embedded linux is a plus.

In your proposal please share more details on your expertise with voice recognition.

Deep Learning

$60/hr - $100/hr

Starts Nov 14, 2018

11 Proposals Status: IN PROGRESS

Client: H******* ********** ****

Posted: Sep 26, 2018

Algorithm to Identify Frac Pits and Well Pads Using Satellite Map Data


We are an online marketplace and data source for oilfield water data. In the past year we have started to analyze satellite imagery for oilfield water related features to include these results in our GIS map and marketplace products. We currently use an outside service company to identify these features. We would like to hire a contractor to build our own automated "virtual machine" to perform this analysis so that we can reduce operating costs of the satellite scans and increase their frequency and capabilities without relying on an outside service provider in the future, and also to own our IP from our investment in advancing these methods.

The key region we are covering is about 75,000 sqmi or 200km2. 

We are identifying several types of features.


Well pads are square or rectangular areas of cleared, leveled dirt where drilling and fracturing equipment can be placed to drill a new oil well. Samples shown below. Dimensions may be 50m2 up to 100m2 or more. When they are newly constructed they have colace spread on them which increases reflectivity. A new dirt road must be built to reach a pad area before the pad can be cleared. We have had very high sensitivity levels using CNN-ML for detection (probably 85-90% detection rate) but also high false positive rates (probably 3:1 FP:TP ratio) — but this was just a first pass with no additional tuning or OBIA rules.

We need to:

a) Identify presence of a well pad. We are using 10m resolution Sentinel-2 imagery.

b) Distinguish well pad from non well pad similar features like a building construction site, road intersection, or harvested farm. We might be able to apply some OBIA rules to improve results, for example well pads have a dirt road leading to them and tend to be near other oil wells.

c) We do not care about well pad size or other qualities except to the extent these help our detection and specificity.

d) Well pad detection most occur frequently and generate results quickly — this is because the sole value of well pad detection is in identifying new activity on the ground and this new activity becomes widely known and without value in a few weeks.

Samples: https://screencast.com/t/g5g5bdrqc  and  https://screencast.com/t/JM2DXnorZ

FEATURE 2: FRAC PITS aka Frac Ponds aka Water Impoundments. 

These are artificial ponds used to supply hydraulic fracturing activities. Sample images below. Methods for identifying water in satellite imagery using Near I/R reflectivity are well-established. We need to:

a) Identify presence of a frac pit  (currently we use OBIA methods for this but ML could work as well or better). We are using 10m resolution Sentinel-2 imagery.

b) Distinguish frac pit from non frac pit water features such as: reserve pits (small water pits used for drilling operations, which should be classified separately), lakes and ponds, swimming pools, reservoirs, streams, wastewater treatment ponds etc. The identification and differentiation of water features can be done through a combination of ML and OBIA rule sets. For example, a frac pit that is nowhere near any known oil well locations might not be a frac pit, or a frac pit in the middle of a residential area might not be a frac pit.

b) Estimate its dimensions (typical size range is 75m2 - 200m2)

c) Estimate volume either based on a standard pit design methods (simple conversion from surface area on a formula) — if we can find a more advanced way to estimate volume that would be even better.

d) Estimate water type: fresh, brackish or produced water (we have found this can be done through turbidity or color or spectral analysis but have not tested reliability)

e) Track changes in same-pit levels over time.

f) Ideally, identify early signs of frac pit construction, not just presence of water.

Sample frac pit images: https://screencast.com/t/BHgBh7ZBqQti

FUTURE WORK: We have a number of other features we would like to detect in addition to frac pits and well pads. Almost everything that happens on the ground in the Permian Basin is of interest to us. If we hire you to do this work for us we will want to add many more feature categories and we have several in mind already.


IMPORTANT: Well Pad identification is the more urgent requirement at this time. Well pad process is priority 1. Frac pits is priority 2. The well pad detection system below should be built as expeditiously as possible and can operate independently of the frac pit detection system. If they go together, great, but frac pit development should not delay or add initial cost to well pad development.

1) Automatically download, combine and load Sentinel-2 (free, 10m resolution) satellite imagery for the regions of interest on whatever frequency these images become available. I believe this frequency to be about once per week, however I am not sure the entire region of interest is covered with every pass.

  • If necessary for early versions this can be done by a human operator and not automated.
  • In the future we might upgrade our image resolution and frequency to more expensive sources. This might prove necessary to identify some of the features we want to identify in the future.

2) Scan the imagery for the desired features described above using ML and OBIA models to be developed and improved over time.

3) Output the GPS coordinate locations of all well pads and frac pits (include dimensions and estimated volume and water type with frac pits) and other features when added

4) Compare newest scan to all prior scans to screen out all previously identified features and highlight only the new features that have appeared in the most recent scan period

5) Construct and operate a system for human QA of new features. Something like a Captcha system for image confirmation and classification so that all newly identified features are shown in series to trained human QA persons in a remote location for confirmation, rejection, enhancement or reclassification

6) A method for feeding the results of human QA back into the ML model and OBIA rules to improve them, and also output the post-QA feature locations list as the final list for this scan date in an appropriate database format.

7) Match the resulting locations to various external data sets to enhance their meaning and for further QA. For example, we should match newly identified well pads to the most recent state data on drilling permit locations in order to remove well pad locations for which permits have already been filed, since these pads can already be found by other means. Also matching new pad locations to the closest published operating drilling permits in order to estimate whi company is the likely owner of the new pads.


In your proposal, please answer the following four questions:

1) What are your background and qualifications for this project?

2) What would be your approach to solving this problem?  Which algorithms are appropriate?

3) Will you work be limited to the development of the algorithm or can you develop and deploy a full solution in the cloud?  We are more concerned about performance than aestethics of the user interface.

4) How would you estimate the cost of Feature 1 and Feature 2?  Can you provide a flat fee quote for each?

Energy and Utility
Engineering and Design
Computer Vision

$50,000 - $100,000

Starts Oct 04, 2018

10 Proposals Status: COMPLETED

Client: S************ ****

Posted: Sep 06, 2018

IBM Watson Expert

We are developing a device which provides hands-free voice and visual communication to assist consumers in his life’s situations.

We are looking for an expert of IBM Watson APIs, particularly in visual recognition and voice processing,  and a good orientation in IBM cloud technology, in order to supply to us an advisory service and a prototype implementation.

We need the expert to be capable to implement our software prototype, so he should have the following skills:

  • a proven programming experience.
  • a valuable knowledge in Node-RED and proven experience in java (3-5 years of experience).
  • a good orientation in IBM cloud technology
  • a proven experience in build projects on top of IBM Watson APIs in IBM cloud environment
  • should be a great team player, to work together and also a great consultant that is capable to explain his knowledge and his work.
  • Working in Linux platform

Nice to have: ML, Deep Learning knowledge particularly in visual recognition and NLP.

The deliverable:

  1. Advisory service 
  2. Approved architecture of the prototype
  3. Working prototype, which will be divided into phases. 

In your proposal please highlight:

  1. How much time do you work with IBM Watson APIs
  2. What is your last project that is implemented on IBM Watson APIs?
  3. How familiar are you with IBM Cloud env?
  4. Did you have a chance to architect the whole project related to IBM Watson APIs by yourself?

IBM Watson

$50/hr - $120/hr

Starts Sep 02, 2018

6 Proposals Status: COMPLETED

Client: N********** ***

Posted: Aug 30, 2018

Customer-specific Pricing Algorithm Based On Historical Pricing Data

Project Overview:

We are looking for assistance in refining our quoting by creating a pricing algorithm. The end deliverable will be a custom (per customer per material/supplier) pricing model which maximizes revenue and minimizes costs for 1) labor and 2) material.

Company Profile:

We are a local DFW, TX sand and gravel trucking broker primarily serving the construction industry. We do not own trucks nor employ drivers.  We work with owner/operators to complete orders.  

We have two main input costs – 1) material (sand, gravel, rock, etc.) and 2) trucking costs (paying the owner/operators for their trucking services).  Trucking (labor) makes up approximately 80% of our revenue with the remaining 20% derived from material sales.

These two input costs are marked up a certain percentage and passed onto our customers. All jobs will incur trucking costs; only a subset of jobs involve materials costs too.

Data Source:

We have recently invested significant time and resources into a custom quoting, dispatch, and job completion system hosted on Amazon Web Services.  

The MySQL database houses all the quoting information and actual deliveries completed (successful quotes).

Current quoting methodology is based off drive time distance/time) for the labor and material costs from local pits/dumps (marked up a certain percentage).  

We would like to refine our pricing methodology based off individual customers (greater discount to high volume customers, higher mark up to riskier customers, higher costs for slower/less sophisticated customers) maximizing our revenue and minimizing our costs.


The deliverable would integrate into our MySQL database and refine itself over time. The algorithm should take into consideration volumes/seasonality and cover the following areas:

1. Trucking costs – provide an estimated trucking cost to minimize cost on a per unit basis (ex per ton, cubic yard, load, hour) given different truck types (sizes). We are thinking about eventually allowing drivers to bid on each project to furhter refine our pricing. We utilize several different truck sizes which, all other things being equal, would result in a different trucking cost for different size trucks.

Problem to solve: what’s the lowest price we can pay our truckers to successfully find truckers to perform the work?

2. Material costs – material costs are “fixed” by our suppliers (local pits and dumps with whom we have purchasing accounts with) and priced by either the ton, cubic yard, or load. The algorithm could suggest lower cost alternatives which may help win the bid but may be a further distance away from the job site.

Problem to solve: are there suppliers which may be located a little further from the job site, but would offer a lower total cost to the customer (with a similar material quality)?

3. Trucking + Materials (when applicable) revenue – provide estimated values to maximize our chance of winning bids on a per customer basis. This should be a flexible calculation based on full and partial truck loads. When trucking costs and material costs are priced in the same unit of measure, the customer will only be presented with a single price quote; when trucking costs and material costs are priced in different units of measure, the customer will see separate price quotes for both materials and labor.

Problem to solve: Considering credit risk, what is a total, all-in cost for the customer to maximize our revenue and win the bid?

4. Credit Risk/Quality of Customer – items which maybe be used in determining credit risk

  • average size ($$$) of invoices (higher average better)
  • total billed revenue (more better)
  • time as customer (longer better)
  • Days Sales Outstanding (lower better) – how slow does this customer pay; our terms are typically N30.

5. Seasonality – Supply and demand. Rainy season = lower demand / higher (available) supply. Ability exists to pay the driver less when demand is lower.

6. Actual vs estimated time spent on each job – currently out labor cost estimates are based off estimated time to complete. Shortly we will be rolling out the ability to track precisely the time it takes to complete each job. This information will need to be planned for and included in any deliverable.

Transportation and Warehousing

$7,000 - $10,000

Starts Aug 31, 2018

1 Proposal Status: IN PROGRESS

Net 60

Client: A******** ********

Posted: Aug 30, 2018

Pandas & Python Expert to Help Execute Causal Inference Process on Survey Data

About us:

We are startup in San Francisco that collects and analyzes survey data to measure how well online brand advertising works.


We've developed a custom Python-based pipeline to perform valid causal inference, but it's not yet fully automated. For now, each ad campaign we analyze requires the creation of a custom "configuration" script in Python that cleans and conforms the data to the format that our statistical pipeline requires. We've got a rush of campaigns we need to analyze over the next few months and so we need someone proficient in Python and pandas to help us set up and execute our analysis (and ideally to help us figure out additional ways to automate the process).

The process is relativley well-defined at this point, but there's a lot that can go wrong (e.g. missing or broken data, violates of statistical expectations) so the process needs a fair amount of hand-holding by someone who has the quantiative savvy to critically evaluate the data and results as they're going through our pipeline.


Expertise Needed :

Prior expereince with causal inference is not necessary, but you do need to be strong in Python and specifically pandas.

We don't really care where you're physically located but we need someone who will overlap with us during business hours (10:30am - 6:30pm PST)


In your proposal: Please tell us how comfortable are you with Python and specifically pandas, also share details on your background/past experience.

Data Analysis and AI Tools
Media and Advertising

$75/hr - $150/hr

Starts Aug 31, 2018

18 Proposals Status: CLOSED

Client: S*******

Posted: Aug 29, 2018

Data Management Healthcare Project

We are looking for a team that can help intake medical claims, pharmacy claims, and bio metric data from a variety of external sources, scrubbing the data to clean, and loading the data into internal schema.

We need a combination of ETL, and Database skills to setup and manage this intake process. 

In addition, we are building a longer term AI based solution that can automate data transformations, so we can operate in more of a Schema on Read mode than Schema on Write.

Beyond the data loading task, we need resources to setup a business rules engine to support identification and sizing of improvement opportunities within the realm of cost reduction, or health condition improvement.

Ultimately, we will have a long list of "upstream" data sources, "mid-stream" analytic partners and models, and "down-stream" delivery via APIs, Blockchain, and MicroServices. 

Technology Stack:

  • Amazon Services (S3, EC2, SMS, Aurora MySQL, RDS, Glue)
  • Talend Open Studio ETL (Open Source) Java
  • Drools
  • R
  • Power BI

Technical Skills:

  • HIPAA Healthcare Compliance 
  • Encryption
  • Database Design\Analysis
  • SQL Development - Store Procedures
  • Business Rules\Workflows Development
  • BI\Datawarehouse


Ideal Team:

A mid-size firm with blended senior and junior resources that can achieve project goals. 


$50/hr - $75/hr

Starts Aug 31, 2018

17 Proposals Status: CLOSED

Client: W**********

Posted: Aug 25, 2018

Comparison of Two Analytics Platforms: Alteryx and RapidMiner

Conduct a comparison of the two analytic platforms Alteryx and RapidMiner in the areas of usability, feature/function and completeness to do data preparation and create, validate and deploy predictive models. The criteria for evaluation will follow the Gartner Data Science Platform Evaluation Guide. Completion of the project will include submission of the completed evaluation form (attached) and follow-up interview to discuss the findings.

Profile of Candidate:

  • PhD or Masters degree in a quantitative field of study
  • 2-4 years experience working for commercial business as Data Scientist
  • Preferable that the candidate has also worked as part of a team of Data Scientists
  • To minimize bias, must have either (a) proficient experience with both products, or (b) no experience with both

Target Completion Time: 40-60 hours

We intend to hire two data scientists so we have two separate evaluations for comparison.

Market Research

$100/hr - $150/hr

14 Proposals Status: CLOSED

Client: C************

Posted: Aug 09, 2018

Upgrade & Optimize Jupyter/Python Prototype for Production Environment

About Us:

We provide rich analytics for customer workplace optimization. The Company’s SaaS offering uses AI-powered analytics to enable customers to:

  • Replace outdated methods of measuring office space utilization
  • Leverage better analytics to optimize their workspace
  • Enhance employee collaboration and productivity, and
  • Increase the effectiveness of real estate spend.

Our SaaS product helps migrate clients away from the historically time-consuming, and often error-prone, manual data gathering and analysis required for workplace optimization metrics such as employee attendance and real estate total cost of occupancy (“TCO”). 

The platform:

1.Aggregates and leverages real time office use data that already exists

  • sources include: badge data, WiFi infrastructure, lighting infrastructure, building information, IoT sensors, calendar data, etc.

2.Harmonizes multiple overlapping and complementary data sources using state of the art data science (including machine learning algorithms and advanced statistical methods)

3.Produces robust analytics and actionable insights:

  • average and peak usage
  • underutilization vs. congestion on a campus, building, floor, business unit level
  • intra- and inter- departmental collaboration
  • conference room occupancy and availability
  • avoided costs/potential savings

We have targeted its data analytics platform at Fortune 500 & Global 2000 corporate tenants with many knowledge workers and large amounts of leased or owner-occupied office space, such as Cisco, ExxonMobil, Lenovo, DELL EMC, BP, Abbvie, Comcast, T. Rowe Price, Hilton and Uber, among others.

The problem:

Upgrade our existing manually run pipeline which consists of a family of Jupyter notebooks and Python scripts to an automated pipeline that runs (CRON?) when triggered by a green flag from the ETL process. ETL is being developed by a team of Scala developers to deliver raw data for processing in Parquet

This process is asynchronous with our SaaS product, which is a FE-BE dyad used by customers. Our SaaS product queries the results output from the pipeline, which are stored in MySQL and ElasticSearch.

We need a senior data scientist to work with our team of talented, but junior, data scientists to develop an automated, optimized production pipeline and create new output tables for a future re-build of the FE-BE SaaS.

Expertise Needed: Python, Spark, AWS, Jupyter, MySQL, ElasticSearch, Parquet. 

Data Sources: post-ETL, all raw data will be in Parquet. Some lookups might be needed to MySQL.

Current Technology Stack: Python, Spark, AWS, Jupyter, MySQL, ElasticSearch, Parquet

Deliverable: Automated  pipeline, sys ops automation on AWS, code base in github, documentation in-code and in Confluence.

In your proposal please tell us more about: 

  • Your experience building algorithm sequences and data pipelines or any other relevant experience.
  • How you would approach this development exercise. 
  • Tell us more about your Python expertise. 
  • Tell us more about your MySQL expertise


$75/hr - $150/hr

Starts Aug 06, 2018

8 Proposals Status: CLOSED

Client: R********* ****

Posted: Aug 06, 2018

Smart Shower : Usage Data Insights

Summary of Product & Product Data

Our digital shower is a product that was designed to deliver an ideal shower consumer experience, our goal was to create an engaging product that pleased consumers. This product generates large amounts of data due to its inherent electronic nature, but the data structure was not designed with the generation of insights in mind.

Strategic Need for Insights

Our organization aspires to be the home water authority, in addition to delivering the best water delivery fixtures, this means we can provide personalized insights that enhance their experience. The data also gives us the opportunity elevate our understanding of consumer behavior as it relates to water usage, showering in this case.

The objective of this project is to mine the existing data generated by users from the showering control unit and extract insights that can provide interactions with our consumers to engage their experience with the product and learn as much as we can about the habits and practices of our consumers and reach to new insights about showering behaviors in the United States. 

Approach and Timeline

We expect this project to be completed within two weeks. We will provide a download of the data and the data dictionary and access to our cloud to get access to the data. We will facilitate a conversation with our teams to clarify any technical or project-related questions. 

Hypothesis / Thought Starters

Shower temperature

o   Shower temperatures by geographic area

o   Profiles shower temperatures by season or outside temperature

o   Shower temperature variations by time of day

§  Per consumer cluster

§  For individual consumers

o   How much do shower temperatures vary? Are there any oddities?


Latent Class Clustering of Users 

Using all available attributes ( shower length, time of day, temperature, durations, # of showers per day) can clustering techniques identify behavior based homogenous user segments (i.e. early morning quick hot shower-ers are 15% of the market) 

Can we use this to create predictive models for our showering system?


Shower length

o   What is the typical shower length? 

- can we overlay the cost of water in each municipality to get a sense of annual shower cost?

- can we overlay / link water quality data tables?

o   Does it vary per time of day / day of week?

      What are the different types of shower types? (i.e. Most people use only one function, i.e. two functions are mainly used in longer evening showers, etc)

      Are there clusters/profiles of shower types (i.e. time / temperature combination)? How do they differ per type of user?

o   i.e. Users that keep temperature steady for the length of a ~10 min shower

o   i.e. Users that end with a “cold shot” but have a length hot shower

o   i.e. Users that activate all functions and keep them on for the hole duration?


o   What types of names are we finding with the presets?

o   What are the most typical presets?

o   What are odd presets?

o   Any fun facts about them:

      i.e. profiles with female names have 10% warmer temperature

      Are there homes with more than one device? If so, what differences are we seeing?

      What is the start water temperature? How does it vary per type of household or region or season?



  • Data Size is approximately 8GB
  • Mutltiple CSV files with data schema will be provided. Also, initial call with data team after kick-off will be necessary for alignment and understanding of data columns.



  •  A comprehensive set of insights based on the questions above backed up by graphs and data analysis
  • A set of recommendations related to data architecture that would make this type of analysis easier in the future, and that could allow us to generate personalized insights on a production basis
  • A set of recommendations of areas of opportunity we may have if we captured additional data



  • Please tell us more about how you would tackle this project and an estimate.
  • We would also like to know your past experience performing similar work 

We would require a non-disclosure agreement to be signed as part of the project.

Consumer Goods and Retail
Customer Behavior Analysis
Consumer Experience

$50/hr - $150/hr

Starts Aug 13, 2018

21 Proposals Status: COMPLETED

Client: M****

Posted: Aug 02, 2018

Matching Providers