facebook-pixel

381 Projects that match your criteria

Sort by:

Understanding Our Customer Base Better Through Analysis of Current Base

We have a E commerce platform and we need help analyzing our google analytics to better define our customer.

  • We are a branded premium denim Jean business pivoting to an all E commerce and made in USA in our own factory model
  • We have a small base of loyal core customer that we need to understand better
  • Digital marketing based on Google data analysis to define customers in order to target similar
  • Customer list (small) google audience data any format.
  • We are using Big Commerce platform
  • Understanding current customer base and define new target base for digital marketing purpose
Consumer Goods and Retail
Customer Acquisition Modeling
Market Segmentation and Targeting

$2,500 - $5,000

Starts Nov 03, 2016

23 Proposals Status: CLOSED

Client: G******* ***** *************

Posted: Nov 02, 2016

Natural Language Processing and Sentiment Analysis of Customer Reviews using RapidMiner

Amway operates in more than 100 countries and is ranked 29th among the largest private companies in the United States.  We have been collecting customer reviews and survey data that has been transcribed from phone conversations in different languages.  We would like a data scientists well-versed in natural langage processing (NLP) to engage in sentiment analysis of this data.

Data Format

The customer review data is in unstructured format and contains approximately 2,000 records.  The initial data set is in the Chinese language and surveys from additional languages will be added after the success of the first phase of this project.  A data sample is attached.

Technology Stack

The data will be analyzed by a RapidMiner-certified data scientist and the analysis results will be exported to Tableau.

Data Analysis

We would like to look at the analysis by the slices listed below. For example can we see the sentiment between months, or in different regions, or by different SKUs, etc.

Be able to slice analysis by:

  • Month
  • SKU No.
  • Complaint Code
  • Region
  • Province (?)

Text analysis:

  • Frequency
  • Di/Trigrams
  • Sentiment
  • Word Associations/Correlations/Clusters

The above represents some of our ideas but we encourage data scientists to suggest other approaches.  Please look at the sample data and provide your approach to the analysis and the kind of insights that can be drawn from it.  Please also provide a ball-park estimate of hours this work may take.

Consumer Goods and Retail
Customer Behavior Analysis
Customer Loyalty

$100/hr - $200/hr

Starts Dec 05, 2016

6 Proposals Status: COMPLETED

Client: A*****

Posted: Nov 01, 2016

ML/AI Engineers for Evaluation of Client Engagement

We are supporting a 2-month ML/AI Proof Of Concept initiative for a client in the HR domain. Our team will provide technical expertise in Data Management and ML/AI. 

Data Science skills

  • AlyData is a boutique Data Science/Data Management firm. We are supporting a client that is evaluating ML/AI for their business.
  • Looking for ML/AI engineers to help on part time or full time basis (all virtual)
  • Engineer will provide recommendation on tool & technology stack required - based on customer requirements and problem domain
  • Client will provide data sources and training/prod data (e.g., Excel, JSON, XML, DB)
  • What is your current technology stack? (Hadoop, Oracle, Cloud etc.)
  • What is the deliverable? (Data Wrangling, Data Fitting, Model(s), Algorithms, Model Output, Analysis of the Model Output and key findings explained, Advisory service) Deployed in the cloud
  • Client will provide subject matter experts. 

Ideal candidate would have one or more ML/AI projects under his or her belt, strong Data Science skills, ability to work independently, good communication skills to gather requirements. Can work your hours and remote, so ideal for someone that wishes to moonlight, if s/he has full time job.

Consumer Goods and Retail
Healthcare
Hospitality, Travel and Leisure

$60/hr - $150/hr

Starts Nov 07, 2016

6 Proposals Status: CLOSED

Client: A*******

Posted: Oct 26, 2016

E-Commerce and Financial Services Data Analysis

We are looking for talented data scientists who have both good communication and analytical skills. We look for people who can make sense out of large amount of data into actionable insights. Specifically, for this project we would ideally like data scientists with e-commerce/retail data or financial services analysis and marketing statistics experience.

Here's a bit more on us and what we are looking for:

  • We are a company leveraging data to deliver actionable insights to increase sales.
  • We are looking to take all historical transactions from our businesses and understand best selling products by refined customer segments.
  • We are looking to statistically correlate customer attributes to the different customer segment tiers based on data fields provided.
  • We would ideally like someone with both statistics and data science experience, especially as it relates to retail/e-commerce or financial services.
  • We are primarily working on CSV and Excel files.
  • The deliverable would be a report that outlining ranked customer attributes and products based on customer segments.
Consumer Goods and Retail
Profitability Analysis
Market Research

$100/hr - $150/hr

Starts Nov 01, 2016

21 Proposals Status: IN PROGRESS

Client: S***** **

Posted: Oct 26, 2016

Data Analytics of Corporate Data to Determine Strategic Product, Process and Technology Development Areas

B37 Ventures is a Venture Capital firm. We invest risk capital in early stage technology startups in the USA. Among the investors into our venture fund are several global corporations with complex and sophisticated operations and serving 60 countries with their products and services.

In one sentence, we invest to create strategic value for our corporate investor and a growth advantage for the startup. To slightly expand on that, the model we are proving is that by leveraging the unique assets of a large company while also levergaing the unique assets of a startup will create value for each player. For the corporation they get an otherwise unobtainable competitive advantage by some alignment and engagement with the startup. For the startup they get an otherwise unobtainable growth advantage by access to the global reach of the corporation. Each party gets something that unleashes valuation.

B37 is now to be given access to a vast amount of data from one corporate investor, a global corporation, including their (i) business intelligence (what was sold, by whom, when and how), (ii) cost to manufacture (supply chain info), and (iii) cost to serve (profitability of each channel, route or product). This data will come from sources like sales’ ERP (Oracle), Maximo, CDE, SATEC, and office tools like email & Yammer.

Our value proposition to this corporate investor is that by applying the right analysis on the data then we'll yield results which helps B37 make better investment decisions for their strategic needs and results that also serve the accomplishment of their articulate and sophisticated corporate “Transformation Objectives”. In short, our corprate investors want to stay out of the “Innovator’s Dilemma”.

"Phase 1” is an Advisory service. This Phase of the project has two primary objectives: (i) engage with a data scientist to consult with B37 and improve our articulation and understanding of what data might have the most utility for our B37 objectives and to our corporate investor, and (ii) define the appropriate parameters and search criteria to meet those objectives. Phase 1 is largely "advisory services" to the B37 Ventures team by providing an education in how data analytics, machine learning, AI, etc. can be applied to the data we anticipate getting acces to and for the outcome described. 

I stress that, for the moment, we do yet have access to the data. We're deep in discussions with one of our corporate investors to receive data and the Advisory service will help us frame our request in the best possible way.

Following Phase 1, we intend to quickly pursue the next Phases (may be more than one or more additional Phases) which include (i) accessing the data from our corporate investor (format tbd), (ii) subject it to expert analysis/analytics, and (iii) convert the “data” into “ information" that has the utility I’ve briefly described above.

All deliverables will come directly and only to the B37 Ventures team.

Note 1:  Many "Categories" are selected below as we'll get access very likely to a wide cross section of the corporation's data.  Together we may decide to narrow and focus on 1 or 2 to start with.

Note 2: The Project Start/End dates and Budget listed below are for Phase 1 as I describe above.

Consumer Goods and Retail
Hi-Tech
Manufacturing

$150/hr - $350/hr

Starts Nov 01, 2016

15 Proposals Status: COMPLETED

Client: B*** ********

Posted: Oct 17, 2016

Entity Extraction from DUI Arrest Logs for Law Firm

We are a Northern California based law firm looking to create an application that will scrape the DUI arrests from arrest logs on these 18 county sites, then scrape the Breeze database and then identify the records that match to build a single file of records found on both websites. The application needs to run once a week and save the record to a webserver. Then when the record is available, it should email the admin with a link to download the record. Sample spreadsheet attached to show info that we're hoping to gather. 

Legal
Information Extraction Web Scraping System
Software and Web Development

$80/hr - $120/hr

Starts Jan 24, 2017

14 Proposals Status: COMPLETED

Client: G******* *** **********

Posted: Oct 11, 2016

HR Data Management and Algorithm Design for Predicting Employee and Team Success

We are an HR analytics company building Predictive analytics for forecasting teams and success candidates. We looking for a data scientist to provide consultation for the following:

(a) Define the right inputs and datasets - Number and type of inputs for training the algorithm

(b) Methods to capture data (data scraping from public sites, APIs, creating datasets, open source etc)

(c) Ways to process data, tools needed etc.

(d) Refining our algorithm - # of input variables, accuracy, weights etc  

(e) Methods to visualize the data

We need to define right predictors of performance based on employee data and provide the relevant output which is the definition of the right team (size and skill) and success candidates and link these teams to business outcome

Deliverables: Defined inputs, algorithm, and dashboard deployed in the cloud

Hi-Tech
Employee Allocation Optimization
Human Resources

$80/hr - $120/hr

Starts Oct 07, 2016

10 Proposals Status: CLOSED

Client: S*******

Posted: Oct 04, 2016

Identify root cause of stalled reaction during manufacturing for one of the largest pharma companies in the world

This project is for Merck KGaA (known as Millipore Sigma in the US and Canada), one of the largest pharmaceutical and chemical companies in the world.

During the manufacturing of a particular product, a reaction is unexpectedly stalling and additional catalyst must be added. This is disruptive to the manufacturing process, and the team would like to understand what is causing it in order to prevent this from occurring in future batches. This product is manufactured infrequently so only about 30 batches of data are available from the past 3 years.  There are two pieces of equipment the reaction may occur on; G144P420 or G144P422.  For each batch, there are time series data recorded during the reaction. This time series data includes pressure, temperature, etc. This time series data is recorded roughly every second; however, there are gaps. Ideally, we would like to use feature extraction to identify differences between good and bad batches of product.

Attached is sample data for one good batch of product.  Each batch has one file corresponding to it (of the same format as the one attached).  Note that the tag name column refers to the piece of equipment and the metric being collected in German. For instance, "G144P420.Innentemp" means equipment G144P420 and the metric recorded was the Inner Temperature.

As a final deliverable, we would like to understand what key features may be driving differences in product quality.

We will not consider proposals over the rate that was specified. 

Manufacturing
Pharmaceutical and Life Sciences
Chemistry

$60/hr - $120/hr

11 Proposals Status: CLOSED

Client: M********* *****

Posted: Sep 27, 2016

Healthcare Paid Claim Data Modeling and Analysis (US-based Candidates)

About the Project:  We would like to have one or more algorithms built that can use large, healthcare paid claim data sets and identify those claims most likely to be part of an accident (motor vehicle, slip and fall, etc), as well as most likely to NOT be part of an accident.  It is possible that there may need to be separate algorithms for the distinct accident types.

About Us:  We perform subrogation-related activities including the identification of  claims that should be paid by another liable party (auto-medical insurance, homeowner’s insurance, etc). Our current identification methodology includes an ETL process that mines paid healthcare claims based on a number of criteria. Selected claims (or a single claim) are aggregated into a case for investigation as to whether or not the claim relates to an accident or injury for which a third party is responsible. 

About the Existing Process:  Currently, our process leverages a proprietary rule set, by which we both identify cases we definitely want to open, as well as claims we know we want to reject.  We review the existing claim on its own merit, as well as claims for the same patient within a reasonable time period, to determine whether the full episode of care appears to be part of an accident.  We augment this advanced process with human review of claims that are classified as “possible” candidates for selection.

About the Data:  We have the claims data, our current rule sets, and our outcomes data, which would be included in this project.

Claims Data includes:

  • Health Plan information
  • Healthcare provider information
  • Patient Information
  • Diagnosis Codes
  • Procedure Codes
  • Occurrence Codes

Existing Case Outcomes Data includes:

  • The determination of whether a case should, or should not, have been paid by a different party
  • Settlements, which are agreements by other parties to repay part or all of the payment made by the health plan
  • Recoveries, which are actual payments made against settlements

We are going to provide an Excel header file which shows, in more detail, the data elements which would be made available during the project.

About the Model:  It is expected that the algorithm will leverage features, both from the claims, and potentially beyond the claims, to predict the likelihood of a claim being part of an accident.  Some potentially predictive features may include:

  • Diagnosis Codes (some ICD-9 and ICD-10 codes specify that the cause was an accident; others may be highly correlated to an accident; still others may help weed out claims that should not be included)
  • Procedure Codes 
  • Patient Age/Gender
  • Patient Zip Code (studies suggest that some states have a significantly higher prevalence of accidents than others)
  • Place of Service (Emergency Room, for example)
  • The use of an ambulance or air-ambulance 
  • Are there claims for multiple covered individuals under the same subscriber for the same DOS at the same provider?  
  • Other third-party data sets that could be incorporated to further augment the prediction (TBD)

About the Deliverable:  The expected outcome of this project is that we have an algorithm, which can be inserted into our existing ETL process, and provides a single, key metric:  likelihood of this claim being part of an accident.  In addition, it is expected that the data scientist will demonstrate the expected claims whose decisioning changed between actual historical decision and modeled decision so that we can estimate profitability lift and ensure a sense of comfort in the expected outcome of the model.

Other Notes:  

  • Our technology team would provide access to a secure server location either in AWS or Azure where the data must remain.  We will assist in installation of any and all software needed to perform the model development.
  • The general expectation is that the model will be called during the claim ETL process, but specifics as to the data scientist’s recommended approach to operationalizing the algorithm(s) should be described in their proposal
  • Only U.S. based Data Scientists will be considered

Healthcare
Insurance
Machine Learning

$10,000 - $20,000

Starts Nov 10, 2016

15 Proposals Status: COMPLETED

Net 30

Client: L***********

Posted: Sep 27, 2016

Improving Content Recommendation with Deep Neural Networks

Background: 
Taboola is widely recognized as the world’s leading content discovery platform, reaching 1B unique visitors and serving over 360 billion recommendations every month. Recent ComScore data shows that Taboola is second only to Facebook in terms of reach (https://www.taboola.com/press-release/taboola-crosses-one-billion-user-mark-second-only-facebook-world%E2%80%99s-largest-discovery). 
Publishers, marketers, and agencies leverage Taboola to retain users on their site, monetize their traffic and distribute their content to drive high quality audiences. Publishers using Taboola include USA Today, NYTimes, TMZ, Politico.com, BusinessInsider, CafeMom, Billboard.com, Fox Television, Weather.com, Examiner, and many more. 
Taboola's operation is vast with ~2,000 servers in 6 data centers processing big data about users and user behavior, content, pages etc..

What are we looking for?
Taboola is interested in utilizing deep neural networks to improve its predictive capabilities in a number of applications. We are looking for researchers with knowledge and expertise relevant to the use of deep learning for recommendation, Natural Language Processing (inc. speech), and computer vision. More so than looking for experts in certain domains, we are looking for people who have experience in utilizing DL techniques to solve new real world problems, especially where no truth sets exists. These people need to have experience (or at least knowledge) in the various types of network architectures and in particular, use of embedding techniques (like word2vec) for various types of entities (we do that for users, for instance), RNN (LSTM, in particular) and CNN (which are particularly useful for working with images). 
The following papers are very relevant for our inquiry: 
http://arxiv.org/abs/1606.07792
http://arxiv.org/pdf/1607.07326.pdf
http://arxiv.org/pdf/1301.3781.pdf

Consumer Goods and Retail
Media and Advertising
R&D

$150/hr - $175/hr

Starts Nov 03, 2016

15 Proposals Status: COMPLETED

Net 60

Client: T*******

Posted: Sep 22, 2016

Matching Providers