facebook-pixel

Predict the Trustworthiness and Background of a Person Using Multiple Sources of Data

Industry Professional Services

Specialization Or Business Function Risk and Compliance (Fraud Identification and Prevention)

Technical Function Analytics (Machine Learning)

Technology & Tools

CLOSED FOR BIDDING

Project Description

Summary

We would like to build a product that can accept some basic identifying information about a person of interest and then uses both traditional and non-traditional public data determine whether that person is safe and trustworthy.  

We want to do this by replicating the steps taken by a great private investigator when they perform this very task for their clients.  A great private investigator will do this in the following ways:

  • Obtain some basic identifying parameters about the person of interest.   This typically includes things like a photo, name, date of birth, email address, birth place, last known city and state, etc.
  • Run a background check on the person of interest using TLO or Checker or similar.  These background checks are fairly commoditized and are primarily checking things like state and county criminal conviction records and other traditional public records.
  • Check all public social media going back as many years as possible to see if the person posted anything racist or otherwise egregious or salacious.
  • Search news and media archives to see if the person was named in a scandal or anything else adverse that wouldn't necessarily show up in a traditional background check.
  • Do a google search on the person of interest and go several pages deep looking for anything adverse.
  • Search any data that was hacked and then dumped in the dark web.  HaveIBeenPwned.com is a good aggregator of this data.
  • Do all of the above on not only the person of interest but also on their 5-10 closest friends and family members or anyone they shared an address or phone number with in the last few years.  Looking, for example, to see if they have a clean record but live with 5 family members who were all convicted of fraud in the last few years.  This would imply that the person of interest is at least questionable and not entirely clean.
  • Present all of their findings in a nice, simple report.   This report will have a summary page that looks like a credit report summary page showing some score or grade on trust & safety, a count of how many negative hits there were, a count of how many positive hits there were, and a count of anything neutral or in need of further investigation. The remainder of the report will have the raw or detailed results. See sample reports attached. 

Scope of Work

We are taking a phased approach, so this specific project will consist of building a minimum viable product demonstrating the basic ability to take some basic identifying information about a person of interest and return results or report showing their trust and safety score or grade, a summary of negative / positive / inconclusive hits, and a verbose list of detailed results.

The MVP should consist of a few main parts:

Data Sources

The MVP should be limited to these data sources:

  • Traditional background check data using TLO or Checker or a similar API (it is ok if this costs money - we want quality data)
  • A News API (Google news?  Yahoo! news?  AP?)
  • The HaveIBeenPwned data that was recently made available

Algorithm/s

You will need to come up with some way to match the parameters we get about the person of interest with the data sets that we are searching against.

Training / Machine Learning

We will need a way for our employees and investigators to review the results sets and reports, see a % for each match showing how accurate you think it is, and for them to either accept it as accurate or correct it.  The human interaction should train the algorithm to be better at matching the more times we do this.  We assume this will be done through some form of machine learning but are open to specific suggestions.

Budget and Timeline

Please provide an estimate of hours required to build the minimum viable product, which takes data from the three sources and displays the desired report on a dashboard.  The user can then mark each match with "thumbs up” or “thumbs down.”  

In your response, please also provide details of the technology stack you would use. Please keep in mind that the MVP will have a user interface but it need not be polished. 

The budget for the entire version 1 which can be used in production is USD 100K-150K. We would expect the expert selected to work with us for the entire project, however we currently need an estimate for phase 1, i.e. MVP suggested above.

Project Overview

  • Posted
    December 11, 2016
  • Preferred Location
    From anywhere

Client Overview


EXPERTISE REQUIRED

Matching Providers