facebook-pixel

Image Similarity and Classification Models

Industry Real Estate

Specialization Or Business Function

Technical Function Analytics (Machine Learning, Image Analysis, Deep Learning), Software and Web Development (Scripts & Utilities, Information Extraction Web Scraping System)

Technology & Tools Programming Languages and Frameworks (JavaScript, R, Python), Machine Learning Frameworks (TensorFlow, Keras)

WORK IN PROGRESS

Project Description

We are a technology company that uses machine learning and data science to deliver a remarkably better leasing experience for commercial real estate.

 

We're looking for someone to develop some image categorization and similarity detection models. Specifically, we want to:

  • Identify images that contain certain specific watermarks with very high confidence
  •  Identify images that are substantially similar to other images we have (cropping, re-compression, etc)
  • Tag images based on whether they are building interiors, building exteriors, contain added text (as opposed to text that naturally occurs within the image), are street maps, are floor plans/architectural drawings, contain a watermark in general (not just the specific watermarks listed above).

 

Deliverable: One or more operarationalizable models. We're guessing the right approach will be some combination of TensorFlow/Keras/OpenCV, but we're open to alternatives. Given working models that meet our requirements, we will handle the associated tasks to put them into production (e.g., creating webservice wrappers, deployment, permissions, etc.)

 


For the similarity search, we want to compute an image fingerprint of some kind and an use an efficient algorithm (FAISS? ANNOY?) to find similar images from within a corpus of hundreds of thousands to several million images. Computing fingerprints of new images, adding fingerprints to the index, and searching the index should all be fast (times in seconds at worst).



The watermark and classification models can be trained off-line, but we will need some way to periodically refresh them. Runtime classification performance should be on the order of a few seconds per image at most ideally. Faster is better.



Models will run on AWS. Images are in S3, metadata is in MySQL but we're happy to create extracts for training. We can provide training images, and our team can tag images as needed.

 

In your proposal tell us more about:

 

  •  Your past experience performing similar work
  • How you would tackle this project, the methodology and an estimate

Project Overview

  • Posted
    October 19, 2018
  • Planned Start
    January 21, 2019
  • Preferred Location
    From anywhere

Client Overview

  • T*****

  • Projects
    100 % Awarded ( 1 of 1 )

EXPERTISE REQUIRED

Matching Providers