facebook-pixel

Upgrade & Optimize Jupyter/Python Prototype for Production Environment

Industry Real Estate, Software

Specialization Or Business Function R&D (Performance Analysis, Design Optimization)

Technical Function Analytics (What if/Scenario Analysis, Data Mining, Real-time Analytics, Machine Learning, Time Series Analysis, Descriptive Analysis, Spatial Analysis, Location Analytics)

Technology & Tools Business Intelligence and Visualization, Big Data and Cloud (Amazon Elastic MapReduce, MySQL, Amazon Kinesis, Elasticsearch, AWS Elastic Beanstalk, AWS Identity & Access Mgmt (IAM), Amazon EC2, Amazon Web Services, Linux), Data Analysis and AI Tools, Programming Languages and Frameworks (Scala, Go, R, Python, PySpark), Mapping and GIS (Mapbox)

CLOSED FOR BIDDING

Project Description

About Us:

We provide rich analytics for customer workplace optimization. The Company’s SaaS offering uses AI-powered analytics to enable customers to:

  • Replace outdated methods of measuring office space utilization
  • Leverage better analytics to optimize their workspace
  • Enhance employee collaboration and productivity, and
  • Increase the effectiveness of real estate spend.

Our SaaS product helps migrate clients away from the historically time-consuming, and often error-prone, manual data gathering and analysis required for workplace optimization metrics such as employee attendance and real estate total cost of occupancy (“TCO”). 

The platform:

1.Aggregates and leverages real time office use data that already exists

  • sources include: badge data, WiFi infrastructure, lighting infrastructure, building information, IoT sensors, calendar data, etc.

2.Harmonizes multiple overlapping and complementary data sources using state of the art data science (including machine learning algorithms and advanced statistical methods)

3.Produces robust analytics and actionable insights:

  • average and peak usage
  • underutilization vs. congestion on a campus, building, floor, business unit level
  • intra- and inter- departmental collaboration
  • conference room occupancy and availability
  • avoided costs/potential savings

We have targeted its data analytics platform at Fortune 500 & Global 2000 corporate tenants with many knowledge workers and large amounts of leased or owner-occupied office space, such as Cisco, ExxonMobil, Lenovo, DELL EMC, BP, Abbvie, Comcast, T. Rowe Price, Hilton and Uber, among others.

The problem:

Upgrade our existing manually run pipeline which consists of a family of Jupyter notebooks and Python scripts to an automated pipeline that runs (CRON?) when triggered by a green flag from the ETL process. ETL is being developed by a team of Scala developers to deliver raw data for processing in Parquet

This process is asynchronous with our SaaS product, which is a FE-BE dyad used by customers. Our SaaS product queries the results output from the pipeline, which are stored in MySQL and ElasticSearch.

We need a senior data scientist to work with our team of talented, but junior, data scientists to develop an automated, optimized production pipeline and create new output tables for a future re-build of the FE-BE SaaS.

Expertise Needed: Python, Spark, AWS, Jupyter, MySQL, ElasticSearch, Parquet. 

Data Sources: post-ETL, all raw data will be in Parquet. Some lookups might be needed to MySQL.

Current Technology Stack: Python, Spark, AWS, Jupyter, MySQL, ElasticSearch, Parquet

Deliverable: Automated  pipeline, sys ops automation on AWS, code base in github, documentation in-code and in Confluence.

In your proposal please tell us more about: 

  • Your experience building algorithm sequences and data pipelines or any other relevant experience.
  • How you would approach this development exercise. 
  • Tell us more about your Python expertise. 
  • Tell us more about your MySQL expertise

Project Overview

  • Posted
    August 06, 2018
  • Planned Start
    August 06, 2018
  • Delivery Date
    September 14, 2018
  • Preferred Location
    United States

Client Overview


EXPERTISE REQUIRED
Python
Spark
jupyter
parquet
MySQL

Matching Providers