facebook-pixel

Design & Implementation of System to Query a massive dataset in realtime.

Industry Hi-Tech

Specialization Or Business Function R&D (Performance Analysis, Design Optimization, Product Segmentation and Clustering, Unmet Need Analysis)

Technical Function Analytics (Predictive Modeling, Real-time Analytics, In-Memory Analytics, Machine Learning)

Technology & Tools

CLOSED FOR BIDDING

Project Description

We need to design and implement a system that will enable us to query a massive dataset in realtime.  We need a Big Data architect to help us design the system using appropriate technologies to scale to millions of users and implement the solution by working with our team.  We have the luxury and advantage of organizing and structuring the data set.  Here is the problem:

If a user wants to explorer a node of interest: x1 with and without filters, we would like to show incoming path and outgoing path with aggregations at each node

 

x12 —> x9 —> x1 —> x2 —> x1 is a path

x12 —> x9 —> x1 —> x3 —> x8 is a path

 

We want to show counts next to each node with and without filters like:  (for all x1 the filters f1, f2 and f3... and for x3: fa, fb, f2, f3… and so on)

 

an example of filter f45 is model=BMW

an example of x1 is action=TestDroveCar

 

The data could be thought of stored with filter attributes

x1   f1  f2  f56  f45

x9  fx fy

x12 fa fx f2

 

 //pseudo query

select count(x) from table where 
(
(x == x1 and f1 == f1v and f2=f2v) OR
(x == x2 and f45 == f45v) OR
(x == x3)
)
and
x1Timestamp < x2Timestamp < x3Timestamp

// alternatively we could use intersect

 

Project Overview

  • Posted
    October 28, 2014
  • Planned Start
    November 03, 2014
  • Preferred Location
    From anywhere

Client Overview


EXPERTISE REQUIRED

Matching Providers