Industry Financial Services
Specialization Or Business Function Finance, R&D
Technical Function Analytics (Data Preparation)
Technology & Tools Big Data and Cloud (MySQL), Data Warehouse Appliances, Programming Languages and Frameworks (C#, Python)
You will be working to clean our historical (end of day) financial dataset (this contains date, symbolid, symbol, open, high, low, close, and volume – 8 fields in total). This data set contains about 17M records. Most of the records are clean; however, some records contain NULLs, 0’s or invalid values. Also, there are some duplicate records. See attached sample_data.csv.
This project will involve reviewing the full CSV of this data, and extracting supplemental data from Xignite for comparison. You will also gather (free) end of day data from Yahoo Finance and Google Finance as an additional comparison. Then, the datasets must be compared. The goal is to create a single most accurate dataset, which is likely a combination of the sources.
Qualifications: Quality oriented. Programming and database skills (SQL, XML, and some method of connecting to APIs (e.g. Python, C#, etc). Detail oriented data cleaning experience. Financial Industry background / interest in the stock market helpful.
In proposal, please provide how many hours you think this project will require.
Matching Providers