Data Engineer
Background:
2020 Elections. Climate Change. Meat Substitutes. Me-too. Issues from the global economy to geopolitical dynamics are continually evolving in a chain reaction of events. It’s almost a cliché by now to cite rising unpredictability as a major concern for global businesses. From the far-reaching effects of elections to the daily impacts of shifting regulations and consumer interests, global trends increasingly impact business decisions that impact the bottom line.
At CTRL Global, we aim to provide the most accurate representation of discrete issues facing companies, municipalities, and institutions, to provide prescriptive insights to help guide decisions on actions and resource allocations.
As a data engineer at CTRL Global, you will work on predicting trends spanning an array of categories and issues.
Requirements:
– 2-5 years of experience in Machine Learning Frameworks, Data Visualisation, and NLP
– Strong hands-on experience and knowledge of scripting languages and relational database programming for statistical and scientific computing including Python and SQL
– Practical experience designing and managing ELT/ETL data pipelines using Google Cloud infrastructure such as Google Big Query and Cloud Functions
– Excellent understanding of a variety of machine learning techniques (embedding space modeling, clustering, decision tree learning, neural networks, GLM/Regression, Random Forest)
– Exposed to the challenges of using statistics in a business setting, such as incomplete data, biased data, large data sets, low signal-to-noise ratios, high variance, and multiple objective functions.
– Awareness of various formats of datasets by channels and other macro-economic and any other broad-based consumer data sources
Role:
– Draw from a broad background of data-mining techniques in mathematics, statistics, information technology, machine learning, data engineering, the design of experiments, visualization, and text mining to discover insightful patterns in data.
– Work with the team and ensure availability of all relevant datasets and thorough understanding and exploration of data; create documentation on the entire process.
– Build out and maintain data pipelines.
– Carrying out statistical and mathematical modeling, create forecasting models.
– Performance metric - delivery on time; codebase moving to production; operational performance trends.