Data Science - Analytics
We’re the leading provider of digital identity management and transaction fraud detection. We support innovative FinTech companies and top tier banks to manage KYC, AML, and other components of client onboarding. Alloy’s API enables its clients to access over 50 third party data sources in real-time to improve decision-making and streamline client experiences. We're backed by venture capital firms that have taken countless companies to IPO like Canapi, Bessemer Venture Partners, Primary Venture Partners, Eniac Ventures, and others. We are well-positioned to bring on incredibly talented individuals who can help take us to the next level!
Why we’re hiring
Alloy is rapidly growing, leading to more clients, more questions, and more data! As a part of the Data Team, you will scale and improve existing processes to our growing client base and you will dive into new areas as we further our reach into multi-product offerings.
You’ll get to work with a modern tech and data science stack - Postgres, Python (pandas, scikit-learn, jupyter), AWS - that values collaboration, working reproducibly, and the craft of good software engineering practices. You’ll be a valued member of a friendly and experienced engineering team, working closely with backend and frontend engineers. Our data team's culture values curiosity, learning, connecting data to Alloy's strategy, open source tools, rigor, and a spirit of relentless resourcefulness.
If you are analytical, good at tackling big, complex questions, and like the pace and career growth opportunities afforded by working at a fast-growing early stage startup, then we'd like to meet you!
What you'll be doing
Working across the Machine Learning and Analytics verticals of the Data team, this role will expose you to a mix of client and internal facing projects. You'll get to work with deployed machine learning models and on strategic client analysis that shapes the onboarding decisions made by top 20 US banks. More specifically, this role will:
Support risk and fraud detection efforts:
- analyze the performance of fraud detection models with an eye toward improving precision and recall
- scale up our data pipeline, helping expand the coverage of our fraud models
Generate internal and client-facing data analysis
- provide internal and external stakeholders with key insights surrounding client use of the product
- enable clients to better understand how ruleset changes will impact evaluation outcomes, fraud capture rates, and false positives
Improve automation and monitoring
- regularly validating the performance of our models on recent applications and triggering retraining of models as new training data becomes available
- instrument model input and output to detect meaningful or unexpected changes in pipelines
- approach analyses with a focus on reproducibility - automation is key, and if one client has a question, it's likely others do too
Who we're looking for
Required
- Technical:
- Content knowledge equivalent with an undergraduate degree in stats / machine learning, either through a formal degree, bootcamp, or self-guided study
- Python
- Core data science libraries (numpy, pandas, scikit-learn)
- Core data visualization and graphing techniques
- core SQL (selects, joins, inserts)
- git
- Non-Technical
- curious, humble
- excellent teammates / friendly collaborators
- quick learners
- strong verbal communicators who can explain nuanced technical concepts to a variety of audiences
- strong writers good at communicating insights gleaned from exploratory analyses
- "product sense" and a desire to understand Alloy's business, strategy and priorities
Nice to Have
- Finance experience
- Prior startup or product management experience
- AWS (s3, lambda, sagemaker, athena)
- Airflow
- Advanced SQL (postgres PL/pgSQL, functions, materialized views, window operations)
- Other relevant data science experience (R, Julia, etc)
- Causal inference / econometric coursework
- probabilistic programming experience (STAN, pymc)
- command line / bash scripting experience
- emacs & vim dorks welcome
- experience with modular python packaging/ shared/version controlled repos