- NEW YORK
- UNITED STATES
Salary: $250,000 - $350,000 base + guaranteed Year 1 bonus (could be 100% to 500%) + sign-on. Total comp is elite tier.
Primary Skills - pytorch, programming, mit, ml, gpu, python, cuda, deep learning, machine learning
Secondary Skills - technical demonstrations, prototyping, model architecture, decision-making, model design, mathematics, ai, experimentation, full stack
Work Experience - 3-15 yrs
Remote Status - No Remote
Client Willing to Sponsor - Yes
Degree - University - Bachelor's Degree/3-4 Year Degree
Major - Only Computer Science, Machine Learning, Mathematics, or a related quantitative field
Relocation Paid - Yes
Recruit From - Nationwide
Salary Details:
Overall Comp - $500,000 - $1.5m
Base salary $250,000 - $300,000 + Guaranteed Bonus + Relocation
Job Description:
Applied ML Systems Engineer - Finance (New York)
A confidential, highly capitalized financial institution in midtown Manhattan is building out a specialized engineering group focused on accelerating its use of modern machine learning across the organization. This is not a bank, not a startup - it is a well-established, technology-first firm where engineering and research are central to the business, not support functions. They need someone who lives in the gap between research papers and running systems. The kind of engineer who reads a new arxiv paper on Monday and has a working version running on GPUs by Thursday - not to publish, but to prove whether it's worth building for real.
Your days will look different from week to week. Some weeks you'll be deep in GPU kernels trying to shave training time. Other weeks you'll be whiteboarding system designs with researchers who have ideas but need an engineer to make them real. The constant is that everything you build gets used - this is not a lab environment where prototypes sit on a shelf. What the work actually looks like:
- You'll spend significant time in low-level performance work - writing custom GPU code, optimizing memory usage, finding the bottlenecks that keep models from training at the speed the business needs
- When the research team identifies a promising new technique, you're the person who builds the first working version and stress-tests it against real data to see if the theory holds up in practice
- You'll own the design of training infrastructure - figuring out how to let researchers run more experiments faster, without blowing up compute costs or introducing instability
- Some of your prototypes will graduate into production systems that run continuously. You'll be involved in that transition, making sure the thing you built in a week can survive running for a year
- You'll have real input into which technical directions the group pursues - this isn't a ticket-taking role. If you see a better approach, you're expected to make the case and build the proof
Qualifications
- You've actually built ML systems that went to production - not just trained models in notebooks. You know the difference between "it works on my machine" and "it works at scale, reliably, for months" - I must see clear evidence of "ML Systems," "Training Infrastructure," "Distributed Training," "GPU Optimization," "Model Performance" etc on your resume
- Must be, or have, worked for: Google (especially Brain, DeepMind, Ads ML, Infra); Meta (FAIR, Infra, Recsys); Amazon (AWS AI, Alexa AI); Apple (ML platform teams); Microsoft (Azure AI, Research, Core AI); AI Labs: OpenAI, Anthropic, Cohere, Mistral, DeepMind; Top Quant / Trading Firms: Two Sigma, Jane Street, Hudson River Trading, Jump Trading, Tower Research, IMC, DRW; High-End Startups: Scale AI, Databricks, Snowflake, Stripe (ML + infra roles), Airbnb / Uber
- Seeking ML Engineers, Infra Engineers on ML systems, and training / performance / distributed systems
- Candidates with strong undergraduate academic pedigree. Must be from: MIT, Stanford, Harvard, Princeton, Caltech, UC Berkeley, Carnegie Mellon, University of Chicago, Columbia, Cornell, UPenn, Yale, UCLA, UC San Diego, Georgia Tech, University of Washington, UT Austin, University of Michigan, Duke, Northwestern with Degree in CS, Math, Physics, EE, or a related quantitative field and advanced coursework, research involvement, systems, or ML-heavy projects. Ideally with publications, strong internships during undergrad, and evidence of building things, not just studying them
- You're comfortable working at the hardware level when needed - GPU programming, memory optimization, custom kernels. You don't treat the infrastructure as someone else's problem
- You can move fast without cutting corners. The ability to prototype quickly is essential, but so is knowing when something needs to be built properly
- You've partnered with researchers before and know how to translate "I think this architecture might work" into "here's a running system that proves it does (or doesn't)"
- You stay current with ML research not because someone tells you to, but because you're genuinely curious about what's new and how it could be applied
Why is This a Great Opportunity
The compensation alone makes this worth a conversation - base salary is $250k - $350k, and there's a guaranteed bonus in your first year on top of that, plus a sign-on. Total comp is genuinely best-in-class.
But the real draw is the work itself. You'll be solving problems that most ML engineers only read about. The datasets are massive and proprietary, the compute budget is effectively uncapped for good ideas, and the people around you are operating at an extremely high level.
This is also a firm that actually values engineers - not just as support for the research team, but as equals who shape the technical direction. Your prototypes don't go into a backlog; they get evaluated and deployed, often within weeks.
Benefits are comprehensive: 401(k) matching, sign-on bonus, medical/prescription coverage, wellness reimbursement, family building support, and charitable gift matching.
If you're the kind of engineer who wants to work on hard problems that matter, with people who are the best at what they do, in an environment where what you build has immediate real-world consequences - this is it.
IND123


