

Updated by OpenAI candidates

Full Stack Engineer, Applied Interview Experience
Funnily enough, I work at Replit, and even for our interviews we have candidates use AI. What you're testing now isn't correctness of code, it's whether someone can reason about the code that gets generated, and OpenAI hadn't really adopted that mindset yet.
Interview process
A recruiter reached out to me cold by email, and the process itself matched what they told me almost exactly. I had a quick recruiter screen, then a long phone screen that was really two interviews back to back: a Playground design prompt and a coding problem around credits. After that I did a virtual onsite that was only three rounds, about four hours total, with a refactoring exercise, a past-project presentation with slides, and a leadership conversation. I ended up getting rejected, and the clearest signal I got afterward was that scale mattered a lot, especially in the project presentation round. One weird process detail is that a different recruiter reached back out to me again a few months later, so my impression was their internal tracking and cooldown handling were not especially tight at that point.
- Final round
- Recruiter screen
- Phone interview
Interview tips
I'd prep this as a full-stack product engineering loop, not a pure backend or pure LeetCode loop. In the system design, ask a ton of clarifying questions and be explicit about what you're abstracting, because they'll let you ignore model-serving details if you ask, and they may want you spending more time on UX and wireframes than you'd expect. For coding, be ready for practical business-logic problems and for messy-code refactoring, not just blank-page algorithms. For the project presentation, pick something that shows real complexity and, if possible, real scale. If your project wasn't actually high scale, be ready to explain exactly how it would scale without sounding hand-wavy.
Company culture
They were hiring pretty broadly at the time. The "Applied" group was described to me as basically everything that wasn't core models or research, and they were still running people through a shared process before deciding where they fit and whether they were senior or staff. They were very accurate about what the loop would be, and there was basically no comp pressure during the process beyond a quick explanation that the RSU structure was a little weird because of the company's restructuring. Interviewer engagement was mixed for me, but the leadership interviewer was very dialed in. The strongest themes I saw were that they expect product engineers to be genuinely full stack, they care a lot about elegance, and they care a lot about scale and about how you handle conflict inside a fast-growing company full of very strong personalities.
Questions asked
Overview
The virtual onsite was about four hours total and had three substantive rounds plus a recruiter wrap-up. The coding round was a refactor exercise, not a fresh algorithm problem, the project round required slides on the most technically challenging thing I'd built, and the leadership round was very behavioral and much more engaged than some of the engineer rounds.
Question types asked
Specific questions asked
Here is some messy code. Refactor it so it can handle new requirements, while keeping the existing tests passing.
What abstractions or class structure would you introduce so this scales better?
Can you preserve the current behavior and also satisfy the new cases?
The code was something like 100 to 120 lines, but it was painful to read because it was full of deeply nested conditionals. I had to untangle it, keep parity with the tests that already passed, and then make it clean enough to support a few new requirements. What I liked about this round is that it tested abstraction and long-term design thinking more than raw coding speed. It felt much closer to real work than just solving a blank LeetCode problem.
Why did you make this design decision?
Why store SQL schemas in a vector database instead of something else?
How did you choose the LLM?
What did your eval structure look like?
How would this scale?
I presented an AI inference system I had built, including using a vector database to store SQL schemas. I was pretty upfront that the product was in a slow-growing B2B SaaS context with only a small number of enterprise customers, so we had optimized more for elegance and stability than immediate massive scale. I also talked through how it would scale if usage really took off. He asked about storage choices, LLM choice, and whether the evals were real or more vibes-based. Later the recruiter told me scale was a big thing they cared about, and I think that round hurt me.
What exactly did the conflict look like?
Why did you choose that approach?
How did the resolution actually take shape?
I walked through a real conflict story and then went deeper on what the disagreement actually was, why I made the choices I did, and how the resolution happened in practice. This was the most engaged interviewer in the whole process. My read was that he cared a lot about whether I could navigate disagreement well, because the company is growing fast and has a lot of very strong people, so operating through conflict matters a lot there.
Get full access with a membership, or share your experience to try it free.
