Data Engineer Interview Questions

Review this list of 154 data engineer interview questions and answers verified by hiring managers and candidates.
  • Data Engineer
    Data Modeling
  • Apple logoAsked at Apple 
    +16

    "function isValid(s) { const stack = []; for (let i=0; i < s.length; i++) { const char = s.charAt(i); if (['(', '{', '['].includes(char)) { stack.push(char); } else { const top = stack.pop(); if ((char === ')' && top !== '(') || (char === '}' && top !== '{') || (char === ']' && top !== '[')) { return false; } } } return stack.length === 0"

    Tiago R. - "function isValid(s) { const stack = []; for (let i=0; i < s.length; i++) { const char = s.charAt(i); if (['(', '{', '['].includes(char)) { stack.push(char); } else { const top = stack.pop(); if ((char === ')' && top !== '(') || (char === '}' && top !== '{') || (char === ']' && top !== '[')) { return false; } } } return stack.length === 0"See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • Amazon logoAsked at Amazon 
    +4

    "Any cycle would cause the prerequisite to be greater than the course. This passes all the tests: function canFinish(_numCourses, prerequisites) { for (const [a, b] of prerequisites) { if (b > a) return false } return true } `"

    Jeremy D. - "Any cycle would cause the prerequisite to be greater than the course. This passes all the tests: function canFinish(_numCourses, prerequisites) { for (const [a, b] of prerequisites) { if (b > a) return false } return true } `"See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • +1

    "This is yet another classic case of evolution of data landscape to account for diversities in the data formats sacrificing restrictive but key components at first and added later to make the solution more effective. Data warehouse -> Data Lake -> Data Lakehouse (Data Lake + Data Warehouse) Data warehouse - A solution to store data in central place (analytics (read) heavy) with stringent schema (structured). Very useful for historical queries and analytics. Schema on write check. Only used for"

    Karthik R. - "This is yet another classic case of evolution of data landscape to account for diversities in the data formats sacrificing restrictive but key components at first and added later to make the solution more effective. Data warehouse -> Data Lake -> Data Lakehouse (Data Lake + Data Warehouse) Data warehouse - A solution to store data in central place (analytics (read) heavy) with stringent schema (structured). Very useful for historical queries and analytics. Schema on write check. Only used for"See full answer

    Data Engineer
    Data Pipeline Design
  • Adobe logoAsked at Adobe 
    Video answer for 'Move all zeros to the end of an array.'
    +49

    "Initialize left pointer: Set a left pointer left to 0. Iterate through the array: Iterate through the array from left to right. If the current element is not 0, swap it with the element at the left pointer and increment left. Time complexity: O(n). The loop iterates through the entire array once, making it linear time. Space complexity: O(1). The algorithm operates in-place, modifying the input array directly without using additional data structures. "

    Avon T. - "Initialize left pointer: Set a left pointer left to 0. Iterate through the array: Iterate through the array from left to right. If the current element is not 0, swap it with the element at the left pointer and increment left. Time complexity: O(n). The loop iterates through the entire array once, making it linear time. Space complexity: O(1). The algorithm operates in-place, modifying the input array directly without using additional data structures. "See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • +12

    "-- Write your query here with cte as ( SELECT e.Emp_ID, e.First_Name, e.Middle_Name, e.Last_Name, e.Manager_ID, e.Country, s.Salary as original_salary, CASE WHEN Country = 'IRELAND' THEN s.Salary*1.09 WHEN Country = 'INDIA' THEN s.Salary*0.012 ELSE s.Salary END AS Salary FROM tbl_Employee e LEFT JOIN tbl_EmployeeSalary s ON e.Emp_ID = s.Emp"

    Abhishek K. - "-- Write your query here with cte as ( SELECT e.Emp_ID, e.First_Name, e.Middle_Name, e.Last_Name, e.Manager_ID, e.Country, s.Salary as original_salary, CASE WHEN Country = 'IRELAND' THEN s.Salary*1.09 WHEN Country = 'INDIA' THEN s.Salary*0.012 ELSE s.Salary END AS Salary FROM tbl_Employee e LEFT JOIN tbl_EmployeeSalary s ON e.Emp_ID = s.Emp"See full answer

    Data Engineer
    Coding
    +3 more
  • Visa logoAsked at Visa 

    "There are couple of reasons for it - Kind of role : Its a product manager role loaded with analytical work, So working with data in stringent regulatory guideline make it more exciting and thrilling. Location & industry is like - Cherry on the cake, Bangalore weather and BFI is at its all time peak as people spending behavior is changing continuously, it will be interesting to see big giants like visa are managing it."

    Nidhi S. - "There are couple of reasons for it - Kind of role : Its a product manager role loaded with analytical work, So working with data in stringent regulatory guideline make it more exciting and thrilling. Location & industry is like - Cherry on the cake, Bangalore weather and BFI is at its all time peak as people spending behavior is changing continuously, it will be interesting to see big giants like visa are managing it."See full answer

    Data Engineer
    Behavioral
    +4 more
  • +20

    "select name, stock from products p left join transactions t on p.id = t.product_id order by date desc limit 1"

    Daniel C. - "select name, stock from products p left join transactions t on p.id = t.product_id order by date desc limit 1"See full answer

    Data Engineer
    Coding
    +3 more
  • " A couple of years ago, we were working on a project to integrate a new third-party data feed into our existing data processing pipeline. This data feed was critical for enhancing our trading algorithms with more comprehensive market data. Given the tight timeline and high stakes, I decided to push for a rapid implementation. In my eagerness to meet the deadline, I underestimated the complexity of integrating this new data feed. I did not allocate sufficient time for thorough testing and valida"

    Scott S. - " A couple of years ago, we were working on a project to integrate a new third-party data feed into our existing data processing pipeline. This data feed was critical for enhancing our trading algorithms with more comprehensive market data. Given the tight timeline and high stakes, I decided to push for a rapid implementation. In my eagerness to meet the deadline, I underestimated the complexity of integrating this new data feed. I did not allocate sufficient time for thorough testing and valida"See full answer

    Data Engineer
    Behavioral
    +2 more
  • +13

    "The user table no longer exists as expected - I get an error that user does not contain user_id. Note that querying the table results in only user:swuoevkivrjfta select * FROM user `"

    Evan R. - "The user table no longer exists as expected - I get an error that user does not contain user_id. Note that querying the table results in only user:swuoevkivrjfta select * FROM user `"See full answer

    Data Engineer
    Coding
    +3 more
  • Google logoAsked at Google 
    +2

    "WITH RECURSIVE fibonacci_series AS ( SELECT 1 AS n, 0 AS fib1, 1 AS fib2 UNION ALL SELECT n + 1 AS n, fib2 AS fib1, fib1 + fib2 AS fib2 FROM fibonacci_series WHERE n < 20 -- Limit the series to 20 numbers ) SELECT n, fib1 AS fib FROM fibonacci_series ORDER BY n; `"

    Yashasvi V. - "WITH RECURSIVE fibonacci_series AS ( SELECT 1 AS n, 0 AS fib1, 1 AS fib2 UNION ALL SELECT n + 1 AS n, fib2 AS fib1, fib1 + fib2 AS fib2 FROM fibonacci_series WHERE n < 20 -- Limit the series to 20 numbers ) SELECT n, fib1 AS fib FROM fibonacci_series ORDER BY n; `"See full answer

    Data Engineer
    Coding
    +4 more
  • "Firstly, congratulations to both the interviewer and interviewee. This was a great learning experience However, being a Full Stack engineer and I was having the following suggestions around the Data Model - Driver & Approval can be two different tables Approval & Document - Approval can be a tuple of (userid,documentid) - comments against a rejection (marks the document which triggers rejection)In this way we can capture the entire history of approval workflow (initiate/pending/appr"

    Nilanjan D. - "Firstly, congratulations to both the interviewer and interviewee. This was a great learning experience However, being a Full Stack engineer and I was having the following suggestions around the Data Model - Driver & Approval can be two different tables Approval & Document - Approval can be a tuple of (userid,documentid) - comments against a rejection (marks the document which triggers rejection)In this way we can capture the entire history of approval workflow (initiate/pending/appr"See full answer

    Data Engineer
    Data Modeling
  • Adobe logoAsked at Adobe 
    +25

    "There is a faster approach that solves the problem in O(n) time: def find_duplicates(arr1, arr2): arr1 = set(arr1) res = [] for num in arr2: if num in arr1: res.append(num) return res `"

    Victor H. - "There is a faster approach that solves the problem in O(n) time: def find_duplicates(arr1, arr2): arr1 = set(arr1) res = [] for num in arr2: if num in arr1: res.append(num) return res `"See full answer

    Data Engineer
    Data Structures & Algorithms
    +2 more
  • Adobe logoAsked at Adobe 
    Video answer for 'Find the median of two sorted arrays.'
    Data Engineer
    Data Structures & Algorithms
    +4 more
  • Adobe logoAsked at Adobe 
    Video answer for 'Product of Array Except Self'
    +46

    "If 0's aren't a concern, couldn't we just multiply all numbers. and then divide product by each number in the list ? if there's more than one zero, then we just return an array of 0s if there's one zero, then we just replace 0 with product and rest 0s. what am i missing?"

    Sachin R. - "If 0's aren't a concern, couldn't we just multiply all numbers. and then divide product by each number in the list ? if there's more than one zero, then we just return an array of 0s if there's one zero, then we just replace 0 with product and rest 0s. what am i missing?"See full answer

    Data Engineer
    Data Structures & Algorithms
    +3 more
  • Adobe logoAsked at Adobe 
    +6

    " function climbStairs(n) { // 4 iterations of Dynamic Programming solutions: // Step 1: Recursive: // if (n <= 2) return n // return climbStairs(n-1) + climbStairs(n-2) // Step 2: Top-down Memoization // const memo = {0:0, 1:1, 2:2} // function f(x) { // if (x in memo) return memo[x] // memo[x] = f(x-1) + f(x-2) // return memo[x] // } // return f(n) // Step 3: Bottom-up Tabulation // const tab = [0,1,2] // f"

    Matthew K. - " function climbStairs(n) { // 4 iterations of Dynamic Programming solutions: // Step 1: Recursive: // if (n <= 2) return n // return climbStairs(n-1) + climbStairs(n-2) // Step 2: Top-down Memoization // const memo = {0:0, 1:1, 2:2} // function f(x) { // if (x in memo) return memo[x] // memo[x] = f(x-1) + f(x-2) // return memo[x] // } // return f(n) // Step 3: Bottom-up Tabulation // const tab = [0,1,2] // f"See full answer

    Data Engineer
    Data Structures & Algorithms
    +3 more
  • Google logoAsked at Google 
    Video answer for 'Design a high-tech gym.'
    +1

    "[2:53 pm, 02/12/2021] Mayank: Before we deep dive into brainstorming the solution I go ahead and make few assumptions for clarifying questions: Step-1- Framing a problem 🎯Why users go to gym? To relieve stress by doing exercise To maintain their body To reduce their weight To remain active To make good physique 🎯Who goes to gym? Couples Group of friends Individuals Here we are trying to design high tech gym so it means we are looking to create good experience-and"

    Mayank S. - "[2:53 pm, 02/12/2021] Mayank: Before we deep dive into brainstorming the solution I go ahead and make few assumptions for clarifying questions: Step-1- Framing a problem 🎯Why users go to gym? To relieve stress by doing exercise To maintain their body To reduce their weight To remain active To make good physique 🎯Who goes to gym? Couples Group of friends Individuals Here we are trying to design high tech gym so it means we are looking to create good experience-and"See full answer

    Data Engineer
    Product Design
  • +8

    "In the question it says: "above the overall average total posts", which to me implying a >, yet in the solution it uses >= Caused me 1 hr to find out. plz fix"

    Peter W. - "In the question it says: "above the overall average total posts", which to me implying a >, yet in the solution it uses >= Caused me 1 hr to find out. plz fix"See full answer

    Data Engineer
    Coding
    +3 more
  • Amazon logoAsked at Amazon 

    "1) select avg(session) from table where session> 180 2) select round(sessiontime/300)*300 as sessionbin, count() as sessioncount from table group by round(sessiontime/300)300 order by session_bin 3) SELECT t1.country AS country_a, t2.country AS country_b FROM ( SELECT country, COUNT(*) AS session_count FROM yourtablename GROUP BY country ) AS t1 JOIN ( SELECT country, COUNT(*) AS session_count FROM yourtablename `GROUP BY countr"

    Erjan G. - "1) select avg(session) from table where session> 180 2) select round(sessiontime/300)*300 as sessionbin, count() as sessioncount from table group by round(sessiontime/300)300 order by session_bin 3) SELECT t1.country AS country_a, t2.country AS country_b FROM ( SELECT country, COUNT(*) AS session_count FROM yourtablename GROUP BY country ) AS t1 JOIN ( SELECT country, COUNT(*) AS session_count FROM yourtablename `GROUP BY countr"See full answer

    Data Engineer
    Coding
    +4 more
Showing 41-60 of 154