Data Engineer Interview Questions

Review this list of 154 data engineer interview questions and answers verified by hiring managers and candidates.
  • Adobe logoAsked at Adobe 
    +17

    "We can use dictionary to store cache items so that our read / write operations will be O(1). Each time we read or update an existing record, we have to ensure the item is moved to the back of the cache. This will allow us to evict the first item in the cache whenever the cache is full and we need to add new records also making our eviction O(1) Instead of normal dictionary, we will use ordered dictionary to store cache items. This will allow us to efficiently move items to back of the cache a"

    Alfred O. - "We can use dictionary to store cache items so that our read / write operations will be O(1). Each time we read or update an existing record, we have to ensure the item is moved to the back of the cache. This will allow us to evict the first item in the cache whenever the cache is full and we need to add new records also making our eviction O(1) Instead of normal dictionary, we will use ordered dictionary to store cache items. This will allow us to efficiently move items to back of the cache a"See full answer

    Data Engineer
    Data Structures & Algorithms
    +6 more
  • Amazon logoAsked at Amazon 
    Video answer for 'Tell me about a skill you recently learned.'
    +47

    "What are they looking for in the answer? "

    Astro S. - "What are they looking for in the answer? "See full answer

    Data Engineer
    Behavioral
    +1 more
  • "it is really good explanation thanks it is really good explanation thanks"

    Amney M. - "it is really good explanation thanks it is really good explanation thanks"See full answer

    Data Engineer
    Coding
    +4 more
  • "SELECT s.Sale_Date, SUM(si.Quantity * si.SalePrice) AS TotalRevenue FROM Sales s JOIN SaleItems si ON s.SaleID = si.Sale_ID GROUP BY s.Sale_Date ORDER BY s.Sale_Date; "

    Bala G. - "SELECT s.Sale_Date, SUM(si.Quantity * si.SalePrice) AS TotalRevenue FROM Sales s JOIN SaleItems si ON s.SaleID = si.Sale_ID GROUP BY s.Sale_Date ORDER BY s.Sale_Date; "See full answer

    Data Engineer
    Coding
    +1 more
  • +2

    "Always assume good intentions on the part of both parties when resolving conflicts. Then proceed with a STAR example."

    Abhinav M. - "Always assume good intentions on the part of both parties when resolving conflicts. Then proceed with a STAR example."See full answer

    Data Engineer
    Behavioral
    +2 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • +24

    "WITH filtered_posts AS ( SELECT p.user_id, p.issuccessfulpost FROM post p WHERE p.postdate >= '2023-11-01' AND p.postdate < '2023-12-01' ), post_summary AS ( SELECT pu.user_type, COUNT(*) AS post_attempt, SUM(CASE WHEN fp.issuccessfulpost = 1 THEN 1 ELSE 0 END) AS post_success FROM filtered_posts fp JOIN postuser pu ON fp.userid = pu.user_id GROUP BY pu.user_type ) SELECT user_type, post_success, post_attempt, CAST(postsuccess AS FLOAT) / postattempt AS postsuccessrate FROM po"

    David I. - "WITH filtered_posts AS ( SELECT p.user_id, p.issuccessfulpost FROM post p WHERE p.postdate >= '2023-11-01' AND p.postdate < '2023-12-01' ), post_summary AS ( SELECT pu.user_type, COUNT(*) AS post_attempt, SUM(CASE WHEN fp.issuccessfulpost = 1 THEN 1 ELSE 0 END) AS post_success FROM filtered_posts fp JOIN postuser pu ON fp.userid = pu.user_id GROUP BY pu.user_type ) SELECT user_type, post_success, post_attempt, CAST(postsuccess AS FLOAT) / postattempt AS postsuccessrate FROM po"See full answer

    Data Engineer
    Coding
    +3 more
  • +37

    "Here's a simpler solution: select u.username , count(p.postid) as countposts from posts as p join users as u on p.userid = u.userid where p.likes >= 100 group by 1 order by 2 desc, 1 asc limit 3 `"

    Bradley E. - "Here's a simpler solution: select u.username , count(p.postid) as countposts from posts as p join users as u on p.userid = u.userid where p.likes >= 100 group by 1 order by 2 desc, 1 asc limit 3 `"See full answer

    Data Engineer
    Coding
    +3 more
  • +52

    "Limit and rank() only works if there are no 2 employees with same salary ( which is okay for this use case) For the query to pass all the test results, we need to use dense_rank with ranked_employees as ( select id, firstname, lastname, salary, denserank() over(order by salary desc) as salaryrank from employees ) select id, firstname, lastname, salary from ranked_employees where salary_rank <= 3 `"

    Vysali K. - "Limit and rank() only works if there are no 2 employees with same salary ( which is okay for this use case) For the query to pass all the test results, we need to use dense_rank with ranked_employees as ( select id, firstname, lastname, salary, denserank() over(order by salary desc) as salaryrank from employees ) select id, firstname, lastname, salary from ranked_employees where salary_rank <= 3 `"See full answer

    Data Engineer
    Coding
    +3 more
  • +10

    "Would be better to adjust resolution in the video player directly."

    Anonymous Prawn - "Would be better to adjust resolution in the video player directly."See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • Adobe logoAsked at Adobe 

    "Use a representative of each, e.g. sort the string and add it to the value of a hashmap> where we put all the words that belong to the same anagram together."

    Gaston B. - "Use a representative of each, e.g. sort the string and add it to the value of a hashmap> where we put all the words that belong to the same anagram together."See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • Apple logoAsked at Apple 
    +16

    "we can use two pointer + set like maintain i,j and also insert jth character to set like while set size is equal to our window j-i+1 then maximize our answer and increase jth pointer till last index"

    Kishor J. - "we can use two pointer + set like maintain i,j and also insert jth character to set like while set size is equal to our window j-i+1 then maximize our answer and increase jth pointer till last index"See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • "How do you find consecutive days for login (MySQL, SQL, date, subquery, MySQL 5.7, development)? 1 Follow Request Answer More All related (34) Recommended 📷 Trausti Thor Johannsson · Follow Been using MySQL for more than 16 yearsDec 27 There are functions like DATEDIFF but there are also BETWE"

    Hayatu H. - "How do you find consecutive days for login (MySQL, SQL, date, subquery, MySQL 5.7, development)? 1 Follow Request Answer More All related (34) Recommended 📷 Trausti Thor Johannsson · Follow Been using MySQL for more than 16 yearsDec 27 There are functions like DATEDIFF but there are also BETWE"See full answer

    Data Engineer
    Coding
    +1 more
  • +25

    "SELECT d.name as departmentname,e.id as employeeid,e.firstname,e.lastname,MAX(e.salary) as salary FROM employees e LEFT JOIN departments d ON e.department_id=d.id GROUP BY department_name ORDER BY department_name;"

    Anisha S. - "SELECT d.name as departmentname,e.id as employeeid,e.firstname,e.lastname,MAX(e.salary) as salary FROM employees e LEFT JOIN departments d ON e.department_id=d.id GROUP BY department_name ORDER BY department_name;"See full answer

    Data Engineer
    Coding
    +3 more
  • Atlassian logoAsked at Atlassian 
    Data Engineer
    Behavioral
    +7 more
  • Adobe logoAsked at Adobe 
    +22

    "#inplace reversal without inbuilt functions def reverseString(s): chars = list(s) l, r = 0, len(s)-1 while l < r: chars[l],chars[r] = chars[r],chars[l] l += 1 r -= 1 reversed = "".join(chars) return reversed "

    Anonymous Possum - "#inplace reversal without inbuilt functions def reverseString(s): chars = list(s) l, r = 0, len(s)-1 while l < r: chars[l],chars[r] = chars[r],chars[l] l += 1 r -= 1 reversed = "".join(chars) return reversed "See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • Meta (Facebook) logoAsked at Meta (Facebook) 
    Video answer for 'Merge Intervals'
    +37

    "const mergeIntervals = (intervals) => { const compare = (a, b) => { if(a[0] b[0]) return 1 else if(a[0] === b[0]) { return a[1] - b[1] } } let current = [] const result = [] const sorted = intervals.sort(compare) for(let i = 0; i = b[0]) current[1] = b[1] els"

    Kofi N. - "const mergeIntervals = (intervals) => { const compare = (a, b) => { if(a[0] b[0]) return 1 else if(a[0] === b[0]) { return a[1] - b[1] } } let current = [] const result = [] const sorted = intervals.sort(compare) for(let i = 0; i = b[0]) current[1] = b[1] els"See full answer

    Data Engineer
    Data Structures & Algorithms
    +6 more
  • +17

    "SELECT u.user_id, u.user_name, u.email, ROUND(AVG(CASE WHEN b.status = 'Unmatched' THEN 1.0 ELSE 0 END), 2) AS avgunmatchedbookings FROM users u LEFT JOIN bookings b ON u.userid = b.userid GROUP BY u.user_id, u.user_name, u.email; `"

    Akshay D. - "SELECT u.user_id, u.user_name, u.email, ROUND(AVG(CASE WHEN b.status = 'Unmatched' THEN 1.0 ELSE 0 END), 2) AS avgunmatchedbookings FROM users u LEFT JOIN bookings b ON u.userid = b.userid GROUP BY u.user_id, u.user_name, u.email; `"See full answer

    Data Engineer
    Coding
    +3 more
  • OpenAI logoAsked at OpenAI 
    Data Engineer
    Behavioral
    +5 more
  • "Missing Item - User ordered multiple items, few items are missing Wrong Item - Entire order is wrong / there are items in the order that were never ordered How is this measured ? CSAT Missing Items Wrong Items Step 1 : Collect data on orders that reported missing / wrong items. Dive deep to understand if the problem is isolated to a specific metro/zip code/restaurant type (say fast food vs fine dine), time of day (lunch vs dinner), tenure of the courier on th"

    Saurabh K. - "Missing Item - User ordered multiple items, few items are missing Wrong Item - Entire order is wrong / there are items in the order that were never ordered How is this measured ? CSAT Missing Items Wrong Items Step 1 : Collect data on orders that reported missing / wrong items. Dive deep to understand if the problem is isolated to a specific metro/zip code/restaurant type (say fast food vs fine dine), time of day (lunch vs dinner), tenure of the courier on th"See full answer

    Data Engineer
    Statistics & Experimentation
  • "What do all data scientists need to know about how to work with very large datasets? 37 Follow Request Answer More All related (39) Recommended 📷 Corrin Lakeland · Follow , M.S. Data Science, University of St. Thomas, St. Paul (2018)6yData Science consultant and managerUpvoted by[Tom Halloin](https://www.quora"

    Hayatu H. - "What do all data scientists need to know about how to work with very large datasets? 37 Follow Request Answer More All related (39) Recommended 📷 Corrin Lakeland · Follow , M.S. Data Science, University of St. Thomas, St. Paul (2018)6yData Science consultant and managerUpvoted by[Tom Halloin](https://www.quora"See full answer

    Data Engineer
    Data Modeling
Showing 21-40 of 154