Skip to main content

Data Engineer Interview Questions

Review this list of 170 Data Engineer interview questions and answers verified by hiring managers and candidates.
  • Visa logoAsked at Visa 
    1 answer

    "There are couple of reasons for it - Kind of role : Its a product manager role loaded with analytical work, So working with data in stringent regulatory guideline make it more exciting and thrilling. Location & industry is like - Cherry on the cake, Bangalore weather and BFI is at its all time peak as people spending behavior is changing continuously, it will be interesting to see big giants like visa are managing it."

    Nidhi S. - "There are couple of reasons for it - Kind of role : Its a product manager role loaded with analytical work, So working with data in stringent regulatory guideline make it more exciting and thrilling. Location & industry is like - Cherry on the cake, Bangalore weather and BFI is at its all time peak as people spending behavior is changing continuously, it will be interesting to see big giants like visa are managing it."See full answer

    Data Engineer
    Behavioral
    +4 more
  • Adobe logoAsked at Adobe 
    Add answer
    Video answer for 'Find the median of two sorted arrays.'
    Data Engineer
    Data Structures & Algorithms
    +4 more
  • Adobe logoAsked at Adobe 
    18 answers
    Video answer for 'Given an nxn grid of 1s and 0s, return the number of islands in the input.'
    +15

    " from typing import List def getnumberof_islands(binaryMatrix: List[List[int]]) -> int: if not binaryMatrix: return 0 rows = len(binaryMatrix) cols = len(binaryMatrix[0]) islands = 0 for r in range(rows): for c in range(cols): if binaryMatrixr == 1: islands += 1 dfs(binaryMatrix, r, c) return islands def dfs(grid, r, c): if ( r = len(grid) "

    Rick E. - " from typing import List def getnumberof_islands(binaryMatrix: List[List[int]]) -> int: if not binaryMatrix: return 0 rows = len(binaryMatrix) cols = len(binaryMatrix[0]) islands = 0 for r in range(rows): for c in range(cols): if binaryMatrixr == 1: islands += 1 dfs(binaryMatrix, r, c) return islands def dfs(grid, r, c): if ( r = len(grid) "See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • DoorDash logoAsked at DoorDash 
    1 answer

    "Missing Item - User ordered multiple items, few items are missing Wrong Item - Entire order is wrong / there are items in the order that were never ordered How is this measured ? CSAT Missing Items Wrong Items Step 1 : Collect data on orders that reported missing / wrong items. Dive deep to understand if the problem is isolated to a specific metro/zip code/restaurant type (say fast food vs fine dine), time of day (lunch vs dinner), tenure of the courier on th"

    Saurabh K. - "Missing Item - User ordered multiple items, few items are missing Wrong Item - Entire order is wrong / there are items in the order that were never ordered How is this measured ? CSAT Missing Items Wrong Items Step 1 : Collect data on orders that reported missing / wrong items. Dive deep to understand if the problem is isolated to a specific metro/zip code/restaurant type (say fast food vs fine dine), time of day (lunch vs dinner), tenure of the courier on th"See full answer

    Data Engineer
    Statistics & Experimentation
  • Amazon logoAsked at Amazon 
    6 answers

    "1) select avg(session) from table where session> 180 2) select round(sessiontime/300)*300 as sessionbin, count() as sessioncount from table group by round(sessiontime/300)300 order by session_bin 3) SELECT t1.country AS country_a, t2.country AS country_b FROM ( SELECT country, COUNT(*) AS session_count FROM yourtablename GROUP BY country ) AS t1 JOIN ( SELECT country, COUNT(*) AS session_count FROM yourtablename `GROUP BY countr"

    Erjan G. - "1) select avg(session) from table where session> 180 2) select round(sessiontime/300)*300 as sessionbin, count() as sessioncount from table group by round(sessiontime/300)300 order by session_bin 3) SELECT t1.country AS country_a, t2.country AS country_b FROM ( SELECT country, COUNT(*) AS session_count FROM yourtablename GROUP BY country ) AS t1 JOIN ( SELECT country, COUNT(*) AS session_count FROM yourtablename `GROUP BY countr"See full answer

    Data Engineer
    Coding
    +4 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • Adobe logoAsked at Adobe 
    23 answers
    Video answer for 'Find a triplet in an array with a given sum.'
    +17

    "from typing import List def three_sum(nums: List[int]) -> List[List[int]]: nums.sort() triplets = set() for i in range(len(nums) - 2): firstNum = nums[i] l = i + 1 r = len(nums) - 1 while l 0: r -= 1 elif potentialSum < 0: l += 1 "

    Anonymous Roadrunner - "from typing import List def three_sum(nums: List[int]) -> List[List[int]]: nums.sort() triplets = set() for i in range(len(nums) - 2): firstNum = nums[i] l = i + 1 r = len(nums) - 1 while l 0: r -= 1 elif potentialSum < 0: l += 1 "See full answer

    Data Engineer
    Data Structures & Algorithms
    +3 more
  • 1 answer
    Video answer for 'Design a data warehouse schema for Instagram.'
    Data Engineer
    Data Modeling
  • 14 answers
    +10

    " select user_id, b.marketing_channel from user_sessions a Left join attribution b on b.sessionid = a.sessionid group by 1,2 HAVING sum(purchasevalue)>100 and min(adclick_timestamp) `"

    G B. - " select user_id, b.marketing_channel from user_sessions a Left join attribution b on b.sessionid = a.sessionid group by 1,2 HAVING sum(purchasevalue)>100 and min(adclick_timestamp) `"See full answer

    Data Engineer
    Coding
    +3 more
  • Apple logoAsked at Apple 
    17 answers
    +12

    "class ListNode: def init(self, val=0, next=None): self.val = val self.next = next def has_cycle(head: ListNode) -> bool: slow, fast = head, head while fast and fast.next: slow = slow.next fast = fast.next.next if slow == fast: return True return False debug your code below node1 = ListNode(1) node2 = ListNode(2) node3 = ListNode(3) node4 = ListNode(4) creates a linked list with a cycle: 1 -> 2 -> 3 -> 4"

    Anonymous Roadrunner - "class ListNode: def init(self, val=0, next=None): self.val = val self.next = next def has_cycle(head: ListNode) -> bool: slow, fast = head, head while fast and fast.next: slow = slow.next fast = fast.next.next if slow == fast: return True return False debug your code below node1 = ListNode(1) node2 = ListNode(2) node3 = ListNode(3) node4 = ListNode(4) creates a linked list with a cycle: 1 -> 2 -> 3 -> 4"See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • 3 answers
    Video answer for 'Design a Data Warehouse Schema for a Ride-Sharing Service'

    "Firstly, congratulations to both the interviewer and interviewee. This was a great learning experience However, being a Full Stack engineer and I was having the following suggestions around the Data Model - Driver & Approval can be two different tables Approval & Document - Approval can be a tuple of (userid,documentid) - comments against a rejection (marks the document which triggers rejection)In this way we can capture the entire history of approval workflow (initiate/pending/appr"

    Nilanjan D. - "Firstly, congratulations to both the interviewer and interviewee. This was a great learning experience However, being a Full Stack engineer and I was having the following suggestions around the Data Model - Driver & Approval can be two different tables Approval & Document - Approval can be a tuple of (userid,documentid) - comments against a rejection (marks the document which triggers rejection)In this way we can capture the entire history of approval workflow (initiate/pending/appr"See full answer

    Data Engineer
    Data Modeling
  • Adobe logoAsked at Adobe 
    12 answers
    +8

    "function findPrimes(n) { if (n < 2) return []; const primes = []; for (let i=2; i <= n; i++) { const half = Math.floor(i/2); let isPrime = true; for (let prime of primes) { if (i % prime === 0) { isPrime = false; break; } } if (isPrime) { primes.push(i); } } return primes; } `"

    Tiago R. - "function findPrimes(n) { if (n < 2) return []; const primes = []; for (let i=2; i <= n; i++) { const half = Math.floor(i/2); let isPrime = true; for (let prime of primes) { if (i % prime === 0) { isPrime = false; break; } } if (isPrime) { primes.push(i); } } return primes; } `"See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • Add answer
    Video answer for 'Design an ETL Pipeline for a ML Platform for AWS'
    Data Engineer
    Data Pipeline Design
  • Airbnb logoAsked at Airbnb 
    Add answer
    Data Engineer
    Data Structures & Algorithms
    +4 more
  • Adobe logoAsked at Adobe 
    14 answers
    Video answer for 'Generate Parentheses'
    +9

    " O(n) time from typing import List def generate_parentheses(n: int): res = [] def generate(buf, opened, closed): if len(buf) == 2 * n: if n != 0: res.append(buf) return if opened < n: generate( buf + "(", opened + 1, closed) if closed < opened: generate(buf + ")", opened, closed + 1) generate("", 0, 0) return res debug your code below print(generate_parentheses(1"

    Rick E. - " O(n) time from typing import List def generate_parentheses(n: int): res = [] def generate(buf, opened, closed): if len(buf) == 2 * n: if n != 0: res.append(buf) return if opened < n: generate( buf + "(", opened + 1, closed) if closed < opened: generate(buf + ")", opened, closed + 1) generate("", 0, 0) return res debug your code below print(generate_parentheses(1"See full answer

    Data Engineer
    Data Structures & Algorithms
    +3 more
  • "What do all data scientists need to know about how to work with very large datasets? 37 Follow Request Answer More All related (39) Recommended 📷 Corrin Lakeland · Follow , M.S. Data Science, University of St. Thomas, St. Paul (2018)6yData Science consultant and managerUpvoted by[Tom Halloin](https://www.quora"

    Hayatu H. - "What do all data scientists need to know about how to work with very large datasets? 37 Follow Request Answer More All related (39) Recommended 📷 Corrin Lakeland · Follow , M.S. Data Science, University of St. Thomas, St. Paul (2018)6yData Science consultant and managerUpvoted by[Tom Halloin](https://www.quora"See full answer

    Data Engineer
    Data Modeling
  • 15 answers
    +12

    " with youngsuccrate as( select strftime('%m', postdate) AS postmonth, round(sum(issuccessfulpost)*1.0/count(issuccessfulpost),2)as yascrate from post where userid in (select userid from post_user where age between 0 and 18) group by post_month ), nonyoungsucc_rate as( select strftime('%m', postdate) AS postmonth, round(sum(issuccessfulpost)*1.0/count(issuccessfulpost),2)as nonyasc_rate from post where user_id in (select"

    Bhavna S. - " with youngsuccrate as( select strftime('%m', postdate) AS postmonth, round(sum(issuccessfulpost)*1.0/count(issuccessfulpost),2)as yascrate from post where userid in (select userid from post_user where age between 0 and 18) group by post_month ), nonyoungsucc_rate as( select strftime('%m', postdate) AS postmonth, round(sum(issuccessfulpost)*1.0/count(issuccessfulpost),2)as nonyasc_rate from post where user_id in (select"See full answer

    Data Engineer
    Coding
    +3 more
  • OpenAI logoAsked at OpenAI 
    Add answer
    Data Engineer
    Behavioral
    +5 more
  • Anthropic logoAsked at Anthropic 
    Add answer
    Data Engineer
    Behavioral
    +1 more
  • Capital One logoAsked at Capital One 
    Add answer
    Data Engineer
    Data Structures & Algorithms
    +2 more
Showing 61-80 of 170
Exponent

Get updates in your inbox with the latest tips, job listings, and more.

Follow Us

Products
Courses
Interview Questions
Interview Experiences
Popular articles
Guides
Coaching
For Partners
Company
Exponent © 2026
Terms of Service | Privacy