Skip to main content

Data Engineer Interview Questions

Review this list of 160 Data Engineer interview questions and answers verified by hiring managers and candidates.
  • Atlassian logoAsked at Atlassian 
    Add answer
    Data Engineer
    Behavioral
    +7 more
  • Google logoAsked at Google 
    2 answers

    "Clarification questions What is the purpose of connecting the DB? Do we expect high-volumes of traffic to hit the DB Do we have scalability or reliability concerns? Format Code -> DB Code -> Cache -> DB API -> Cache -> DB - APIs are built for a purpose and have a specified protocol (GET, POST, DELETE) to speak to the DB. APIs can also use a contract to retrieve information from a DB much faster than code. Load balanced APIs -> Cache -> DB **Aut"

    Aaron W. - "Clarification questions What is the purpose of connecting the DB? Do we expect high-volumes of traffic to hit the DB Do we have scalability or reliability concerns? Format Code -> DB Code -> Cache -> DB API -> Cache -> DB - APIs are built for a purpose and have a specified protocol (GET, POST, DELETE) to speak to the DB. APIs can also use a contract to retrieve information from a DB much faster than code. Load balanced APIs -> Cache -> DB **Aut"See full answer

    Data Engineer
    Concept
    +5 more
  • 1 answer
    Video answer for 'Design a Data Warehouse Schema for Customer Support'

    "not able to understand the accent of the candidate"

    Akash A. - "not able to understand the accent of the candidate"See full answer

    Data Engineer
    Data Modeling
  • Adobe logoAsked at Adobe 
    6 answers
    +3

    "function main(){ const v1=[2,3, 4, 10] const v2= [3,4 ,5,20, 23] return merge(v1,v2); } function merge(left, right){ const result=[]; while(left.length>0&& right.length>0){ if(left[0]0){ result=result.concat(left) } if(right.length>0){ result=result.concat(right) } return result; }"

    Samuel M. - "function main(){ const v1=[2,3, 4, 10] const v2= [3,4 ,5,20, 23] return merge(v1,v2); } function merge(left, right){ const result=[]; while(left.length>0&& right.length>0){ if(left[0]0){ result=result.concat(left) } if(right.length>0){ result=result.concat(right) } return result; }"See full answer

    Data Engineer
    Data Structures & Algorithms
    +5 more
  • Adobe logoAsked at Adobe 
    10 answers
    +6

    "bool isValidBST(TreeNode* root, long min = LONGMIN, long max = LONGMAX){ if (root == NULL) return true; if (root->val val >= max) return false; return isValidBST(root->left, min, root->val) && isValidBST(root->right, root->val, max); } `"

    Alvaro R. - "bool isValidBST(TreeNode* root, long min = LONGMIN, long max = LONGMAX){ if (root == NULL) return true; if (root->val val >= max) return false; return isValidBST(root->left, min, root->val) && isValidBST(root->right, root->val, max); } `"See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • Meta logoAsked at Meta 
    1 answer

    "int[] sqSorted(int[] nums) { int i = 0, j = nums.length-1; int k = nums.length-1; int[] sqs = new int[nums.length]; while(i n1) { sqs[k--] = n2; j--; } else { sqs[k--] = n1; i++; } } for(int n: sqs) System.out.println(n); return sqs; }"

    Mahaboob P. - "int[] sqSorted(int[] nums) { int i = 0, j = nums.length-1; int k = nums.length-1; int[] sqs = new int[nums.length]; while(i n1) { sqs[k--] = n2; j--; } else { sqs[k--] = n1; i++; } } for(int n: sqs) System.out.println(n); return sqs; }"See full answer

    Data Engineer
    Data Structures & Algorithms
    +2 more
  • Apple logoAsked at Apple 
    7 answers
    +3

    "This could be done using two-pointer approach assuming array is sorted: left and right pointers. We need track two sums (left and right) as we move pointers. For moving pointers we will move left to right by 1 (increment) when right sum is greater. We will move right pointer to left by 1 (decrement) when left sum is greater. at some point we will either get the sum same and that's when we exit from the loop. 0-left will be one array and right-(n-1) will be another array. We are not going to mo"

    Bhaskar B. - "This could be done using two-pointer approach assuming array is sorted: left and right pointers. We need track two sums (left and right) as we move pointers. For moving pointers we will move left to right by 1 (increment) when right sum is greater. We will move right pointer to left by 1 (decrement) when left sum is greater. at some point we will either get the sum same and that's when we exit from the loop. 0-left will be one array and right-(n-1) will be another array. We are not going to mo"See full answer

    Data Engineer
    Data Structures & Algorithms
    +2 more
  • Tesla logoAsked at Tesla 
    Add answer
    Video answer for 'Given a matrix of m x n elements (m rows, n columns), return all elements of the matrix in clockwise spiral order.'
    Data Engineer
    Data Structures & Algorithms
    +3 more
  • Amazon logoAsked at Amazon 
    2 answers

    "1) Have a common goal 2) Have a clear and fair accountability between teams 3) Ensure conflicts are resolved in time on common issues 4) Promote common Brain-storming , problem solving sessions 5) Most important , Have clear and effective communication established and practised"

    Saurabh N. - "1) Have a common goal 2) Have a clear and fair accountability between teams 3) Ensure conflicts are resolved in time on common issues 4) Promote common Brain-storming , problem solving sessions 5) Most important , Have clear and effective communication established and practised"See full answer

    Data Engineer
    Behavioral
    +5 more
  • Add answer
    Video answer for 'Design a Data Warehouse Schema for Airbnb'
    Data Engineer
    Data Modeling
  • Google logoAsked at Google 
    1 answer

    "Hadoop is better than PySpark when you are dealing with extremely large scale, batch oriented, non-iterative workloads where in-memory computing isn't feasible/ necessary, like log storage or ETL workflows that don't require high response times. It's also better in situations where the Hadoop ecosystem is already deeply embedded and where there is a need for resource conscious, fault tolerant computation without the overhead of Spark's memory constraints. In these such scenarios, Hadoop's disk-b"

    Joshua R. - "Hadoop is better than PySpark when you are dealing with extremely large scale, batch oriented, non-iterative workloads where in-memory computing isn't feasible/ necessary, like log storage or ETL workflows that don't require high response times. It's also better in situations where the Hadoop ecosystem is already deeply embedded and where there is a need for resource conscious, fault tolerant computation without the overhead of Spark's memory constraints. In these such scenarios, Hadoop's disk-b"See full answer

    Data Engineer
    Data Pipeline Design
  • Add answer
    Video answer for 'Design an ETL Pipeline for Slack for School'
    Data Engineer
    Data Pipeline Design
  • Adobe logoAsked at Adobe 
    12 answers
    +9

    "from typing import List def traprainwater(height: List[int]) -> int: if not height: return 0 l, r = 0, len(height) - 1 leftMax, rightMax = height[l], height[r] res = 0 while l < r: if leftMax < rightMax: l += 1 leftMax = max(leftMax, height[l]) res += leftMax - height[l] else: r -= 1 rightMax = max(rightMax, height[r]) "

    Anonymous Roadrunner - "from typing import List def traprainwater(height: List[int]) -> int: if not height: return 0 l, r = 0, len(height) - 1 leftMax, rightMax = height[l], height[r] res = 0 while l < r: if leftMax < rightMax: l += 1 leftMax = max(leftMax, height[l]) res += leftMax - height[l] else: r -= 1 rightMax = max(rightMax, height[r]) "See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • Adobe logoAsked at Adobe 
    4 answers
    +1

    "static boolean sudokuSolve(char board) { return sudokuSolve(board, 0, 0); } static boolean sudokuSolve(char board, int r, int c) { if(c>=board[0].length) { r=r+1; c=0; } if(r>=board.length) return true; if(boardr=='.') { for(int num=1; num<=9; num++) { boardr=(char)('0' + num); if(isValidPosition(board, r, c)) { if(sudokuSolve(board, r, c+1)) return true; } boardr='.'; } } else { return sudokuSolve(board, r, c+1); } return false; } static boolean isValidPosition(char b"

    Divya R. - "static boolean sudokuSolve(char board) { return sudokuSolve(board, 0, 0); } static boolean sudokuSolve(char board, int r, int c) { if(c>=board[0].length) { r=r+1; c=0; } if(r>=board.length) return true; if(boardr=='.') { for(int num=1; num<=9; num++) { boardr=(char)('0' + num); if(isValidPosition(board, r, c)) { if(sudokuSolve(board, r, c+1)) return true; } boardr='.'; } } else { return sudokuSolve(board, r, c+1); } return false; } static boolean isValidPosition(char b"See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • 6 answers
    +3

    "WITH suspicious_transactions AS ( SELECT c.first_name, c.last_name, t.receipt_number, COUNT(t.receiptnumber) OVER (PARTITION BY c.customerid) AS noofoffences FROM customers c JOIN transactions t ON c.customerid = t.customerid WHERE t.receipt_number LIKE '%999%' OR t.receipt_number LIKE '%1234%' OR t.receipt_number LIKE '%XYZ%' ) SELECT first_name, last_name, receipt_number, noofoffences FROM suspicious_transactions WHERE noofoffences >= 2;"

    Jayveer S. - "WITH suspicious_transactions AS ( SELECT c.first_name, c.last_name, t.receipt_number, COUNT(t.receiptnumber) OVER (PARTITION BY c.customerid) AS noofoffences FROM customers c JOIN transactions t ON c.customerid = t.customerid WHERE t.receipt_number LIKE '%999%' OR t.receipt_number LIKE '%1234%' OR t.receipt_number LIKE '%XYZ%' ) SELECT first_name, last_name, receipt_number, noofoffences FROM suspicious_transactions WHERE noofoffences >= 2;"See full answer

    Data Engineer
    Coding
    +3 more
  • Apple logoAsked at Apple 
    3 answers

    "I was able to provide the optimal approach and coded it up"

    Anonymous Wasp - "I was able to provide the optimal approach and coded it up"See full answer

    Data Engineer
    Data Structures & Algorithms
    +2 more
  • "There are 2 questions popping into my mind: Should the 2nd job have to kick off at 12:30AM? Are there others depending on the 2nd job? If both answers are no, we may simply postpone the second job to allow sufficient time for the first one to complete. If they are yeses, we could let the 2nd job retry to a certain amount of times. Make sure that even reaching the maximum of retries won't delay or fail the following jobs."

    Anzhe M. - "There are 2 questions popping into my mind: Should the 2nd job have to kick off at 12:30AM? Are there others depending on the 2nd job? If both answers are no, we may simply postpone the second job to allow sufficient time for the first one to complete. If they are yeses, we could let the 2nd job retry to a certain amount of times. Make sure that even reaching the maximum of retries won't delay or fail the following jobs."See full answer

    Data Engineer
    Data Pipeline Design
  • Apple logoAsked at Apple 
    3 answers

    "class TrieNode { constructor() { this.children = {}; this.isEndOfWord = false; } } class Trie { constructor() { this.root = new TrieNode(); } insert(word) { let node = this.root; for (const char of word) { if (!node.children[char]) { node.children[char] = new TrieNode(); } node = node.children[char]; } node.isEndOfWord = true; } search(word) { l"

    Tiago R. - "class TrieNode { constructor() { this.children = {}; this.isEndOfWord = false; } } class Trie { constructor() { this.root = new TrieNode(); } insert(word) { let node = this.root; for (const char of word) { if (!node.children[char]) { node.children[char] = new TrieNode(); } node = node.children[char]; } node.isEndOfWord = true; } search(word) { l"See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • Meta logoAsked at Meta 
    6 answers
    +3

    "My roles included: ➢ B2B Sales of sanitary ware products to clients ➢ Sourcing of New Businesses and Managing Existing Clients ➢ Responsible for managing sales unit ➢ Developing new sales lead to meet set target. ➢ Project Management Skills: Online Sales Management · Sales Management · Customer Engagement · Sales Operations · Direct Sales · B2B Marketing"

    Jerry D. - "My roles included: ➢ B2B Sales of sanitary ware products to clients ➢ Sourcing of New Businesses and Managing Existing Clients ➢ Responsible for managing sales unit ➢ Developing new sales lead to meet set target. ➢ Project Management Skills: Online Sales Management · Sales Management · Customer Engagement · Sales Operations · Direct Sales · B2B Marketing"See full answer

    Data Engineer
    Behavioral
    +3 more
  • Apple logoAsked at Apple 
    1 answer

    "public class HashMap { public class Element { T key; V value; Element(T k, V v) { this.key = k; this.value = v; } } private static final int DEFAULT_CAPACITY = 16; private static final float LOAD_FACTOR = 0.75f; private LinkedList[] table = new LinkedList[DEFAULT_CAPACITY]; private int size = 0; private int threshold = (int) (DEFAULTCAPACITY * LOADFACTOR); public void put(T k"

    Md kamrul H. - "public class HashMap { public class Element { T key; V value; Element(T k, V v) { this.key = k; this.value = v; } } private static final int DEFAULT_CAPACITY = 16; private static final float LOAD_FACTOR = 0.75f; private LinkedList[] table = new LinkedList[DEFAULT_CAPACITY]; private int size = 0; private int threshold = (int) (DEFAULTCAPACITY * LOADFACTOR); public void put(T k"See full answer

    Data Engineer
    Data Structures & Algorithms
    +2 more
Showing 101-120 of 160