Skip to main content

Data Engineer Interview Questions

Review this list of 160 Data Engineer interview questions and answers verified by hiring managers and candidates.
  • Discord logoAsked at Discord 
    Add answer
    Data Engineer
    Behavioral
    +4 more
  • Adobe logoAsked at Adobe 

    Permutations

    IDE
    Medium
    3 answers

    "function permute(nums) { if (nums.length <= 1) { return [nums]; } const prevPermutations = permute(nums.slice(0, nums.length-1)); const currentNum = nums[nums.length-1]; const permutations = new Set(); for (let prev of prevPermutations) { for (let i=0; i < prev.length; i++) { permutations.add([...prev.slice(0, i), currentNum, ...prev.slice(i)]); } permutations.add([...prev, currentNum]); } return [...permutations]"

    Tiago R. - "function permute(nums) { if (nums.length <= 1) { return [nums]; } const prevPermutations = permute(nums.slice(0, nums.length-1)); const currentNum = nums[nums.length-1]; const permutations = new Set(); for (let prev of prevPermutations) { for (let i=0; i < prev.length; i++) { permutations.add([...prev.slice(0, i), currentNum, ...prev.slice(i)]); } permutations.add([...prev, currentNum]); } return [...permutations]"See full answer

    Data Engineer
    Data Structures & Algorithms
    +3 more
  • Microsoft logoAsked at Microsoft 
    2 answers

    "SQL is structured query language."

    Rafia M. - "SQL is structured query language."See full answer

    Data Engineer
    SQL
    +2 more
  • Adobe logoAsked at Adobe 
    Add answer
    Video answer for 'Solve John Conway's "Game of Life".'
    Data Engineer
    Data Structures & Algorithms
    +2 more
  • Databricks logoAsked at Databricks 
    2 answers

    "Medallion architecture is a layered data architecture used in lakehouse systems. Data flows through Bronze, Silver, and Gold layers where each layer improves data quality. Bronze stores raw data, Silver contains cleaned and validated datasets, and Gold provides aggregated business-ready data for analytics and reporting bronzedf = spark.read.json("/landing/apidata") bronze_df.write.format("delta").save("/bronze/users")"

    Ramagiri P. - "Medallion architecture is a layered data architecture used in lakehouse systems. Data flows through Bronze, Silver, and Gold layers where each layer improves data quality. Bronze stores raw data, Silver contains cleaned and validated datasets, and Gold provides aggregated business-ready data for analytics and reporting bronzedf = spark.read.json("/landing/apidata") bronze_df.write.format("delta").save("/bronze/users")"See full answer

    Data Engineer
    Data Pipeline Design
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • Walmart Labs logoAsked at Walmart Labs 
    Add answer
    Data Engineer
    Behavioral
    +5 more
  • Uber logoAsked at Uber 
    1 answer

    "Not my answer, but rather the details of this question. It should include the following functions: int insertNewCustomer(double revenue) -> returns a customer ID (assume auto-incremented & 0-based) int insertNewCustomer(double revenue, int referrerID) -> returns a customer ID (assume auto-incremented & 0-based) Set getLowestKCustomersByMinTotalRevenue(int k, double minTotalRevenue) -> returns customer IDs Note: The total revenue consists of the revenue that this customer bring"

    Anzhe M. - "Not my answer, but rather the details of this question. It should include the following functions: int insertNewCustomer(double revenue) -> returns a customer ID (assume auto-incremented & 0-based) int insertNewCustomer(double revenue, int referrerID) -> returns a customer ID (assume auto-incremented & 0-based) Set getLowestKCustomersByMinTotalRevenue(int k, double minTotalRevenue) -> returns customer IDs Note: The total revenue consists of the revenue that this customer bring"See full answer

    Data Engineer
    Coding
  • Adobe logoAsked at Adobe 
    2 answers

    "func isMatch(text: String, pattern: String) -> Bool { // Convert strings to arrays for easier indexing let s = Array(text.characters) let p = Array(pattern.characters) guard !s.isEmpty && !p.isEmpty else { return true } // Create DP table: dpi represents if s[0...i-1] matches p[0...j-1] var dp = Array(repeating: Array(repeating: false, count: p.count + 1), count: s.count + 1) // Empty pattern matches empty string dp[0]["

    Reno S. - "func isMatch(text: String, pattern: String) -> Bool { // Convert strings to arrays for easier indexing let s = Array(text.characters) let p = Array(pattern.characters) guard !s.isEmpty && !p.isEmpty else { return true } // Create DP table: dpi represents if s[0...i-1] matches p[0...j-1] var dp = Array(repeating: Array(repeating: false, count: p.count + 1), count: s.count + 1) // Empty pattern matches empty string dp[0]["See full answer

    Data Engineer
    Data Structures & Algorithms
    +3 more
  • Walmart Labs logoAsked at Walmart Labs 
    1 answer

    "I’ve spent over 6 years building and scaling e-commerce products across EMEA and APAC. At Jumia, I led product initiatives on the checkout and payments side. For example, I launched gamified promotions on PDP and checkout that improved engagement and delivered a 2.3x uplift in conversion. I also introduced automated installment payments and order cancellation flows, which not only improved user trust but also reduced complaints by 30% and lowered operational costs. Before that, at Lazada, I work"

    Rajeev K. - "I’ve spent over 6 years building and scaling e-commerce products across EMEA and APAC. At Jumia, I led product initiatives on the checkout and payments side. For example, I launched gamified promotions on PDP and checkout that improved engagement and delivered a 2.3x uplift in conversion. I also introduced automated installment payments and order cancellation flows, which not only improved user trust but also reduced complaints by 30% and lowered operational costs. Before that, at Lazada, I work"See full answer

    Data Engineer
    Behavioral
    +2 more
  • +1

    "Multithreading: Multiple threads run within the same process, sharing memory. More lightweight, Faster Context switching shared memory - potentials synchronizartion issues Use Lock, Synchronized keywords to handle Multiprocessing: Multiple processes run independently, each with its own memory space. More heavyweight because of own resources, which reduces shared data corruption issues. Slower need to manage seperate processes Need to use IPC mechanisms like pipes, sockets an"

    Sue G. - "Multithreading: Multiple threads run within the same process, sharing memory. More lightweight, Faster Context switching shared memory - potentials synchronizartion issues Use Lock, Synchronized keywords to handle Multiprocessing: Multiple processes run independently, each with its own memory space. More heavyweight because of own resources, which reduces shared data corruption issues. Slower need to manage seperate processes Need to use IPC mechanisms like pipes, sockets an"See full answer

    Data Engineer
    Concept
  • Apple logoAsked at Apple 
    9 answers
    +5

    "Make current as root. 2 while current is not null, if p and q are less than current, go left. If p and q are greater than current, go right. else return current. return null"

    Vaibhav D. - "Make current as root. 2 while current is not null, if p and q are less than current, go left. If p and q are greater than current, go right. else return current. return null"See full answer

    Data Engineer
    Data Structures & Algorithms
    +4 more
  • Amazon logoAsked at Amazon 
    1 answer

    "OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) are two types of data processing systems, each designed for specific purposes in the context of database and data warehouse environments. OLTP (Online Transaction Processing):Purpose: OLTP systems are designed to manage and handle high volumes of transactions, such as inserting, updating, and deleting data. These systems are typically used in day-to-day business operations. Characteristics: Handles small, si"

    Nikunj V. - "OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) are two types of data processing systems, each designed for specific purposes in the context of database and data warehouse environments. OLTP (Online Transaction Processing):Purpose: OLTP systems are designed to manage and handle high volumes of transactions, such as inserting, updating, and deleting data. These systems are typically used in day-to-day business operations. Characteristics: Handles small, si"See full answer

    Data Engineer
    Technical
    +1 more
  • Adobe logoAsked at Adobe 
    2 answers

    "The rule doesn't work the other way around. If the array is smaller than n, it can still have duplicates. Eg: n=10 , arr = [3,3]"

    Murali M. - "The rule doesn't work the other way around. If the array is smaller than n, it can still have duplicates. Eg: n=10 , arr = [3,3]"See full answer

    Data Engineer
    Data Structures & Algorithms
    +2 more
  • Discord logoAsked at Discord 
    Add answer
    Data Engineer
    Behavioral
    +1 more
  • Data Engineer
    Data Pipeline Design
  • Databricks logoAsked at Databricks 
    Add answer
    Data Engineer
    Data Pipeline Design
  • Salesforce logoAsked at Salesforce 
    Add answer
    Data Engineer
    Behavioral
    +4 more
  • Adobe logoAsked at Adobe 
    9 answers
    +6

    "function isPalindrome(s, start, end) { while (s[start] === s[end] && end >= start) { start++; end--; } return end <= start; } function longestPalindromicSubstring(s) { let longestPalindrome = ''; for (let i=0; i < s.length; i++) { let j = s.length-1; while (s[i] !== s[j] && i <= j) { j--; } if (s[i] === s[j]) { if (isPalindrome(s, i, j)) { const validPalindrome = s.substring(i, j+1"

    Tiago R. - "function isPalindrome(s, start, end) { while (s[start] === s[end] && end >= start) { start++; end--; } return end <= start; } function longestPalindromicSubstring(s) { let longestPalindrome = ''; for (let i=0; i < s.length; i++) { let j = s.length-1; while (s[i] !== s[j] && i <= j) { j--; } if (s[i] === s[j]) { if (isPalindrome(s, i, j)) { const validPalindrome = s.substring(i, j+1"See full answer

    Data Engineer
    Data Structures & Algorithms
    +3 more
  • 2 answers

    "na"

    Sonveer K. - "na"See full answer

    Data Engineer
    Data Modeling
  • Databricks logoAsked at Databricks 
    1 answer

    "Delta lake is a metadata layer on top of cloud storage which helps giving datalake transactional capabilities. It helps implement upsert/merge as it conforms a schema to the data assets stored in cloud. It also offers various other capabilities like liquid clustering,time travel, schema evolution,deletes."

    Nitish C. - "Delta lake is a metadata layer on top of cloud storage which helps giving datalake transactional capabilities. It helps implement upsert/merge as it conforms a schema to the data assets stored in cloud. It also offers various other capabilities like liquid clustering,time travel, schema evolution,deletes."See full answer

    Data Engineer
    Data Pipeline Design
Showing 121-140 of 160