Databricks Interview Questions

Review this list of 18 Databricks interview questions and answers verified by hiring managers and candidates.
  • Databricks logoAsked at Databricks 

    "High Level Architect Client v API Gateway v Object Storage v Message Queue v Worker v Database Client should can document with a web site or directly with API services. API Gateway should be used for upload document,get document info and state. Object storage should be used for original document and send event to Message Queue for starting. Message Queue is neccessary because there are millions of document should be process each time. Worker can get text from document with OCR. Database shoul"

    Berk C. - "High Level Architect Client v API Gateway v Object Storage v Message Queue v Worker v Database Client should can document with a web site or directly with API services. API Gateway should be used for upload document,get document info and state. Object storage should be used for original document and send event to Message Queue for starting. Message Queue is neccessary because there are millions of document should be process each time. Worker can get text from document with OCR. Database shoul"See full answer

    Software Engineer
    System Design
    +2 more
  • Databricks logoAsked at Databricks 

    "Constraints: 4-direction moves; no mode switching (pick exactly one of {1=bicycle, 2=bike, 3=car, 4=bus} for the full trip). Per-mode search: If a mode’s per-step time/cost are uniform, run BFS on allowed cells. Then totaltime = steps × timeperstep, tie-break by steps × costper_step. If time/cost vary by cell (given matrices), run Dijkstra per mode minimizing (totaltime, totalcost) lexicographically. Maintain the best ⟨time, cost⟩ per cell; relax when the new pair is strictly better. S"

    Rahul J. - "Constraints: 4-direction moves; no mode switching (pick exactly one of {1=bicycle, 2=bike, 3=car, 4=bus} for the full trip). Per-mode search: If a mode’s per-step time/cost are uniform, run BFS on allowed cells. Then totaltime = steps × timeperstep, tie-break by steps × costper_step. If time/cost vary by cell (given matrices), run Dijkstra per mode minimizing (totaltime, totalcost) lexicographically. Maintain the best ⟨time, cost⟩ per cell; relax when the new pair is strictly better. S"See full answer

    Software Engineer
    Coding
    +1 more
  • Databricks logoAsked at Databricks 

    "user table - with userid, username, email, phonenumber, accountcreateddate exercises table - types of exercises - indoor walk, outdoor walk, running, stairs, cycling, swimming etc - exerciseid, exercisetype date table - date, day, month, year - with dateid Session table - userid, sessiondateid(linked to dateid in date table), exerciseid, distance covered, calories spent, starttime, endtime "

    Sreeram reddy B. - "user table - with userid, username, email, phonenumber, accountcreateddate exercises table - types of exercises - indoor walk, outdoor walk, running, stairs, cycling, swimming etc - exerciseid, exercisetype date table - date, day, month, year - with dateid Session table - userid, sessiondateid(linked to dateid in date table), exerciseid, distance covered, calories spent, starttime, endtime "See full answer

    Data Engineer
    Data Modeling
  • Databricks logoAsked at Databricks 
    Software Engineer
    System Design
    +1 more
  • +1

    "This is yet another classic case of evolution of data landscape to account for diversities in the data formats sacrificing restrictive but key components at first and added later to make the solution more effective. Data warehouse -> Data Lake -> Data Lakehouse (Data Lake + Data Warehouse) Data warehouse - A solution to store data in central place (analytics (read) heavy) with stringent schema (structured). Very useful for historical queries and analytics. Schema on write check. Only used for"

    Karthik R. - "This is yet another classic case of evolution of data landscape to account for diversities in the data formats sacrificing restrictive but key components at first and added later to make the solution more effective. Data warehouse -> Data Lake -> Data Lakehouse (Data Lake + Data Warehouse) Data warehouse - A solution to store data in central place (analytics (read) heavy) with stringent schema (structured). Very useful for historical queries and analytics. Schema on write check. Only used for"See full answer

    Data Engineer
    Data Pipeline Design
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • Databricks logoAsked at Databricks 

    "One Accomplishment I'm most proud of is that I graduated from Schaumburg High School In May of 2021 and I was able to get up the stage and collect my diploma. This was a HUGE Impact in regards of passing all of my classes and earning all of my credits in order to be apart of the NOW Arena graduation."

    Amparo L. - "One Accomplishment I'm most proud of is that I graduated from Schaumburg High School In May of 2021 and I was able to get up the stage and collect my diploma. This was a HUGE Impact in regards of passing all of my classes and earning all of my credits in order to be apart of the NOW Arena graduation."See full answer

    Product Manager
    Behavioral
    +1 more
  • Databricks logoAsked at Databricks 
    Video answer for 'What is your leadership style?'
    +7

    "My leadership style is flexible and adaptive, it varies depending on the team members and the needs of the company. My leadership goal is to empower the team and inspire and grow leaders. In order to achieve that, I combine transformational, democratic and coaching leadership styles. Usually when we are facing a new type of challenge, or at the early stage of a project, I like to adapt the transformational leadership which allows me to listen to all the suggestions from the team members and sta"

    onering2ruleall - "My leadership style is flexible and adaptive, it varies depending on the team members and the needs of the company. My leadership goal is to empower the team and inspire and grow leaders. In order to achieve that, I combine transformational, democratic and coaching leadership styles. Usually when we are facing a new type of challenge, or at the early stage of a project, I like to adapt the transformational leadership which allows me to listen to all the suggestions from the team members and sta"See full answer

    Engineering Manager
    Behavioral
    +4 more
  • "We have to work with the c-suite to understood the direct quartly outcomes or goals. This could be our epic and then we try to break that down into business value and complexity . This will allow us to prioritize whats next. From there we can structure a mvp to cover maybe some of these areas to understand the estimation of this work. After the first couple weeks we can structure a roadmap and then define when"

    Howard H. - "We have to work with the c-suite to understood the direct quartly outcomes or goals. This could be our epic and then we try to break that down into business value and complexity . This will allow us to prioritize whats next. From there we can structure a mvp to cover maybe some of these areas to understand the estimation of this work. After the first couple weeks we can structure a roadmap and then define when"See full answer

    Product Manager
    Behavioral
  • Databricks logoAsked at Databricks 
    Video answer for 'Demo LabelBox for an Autonomous Delivery Client'
    Solutions Architect
    Customer Interaction
  • "There are 2 questions popping into my mind: Should the 2nd job have to kick off at 12:30AM? Are there others depending on the 2nd job? If both answers are no, we may simply postpone the second job to allow sufficient time for the first one to complete. If they are yeses, we could let the 2nd job retry to a certain amount of times. Make sure that even reaching the maximum of retries won't delay or fail the following jobs."

    Anzhe M. - "There are 2 questions popping into my mind: Should the 2nd job have to kick off at 12:30AM? Are there others depending on the 2nd job? If both answers are no, we may simply postpone the second job to allow sufficient time for the first one to complete. If they are yeses, we could let the 2nd job retry to a certain amount of times. Make sure that even reaching the maximum of retries won't delay or fail the following jobs."See full answer

    Data Engineer
    Data Pipeline Design
  • Databricks logoAsked at Databricks 
    Data Engineer
    Data Pipeline Design
  • Data Engineer
    Data Pipeline Design
  • Databricks logoAsked at Databricks 

    "You will need to start from Browser and go all the way up to Analytic systems and methods. Everything needs to be covered"

    Divya K. - "You will need to start from Browser and go all the way up to Analytic systems and methods. Everything needs to be covered"See full answer

    Technical Program Manager
    Execution
    +2 more
  • Databricks logoAsked at Databricks 

    "Explain how you implemented your telemetry and observability in previous projects."

    Divya K. - "Explain how you implemented your telemetry and observability in previous projects."See full answer

    Technical Program Manager
    Technical
  • Data Engineer
    Data Pipeline Design
  • Databricks logoAsked at Databricks 
    Data Engineer
    Data Pipeline Design
  • Data Engineer
    Data Pipeline Design
Showing 1-18 of 18