Skip to main content

Data Scientist Technical Interview Questions

Review this list of 10 Technical Data Scientist interview questions and answers verified by hiring managers and candidates.
  • Amazon logoAsked at Amazon 
    Video answer for 'Implement k-means clustering.'

    "at first I want to know number of cluster I will put random number if I don't know and I will use method called Elbow method or Silhouette Score ,Gap Statistic and Davies–Bouldin Index to know the best number of cluster and I will use scikit-learn library to import kmeans from sklearn.cluster import KMeans kmeans = KMeans(nclusters=2, randomstate=0) kmeans.fit(X) and X this my data "

    Taheia S. - "at first I want to know number of cluster I will put random number if I don't know and I will use method called Elbow method or Silhouette Score ,Gap Statistic and Davies–Bouldin Index to know the best number of cluster and I will use scikit-learn library to import kmeans from sklearn.cluster import KMeans kmeans = KMeans(nclusters=2, randomstate=0) kmeans.fit(X) and X this my data "See full answer

    Data Scientist
    Technical
    +5 more
  • Google logoAsked at Google 
    +2

    "WITH RECURSIVE fibonacci_series AS ( SELECT 1 AS n, 0 AS fib1, 1 AS fib2 UNION ALL SELECT n + 1 AS n, fib2 AS fib1, fib1 + fib2 AS fib2 FROM fibonacci_series WHERE n < 20 -- Limit the series to 20 numbers ) SELECT n, fib1 AS fib FROM fibonacci_series ORDER BY n; `"

    Yashasvi V. - "WITH RECURSIVE fibonacci_series AS ( SELECT 1 AS n, 0 AS fib1, 1 AS fib2 UNION ALL SELECT n + 1 AS n, fib2 AS fib1, fib1 + fib2 AS fib2 FROM fibonacci_series WHERE n < 20 -- Limit the series to 20 numbers ) SELECT n, fib1 AS fib FROM fibonacci_series ORDER BY n; `"See full answer

    Data Scientist
    Technical
    +4 more
  • Amazon logoAsked at Amazon 

    "SQL databases are relational, NoSQL databases are non-relational. SQL databases use structured query language and have a predefined schema. NoSQL databases have dynamic schemas for unstructured data. SQL databases are vertically scalable, while NoSQL databases are horizontally scalable."

    Ali H. - "SQL databases are relational, NoSQL databases are non-relational. SQL databases use structured query language and have a predefined schema. NoSQL databases have dynamic schemas for unstructured data. SQL databases are vertically scalable, while NoSQL databases are horizontally scalable."See full answer

    Data Scientist
    Technical
    +7 more
  • Google logoAsked at Google 
    +1

    "Deep Learning is a part of Artificial Intelligence, it's like teaching the machine to think and make decisions on its own. It's like how we teach a child the concept of an apple - it's round, red, has a stem on top. We show them multiple pictures of apples and then they understand and can recognize an apple in future. Similarly, we feed lots of data to the machine, and slowly, it starts learning from that data, and can then make relevant predictions or decisions based on what it has learnt. A co"

    Surbhi G. - "Deep Learning is a part of Artificial Intelligence, it's like teaching the machine to think and make decisions on its own. It's like how we teach a child the concept of an apple - it's round, red, has a stem on top. We show them multiple pictures of apples and then they understand and can recognize an apple in future. Similarly, we feed lots of data to the machine, and slowly, it starts learning from that data, and can then make relevant predictions or decisions based on what it has learnt. A co"See full answer

    Data Scientist
    Technical
    +3 more
  • Google logoAsked at Google 

    "Clarification questions What is the purpose of connecting the DB? Do we expect high-volumes of traffic to hit the DB Do we have scalability or reliability concerns? Format Code -> DB Code -> Cache -> DB API -> Cache -> DB - APIs are built for a purpose and have a specified protocol (GET, POST, DELETE) to speak to the DB. APIs can also use a contract to retrieve information from a DB much faster than code. Load balanced APIs -> Cache -> DB **Aut"

    Aaron W. - "Clarification questions What is the purpose of connecting the DB? Do we expect high-volumes of traffic to hit the DB Do we have scalability or reliability concerns? Format Code -> DB Code -> Cache -> DB API -> Cache -> DB - APIs are built for a purpose and have a specified protocol (GET, POST, DELETE) to speak to the DB. APIs can also use a contract to retrieve information from a DB much faster than code. Load balanced APIs -> Cache -> DB **Aut"See full answer

    Data Scientist
    Technical
    +5 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • "Clarification question: How many subscription plans are offered by Tinder ? If there is more than one subscription plan, then we need to ask is the fluctuation happening across all plans or in a particular one ? Assumption: Let's say lower priced subscription plan is showing the most fluctuation and there are only two types of plans In this subscription plan which age group is showing the most fluctuation (18-24,25-30, 30+ etc) ? Is there any seasonality trend observed (eg: placemen"

    Srijita P. - "Clarification question: How many subscription plans are offered by Tinder ? If there is more than one subscription plan, then we need to ask is the fluctuation happening across all plans or in a particular one ? Assumption: Let's say lower priced subscription plan is showing the most fluctuation and there are only two types of plans In this subscription plan which age group is showing the most fluctuation (18-24,25-30, 30+ etc) ? Is there any seasonality trend observed (eg: placemen"See full answer

    Data Scientist
    Technical
  • "A Random Forest works by building an ensemble of decision trees, each trained on a slightly different version of the data. The key mechanism is bagging: for each tree, we sample the training data with replacement (bootstrapping), so every tree sees a different subset of examples. On top of that, at each split the algorithm randomly selects a subset of features, so trees explore different predictors. These two sources of randomness decorrelate the trees. When we aggregate them — by averag"

    Yuexiang Y. - "A Random Forest works by building an ensemble of decision trees, each trained on a slightly different version of the data. The key mechanism is bagging: for each tree, we sample the training data with replacement (bootstrapping), so every tree sees a different subset of examples. On top of that, at each split the algorithm randomly selects a subset of features, so trees explore different predictors. These two sources of randomness decorrelate the trees. When we aggregate them — by averag"See full answer

    Data Scientist
    Technical
  • Apple logoAsked at Apple 

    Coin Change

    IDE
    Medium
    +9

    "The example given is wrong. The 2nd test case should have answer 3, as we can get to 6 by using 3 coins of denomination 2."

    Anmol R. - "The example given is wrong. The 2nd test case should have answer 3, as we can get to 6 by using 3 coins of denomination 2."See full answer

    Data Scientist
    Technical
    +4 more
  • Infosys logoAsked at Infosys 

    "In Python, an "oops" (Object-Oriented Programming) concept refers to a programming paradigm that is based on the idea of objects and classes. OOP allows developers to model real-world concepts and create reusable code blocks through the use of inheritance, polymorphism, and encapsulation. Here are some common OOP concepts in Python: Class: A class is a blueprint for creating objects. It defines the attributes and behaviors that objects of that class will have. Object: An object is an insta"

    Anonymous Flamingo - "In Python, an "oops" (Object-Oriented Programming) concept refers to a programming paradigm that is based on the idea of objects and classes. OOP allows developers to model real-world concepts and create reusable code blocks through the use of inheritance, polymorphism, and encapsulation. Here are some common OOP concepts in Python: Class: A class is a blueprint for creating objects. It defines the attributes and behaviors that objects of that class will have. Object: An object is an insta"See full answer

    Data Scientist
    Technical
  • Tinder logoAsked at Tinder 

    "I would recognize the factors that are causing the interference. Then i will use tools like smoothing techniques or algorithms (e.g Kalman filters for time series) which can help isolate genuine trends from noise. In testing i would employ techniqu es like A/B testing to measure interference from unrelated factors and use techniques like regression analysis to seperate the relevant factors from noise."

    Trusha M. - "I would recognize the factors that are causing the interference. Then i will use tools like smoothing techniques or algorithms (e.g Kalman filters for time series) which can help isolate genuine trends from noise. In testing i would employ techniqu es like A/B testing to measure interference from unrelated factors and use techniques like regression analysis to seperate the relevant factors from noise."See full answer

    Data Scientist
    Technical
Showing 1-10 of 10