"supervised learning: model is trained on the labeled data
unsupervised learning: no labels provided - model learns by finding patterns , structure and groupings in the data.
Semi-supervised learning: use small set of labels to guide learning for the larger pool of unlabeled data.
reinforcement learning: leans by interacting with students the environment, receives reward and penalties based on actions
self supervised: no labelled data . The model makes its own practice problems by"
Anchal V. - "supervised learning: model is trained on the labeled data
unsupervised learning: no labels provided - model learns by finding patterns , structure and groupings in the data.
Semi-supervised learning: use small set of labels to guide learning for the larger pool of unlabeled data.
reinforcement learning: leans by interacting with students the environment, receives reward and penalties based on actions
self supervised: no labelled data . The model makes its own practice problems by"See full answer
"I would meet with my team to discuss and break down the 12 features into sub-tasks based on priority, then arrange a meeting with stakeholders to align on the priority levels and secure their approval.Next, I’d assign each of the main Priority 1 features to engineers accordingly, ensuring that the first three months focus on P1 delivery. The following three months would be dedicated to testing the P1 features while progressing with lower-priority features in parallel. This ensures that by month"
Riley M. - "I would meet with my team to discuss and break down the 12 features into sub-tasks based on priority, then arrange a meeting with stakeholders to align on the priority levels and secure their approval.Next, I’d assign each of the main Priority 1 features to engineers accordingly, ensuring that the first three months focus on P1 delivery. The following three months would be dedicated to testing the P1 features while progressing with lower-priority features in parallel. This ensures that by month"See full answer
"Precision - Out of all the things we picked as correct, how many were actually correct?
recall - Out of all the things that were truly correct, how many did we actually find?"
Vineet M. - "Precision - Out of all the things we picked as correct, how many were actually correct?
recall - Out of all the things that were truly correct, how many did we actually find?"See full answer
Data Scientist
Statistics & Experimentation
🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.
"Situation:
At my previous company, we had more than 200 different data sources across 15 business units. These included CRM systems, marketing platforms, HR databases, and even third-party data feeds. The problem was that each team was managing data in its own way. This created inconsistent privacy controls, data quality issues, and compliance gaps, all of which were becoming urgent ahead of a major regulatory audit.
Task:
I was given the responsibility of implementing a unified data governance"
Mark G. - "Situation:
At my previous company, we had more than 200 different data sources across 15 business units. These included CRM systems, marketing platforms, HR databases, and even third-party data feeds. The problem was that each team was managing data in its own way. This created inconsistent privacy controls, data quality issues, and compliance gaps, all of which were becoming urgent ahead of a major regulatory audit.
Task:
I was given the responsibility of implementing a unified data governance"See full answer
"Function signature for reference:
def calculate(servers: List[int], k: int) -> int:
...
To resolve this, you can use binary search considering left=0 and right=max(servers) * k
so
Example:
servers=[1,4,5] First server handle 1 request in let's say 1 second, second 4 seconds and last 5 seconds.
k=10
So I want to know the minimal time to process 10 requests
Get the mid for timeline
mid = (left+right)//2 -> mid is 25
Check how many we could process
25//1 = 25 25//4=6 25//5=5 so 25 + 6 +"
Babaa - "Function signature for reference:
def calculate(servers: List[int], k: int) -> int:
...
To resolve this, you can use binary search considering left=0 and right=max(servers) * k
so
Example:
servers=[1,4,5] First server handle 1 request in let's say 1 second, second 4 seconds and last 5 seconds.
k=10
So I want to know the minimal time to process 10 requests
Get the mid for timeline
mid = (left+right)//2 -> mid is 25
Check how many we could process
25//1 = 25 25//4=6 25//5=5 so 25 + 6 +"See full answer
"Clarifying questions:
What is the issue in the current algorithm? Not obvious
Why do you want to improve? Engagement
Are there any churn trend in YouTube viewers? Yes I see for last few months 10% reduction
Do you have any support tickets from users regarding personalization? Not sure
Have you released any new launches or bug fix into the existing algorithm? No
What is the goal for this improvement suggestion? Engagement
Are you targeting any specific region? NA
Do"
Manjula R. - "Clarifying questions:
What is the issue in the current algorithm? Not obvious
Why do you want to improve? Engagement
Are there any churn trend in YouTube viewers? Yes I see for last few months 10% reduction
Do you have any support tickets from users regarding personalization? Not sure
Have you released any new launches or bug fix into the existing algorithm? No
What is the goal for this improvement suggestion? Engagement
Are you targeting any specific region? NA
Do"See full answer
"I would assume that this is similar to an intervals question. Meeting Rooms II (https://www.lintcode.com/problem/919/?fromId=203&_from=collection) on Leetcode seems like the closest comparison, it's a premium question so I linked Lintcode.
I'm assuming that we also need to just return the minimum number of cars used. You need to sort for the most optimal solution, so you're constrained by an O(nlogn) time complexity. So any sorting solution could work (using a heap, sorting the array input arra"
Sohum S. - "I would assume that this is similar to an intervals question. Meeting Rooms II (https://www.lintcode.com/problem/919/?fromId=203&_from=collection) on Leetcode seems like the closest comparison, it's a premium question so I linked Lintcode.
I'm assuming that we also need to just return the minimum number of cars used. You need to sort for the most optimal solution, so you're constrained by an O(nlogn) time complexity. So any sorting solution could work (using a heap, sorting the array input arra"See full answer
"A daily 10 minute cadence to make a note of the progress and the deliverables for the day, should help ascertain the delivery and to meet schedule the self performing teams need to be on toes to deliver. Time lines need to be revisited end of each day to evalute the impact."
Nilesh S. - "A daily 10 minute cadence to make a note of the progress and the deliverables for the day, should help ascertain the delivery and to meet schedule the self performing teams need to be on toes to deliver. Time lines need to be revisited end of each day to evalute the impact."See full answer
"Goal:
Maps should reflect reality of places especially with places like hospitals
Ecosystem :
Maps User Usecases:
Daily Commute
Explore places
Critical situations like emergency - hospital visits, work emergencies, family
Businesses using Google Business Profile:
Medical & Health
Hospitals
Clinics
Gyms
Restaurants
Salons
Mom & Pop stores, SMB - Retail
Enterprises
These can be online, physical or both
Local guides
Downstream usecases for other google products:
Google Search
Google Gemini
"
Pooja G. - "Goal:
Maps should reflect reality of places especially with places like hospitals
Ecosystem :
Maps User Usecases:
Daily Commute
Explore places
Critical situations like emergency - hospital visits, work emergencies, family
Businesses using Google Business Profile:
Medical & Health
Hospitals
Clinics
Gyms
Restaurants
Salons
Mom & Pop stores, SMB - Retail
Enterprises
These can be online, physical or both
Local guides
Downstream usecases for other google products:
Google Search
Google Gemini
"See full answer
"Question: An array of n integers is given, and a positive integer k, where k << n. k indicates that the absolute difference between each element's current index (icurrent) and the index in the sorted array (isorted) is less than k (|icurr - isorted| < k).
Sort the given array.
The most common solution is with a Heap:
def solution(arr, k):
min_heap = []
result = []
for i in range(len(arr))
heapq.heappush(min_heap, arr[i])
"
Guilherme M. - "Question: An array of n integers is given, and a positive integer k, where k << n. k indicates that the absolute difference between each element's current index (icurrent) and the index in the sorted array (isorted) is less than k (|icurr - isorted| < k).
Sort the given array.
The most common solution is with a Heap:
def solution(arr, k):
min_heap = []
result = []
for i in range(len(arr))
heapq.heappush(min_heap, arr[i])
"See full answer
"The distribution of daily search queries per user, as shown in the histogram, can be described as approximately normal (or bell-shaped) with a slight positive skew.
Key Characteristics:
Shape: The distribution is roughly symmetrical around its center, resembling a bell curve. This indicates that most users perform a moderate number of daily search queries.
Central Tendency: The peak of the distribution, representing the highest density of users, appears to be around **8"
Sam A. - "The distribution of daily search queries per user, as shown in the histogram, can be described as approximately normal (or bell-shaped) with a slight positive skew.
Key Characteristics:
Shape: The distribution is roughly symmetrical around its center, resembling a bell curve. This indicates that most users perform a moderate number of daily search queries.
Central Tendency: The peak of the distribution, representing the highest density of users, appears to be around **8"See full answer