"I've worked on projects not quite like this, but very similar, in the past - I'll borrow from that to answer this:
The Broader Context
this problem doesn't specify the type of data we're working with, or how it's being ingested
to align with my personal background, I'll assume a picture that lends this problem well to being a computer vision (abbreviated "CV") related question:
let's say we have a conveyor belt in a waste facility, which sequentially carries a stream of waste
w"
Zain R. - "I've worked on projects not quite like this, but very similar, in the past - I'll borrow from that to answer this:
The Broader Context
this problem doesn't specify the type of data we're working with, or how it's being ingested
to align with my personal background, I'll assume a picture that lends this problem well to being a computer vision (abbreviated "CV") related question:
let's say we have a conveyor belt in a waste facility, which sequentially carries a stream of waste
w"See full answer
"I gave multiple answers including polling the service every 10 sec to see customer. Or we can have the client side call which will send this data after 10 sec to us. We will store in dynamo DB and then send through pipelines to redshift DB for analytics."
Deepti K. - "I gave multiple answers including polling the service every 10 sec to see customer. Or we can have the client side call which will send this data after 10 sec to us. We will store in dynamo DB and then send through pipelines to redshift DB for analytics."See full answer
"[I'm not sure whether the answer below is the best, as I have not gotten result and feedback from my interview]
Ans: I would solve by first using a VAE-style model, to create a latent space embedding that translates user description to generate images. Training would be done on the 1000 avatar images and 100000 descriptions, following this scheme:
VAE:
description -> encoder -> latent space -> decoder -> image
Q: "OK, but that means you're limiting the generated images to be only the 1000 imag"
Nick S. - "[I'm not sure whether the answer below is the best, as I have not gotten result and feedback from my interview]
Ans: I would solve by first using a VAE-style model, to create a latent space embedding that translates user description to generate images. Training would be done on the 1000 avatar images and 100000 descriptions, following this scheme:
VAE:
description -> encoder -> latent space -> decoder -> image
Q: "OK, but that means you're limiting the generated images to be only the 1000 imag"See full answer
"No ,MSE is suitable for only regression modes. Although the logistic regression in Its name has regression , but it is a classification problem so MSE is not suitable for classification models like logistic regression."
1036 loknadh R. - "No ,MSE is suitable for only regression modes. Although the logistic regression in Its name has regression , but it is a classification problem so MSE is not suitable for classification models like logistic regression."See full answer
"Switching from a linear kernel to RBF / Gaussian kernel is likely to result in overfitting the model. It is a move that adds complexity to the mix, and if the data doesn't need that sort of complexity, it would result in overfitting. On the other hand, all the other three approaches would only try too reduce complexity in the process, thereby doesn't contribute to overfitting the model."
Sri V. - "Switching from a linear kernel to RBF / Gaussian kernel is likely to result in overfitting the model. It is a move that adds complexity to the mix, and if the data doesn't need that sort of complexity, it would result in overfitting. On the other hand, all the other three approaches would only try too reduce complexity in the process, thereby doesn't contribute to overfitting the model."See full answer
"AUC 0.5 equates to a random model, so when creating any machine learning model or statistical model, you ideally want your model to at least beat this random baseline."
Harsh S. - "AUC 0.5 equates to a random model, so when creating any machine learning model or statistical model, you ideally want your model to at least beat this random baseline."See full answer
"Random Forest is a machine learning model used for classification problems or regression problems. It can handle binary classification as well as multi-class classification. It is a very efficient model and is great for a baseline or used in a service that needs extremely low latency depending on the size of the model. It's also a good option for wide datasets (dataset with many features) due to it's random subset of features. it is slightly less optimized for deep datasets on very large dataset"
Jake M. - "Random Forest is a machine learning model used for classification problems or regression problems. It can handle binary classification as well as multi-class classification. It is a very efficient model and is great for a baseline or used in a service that needs extremely low latency depending on the size of the model. It's also a good option for wide datasets (dataset with many features) due to it's random subset of features. it is slightly less optimized for deep datasets on very large dataset"See full answer