"[I'm not sure whether the answer below is the best, as I have not gotten result and feedback from my interview]
Ans: I would solve by first using a VAE-style model, to create a latent space embedding that translates user description to generate images. Training would be done on the 1000 avatar images and 100000 descriptions, following this scheme:
VAE:
description -> encoder -> latent space -> decoder -> image
Q: "OK, but that means you're limiting the generated images to be only the 1000 imag"
Nick S. - "[I'm not sure whether the answer below is the best, as I have not gotten result and feedback from my interview]
Ans: I would solve by first using a VAE-style model, to create a latent space embedding that translates user description to generate images. Training would be done on the 1000 avatar images and 100000 descriptions, following this scheme:
VAE:
description -> encoder -> latent space -> decoder -> image
Q: "OK, but that means you're limiting the generated images to be only the 1000 imag"See full answer
"Zero in on the problem, the expectations of user are to find a restaurant but their feed is uninspired so they may bounce out of Yelp.
Identify the impact size of user feeling like discovery is not personalised enough by seeing % of users that selected a restaurant from the homepage
If large enough, I will look at who is likely the ones that want personalisation and why? Do they feel like they want to try new restaurants or are they finding it difficult to find restaurants they have been"
Chermaine Y. - "Zero in on the problem, the expectations of user are to find a restaurant but their feed is uninspired so they may bounce out of Yelp.
Identify the impact size of user feeling like discovery is not personalised enough by seeing % of users that selected a restaurant from the homepage
If large enough, I will look at who is likely the ones that want personalisation and why? Do they feel like they want to try new restaurants or are they finding it difficult to find restaurants they have been"See full answer
"No ,MSE is suitable for only regression modes. Although the logistic regression in Its name has regression , but it is a classification problem so MSE is not suitable for classification models like logistic regression."
1036 loknadh R. - "No ,MSE is suitable for only regression modes. Although the logistic regression in Its name has regression , but it is a classification problem so MSE is not suitable for classification models like logistic regression."See full answer
"Switching from a linear kernel to RBF / Gaussian kernel is likely to result in overfitting the model. It is a move that adds complexity to the mix, and if the data doesn't need that sort of complexity, it would result in overfitting. On the other hand, all the other three approaches would only try too reduce complexity in the process, thereby doesn't contribute to overfitting the model."
Sri V. - "Switching from a linear kernel to RBF / Gaussian kernel is likely to result in overfitting the model. It is a move that adds complexity to the mix, and if the data doesn't need that sort of complexity, it would result in overfitting. On the other hand, all the other three approaches would only try too reduce complexity in the process, thereby doesn't contribute to overfitting the model."See full answer
"AUC 0.5 equates to a random model, so when creating any machine learning model or statistical model, you ideally want your model to at least beat this random baseline."
Harsh S. - "AUC 0.5 equates to a random model, so when creating any machine learning model or statistical model, you ideally want your model to at least beat this random baseline."See full answer
"Random Forest is a machine learning model used for classification problems or regression problems. It can handle binary classification as well as multi-class classification. It is a very efficient model and is great for a baseline or used in a service that needs extremely low latency depending on the size of the model. It's also a good option for wide datasets (dataset with many features) due to it's random subset of features. it is slightly less optimized for deep datasets on very large dataset"
Jake M. - "Random Forest is a machine learning model used for classification problems or regression problems. It can handle binary classification as well as multi-class classification. It is a very efficient model and is great for a baseline or used in a service that needs extremely low latency depending on the size of the model. It's also a good option for wide datasets (dataset with many features) due to it's random subset of features. it is slightly less optimized for deep datasets on very large dataset"See full answer