Machine Learning Engineer Questions And Answers

21⟩ Can you explain what is the difference between inductive machine learning and deductive machine learning?

In inductive machine learning, the model learns by examples from a set of observed instances to draw a generalized conclusion whereas in deductive learning the model first draws the conclusion and then the conclusion is drawn. Let’s understand this with an example, for instance, if you have to explain to a kid that playing with fire can cause burns. There are two ways you can explain this to kids, you can show them training examples of various fire accidents or images with burnt people and label them as “Hazardous”. In this case the kid will learn with the help of examples and not play with fire. This is referred to as Inductive machine learning. The other way is to let your kid play with fire and wait to see what happens. If the kid gets a burn they will learn not to play with fire and whenever they come across fire, they will avoid going near it. This is referred to as deductive learning.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

381 views

22⟩ Explain me machine learning in to a layperson?

Machine learning is all about making decisions based on previous experience with a task with the intent of improving its performance. There are multiple examples that can be given to explain machine learning to a layperson –

☛ Imagine a curious kid who sticks his palm

☛ You have observed from your connections that obese people often tend to get heart diseases thus you make the decision that you will try to remain thin otherwise you might suffer from a heart disease. You have observed a ton of data and come up with a general rule of classification.

☛ You are playing blackjack and based on the sequence of cards you see, you decide whether to hit or to stay. In this case based on the previous information you have and by looking at what happens, you make a decision quickly.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

191 views

23⟩ Tell me what is the most frequent metric to assess model accuracy for classification problems?

Percent Correct Classification (PCC) measures the overall accuracy irrespective of the kind of errors that are made, all errors are considered to have same weight.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

194 views

24⟩ Tell us why is Naïve Bayes machine learning algorithm naïve?

Naïve Bayes machine learning algorithm is considered Naïve because the assumptions the algorithm makes are virtually impossible to find in real-life data. Conditional probability is calculated as a pure product of individual probabilities of components. This means that the algorithm assumes the presence or absence of a specific feature of a class is not related to the presence or absence of any other feature (absolute independence of features), given the class variable. For instance, a fruit may be considered to be a banana if it is yellow, long and about 5 inches in length. However, if these features depend on each other or are based on the existence of other features, a naïve Bayes classifier will assume all these properties to contribute independently to the probability that this fruit is a banana. Assuming that all features in a given dataset are equally important and independent rarely exists in the real-world scenario.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

186 views

25⟩ Tell us what is your training in machine learning and what types of hands-on experience do you have?

Your answer to this question will depend on your training in machine learning. Be sure to emphasize any direct projects you’ve completed as part of your education. Don’t fail to mention any additional experience that you have including certifications and how they have prepared you for your role in the machine learning field.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

211 views

26⟩ Tell us what do you think of our current data process?

This kind of question requires you to listen carefully and impart feedback in a manner that is constructive and insightful. Your interviewer is trying to gauge if you’d be a valuable member of their team and whether you grasp the nuances of why certain things are set the way they are in the company’s data process based on company- or industry-specific conditions. They’re trying to see if you can be an intellectual peer. Act accordingly.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

189 views

27⟩ Tell us how do you ensure you’re not overfitting with a model?

This is a simple restatement of a fundamental problem in machine learning: the possibility of overfitting training data and carrying the noise of that data through to the test set, thereby providing inaccurate generalizations.

There are three main methods to avoid overfitting:

1- Keep the model simpler: reduce variance by taking into account fewer variables and parameters, thereby removing some of the noise in the training data.

2- Use cross-validation techniques such as k-folds cross-validation.

3- Use regularization techniques such as LASSO that penalize certain model parameters if they’re likely to cause overfitting.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

188 views

28⟩ Explain me what is Bayes’ Theorem? How is it useful in a machine learning context?

Bayes’ Theorem gives you the posterior probability of an event given what is known as prior knowledge.

Mathematically, it’s expressed as the true positive rate of a condition sample divided by the sum of the false positive rate of the population and the true positive rate of a condition. Say you had a 60% chance of actually having the flu after a flu test, but out of people who had the flu, the test will be false 50% of the time, and the overall population only has a 5% chance of having the flu. Would you actually have a 60% chance of having the flu after having a positive test?

Bayes’ Theorem says no. It says that you have a (.6 * 0.05) (True Positive Rate of a Condition Sample) / (.6*0.05)(True Positive Rate of a Condition Sample) + (.5*0.95) (False Positive Rate of a Population) = 0.0594 or 5.94% chance of getting a flu.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

207 views

29⟩ Tell me how is KNN different from k-means clustering?

K-Nearest Neighbors is a supervised classification algorithm, while k-means clustering is an unsupervised clustering algorithm. While the mechanisms may seem similar at first, what this really means is that in order for K-Nearest Neighbors to work, you need labeled data you want to classify an unlabeled point into (thus the nearest neighbor part). K-means clustering requires only a set of unlabeled points and a threshold: the algorithm will take unlabeled points and gradually learn how to cluster them into groups by computing the mean of the distance between different points.

The critical difference here is that KNN needs labeled points and is thus supervised learning, while k-means doesn’t — and is thus unsupervised learning.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

203 views

30⟩ Please explain what is deep learning, and how does it contrast with other machine learning algorithms?

Deep learning is a subset of machine learning that is concerned with neural networks: how to use backpropagation and certain principles from neuroscience to more accurately model large sets of unlabelled or semi-structured data. In that sense, deep learning represents an unsupervised learning algorithm that learns representations of data through the use of neural nets.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

175 views

31⟩ Do you know what’s the “kernel trick” and how is it useful?

The Kernel trick involves kernel functions that can enable in higher-dimension spaces without explicitly calculating the coordinates of points within that dimension: instead, kernel functions compute the inner products between the images of all pairs of data in a feature space. This allows them the very useful attribute of calculating the coordinates of higher dimensions while being computationally cheaper than the explicit calculation of said coordinates. Many algorithms can be expressed in terms of inner products. Using the kernel trick enables us effectively run algorithms in a high-dimensional space with lower-dimensional data.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

171 views

32⟩ Tell me how a ROC curve works?

The ROC curve is a graphical representation of the contrast between true positive rates and the false positive rate at various thresholds. It’s often used as a proxy for the trade-off between the sensitivity of the model (true positives) vs the fall-out or the probability it will trigger a false alarm (false positives).

Is this answer helpful? 0 Yes | 0 No

Answer This Question

175 views

33⟩ Tell us what’s the difference between a generative and discriminative model?

A generative model will learn categories of data while a discriminative model will simply learn the distinction between different categories of data. Discriminative models will generally outperform generative models on classification tasks.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

188 views

34⟩ Do you know which is more important to you– model accuracy, or model performance?

This question tests your grasp of the nuances of machine learning model performance! Machine learning interview questions often look towards the details. There are models with higher accuracy that can perform worse in predictive power — how does that make sense?

Well, it has everything to do with how model accuracy is only a subset of model performance, and at that, a sometimes misleading one. For example, if you wanted to detect fraud in a massive dataset with a sample of millions, a more accurate model would most likely predict no fraud at all if only a vast minority of cases were fraud. However, this would be useless for a predictive model — a model designed to find fraud that asserted there was no fraud at all! Questions like this help you demonstrate that you understand model accuracy isn’t the be-all and end-all of model performance.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

218 views

35⟩ Tell us what kind of problems does regularization solve?

Regularization is used to address overfitting problems as it penalizes the loss function by adding a multiple of an L1 (LASSO) or an L2 (Ridge) norm of your weights vector w.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

175 views

36⟩ Tell us what is decision tree classification?

A decision tree builds classification (or regression) models as a tree structure, with datasets broken up into ever smaller subsets while developing the decision tree, literally in a tree-like way with branches and nodes. Decision trees can handle both categorical and numerical data.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

187 views

37⟩ Tell me do you have experience with Spark or big data tools for machine learning?

You’ll want to get familiar with the meaning of big data for different companies and the different tools they’ll want. Spark is the big data tool most in demand now, able to handle immense datasets with speed. Be honest if you don’t have experience with the tools demanded, but also take a look at job descriptions and see what tools pop up: you’ll want to invest in familiarizing yourself with them.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

190 views

38⟩ Tell us why is “Naive” Bayes naive?

Despite its practical applications, especially in text mining, Naive Bayes is considered “Naive” because it makes an assumption that is virtually impossible to see in real-life data: the conditional probability is calculated as the pure product of the individual probabilities of components. This implies the absolute independence of features — a condition probably never met in real life.

As a Quora commenter put it whimsically, a Naive Bayes classifier that figured out that you liked pickles and ice cream would probably naively recommend you a pickle ice cream.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

183 views

39⟩ Tell me what is supervised versus unsupervised learning?

Supervised learning is a process of machine learning in which outputs are fed back into a computer for the software to learn from for more accurate results the next time. With supervised learning, the “machine” receives initial training to start. In contrast, unsupervised learning means a computer will learn without initial training.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

170 views

40⟩ Tell me how much data will you allocate for your training, validation and test sets?

There is no to the point answer to this question but there needs to be a balance/equilibrium when allocating data for training, validation and test sets.

If you make the training set too small, then the actual model parameters might have high variance. Also, if the test set is too small, there are chances of unreliable estimation of model performance. A general thumb rule to follow is to use 80: 20 train/test spilt. After this the training set can be further split into validation sets.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

222 views

Machine Learning Engineer

Home Engineering Machine Learning Engineer

65 Machine Learning Engineer Questions And Answers

21⟩ Can you explain what is the difference between inductive machine learning and deductive machine learning?

22⟩ Explain me machine learning in to a layperson?

23⟩ Tell me what is the most frequent metric to assess model accuracy for classification problems?

24⟩ Tell us why is Naïve Bayes machine learning algorithm naïve?

25⟩ Tell us what is your training in machine learning and what types of hands-on experience do you have?

26⟩ Tell us what do you think of our current data process?

27⟩ Tell us how do you ensure you’re not overfitting with a model?

28⟩ Explain me what is Bayes’ Theorem? How is it useful in a machine learning context?

29⟩ Tell me how is KNN different from k-means clustering?

30⟩ Please explain what is deep learning, and how does it contrast with other machine learning algorithms?

31⟩ Do you know what’s the “kernel trick” and how is it useful?

32⟩ Tell me how a ROC curve works?

33⟩ Tell us what’s the difference between a generative and discriminative model?

34⟩ Do you know which is more important to you– model accuracy, or model performance?

35⟩ Tell us what kind of problems does regularization solve?

36⟩ Tell us what is decision tree classification?

37⟩ Tell me do you have experience with Spark or big data tools for machine learning?

38⟩ Tell us why is “Naive” Bayes naive?

39⟩ Tell me what is supervised versus unsupervised learning?

40⟩ Tell me how much data will you allocate for your training, validation and test sets?

Quick Links:

Machine Learning Engineer

Home Engineering Machine Learning Engineer

65 Machine Learning Engineer Questions And Answers

21⟩ Can you explain what is the difference between inductive machine learning and deductive machine learning?

22⟩ Explain me machine learning in to a layperson?

23⟩ Tell me what is the most frequent metric to assess model accuracy for classification problems?

24⟩ Tell us why is Naïve Bayes machine learning algorithm naïve?

25⟩ Tell us what is your training in machine learning and what types of hands-on experience do you have?

26⟩ Tell us what do you think of our current data process?

27⟩ Tell us how do you ensure you’re not overfitting with a model?

28⟩ Explain me what is Bayes’ Theorem? How is it useful in a machine learning context?

29⟩ Tell me how is KNN different from k-means clustering?

30⟩ Please explain what is deep learning, and how does it contrast with other machine learning algorithms?

31⟩ Do you know what’s the “kernel trick” and how is it useful?

32⟩ Tell me how a ROC curve works?

33⟩ Tell us what’s the difference between a generative and discriminative model?

34⟩ Do you know which is more important to you– model accuracy, or model performance?

35⟩ Tell us what kind of problems does regularization solve?

36⟩ Tell us what is decision tree classification?

37⟩ Tell me do you have experience with Spark or big data tools for machine learning?

38⟩ Tell us why is “Naive” Bayes naive?

39⟩ Tell me what is supervised versus unsupervised learning?

40⟩ Tell me how much data will you allocate for your training, validation and test sets?

BE THE FIRST TO KNOW

Quick Links: