Using cross-validation technique for a CNN model?Validation vs. test vs. training accuracy. Which one should I compare for claiming overfit?Convolutional Neural Network not learning EEG dataConsistently inconsistent cross-validation results that are wildly different from original model accuracyReporting test result for cross-validation with Neural NetworkDecision tree classifier: possible overfittingTaking average of multiple neural networks?Interpreting confusion matrix and validation results in convolutional networksDifficulty in choosing Hyperparameters for my CNNsklearn cross_validate without test/train splitOversampling before Cross-Validation, is it a problem?Stop CNN model at high accuracy and low loss rate?
What does "Scientists rise up against statistical significance" mean? (Comment in Nature)
Store Credit Card Information in Password Manager?
Using substitution ciphers to generate new alphabets in a novel
Strong empirical falsification of quantum mechanics based on vacuum energy density?
Fear of getting stuck on one programming language / technology that is not used in my country
How could a planet have erratic days?
How do you make your own symbol when Detexify fails?
Can a college of swords bard use blade flourish on an OA from dissonant whispers?
How to explain what's wrong with this application of the chain rule?
When were female captains banned from Starfleet?
Why is the "ls" command showing permissions of files in a FAT32 partition?
What if you are holding an Iron Flask with a demon inside and walk into Antimagic Field?
Does malloc reserve more space while allocating memory?
Bridge building with irregular planks
The IT department bottlenecks progress, how should I handle this?
Calculate sum of polynomial roots
Using cross-validation technique for a CNN model?
X marks the what?
How should I address a possible mistake to co-authors in a submitted paper
Biological Blimps: Propulsion
Does the Linux kernel need a file system to run?
Can a stoichiometric mixture of oxygen and methane exist as a liquid at standard pressure and some (low) temperature?
How much character growth crosses the line into breaking the character
It grows, but water kills it
Using cross-validation technique for a CNN model?
Validation vs. test vs. training accuracy. Which one should I compare for claiming overfit?Convolutional Neural Network not learning EEG dataConsistently inconsistent cross-validation results that are wildly different from original model accuracyReporting test result for cross-validation with Neural NetworkDecision tree classifier: possible overfittingTaking average of multiple neural networks?Interpreting confusion matrix and validation results in convolutional networksDifficulty in choosing Hyperparameters for my CNNsklearn cross_validate without test/train splitOversampling before Cross-Validation, is it a problem?Stop CNN model at high accuracy and low loss rate?
$begingroup$
I am working on the CNN model, as always I use batches with epochs to train my model, for my model, when it completed training and validation, finally I use a test set to measure the model performance and generate confusion matrix, now I want to use cross-validation to train my model, I can implement it but there are some questions in my mind, my questions are:
1- why most CNN models not using cross-validation technique?
2- if I use cross-validation how can I generate confusion matrix? can I split dataset to train/test then do cross-validation on train set as train/validation (i.e. doing cross-validation as train/validation except for the usual train/test) and at last use test set the same way? or how?
python deep-learning
$endgroup$
add a comment |
$begingroup$
I am working on the CNN model, as always I use batches with epochs to train my model, for my model, when it completed training and validation, finally I use a test set to measure the model performance and generate confusion matrix, now I want to use cross-validation to train my model, I can implement it but there are some questions in my mind, my questions are:
1- why most CNN models not using cross-validation technique?
2- if I use cross-validation how can I generate confusion matrix? can I split dataset to train/test then do cross-validation on train set as train/validation (i.e. doing cross-validation as train/validation except for the usual train/test) and at last use test set the same way? or how?
python deep-learning
$endgroup$
add a comment |
$begingroup$
I am working on the CNN model, as always I use batches with epochs to train my model, for my model, when it completed training and validation, finally I use a test set to measure the model performance and generate confusion matrix, now I want to use cross-validation to train my model, I can implement it but there are some questions in my mind, my questions are:
1- why most CNN models not using cross-validation technique?
2- if I use cross-validation how can I generate confusion matrix? can I split dataset to train/test then do cross-validation on train set as train/validation (i.e. doing cross-validation as train/validation except for the usual train/test) and at last use test set the same way? or how?
python deep-learning
$endgroup$
I am working on the CNN model, as always I use batches with epochs to train my model, for my model, when it completed training and validation, finally I use a test set to measure the model performance and generate confusion matrix, now I want to use cross-validation to train my model, I can implement it but there are some questions in my mind, my questions are:
1- why most CNN models not using cross-validation technique?
2- if I use cross-validation how can I generate confusion matrix? can I split dataset to train/test then do cross-validation on train set as train/validation (i.e. doing cross-validation as train/validation except for the usual train/test) and at last use test set the same way? or how?
python deep-learning
python deep-learning
asked 4 hours ago
honar.cshonar.cs
10812
10812
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Question 1: Why do most CNN models not apply the cross-validation technique?
$k$-fold cross-validation is often used for simple models with few parameters, models with simple hyperparameters and additionally the models are easy to optimize. Typical examples are linear regression, logistic regression, small neural networks and support vector machines.
For a convolutional neural network with many parameters (e.g. more than one million) we just have too many possible changes in the architecture. What you can do is to do some experiments with the learning rate, batch size, dropout (amount and position) and batch normalization (position). Training a convolutional neural network with a huge dataset takes quite a long time. Doing hyperparameter optimization would just be total overkill. Often in papers, they try to improve the results of other research papers. It is not the goal to get better results by improving the chosen hyperparameters but rather to come up with new ideas to solve the given task but with better accuracy or less computational effort.
Question 2: If I use cross-validation how can I generate confusion
matrix? can I split dataset to train/test then do cross-validation on
train set as train/validation (i.e. doing cross-validation as
train/validation except for the usual train/test) and at last use test
set the same way? or how?
In order to do $k$-fold cross validation you will need to split your initial data set into two parts. One dataset for doing the hyperparameter optimization and one for the final validation. Then we take the dataset for the hyperparameter optimization and split it into $k$ (hopefully) equally sized data sets $mathcalD_1,mathcalD_2,ldots,mathcalD_k$. For the sake of clarity let us set $k=3$. Then for each possible hyperparameter combination that we want to test we use $mathcalD_1$ and $mathcalD_2$ to fit our model and we use $mathcalD_3$ to validate our model. Then we do the same with $mathcalD_2$ and $mathcalD_3$ and use $mathcalD_1$ for validation. Then we do the same with $mathcalD_1$ and $mathcalD_3$ and use $mathcalD_2$ for validation. We will get $3$ confusion matrices for every possible hyperparameter configuration. In order to derive a metric from these three results, we take the mean of these confusion matrices. Then we can scan through all averaged confusion matrices so select the hyperparameter configuration that was the best (you have to define what parts of the confusion matrix are important for your problem). Finally, we pick the 'best' hyperparameters and calculate the prediction performance on the final validation set. This performance metrics are the ones that you report.
New contributor
MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47797%2fusing-cross-validation-technique-for-a-cnn-model%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Question 1: Why do most CNN models not apply the cross-validation technique?
$k$-fold cross-validation is often used for simple models with few parameters, models with simple hyperparameters and additionally the models are easy to optimize. Typical examples are linear regression, logistic regression, small neural networks and support vector machines.
For a convolutional neural network with many parameters (e.g. more than one million) we just have too many possible changes in the architecture. What you can do is to do some experiments with the learning rate, batch size, dropout (amount and position) and batch normalization (position). Training a convolutional neural network with a huge dataset takes quite a long time. Doing hyperparameter optimization would just be total overkill. Often in papers, they try to improve the results of other research papers. It is not the goal to get better results by improving the chosen hyperparameters but rather to come up with new ideas to solve the given task but with better accuracy or less computational effort.
Question 2: If I use cross-validation how can I generate confusion
matrix? can I split dataset to train/test then do cross-validation on
train set as train/validation (i.e. doing cross-validation as
train/validation except for the usual train/test) and at last use test
set the same way? or how?
In order to do $k$-fold cross validation you will need to split your initial data set into two parts. One dataset for doing the hyperparameter optimization and one for the final validation. Then we take the dataset for the hyperparameter optimization and split it into $k$ (hopefully) equally sized data sets $mathcalD_1,mathcalD_2,ldots,mathcalD_k$. For the sake of clarity let us set $k=3$. Then for each possible hyperparameter combination that we want to test we use $mathcalD_1$ and $mathcalD_2$ to fit our model and we use $mathcalD_3$ to validate our model. Then we do the same with $mathcalD_2$ and $mathcalD_3$ and use $mathcalD_1$ for validation. Then we do the same with $mathcalD_1$ and $mathcalD_3$ and use $mathcalD_2$ for validation. We will get $3$ confusion matrices for every possible hyperparameter configuration. In order to derive a metric from these three results, we take the mean of these confusion matrices. Then we can scan through all averaged confusion matrices so select the hyperparameter configuration that was the best (you have to define what parts of the confusion matrix are important for your problem). Finally, we pick the 'best' hyperparameters and calculate the prediction performance on the final validation set. This performance metrics are the ones that you report.
New contributor
MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
Question 1: Why do most CNN models not apply the cross-validation technique?
$k$-fold cross-validation is often used for simple models with few parameters, models with simple hyperparameters and additionally the models are easy to optimize. Typical examples are linear regression, logistic regression, small neural networks and support vector machines.
For a convolutional neural network with many parameters (e.g. more than one million) we just have too many possible changes in the architecture. What you can do is to do some experiments with the learning rate, batch size, dropout (amount and position) and batch normalization (position). Training a convolutional neural network with a huge dataset takes quite a long time. Doing hyperparameter optimization would just be total overkill. Often in papers, they try to improve the results of other research papers. It is not the goal to get better results by improving the chosen hyperparameters but rather to come up with new ideas to solve the given task but with better accuracy or less computational effort.
Question 2: If I use cross-validation how can I generate confusion
matrix? can I split dataset to train/test then do cross-validation on
train set as train/validation (i.e. doing cross-validation as
train/validation except for the usual train/test) and at last use test
set the same way? or how?
In order to do $k$-fold cross validation you will need to split your initial data set into two parts. One dataset for doing the hyperparameter optimization and one for the final validation. Then we take the dataset for the hyperparameter optimization and split it into $k$ (hopefully) equally sized data sets $mathcalD_1,mathcalD_2,ldots,mathcalD_k$. For the sake of clarity let us set $k=3$. Then for each possible hyperparameter combination that we want to test we use $mathcalD_1$ and $mathcalD_2$ to fit our model and we use $mathcalD_3$ to validate our model. Then we do the same with $mathcalD_2$ and $mathcalD_3$ and use $mathcalD_1$ for validation. Then we do the same with $mathcalD_1$ and $mathcalD_3$ and use $mathcalD_2$ for validation. We will get $3$ confusion matrices for every possible hyperparameter configuration. In order to derive a metric from these three results, we take the mean of these confusion matrices. Then we can scan through all averaged confusion matrices so select the hyperparameter configuration that was the best (you have to define what parts of the confusion matrix are important for your problem). Finally, we pick the 'best' hyperparameters and calculate the prediction performance on the final validation set. This performance metrics are the ones that you report.
New contributor
MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
Question 1: Why do most CNN models not apply the cross-validation technique?
$k$-fold cross-validation is often used for simple models with few parameters, models with simple hyperparameters and additionally the models are easy to optimize. Typical examples are linear regression, logistic regression, small neural networks and support vector machines.
For a convolutional neural network with many parameters (e.g. more than one million) we just have too many possible changes in the architecture. What you can do is to do some experiments with the learning rate, batch size, dropout (amount and position) and batch normalization (position). Training a convolutional neural network with a huge dataset takes quite a long time. Doing hyperparameter optimization would just be total overkill. Often in papers, they try to improve the results of other research papers. It is not the goal to get better results by improving the chosen hyperparameters but rather to come up with new ideas to solve the given task but with better accuracy or less computational effort.
Question 2: If I use cross-validation how can I generate confusion
matrix? can I split dataset to train/test then do cross-validation on
train set as train/validation (i.e. doing cross-validation as
train/validation except for the usual train/test) and at last use test
set the same way? or how?
In order to do $k$-fold cross validation you will need to split your initial data set into two parts. One dataset for doing the hyperparameter optimization and one for the final validation. Then we take the dataset for the hyperparameter optimization and split it into $k$ (hopefully) equally sized data sets $mathcalD_1,mathcalD_2,ldots,mathcalD_k$. For the sake of clarity let us set $k=3$. Then for each possible hyperparameter combination that we want to test we use $mathcalD_1$ and $mathcalD_2$ to fit our model and we use $mathcalD_3$ to validate our model. Then we do the same with $mathcalD_2$ and $mathcalD_3$ and use $mathcalD_1$ for validation. Then we do the same with $mathcalD_1$ and $mathcalD_3$ and use $mathcalD_2$ for validation. We will get $3$ confusion matrices for every possible hyperparameter configuration. In order to derive a metric from these three results, we take the mean of these confusion matrices. Then we can scan through all averaged confusion matrices so select the hyperparameter configuration that was the best (you have to define what parts of the confusion matrix are important for your problem). Finally, we pick the 'best' hyperparameters and calculate the prediction performance on the final validation set. This performance metrics are the ones that you report.
New contributor
MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
Question 1: Why do most CNN models not apply the cross-validation technique?
$k$-fold cross-validation is often used for simple models with few parameters, models with simple hyperparameters and additionally the models are easy to optimize. Typical examples are linear regression, logistic regression, small neural networks and support vector machines.
For a convolutional neural network with many parameters (e.g. more than one million) we just have too many possible changes in the architecture. What you can do is to do some experiments with the learning rate, batch size, dropout (amount and position) and batch normalization (position). Training a convolutional neural network with a huge dataset takes quite a long time. Doing hyperparameter optimization would just be total overkill. Often in papers, they try to improve the results of other research papers. It is not the goal to get better results by improving the chosen hyperparameters but rather to come up with new ideas to solve the given task but with better accuracy or less computational effort.
Question 2: If I use cross-validation how can I generate confusion
matrix? can I split dataset to train/test then do cross-validation on
train set as train/validation (i.e. doing cross-validation as
train/validation except for the usual train/test) and at last use test
set the same way? or how?
In order to do $k$-fold cross validation you will need to split your initial data set into two parts. One dataset for doing the hyperparameter optimization and one for the final validation. Then we take the dataset for the hyperparameter optimization and split it into $k$ (hopefully) equally sized data sets $mathcalD_1,mathcalD_2,ldots,mathcalD_k$. For the sake of clarity let us set $k=3$. Then for each possible hyperparameter combination that we want to test we use $mathcalD_1$ and $mathcalD_2$ to fit our model and we use $mathcalD_3$ to validate our model. Then we do the same with $mathcalD_2$ and $mathcalD_3$ and use $mathcalD_1$ for validation. Then we do the same with $mathcalD_1$ and $mathcalD_3$ and use $mathcalD_2$ for validation. We will get $3$ confusion matrices for every possible hyperparameter configuration. In order to derive a metric from these three results, we take the mean of these confusion matrices. Then we can scan through all averaged confusion matrices so select the hyperparameter configuration that was the best (you have to define what parts of the confusion matrix are important for your problem). Finally, we pick the 'best' hyperparameters and calculate the prediction performance on the final validation set. This performance metrics are the ones that you report.
New contributor
MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
answered 4 hours ago
MachineLearnerMachineLearner
30810
30810
New contributor
MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47797%2fusing-cross-validation-technique-for-a-cnn-model%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown