Thesis on avalanche prediction using One Class SVMLinearly increasing data with manual resetsklearn -...
Why did C use the -> operator instead of reusing the . operator?
Why must Chinese maps be obfuscated?
Alignment of various blocks in tikz
Does a large simulator bay have standard public address announcements?
Is there really no use for MD5 anymore?
What term is being referred to with "reflected-sound-of-underground-spirits"?
Why was the Spitfire's elliptical wing almost uncopied by other aircraft of World War 2?
What's the name of these pliers?
What happened to Captain America in Endgame?
Which big number is bigger?
Is it idiomatic to construct against `this`
Don’t seats that recline flat defeat the purpose of having seatbelts?
How to denote matrix elements succinctly?
Map of water taps to fill bottles
Two field separators (colon and space) in awk
Can someone publish a story that happened to you?
Was there a Viking Exchange as well as a Columbian one?
How to not starve gigantic beasts
Philosophical question on logistic regression: why isn't the optimal threshold value trained?
Do I have an "anti-research" personality?
Checks user level and limit the data before saving it to mongoDB
How did Captain America manage to do this?
How do I check if a string is entirely made of the same substring?
Pulling the rope with one hand is as heavy as with two hands?
Thesis on avalanche prediction using One Class SVM
Linearly increasing data with manual resetsklearn - overfitting problemDate prediction - periodic recurrenceIdentifying Waveform Segments Using Training WaveformsTime series prediction without sliding windowMulti-label classification of text with variable tag distribution in KerasNLP how to go beyond simple intent finding--using context and targeting objectsHow can I improve the accuracy of my neural network on a very unbalanced dataset?Word classification (not text classification) using NLPAnomaly detection on text data using one Class SVM
$begingroup$
I'm doing my thesis on avalanche prediction using machine learning.
For my input features i'm using avalanche accidents with features like slope, altitude, facing direction of the slope, combined with according weather data from the day the avalanche occurred.
I want to predict an avalanche when certain variables combine and create a deadly avalanche situation. So 1: the avalanche occurs. 0: The avalanche does not occur. The only data in my database are occured avalanches, i got around 200 samples. So I don't have any data of a non deadly avalanche situation, which is mostly the case.
My question is if a One Class SVM is a good approach to take on this clasification?
machine-learning python
New contributor
$endgroup$
add a comment |
$begingroup$
I'm doing my thesis on avalanche prediction using machine learning.
For my input features i'm using avalanche accidents with features like slope, altitude, facing direction of the slope, combined with according weather data from the day the avalanche occurred.
I want to predict an avalanche when certain variables combine and create a deadly avalanche situation. So 1: the avalanche occurs. 0: The avalanche does not occur. The only data in my database are occured avalanches, i got around 200 samples. So I don't have any data of a non deadly avalanche situation, which is mostly the case.
My question is if a One Class SVM is a good approach to take on this clasification?
machine-learning python
New contributor
$endgroup$
add a comment |
$begingroup$
I'm doing my thesis on avalanche prediction using machine learning.
For my input features i'm using avalanche accidents with features like slope, altitude, facing direction of the slope, combined with according weather data from the day the avalanche occurred.
I want to predict an avalanche when certain variables combine and create a deadly avalanche situation. So 1: the avalanche occurs. 0: The avalanche does not occur. The only data in my database are occured avalanches, i got around 200 samples. So I don't have any data of a non deadly avalanche situation, which is mostly the case.
My question is if a One Class SVM is a good approach to take on this clasification?
machine-learning python
New contributor
$endgroup$
I'm doing my thesis on avalanche prediction using machine learning.
For my input features i'm using avalanche accidents with features like slope, altitude, facing direction of the slope, combined with according weather data from the day the avalanche occurred.
I want to predict an avalanche when certain variables combine and create a deadly avalanche situation. So 1: the avalanche occurs. 0: The avalanche does not occur. The only data in my database are occured avalanches, i got around 200 samples. So I don't have any data of a non deadly avalanche situation, which is mostly the case.
My question is if a One Class SVM is a good approach to take on this clasification?
machine-learning python
machine-learning python
New contributor
New contributor
New contributor
asked 20 hours ago
Pieter De MalschePieter De Malsche
161
161
New contributor
New contributor
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
$begingroup$
Your problem seems to belong to novelty detection in the general area of OCC problems.
So, the short answer is: yes. You can apply SVDD (Support Vector Data Description) method to get the smallest hypersphere containing samples in the dataset and then assess whether a new observation is an outlier or not.
Of course, the less representative your dataset is, the less accurate your classifier will be.
$endgroup$
add a comment |
$begingroup$
You can use a method of data mining to predict avalanches, however, there are some pit falls which I can provide you based on my basic avalanche knowledge from mountaineering.
- What do you want to predict? Spontanous avalanches (mainly threating villages and roads) or human triggered avalanches (mainly affecting skiers). The factors for these are completely different
- Getting data has already been mentioned. There are some data sets related to avalanche incidents, for example at the swiss avlanche research institute: https://www.slf.ch/de/lawinen/unfaelle-und-schadenlawinen/alle-gemeldeten-lawinenunfaelle-aktuell.html
However, there is naturally little data about cases where no avalanche was triggered and where an avalanche was triggered but nobody harmed and therefore not reported. There have been some tries to estimate the number of people on tour based on touring reports in the internet. - Getting precise data is even more of a problem. Consider figure 2 in this weeks report: https://www.slf.ch/de/lawinenbulletin-und-schneesituation/wochen-und-winterberichte/201819/wob-18-25-april.html
It compares the same slope at a time difference of 45 minutes and it looks completely different. - Feature selection is another big issue. You mention that you want to use weather data from the day of the incident. I think this is drawing the wrong conclusions as most skiing avalanches happen during the weekends and probably in slightly better weather. Also most people will be sensible and not go ski touring on risky tours on risky days. This has a big potential to skew your data and your model
New contributor
$endgroup$
add a comment |
$begingroup$
Could you look for any possible way to get non-avalanche data?
1) Avalanches happened in a mountains chain. Could you add to your data neighboring peaks data from the same day avalanche happened?
2) You may have good insights from data exploration. For instance, what is the minimal slope that mountain should have to be able to produce avalanches? Range of temperatures?
3) Could you look for other data sets (with non-avalanches entries) that you could combine with your data?
$endgroup$
1
$begingroup$
it is incorrect that there is no way to learn a classification model if the training data does not contain at least two classes. The field which addresses this is known as "one-class classification" (or anomaly detection, or outlier detection, it probably also has other names). In this field the broad approach is to build a model of the data and define a distance measure and threshold. New data points which are closer than the threshold are labelled as the training data and those which are further are labelled as the other class. Hope this helps.
$endgroup$
– cfogelberg
10 hours ago
$begingroup$
Thanks for the comment and explanation. I didn't know that the outlier detection can also be treated as classification. It is a very interesting point of view. I have edited reply to not mislead others.
$endgroup$
– Tatyana
8 hours ago
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Pieter De Malsche is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f50968%2fthesis-on-avalanche-prediction-using-one-class-svm%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Your problem seems to belong to novelty detection in the general area of OCC problems.
So, the short answer is: yes. You can apply SVDD (Support Vector Data Description) method to get the smallest hypersphere containing samples in the dataset and then assess whether a new observation is an outlier or not.
Of course, the less representative your dataset is, the less accurate your classifier will be.
$endgroup$
add a comment |
$begingroup$
Your problem seems to belong to novelty detection in the general area of OCC problems.
So, the short answer is: yes. You can apply SVDD (Support Vector Data Description) method to get the smallest hypersphere containing samples in the dataset and then assess whether a new observation is an outlier or not.
Of course, the less representative your dataset is, the less accurate your classifier will be.
$endgroup$
add a comment |
$begingroup$
Your problem seems to belong to novelty detection in the general area of OCC problems.
So, the short answer is: yes. You can apply SVDD (Support Vector Data Description) method to get the smallest hypersphere containing samples in the dataset and then assess whether a new observation is an outlier or not.
Of course, the less representative your dataset is, the less accurate your classifier will be.
$endgroup$
Your problem seems to belong to novelty detection in the general area of OCC problems.
So, the short answer is: yes. You can apply SVDD (Support Vector Data Description) method to get the smallest hypersphere containing samples in the dataset and then assess whether a new observation is an outlier or not.
Of course, the less representative your dataset is, the less accurate your classifier will be.
answered 18 hours ago
sentencesentence
1734
1734
add a comment |
add a comment |
$begingroup$
You can use a method of data mining to predict avalanches, however, there are some pit falls which I can provide you based on my basic avalanche knowledge from mountaineering.
- What do you want to predict? Spontanous avalanches (mainly threating villages and roads) or human triggered avalanches (mainly affecting skiers). The factors for these are completely different
- Getting data has already been mentioned. There are some data sets related to avalanche incidents, for example at the swiss avlanche research institute: https://www.slf.ch/de/lawinen/unfaelle-und-schadenlawinen/alle-gemeldeten-lawinenunfaelle-aktuell.html
However, there is naturally little data about cases where no avalanche was triggered and where an avalanche was triggered but nobody harmed and therefore not reported. There have been some tries to estimate the number of people on tour based on touring reports in the internet. - Getting precise data is even more of a problem. Consider figure 2 in this weeks report: https://www.slf.ch/de/lawinenbulletin-und-schneesituation/wochen-und-winterberichte/201819/wob-18-25-april.html
It compares the same slope at a time difference of 45 minutes and it looks completely different. - Feature selection is another big issue. You mention that you want to use weather data from the day of the incident. I think this is drawing the wrong conclusions as most skiing avalanches happen during the weekends and probably in slightly better weather. Also most people will be sensible and not go ski touring on risky tours on risky days. This has a big potential to skew your data and your model
New contributor
$endgroup$
add a comment |
$begingroup$
You can use a method of data mining to predict avalanches, however, there are some pit falls which I can provide you based on my basic avalanche knowledge from mountaineering.
- What do you want to predict? Spontanous avalanches (mainly threating villages and roads) or human triggered avalanches (mainly affecting skiers). The factors for these are completely different
- Getting data has already been mentioned. There are some data sets related to avalanche incidents, for example at the swiss avlanche research institute: https://www.slf.ch/de/lawinen/unfaelle-und-schadenlawinen/alle-gemeldeten-lawinenunfaelle-aktuell.html
However, there is naturally little data about cases where no avalanche was triggered and where an avalanche was triggered but nobody harmed and therefore not reported. There have been some tries to estimate the number of people on tour based on touring reports in the internet. - Getting precise data is even more of a problem. Consider figure 2 in this weeks report: https://www.slf.ch/de/lawinenbulletin-und-schneesituation/wochen-und-winterberichte/201819/wob-18-25-april.html
It compares the same slope at a time difference of 45 minutes and it looks completely different. - Feature selection is another big issue. You mention that you want to use weather data from the day of the incident. I think this is drawing the wrong conclusions as most skiing avalanches happen during the weekends and probably in slightly better weather. Also most people will be sensible and not go ski touring on risky tours on risky days. This has a big potential to skew your data and your model
New contributor
$endgroup$
add a comment |
$begingroup$
You can use a method of data mining to predict avalanches, however, there are some pit falls which I can provide you based on my basic avalanche knowledge from mountaineering.
- What do you want to predict? Spontanous avalanches (mainly threating villages and roads) or human triggered avalanches (mainly affecting skiers). The factors for these are completely different
- Getting data has already been mentioned. There are some data sets related to avalanche incidents, for example at the swiss avlanche research institute: https://www.slf.ch/de/lawinen/unfaelle-und-schadenlawinen/alle-gemeldeten-lawinenunfaelle-aktuell.html
However, there is naturally little data about cases where no avalanche was triggered and where an avalanche was triggered but nobody harmed and therefore not reported. There have been some tries to estimate the number of people on tour based on touring reports in the internet. - Getting precise data is even more of a problem. Consider figure 2 in this weeks report: https://www.slf.ch/de/lawinenbulletin-und-schneesituation/wochen-und-winterberichte/201819/wob-18-25-april.html
It compares the same slope at a time difference of 45 minutes and it looks completely different. - Feature selection is another big issue. You mention that you want to use weather data from the day of the incident. I think this is drawing the wrong conclusions as most skiing avalanches happen during the weekends and probably in slightly better weather. Also most people will be sensible and not go ski touring on risky tours on risky days. This has a big potential to skew your data and your model
New contributor
$endgroup$
You can use a method of data mining to predict avalanches, however, there are some pit falls which I can provide you based on my basic avalanche knowledge from mountaineering.
- What do you want to predict? Spontanous avalanches (mainly threating villages and roads) or human triggered avalanches (mainly affecting skiers). The factors for these are completely different
- Getting data has already been mentioned. There are some data sets related to avalanche incidents, for example at the swiss avlanche research institute: https://www.slf.ch/de/lawinen/unfaelle-und-schadenlawinen/alle-gemeldeten-lawinenunfaelle-aktuell.html
However, there is naturally little data about cases where no avalanche was triggered and where an avalanche was triggered but nobody harmed and therefore not reported. There have been some tries to estimate the number of people on tour based on touring reports in the internet. - Getting precise data is even more of a problem. Consider figure 2 in this weeks report: https://www.slf.ch/de/lawinenbulletin-und-schneesituation/wochen-und-winterberichte/201819/wob-18-25-april.html
It compares the same slope at a time difference of 45 minutes and it looks completely different. - Feature selection is another big issue. You mention that you want to use weather data from the day of the incident. I think this is drawing the wrong conclusions as most skiing avalanches happen during the weekends and probably in slightly better weather. Also most people will be sensible and not go ski touring on risky tours on risky days. This has a big potential to skew your data and your model
New contributor
New contributor
answered 13 hours ago
ManzielManziel
1212
1212
New contributor
New contributor
add a comment |
add a comment |
$begingroup$
Could you look for any possible way to get non-avalanche data?
1) Avalanches happened in a mountains chain. Could you add to your data neighboring peaks data from the same day avalanche happened?
2) You may have good insights from data exploration. For instance, what is the minimal slope that mountain should have to be able to produce avalanches? Range of temperatures?
3) Could you look for other data sets (with non-avalanches entries) that you could combine with your data?
$endgroup$
1
$begingroup$
it is incorrect that there is no way to learn a classification model if the training data does not contain at least two classes. The field which addresses this is known as "one-class classification" (or anomaly detection, or outlier detection, it probably also has other names). In this field the broad approach is to build a model of the data and define a distance measure and threshold. New data points which are closer than the threshold are labelled as the training data and those which are further are labelled as the other class. Hope this helps.
$endgroup$
– cfogelberg
10 hours ago
$begingroup$
Thanks for the comment and explanation. I didn't know that the outlier detection can also be treated as classification. It is a very interesting point of view. I have edited reply to not mislead others.
$endgroup$
– Tatyana
8 hours ago
add a comment |
$begingroup$
Could you look for any possible way to get non-avalanche data?
1) Avalanches happened in a mountains chain. Could you add to your data neighboring peaks data from the same day avalanche happened?
2) You may have good insights from data exploration. For instance, what is the minimal slope that mountain should have to be able to produce avalanches? Range of temperatures?
3) Could you look for other data sets (with non-avalanches entries) that you could combine with your data?
$endgroup$
1
$begingroup$
it is incorrect that there is no way to learn a classification model if the training data does not contain at least two classes. The field which addresses this is known as "one-class classification" (or anomaly detection, or outlier detection, it probably also has other names). In this field the broad approach is to build a model of the data and define a distance measure and threshold. New data points which are closer than the threshold are labelled as the training data and those which are further are labelled as the other class. Hope this helps.
$endgroup$
– cfogelberg
10 hours ago
$begingroup$
Thanks for the comment and explanation. I didn't know that the outlier detection can also be treated as classification. It is a very interesting point of view. I have edited reply to not mislead others.
$endgroup$
– Tatyana
8 hours ago
add a comment |
$begingroup$
Could you look for any possible way to get non-avalanche data?
1) Avalanches happened in a mountains chain. Could you add to your data neighboring peaks data from the same day avalanche happened?
2) You may have good insights from data exploration. For instance, what is the minimal slope that mountain should have to be able to produce avalanches? Range of temperatures?
3) Could you look for other data sets (with non-avalanches entries) that you could combine with your data?
$endgroup$
Could you look for any possible way to get non-avalanche data?
1) Avalanches happened in a mountains chain. Could you add to your data neighboring peaks data from the same day avalanche happened?
2) You may have good insights from data exploration. For instance, what is the minimal slope that mountain should have to be able to produce avalanches? Range of temperatures?
3) Could you look for other data sets (with non-avalanches entries) that you could combine with your data?
edited 8 hours ago
answered 18 hours ago
TatyanaTatyana
414
414
1
$begingroup$
it is incorrect that there is no way to learn a classification model if the training data does not contain at least two classes. The field which addresses this is known as "one-class classification" (or anomaly detection, or outlier detection, it probably also has other names). In this field the broad approach is to build a model of the data and define a distance measure and threshold. New data points which are closer than the threshold are labelled as the training data and those which are further are labelled as the other class. Hope this helps.
$endgroup$
– cfogelberg
10 hours ago
$begingroup$
Thanks for the comment and explanation. I didn't know that the outlier detection can also be treated as classification. It is a very interesting point of view. I have edited reply to not mislead others.
$endgroup$
– Tatyana
8 hours ago
add a comment |
1
$begingroup$
it is incorrect that there is no way to learn a classification model if the training data does not contain at least two classes. The field which addresses this is known as "one-class classification" (or anomaly detection, or outlier detection, it probably also has other names). In this field the broad approach is to build a model of the data and define a distance measure and threshold. New data points which are closer than the threshold are labelled as the training data and those which are further are labelled as the other class. Hope this helps.
$endgroup$
– cfogelberg
10 hours ago
$begingroup$
Thanks for the comment and explanation. I didn't know that the outlier detection can also be treated as classification. It is a very interesting point of view. I have edited reply to not mislead others.
$endgroup$
– Tatyana
8 hours ago
1
1
$begingroup$
it is incorrect that there is no way to learn a classification model if the training data does not contain at least two classes. The field which addresses this is known as "one-class classification" (or anomaly detection, or outlier detection, it probably also has other names). In this field the broad approach is to build a model of the data and define a distance measure and threshold. New data points which are closer than the threshold are labelled as the training data and those which are further are labelled as the other class. Hope this helps.
$endgroup$
– cfogelberg
10 hours ago
$begingroup$
it is incorrect that there is no way to learn a classification model if the training data does not contain at least two classes. The field which addresses this is known as "one-class classification" (or anomaly detection, or outlier detection, it probably also has other names). In this field the broad approach is to build a model of the data and define a distance measure and threshold. New data points which are closer than the threshold are labelled as the training data and those which are further are labelled as the other class. Hope this helps.
$endgroup$
– cfogelberg
10 hours ago
$begingroup$
Thanks for the comment and explanation. I didn't know that the outlier detection can also be treated as classification. It is a very interesting point of view. I have edited reply to not mislead others.
$endgroup$
– Tatyana
8 hours ago
$begingroup$
Thanks for the comment and explanation. I didn't know that the outlier detection can also be treated as classification. It is a very interesting point of view. I have edited reply to not mislead others.
$endgroup$
– Tatyana
8 hours ago
add a comment |
Pieter De Malsche is a new contributor. Be nice, and check out our Code of Conduct.
Pieter De Malsche is a new contributor. Be nice, and check out our Code of Conduct.
Pieter De Malsche is a new contributor. Be nice, and check out our Code of Conduct.
Pieter De Malsche is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f50968%2fthesis-on-avalanche-prediction-using-one-class-svm%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown