\name{SVMFeatureSelectionSystem-package}
\alias{SVMFeatureSelectionSystem-package}
\alias{SVMFeatureSelectionSystem}
\docType{package}
\title{
Multiobjective feature selection for SVM and SVR.
}
\description{
This package was created to solve feature selection problem for support vector
machine and support vector regression. In feature selection problem we have a
set of possible input variables for SVM/SVR and the goal is to choose proper
subset of these variables. The package is based on multiobjective genetic
algorithm called multimodal NSGAII algorithm. The package also contains tool
to visualize the results.
}
\details{
\tabular{ll}{
Package: \tab SVMFeatureSelectionSystem\cr
Type: \tab Package\cr
Version: \tab 1.0\cr
Date: \tab 2013-12-23\cr
License: \tab BUT OPEN SOURCE LICENCE
Version 1. (type licenseSVMFeatureSelectionsSystem()).\cr
}
This package was created to solve feature selection problem for support vector
machine and support vector regression. Support vector machine (SVM) [1] is popular
machine learning algorithm which can be used to solve classification tasks.
The variant of SVM called support vector regression (SVR) [2] can be used to solve
regression tasks. In feature selection problem we have a set of possible input
variables for SVM/SVR and the goal is to choose proper subset of these
variables. The package uses multi objective genetic algorithm called multimodal
NSGAII [3] to solve this problem. In case of classification the main goal of feature
selection is to find the subset of inputs for which the number of misclassified
inputs will be minimal. In case of regression the main goal is to find the set
of inputs for which the root mean squared error will be minimal. We choose multi
objective genetic algorithm [4] because in the real world we need also to minimize
the number of inputs and number of samples for which one of the input values is
missing.

There are two basic functions in the package. Function
featureSelectionClassificationGA is able to find proper inputs of SVM for
classification task and the function featureSelectionRegressionGA is able to
find proper inputs of SVR for regression task. Both functions return list which
contains fitness values of obtained solutions, trained SVM/SVR for each solution
and information about convergence of genetic algorithm. The convergence of
multiobjective genetic algorithm can be visualized by functions
plotConvergenceClassification and plotConvergenceRegression. Usually the most
interesting are solutions, which lie on the Pareto front. Functions
filterParetoOptimalClassification and filterParetoOptimalRegression take the
results and filter out solutions which are dominated by other obtained
solutions. The package provides functions for visualization of obtained results.
Functions plotParetoFrontClassification and plotParetoFrontRegression plot
various compromises obtained at the end of the run of the genetic algorithm. To
obtain results for new data user can use functions predictClassification and
predictRegression. Functions mse, rmse, mae, rae and rse provides more
information about the quality of prediction for obtained models.
}
\author{
Ing. Jiri Petrlik <ipetrlik@fit.vutbr.cz>
}

\note{
This software was supported by IT4Innovations Centre of Excellence CZ.1.05/1.1.00/02.0070.
}

\references{
[1] A Tutorial on Support Vector Machines for Pattern Recognition, C. Burges, Data Mining and Knowledge Discovery, Vol. 2,Issue 2, 1998

[2] Support Vector Regression, Debasish Basak, Srimanta Pal, Dipak Chandra Patranabis, Neuaral Information Processing - Letters and Reviews, Vol. 11, No. 10, 2007 

[3] Deb, Kalyanmoy, Raji Reddy, A, Reliable classification of two-class cancer data using evolutionary algorithms, Biosystems, 2003

[4] Deb, Kalyanmoy, Multi-Objective Optimization using Evolutionary Algorithms, WILEY, 2009
}

\keyword{ feature selection, SVM, SVR, multimodal NSGAII }

\examples{

# Example of classification task 1:
library(SVMFeatureSelectionSystem);
library(classifly);

features<-colnames(olives)[3:10];
predictedVariable<-colnames(olives)[2];

shuffleOlives<-olives[sample(nrow(olives)),];
dataTrain<-shuffleOlives[1:286,];
dataTest<-shuffleOlives[287:572,];

results<-featureSelectionClassificationGA(dataTrain,dataTest,predictedVariable,
  features);

# Example of classification task 2:
library(SVMFeatureSelectionSystem);
library(plsgenomics);
library(stringr);

data(Colon);
ColonDataset<-as.data.frame(Colon$X);
colnames(ColonDataset)<-str_trim(Colon$gene.names);
ColonDataset<-cbind(ColonDataset,result=Colon$Y);

ColonDataset[ColonDataset[,"result"]==1,"result"]<-"normal";
ColonDataset[ColonDataset[,"result"]==2,"result"]<-"tumor";
ColonDataset[,"result"]<-as.factor(ColonDataset[,"result"]);

ColonDataset<-ColonDataset[sample(nrow(ColonDataset)),];
trainColon<-ColonDataset[1:31,];
testColon<-ColonDataset[32:62,];

features<-colnames(trainColon)[1:2000];
predictedVariable<-colnames(trainColon)[2001];

results<-featureSelectionClassificationGA(trainColon,testColon,
  predictedVariable,features,popSize=100,generations=200);

# Example of regression task 1:
library(SVMFeatureSelectionSystem);
library(rpart);

car90$Country<-as.numeric(car90$Country);
car90$Model2<-as.numeric(car90$Model2);
car90$Reliability<-as.numeric(car90$Reliability);
car90$Rim<-as.numeric(car90$Rim);
car90$Steering<-as.numeric(car90$Steering);
car90$Tires<-as.numeric(car90$Tires);
car90$Trans1<-as.numeric(car90$Trans1);
car90$Trans2<-as.numeric(car90$Trans2);
car90$Type<-as.numeric(car90$Type);

features<-colnames(car90)[colnames(car90)!="Price"];
predictedVariable<-"Price";

shuffleCar90<-car90[sample(nrow(car90)),]
trainData<-shuffleCar90[1:80,];
testData<-shuffleCar90[81:111,];

results<-featureSelectionRegressionGA(trainData,testData,predictedVariable,
  features);

# Example of regression task 2:
library(SVMFeatureSelectionSystem);
library(mlbench);

sim<-mlbench.friedman1(100);
trainData<-as.data.frame(sim$x);
trainData<-cbind(trainData,V11=sim$y);

sim<-mlbench.friedman1(100);
testData<-as.data.frame(sim$x);
testData<-cbind(testData,V11=sim$y);

features<-colnames(trainData)[1:10];
predictedVariable<-"V11";

results<-featureSelectionRegressionGA(trainData,testData,predictedVariable,
  features);
}
