Overview
We present BudgetedSVM, a C++ toolbox containing highly optimized implementations of three recently proposed algorithms for scalable training of Support Vector Machine (SVM) approximators: Adaptive Multihyperplane Machines (AMM), Budgeted Stochastic Gradient Descent (BSGD), and Lowrank Linearization SVM (LLSVM). The toolbox also includes Pegasos, a stateoftheart linear SVM solver, as it is a special case of AMM. To avoid confusion, we note that our toolbox is not related to BudgetSVM method described in Dekel, O., Singer, Y., Support Vector Machines on a Budget, Advances in Neural Information Processing Systems (NIPS), 2007.
Some of the main features of BudgetedSVM package are listed below:
 We provide efficient implementations of algorithms for highlyscalable nonlinear SVM training.
 The toolbox can handle largescale, highdimensional data sets that cannot be loaded into memory.
 The toolbox requires constant memory to train accurate models that solve highly nonlinear problems.
 We provide commandline and Matlab interfaces to BudgetedSVM.
 We provide an efficient API that provides functionalities for handling largescale, highdimensional data sets. Using BudgetedSVM API, data sets with millions data points and/or features are easily handled. For more details, please see the documentation included in the download package.
BudgetedSVM is capable of training nonlinear models with accuracy comparable to LibSVM and time comparable to LibLinear. Unlike other popular, stateoftheart machine learning libraries which provide very comprehensive set of various machine learning tools, such as
Weka,
Shogun,
Shark, or
scikitlearn, BudgetedSVM toolbox is focused on solving largescale, nonlinear classification problems when only limited computational resources are available. In particular, through highlyoptimized C++ implementation of recently proposed largescale SVM approximators, BudgetedSVM can be used to solve nonlinear classification problems with millions of examples and/or features within minutes, even on a regular personal computer.
BudgetedSVM is a free, opensource software, published under the Modified BSD license. If you find the toolbox useful in your own work, please cite our JMLR paper.
Stepbystep commandprompt example
The interface design follows format of LibSVM and LibLinear, enabling users of these two popular packages to easily adopt BudgetedSVM for use in their projects. Before moving on, you will need to dowload BudgetedSVM toolbox, which can be found at the Download page. After downloading and unzipping the software package, cd
to the directory where you unzipped the toolbox and type make
to compile the toolbox. Then, in order to train and evaluate a classifier on adult9a data set (do not worry about finding the data set online, we included training and testing .txt files in the .zip file), we can run the following commands to train nonlinear classifiers implemented in BudgetedSVM:
>> # train and test AMM algorithm
>> bin/budgetedsvmtrain A 1 e 5 L 0.001 B 20 D 123 v 1 k 10000 c 10 a9a_train.txt a9a_model.txt
>> bin/budgetedsvmpredict v 1 a9a_test.txt a9a_model.txt a9a_predictions.txt
>> # train and test LLSVM algorithm
>> bin/budgetedsvmtrain A 3 L 0.1 K 0 g 0.01 B 100 m 1 D 123 v 1 a9a_train.txt a9a_model.txt
>> bin/budgetedsvmpredict v 1 a9a_test.txt a9a_model.txt a9a_predictions.txt
>> # train and test BSGD algorithm
>> bin/budgetedsvmtrain A 4 g 0.01 e 5 L 0.0001 B 200 m 1 D 123 v 1 a9a_train.txt a9a_model.txt
>> bin/budgetedsvmpredict v 1 a9a_test.txt a9a_model.txt a9a_predictions.txt
Note that the train and predict programs were created in the "./bin" folder, which is why we appended "bin/" to the calls to the functions. If you have Windows operating system, be careful to use "\" (backslash) instead of "/" (forwardslash) when specifying the path to the programs in the command prompt. After running the examples, algorithms should return accuracy of around 15% (the data set gets shuffled during training so the accuracy could fluctuate slightly between runs).
Detailed explanation of the above code examples follows:
 AMM example: The first command uses AMM batch (
A 1
) algorithm to train multihyperplane machine for 5 epochs (e 5
), using regularization parameter lambda of 0.001 (L 0.001
, larger values result in less complex model, or, in other words, more regularization) and setting the maximum number of weights per class to 20 (B 20
). As adult9a data set is of dimensionality 123, we also write D 123
, and choose verbose output (v 1
) which prints detailed steps of the algorithm. Lastly, we specify that pruning of weigths will be performed every 10,000 iterations (k 10000
, smaller values results in more aggressive pruning), and the pruning parameter is set to 10 (c 10
, larger values result in more aggressive pruning). If you did not specify a name for the model file, it will be created such that suffix .model
is appended to the filename of the training file (note that we did include the model filename in the above example). The second command evaluates the model on testing data set, and prints the accuracy on the testing set while saving the predictions to a9a_predictions.txt
.
 LLSVM example: The first command uses LLSVM (
A 3
) algorithm to train a classification model, setting regularization parameter to 0.1 (L 0.1
, larger values result in less complex model, or, in other words, more regularization), which result in higher regularization than in the AMM case described above. We use Gaussian kernel (K 0
) with kernel width 0.01 (g 0.01
). With B 100
option we set the budget, specifying that the model will comprise 100 landmark points which will be chosen by running kmeans on the loaded training data (m 1
). As adult9a data set is of dimensionality 123, we also write D 123
, and choose verbose output (v 1
) which prints detailed steps of the algorithm. If you did not specify a name for the model file, it will be created such that suffix .model
is appended to the filename of the training file (note that we did include the model filename in the above example). The second command evaluates the model on testing data set, and prints the accuracy on the testing set while saving the predictions to a9a_predictions.txt
.
 BSGD example: The first command uses BSGD (
A 4
) algorithm to train a classification model for 5 epochs (e 5
), using learning rate lambda of 0.0001 (L 0.0001
, larger values result in less complex model, or, in other words, more regularization) and kernel width of Gaussian kernel of 0.01 (g 0.01
). With B 200
option we specifying that the model will comprise 200 support vectors, and in the case of a budget overflow two support vectors will be merged (m 1
) to maintain the budget. As adult9a data set is of dimensionality 123, we also write D 123
, and choose verbose output (v 1
) which prints detailed steps of the algorithm. If you did not specify a name for the model file, it will be created such that suffix .model
is appended to the filename of the training file (note that we did include the model filename in the above example). The second command evaluates the model on testing data set, and prints the accuracy on the testing set while saving the predictions to a9a_predictions.txt
.
For a list of all available options simply run
budgetedsvmtrain
and
budgetedsvmpredict
without any arguments, or see the
readme.txt
file included in the package.
Stepbystep Matlab example
We provide similar interface for Matlab/Octave as well. We note that we included functions libsvmread
and libsvmwrite
in the toolbox, used for loading the data sets into Matlab and for saving the Matlab matrices to .txt files, respectively, written by RongEn Fan and KaiWei Chang from National Taiwan University as a part of LibSVM package. After downloading and unzipping the software package, cd
to the matlab
folder located in the directory where you unzipped the toolbox, and type make
to compile the toolbox. Then, in order to train and evaluate AMM classifier on adult9a data set, we can run the following commands:
>> % load the data into Matlab
>> [a9a_label, a9a_inst] = libsvmread('../a9a_train.txt');
>> % train a nonlinear model on the training set
>> model = budgetedsvm_train(a9a_label, a9a_inst, 'A 2 L 0.001 v 1 e 5');
>> % evaluate the trained model on the training data
>> [accuracy, predict_label] = budgetedsvm_predict(a9a_label, a9a_inst, model, 'v 1');
Unlike in the commandprompt example given above, in this example we use AMM online (A 2
) algorithm to train multihyperplane machine. However, the above example assumes that the data can be loaded to Matlab workspace. That might not be possible for largescale data sets, and for that reason we also included a version where the data is loaded into BudgetedSVM directly from the .txt files, and the learned model is stored in the .txt file instead in the Matlab workspace:
>> % train a nonlinear model on the training set
>> budgetedsvm_train('../a9a_train.txt', '../a9a_model.txt', 'A 2 L 0.001 v 1 e 5 D 123');
>> % evaluate the trained model on the testing data
>> [accuracy, predict_label] = budgetedsvm_predict('../a9a_test.txt', '../a9a_model.txt', 'v 1');
After running the examples in Matlab prompt, algorithm should return accuracy of around 15%.
Here we do not provide examples for other implemented algorithms as they can be easily trained by using option strings shown in the above commandprompt examples. For a list of all available options simply run
budgetedsvm_train
and
budgetedsvm_predict
without any arguments, or see the
matlab/readme.txt
file included in the package.
Empirical evaluation
BudgetedSVM algorithms can learn a model even for data sets with millions of data points and features, with training times orders of magnitude faster than SVM with RBF kernel (RBFSVM) trained using LibSVM. For illustration, in Table 1 we give comparison of error rates and training times on several largescale, highdimensional binary classification tasks which can be found here (N is the number of data points, D is the data dimensionality, S is the fraction of nonzero features, B is the size of the trained model). Note that Pegasos is a stateoftheart linear classifier, and its performance is expected to be the same as that of other linear classifiers. We can see that on webspam and rcv1bin data sets it took LibSVM hours to train RBFSVM, while BudgetedSVM algorithms with much smaller budgets achieved high accuracies within minutes, and even seconds in the case of AMM. Note that the budget B of algorithms implemented in BudgetedSVM is several orders of magnitude smaller than number of support vectors found by RBFSVM.
Table 1. Error rates (in %) and training times^{1} on benchmark data sets
 Pegasos  AMM  LLSVM  BSGD  LibLinear^{2}  RBFSVM 
Data set       
 N = 280,000 
webspam  D = 254 
 S = 41.9% 
  
3.46  500  2.5m 
2.60  1,000  6.1m 
1.99  3,000  0.5h 

2.04  500  2.0m 
1.72  1,000  3.9m 
1.49  3,000  0.2h 
  
 N = 677,399 
rcv1bin  D = 47,236 
 S = 0.2% 
  
4.97  500  0.2h 
4.23  1,000  0.5h 
3.05  3,000  2.2h 

3.33  500  0.8h 
2.92  1,000  1.5h 
2.53  3,000  4.4h 
  
 N=8,000,000 
mnist8mbin  D = 784 
 S = 19.3% 
  
6.84  500  1.6h 
4.59  1,000  3.8h 
2.59  3,000  15h 

2.23  500  2.3h 
1.92  1,000  4.9h 
1.62  3,000  16.1h 
  
1. We excluded data loading time, and evaluated performance on Intel E7400 with 2.80GHz processor and 4GB RAM. Scripts used to generate the results in the table can be found here:
BudgetedSVM,
LibLinear,
LibSVM.
2. We used LibLinear extension used when data set can not be loaded into the memory, found
here.
3. Listed accuracy obtained after 2 days of PpackSVM training on 512 processors; from
Zhu, Z. A., Chen, W., Wang, G., Zhu, C., Chen, Z. PpackSVM: Parallel primal gradient descent kernel SVM, International Conference on Data Mining (ICDM), 2009.