Machine learning will definitely be used to design algorithm optimization problems, primarily by implementing the Platt SMO algorithm. The following article introduces the optimization of SVM, focusing on the implementation of the Platt SMO algorithm and attempting to use the genetic algorithm framework GAFT to initialize and optimize the SVM model.
**Heuristic Selection of Variables in SMO**
In the SMO algorithm, we need to select a pair of α values to optimize at each step. Through heuristic selection, we can more efficiently choose the variables to be optimized so that the objective function decreases as rapidly as possible.
Different heuristics are applied for selecting the first α₠and the second α₂ in Platt SMO.
**Selection of the First Variable**
The choice of the first variable is part of the outer loop. Unlike previous approaches where we iterated through the entire α list, here we alternate between the full training set and the non-boundary sample set:
First, we scan the entire training set to check if any α_i violates the KKT conditions. If an α_i and its corresponding x_i, y_i violate the KKT condition, it needs to be optimized.
The Karush-Kuhn-Tucker (KKT) conditions are the sufficient and necessary conditions for the optimal solution of a positive definite quadratic programming problem. For the SVM dual problem, the KKT conditions are relatively simple:
[Image: SVM optimization based on machine learning algorithm]
After scanning the entire training set and optimizing the corresponding α values, we then focus only on the non-boundary αs in the next iteration. Non-boundary αs are those not equal to 0 or C. These points are again checked for KKT violations and optimized.
This process continues, alternating between the two datasets. When all αs satisfy the KKT conditions, the algorithm stops.
To quickly select α with the largest step size, we cache the error values for all data points. We created a SVMUtil class to store important SVM variables and some helper methods.
[Image: SVM optimization based on machine learning algorithm]
Here’s an approximate code snippet for the alternate traversal of the first variable. For the complete Python implementation, refer to this link: [https://github.com/PytLab/MLBox/blob/master/svm/svm_platt_smo.py](https://github.com/PytLab/MLBox/blob/master/svm/svm_platt_smo.py)
[Image: SVM optimization based on machine learning algorithm]
**Choice of the Second Variable**
The selection of the second variable in SMO is part of the inner loop. Once α₠is selected, we aim to choose α₂ such that the optimization leads to a significant change.
From our previous derivation, we know that the new α₂ depends on |E₠- E₂|. When E₠is positive, we choose the smallest E_i as E₂. Usually, the E_i values of each sample are cached in a list, and we select α₂ with the maximum |E₠- E₂| to achieve the largest step size.
Sometimes, even after following these heuristics, the function value doesn’t decrease sufficiently. In such cases, we proceed with the following steps:
- Select a non-boundary sample that causes enough function value reduction as the second variable.
- If none exists in the non-boundary dataset, select from the full dataset.
- If still no improvement, reselect the first αâ‚.
The Python implementation for choosing the second variable is shown below:
[Image: SVM optimization based on machine learning algorithm]
**KKT Conditions Allow for Certain Errors**
In the Platt paper, there is a tolerance allowed for the KKT conditions, which permits certain errors. Here is the corresponding Python implementation:
[Image: SVM optimization based on machine learning algorithm]
For the complete implementation of Platt SMO, see: [https://github.com/PytLab/MLBox/blob/master/svm/svm_platt_smo.py](https://github.com/PytLab/MLBox/blob/master/svm/svm_platt_smo.py)
Using the Platt SMO algorithm, we optimized the previous dataset and obtained the following result:
[Image: SVM optimization based on machine learning algorithm]
Visualizing the decision boundary and support vectors:
[Image: SVM optimization based on machine learning algorithm]
It can be seen that the support vectors optimized by Platt SMO differ slightly from those of the simplified SMO algorithm.
**Optimizing SVM Using Genetic Algorithms**
Since I recently developed a genetic algorithm framework, it's very easy to use genetic algorithms as a heuristic search method. I tried using a genetic algorithm to optimize SVM.
By using genetic algorithms, we can directly optimize the original form of SVM, which is the most intuitive representation:
[Image: SVM optimization based on machine learning algorithm]
With the help of my own genetic algorithm framework, we only needed to write a few lines of Python code to optimize the SVM algorithm. The key part was writing the fitness function. According to the formula above, we calculated the distance from each point in the dataset to the decision boundary and returned the minimum distance. This was then used in the genetic algorithm for evolutionary iterations.
The GAFT project address is: [https://github.com/PytLab/gaft](https://github.com/PytLab/gaft). Please refer to the README for more details.
We started building the population for evolutionary iterations.
**Creating Individuals and Populations**
For two-dimensional data points, we only need to optimize three parameters: [w1, w2] and b. Each individual is defined as follows:
[Image: SVM optimization based on machine learning algorithm]
The population size is set as follows:
[Image: SVM optimization based on machine learning algorithm]
**Creating Genetic Operators and GA Engines**
There's nothing special here—just using the built-in operators from the framework.
[Image: SVM optimization based on machine learning algorithm]
**Fitness Function**
This part simply involves describing the original form of SVM, which requires just three lines of code:
[Image: SVM optimization based on machine learning algorithm]
**Starting the Iteration**
We ran 300 generations of population iterations.
[Image: SVM optimization based on machine learning algorithm]
Visualizing the partition line optimized by the genetic algorithm:
[Image: SVM optimization based on machine learning algorithm]
The resulting segmentation curve is shown below:
[Image: SVM optimization based on machine learning algorithm]
Lenovo thinkpad E14,Lenovo thinkpad E14 G2,Lenovo Thinkpad E14 keyboard,Lenovo Thinkpad replacement parts
S-yuan Electronic Technology Limited , https://www.syuanelectronic.com