Machine Learning Naive Bayes Application Tutorial

Today, we introduce a common classification method based on probability in machine learningâ€”Naive Bayes. Unlike hard decision methods such as KNN or decision trees, which output only 0 or 1, Naive Bayes provides the probability of a sample belonging to a specific class, with values ranging from 0 to 1. This probabilistic approach makes it particularly effective in tasks like text classification and spam detection.

The foundation of Naive Bayes is Bayes' Theorem, a well-known rule in probability theory. Letâ€™s consider a two-class problem where we have classes c1 and c2. Given a sample, such as an email represented by a vector x, we want to determine whether it belongs to c1 or c2. This can be expressed using posterior probability:

According to Bayesâ€™ Theorem, we can compute this as follows:

This is the core idea behind Naive Bayes. The algorithm assumes that all features (words in the case of text classification) are conditionally independent given the class, hence the term "naive." Despite this simplification, it often performs surprisingly well in practice.

Letâ€™s walk through an example of using Naive Bayes for text categorization. First, we create a dataset of sample texts along with their corresponding labels:

def load_dataset():
posting_list = [['my', 'dog', 'has', 'flea', \
'problems', 'help', 'please'],
['maybe', 'not', 'take', 'him', \
'to', 'dog', 'park', 'stupid'],
['my', 'dalmaTIon', 'is', 'so', 'cute', \
'I', 'love', 'him'],
['stop', 'posTIng', 'stupid', 'worthless', 'garbage'],
['mr', 'licks', 'ate', 'my', 'steak', 'how',\
'to', 'stop', 'him'],
['quit', 'buying', 'worthless', 'dog', 'food', 'stupid']]
class_vec = [0, 1, 0, 1, 0, 1]
return posting_list, class_vec

Next, we build a vocabulary list that maps each unique word to an index. This will help us convert text into numerical vectors for further processing:

def create_vocab_list(dataset):
vocab_set = set([])
for document in dataset:
vocab_set = vocab_set | set(document)
return list(vocab_set)

Once we have the vocabulary, we can transform each text document into a vector. Two common approaches are the bag-of-words model and the binary presence model:

def word2vec(vocab_list, input_set):
return_vec = [0] * len(vocab_list)
for word in input_set:
if word in vocab_list:
return_vec[vocab_list.index(word)] = 1
else:
print("the word %s is not in the vocabulary" % word)
return return_vec

def bow_vec(vocab_list, input_set):
return_vec = [0] * len(vocab_list)
for word in input_set:
if word in vocab_list:
return_vec[vocab_list.index(word)] += 1
else:
print("the word %s is not in the vocabulary" % word)
return return_vec

Now, let's build the Naive Bayes classifier. One important consideration is that multiplying many small probabilities can lead to underflow issues. To avoid this, we use logarithms instead of direct multiplication:

def train_nb(train_mat, train_class):
num_doc = len(train_mat)
num_word = len(train_mat[0])
p_1 = sum(train_class) / float(num_doc)
p0_num = np.zeros(num_word) + 1
p1_num = np.zeros(num_word) + 1
p0_deno = 2.0
p1_deno = 2.0
for i in range(num_doc):
if train_class[i] == 1:
p1_num += train_mat[i]
p1_deno += sum(train_mat[i])
else:
p0_num += train_mat[i]
p0_deno += sum(train_mat[i])
p1_vec = np.log(p1_num / p1_deno)
p0_vec = np.log(p0_num / p0_deno)
return p_1, p1_vec, p0_vec

To classify a new sample, we calculate the log probability for each class and choose the one with the highest score:

def classify_nb(test_vec, p0_vec, p1_vec, p1):
p1 = sum(test_vec * p1_vec) + math.log(p1)
p0 = sum(test_vec * p0_vec) + math.log(1 - p1)
if p1 > p0:
return 1
else:
return 0

Before training the model, we need to process the text. A simple function can split a string into tokens and clean them up:

def text_parse(long_string):
import re
reg_ex = re.compile(r'\W*')
list_of_tokens = reg_ex.split(long_string)
return [tok.lower() for tok in list_of_tokens if len(tok) > 0]

Finally, here's a simple test to see how the model works:

test_string = 'This book is the best book on Python or ML\nI have ever laid eyes upon.'
word_list = text_parse(test_string)
my_data, class_vec = load_dataset()
vocab_list = create_vocab_list(my_data)
train_mat = []
for doc in my_data:
train_mat.append(word2vec(vocab_list, doc))
p_1, p1_vec, p0_vec = train_nb(train_mat, class_vec)

T25 Series Terminal Blocks

Withstand high voltage up to 750V (IEC/EN standard)

UL 94V-2 or UL 94V-0 flame retardant housing

Anti-falling screws

Optional wire protection

1~12 poles, dividable as requested

Maximum wiring capacity of 25 mm2

25 mmÂ² connector blocks, 60 amp Terminal Blocks,traditional screw type terminal blocks,PA66 Terminal Blocks

Jiangmen Krealux Electrical Appliances Co.,Ltd. , https://www.krealux-online.com

ROHM developed the power supply IC "BD9V100MUF-C" with 2MHz operation, the industry's highest step-down ratio, and 60V input 2.5V output.

Machine Learning Naive Bayes Application Tutorial