Vectors Beginner

Vectors are the most fundamental data structure in machine learning. Every data point, every feature set, every word embedding, and every neural network weight is represented as a vector. Mastering vector operations is the first step to understanding how AI processes information.

What is a Vector?

In machine learning, a vector is an ordered list of numbers. Each number represents a feature or dimension of the data. A vector with n elements lives in n-dimensional space.

Python
import numpy as np

# A 3-dimensional vector
v = np.array([1, 2, 3])

# In ML: a data point with 4 features
sample = np.array([5.1, 3.5, 1.4, 0.2])  # Iris flower measurements

# A word embedding (simplified)
word_vec = np.array([0.2, -0.5, 0.8, 0.1, -0.3])

Vector Operations

Addition and Scalar Multiplication

Vectors can be added element-wise and scaled by a number (scalar):

Python
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Vector addition
c = a + b  # [5, 7, 9]

# Scalar multiplication
d = 3 * a  # [3, 6, 9]

# Linear combination
e = 2 * a + 3 * b  # [14, 19, 24]

Dot Product

The dot product is the most important vector operation in ML. It measures the similarity between two vectors and forms the basis of neural network computations.

Python
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Dot product: sum of element-wise products
dot = np.dot(a, b)  # 1*4 + 2*5 + 3*6 = 32

# Equivalent notation
dot = a @ b  # 32
ML Connection: In a neural network, each neuron computes a dot product between its weight vector and the input vector, then applies an activation function. The dot product is literally the fundamental operation of deep learning.

Vector Norms

Norms measure the "size" or "length" of a vector. Different norms are used for different purposes in ML:

Norm Formula ML Use Case
L1 (Manhattan) Sum of absolute values Lasso regularization, sparse models
L2 (Euclidean) Square root of sum of squares Ridge regularization, distance metrics
L-infinity (Max) Maximum absolute value Adversarial robustness bounds
Python
v = np.array([3, -4])

# L1 norm
l1 = np.linalg.norm(v, ord=1)  # |3| + |-4| = 7

# L2 norm (default)
l2 = np.linalg.norm(v)  # sqrt(9 + 16) = 5.0

# Normalize a vector (unit vector)
unit_v = v / np.linalg.norm(v)  # [0.6, -0.8]

Cosine Similarity

Cosine similarity measures the angle between two vectors, ignoring their magnitude. It is the backbone of search engines, recommendation systems, and NLP:

Python
def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# Similar vectors have cosine similarity close to 1
v1 = np.array([1, 2, 3])
v2 = np.array([2, 4, 6])
print(cosine_similarity(v1, v2))  # 1.0 (identical direction)

# Orthogonal vectors have cosine similarity of 0
v3 = np.array([1, 0])
v4 = np.array([0, 1])
print(cosine_similarity(v3, v4))  # 0.0
In Practice: Word embeddings like Word2Vec represent words as vectors where cosine similarity captures semantic meaning. For example, the vector for "king" minus "man" plus "woman" gives a vector close to "queen."

Next Up: Matrices

Now that you understand vectors, let's explore matrices — the structures that transform vectors and form the heart of neural networks.

Next: Matrices →