Artificial Intelligence: ML Terminology

Here is a glossary of common Machine Learning (ML) terminology to help understand the field better:

Machine Learning (ML): A field of AI where systems learn patterns from data to make decisions or predictions without being explicitly programmed.
Artificial Intelligence (AI): A broader field aimed at creating systems capable of performing tasks that typically require human intelligence.
Deep Learning: A subset of ML focused on neural networks with many layers, used for tasks like image recognition and natural language processing.

Supervised Learning: Learning with labeled data (input-output pairs).
Unsupervised Learning: Learning from data without labeled outputs (e.g., clustering).
Semi-Supervised Learning: Combines a small amount of labeled data with a large amount of unlabeled data.
Reinforcement Learning (RL): Learning by interacting with an environment to maximize a reward.

Dataset: A collection of data used for training, validation, and testing a model.
Features: Input variables used by the model for predictions (e.g., age, income).
Labels: The output or target variable in supervised learning.
Feature Engineering: The process of creating, transforming, or selecting features to improve model performance.
Feature Scaling: Normalizing or standardizing features to ensure consistent ranges.

Model: A mathematical representation of a process learned from data.
Algorithm: A procedure or formula for solving a problem, such as training a model.
Hyperparameters: Configuration settings external to the model that need to be specified before training (e.g., learning rate, number of layers).
Parameters: Internal values of a model learned from training (e.g., weights in a neural network).

Training: The process of learning model parameters using training data.
Validation: Evaluating the model on a separate dataset to fine-tune hyperparameters.
Testing: Assessing the final model’s performance on unseen data.
Overfitting: When a model learns noise in the training data instead of generalizing.
Underfitting: When a model is too simple to capture the patterns in the data.

Accuracy: The proportion of correctly predicted samples out of the total.
Precision: The proportion of true positives out of predicted positives.
Recall: The proportion of true positives out of actual positives.
F1 Score: The harmonic mean of precision and recall.
ROC Curve: A graph showing the trade-off between true positive rate and false positive rate.
Loss Function: A mathematical function measuring the error of predictions (e.g., Mean Squared Error, Cross-Entropy).

Gradient Descent: An optimization algorithm used to minimize the loss function by adjusting parameters.
Learning Rate: A hyperparameter that controls how much to adjust the model's weights during training.
Backpropagation: The process of updating weights in a neural network using gradient descent.
Neural Network: A model inspired by the human brain, consisting of layers of neurons connected by weights.
Activation Function: Non-linear functions applied to neurons (e.g., ReLU, sigmoid, softmax).
Regularization: Techniques to prevent overfitting by penalizing large weights (e.g., L1, L2).
Dropout: A regularization method that randomly deactivates neurons during training.
Transfer Learning: Reusing a pre-trained model on a new but related task.

Clustering: Grouping similar data points (e.g., k-means).
Dimensionality Reduction: Reducing the number of features while preserving information (e.g., PCA, t-SNE).
Ensemble Learning: Combining multiple models to improve performance (e.g., Random Forest, Gradient Boosting).
Time Series: Analysis of data points ordered by time.
Natural Language Processing (NLP): Techniques for analyzing and generating human language.
Computer Vision: Techniques for analyzing and interpreting visual data.

Would you like details on any of these terms or related concepts?

Artificial Intelligence