Friday, December 20, 2024

ML Terminology

 Here is a glossary of common Machine Learning (ML) terminology to help understand the field better:

General Concepts

  1. Machine Learning (ML): A field of AI where systems learn patterns from data to make decisions or predictions without being explicitly programmed.
  2. Artificial Intelligence (AI): A broader field aimed at creating systems capable of performing tasks that typically require human intelligence.
  3. Deep Learning: A subset of ML focused on neural networks with many layers, used for tasks like image recognition and natural language processing.

Types of Learning

  1. Supervised Learning: Learning with labeled data (input-output pairs).
  2. Unsupervised Learning: Learning from data without labeled outputs (e.g., clustering).
  3. Semi-Supervised Learning: Combines a small amount of labeled data with a large amount of unlabeled data.
  4. Reinforcement Learning (RL): Learning by interacting with an environment to maximize a reward.

Data and Features

  1. Dataset: A collection of data used for training, validation, and testing a model.
  2. Features: Input variables used by the model for predictions (e.g., age, income).
  3. Labels: The output or target variable in supervised learning.
  4. Feature Engineering: The process of creating, transforming, or selecting features to improve model performance.
  5. Feature Scaling: Normalizing or standardizing features to ensure consistent ranges.

Models and Algorithms

  1. Model: A mathematical representation of a process learned from data.
  2. Algorithm: A procedure or formula for solving a problem, such as training a model.
  3. Hyperparameters: Configuration settings external to the model that need to be specified before training (e.g., learning rate, number of layers).
  4. Parameters: Internal values of a model learned from training (e.g., weights in a neural network).

Training and Evaluation

  1. Training: The process of learning model parameters using training data.
  2. Validation: Evaluating the model on a separate dataset to fine-tune hyperparameters.
  3. Testing: Assessing the final model’s performance on unseen data.
  4. Overfitting: When a model learns noise in the training data instead of generalizing.
  5. Underfitting: When a model is too simple to capture the patterns in the data.

Metrics and Evaluation

  1. Accuracy: The proportion of correctly predicted samples out of the total.
  2. Precision: The proportion of true positives out of predicted positives.
  3. Recall: The proportion of true positives out of actual positives.
  4. F1 Score: The harmonic mean of precision and recall.
  5. ROC Curve: A graph showing the trade-off between true positive rate and false positive rate.
  6. Loss Function: A mathematical function measuring the error of predictions (e.g., Mean Squared Error, Cross-Entropy).

Advanced Concepts

  1. Gradient Descent: An optimization algorithm used to minimize the loss function by adjusting parameters.
  2. Learning Rate: A hyperparameter that controls how much to adjust the model's weights during training.
  3. Backpropagation: The process of updating weights in a neural network using gradient descent.
  4. Neural Network: A model inspired by the human brain, consisting of layers of neurons connected by weights.
  5. Activation Function: Non-linear functions applied to neurons (e.g., ReLU, sigmoid, softmax).
  6. Regularization: Techniques to prevent overfitting by penalizing large weights (e.g., L1, L2).
  7. Dropout: A regularization method that randomly deactivates neurons during training.
  8. Transfer Learning: Reusing a pre-trained model on a new but related task.

Specialized Topics

  1. Clustering: Grouping similar data points (e.g., k-means).
  2. Dimensionality Reduction: Reducing the number of features while preserving information (e.g., PCA, t-SNE).
  3. Ensemble Learning: Combining multiple models to improve performance (e.g., Random Forest, Gradient Boosting).
  4. Time Series: Analysis of data points ordered by time.
  5. Natural Language Processing (NLP): Techniques for analyzing and generating human language.
  6. Computer Vision: Techniques for analyzing and interpreting visual data.

Would you like details on any of these terms or related concepts?

No comments:

Post a Comment

How will AI transform your life in the next 5 years?

 AI is already transforming how we live and work, and over the next 5 years, this transformation is expected to accelerate in several key ar...