Here is a glossary of common Machine Learning (ML) terminology to help understand the field better:
General Concepts
- Machine Learning (ML): A field of AI where systems learn patterns from data to make decisions or predictions without being explicitly programmed.
- Artificial Intelligence (AI): A broader field aimed at creating systems capable of performing tasks that typically require human intelligence.
- Deep Learning: A subset of ML focused on neural networks with many layers, used for tasks like image recognition and natural language processing.
Types of Learning
- Supervised Learning: Learning with labeled data (input-output pairs).
- Unsupervised Learning: Learning from data without labeled outputs (e.g., clustering).
- Semi-Supervised Learning: Combines a small amount of labeled data with a large amount of unlabeled data.
- Reinforcement Learning (RL): Learning by interacting with an environment to maximize a reward.
Data and Features
- Dataset: A collection of data used for training, validation, and testing a model.
- Features: Input variables used by the model for predictions (e.g., age, income).
- Labels: The output or target variable in supervised learning.
- Feature Engineering: The process of creating, transforming, or selecting features to improve model performance.
- Feature Scaling: Normalizing or standardizing features to ensure consistent ranges.
Models and Algorithms
- Model: A mathematical representation of a process learned from data.
- Algorithm: A procedure or formula for solving a problem, such as training a model.
- Hyperparameters: Configuration settings external to the model that need to be specified before training (e.g., learning rate, number of layers).
- Parameters: Internal values of a model learned from training (e.g., weights in a neural network).
Training and Evaluation
- Training: The process of learning model parameters using training data.
- Validation: Evaluating the model on a separate dataset to fine-tune hyperparameters.
- Testing: Assessing the final model’s performance on unseen data.
- Overfitting: When a model learns noise in the training data instead of generalizing.
- Underfitting: When a model is too simple to capture the patterns in the data.
Metrics and Evaluation
- Accuracy: The proportion of correctly predicted samples out of the total.
- Precision: The proportion of true positives out of predicted positives.
- Recall: The proportion of true positives out of actual positives.
- F1 Score: The harmonic mean of precision and recall.
- ROC Curve: A graph showing the trade-off between true positive rate and false positive rate.
- Loss Function: A mathematical function measuring the error of predictions (e.g., Mean Squared Error, Cross-Entropy).
Advanced Concepts
- Gradient Descent: An optimization algorithm used to minimize the loss function by adjusting parameters.
- Learning Rate: A hyperparameter that controls how much to adjust the model's weights during training.
- Backpropagation: The process of updating weights in a neural network using gradient descent.
- Neural Network: A model inspired by the human brain, consisting of layers of neurons connected by weights.
- Activation Function: Non-linear functions applied to neurons (e.g., ReLU, sigmoid, softmax).
- Regularization: Techniques to prevent overfitting by penalizing large weights (e.g., L1, L2).
- Dropout: A regularization method that randomly deactivates neurons during training.
- Transfer Learning: Reusing a pre-trained model on a new but related task.
Specialized Topics
- Clustering: Grouping similar data points (e.g., k-means).
- Dimensionality Reduction: Reducing the number of features while preserving information (e.g., PCA, t-SNE).
- Ensemble Learning: Combining multiple models to improve performance (e.g., Random Forest, Gradient Boosting).
- Time Series: Analysis of data points ordered by time.
- Natural Language Processing (NLP): Techniques for analyzing and generating human language.
- Computer Vision: Techniques for analyzing and interpreting visual data.
Would you like details on any of these terms or related concepts?
No comments:
Post a Comment