Basic Concepts of Machine Learning

Basic Concepts of Machine Learning
Basic Concepts of Machine Learning
1. Types of Problems
To some extent, the distinction between modern machine learning and artificial intelligence (AI) has become increasingly blurred. The first step in machine learning is to represent a problem in a way that machines can understand—while still accurately expressing human intent. So what kinds of problems do people want machine learning to solve? Broadly speaking, they seem to fall into the following categories:
- Exploration: finding patterns, trends, or relationships by analyzing data. For example, the long-debated question of the relationship between smoking and lung cancer.
- Description: summarizing and describing data.
- Inference: using data to support a hypothesis.
- Prediction: using historical data to predict the future.
- Attribution: identifying the causes of events or phenomena.
- Mechanism: discovering underlying rules or laws.
2. The Order of Solving Problems
The process of using machine learning to solve problems can be divided into the following six steps:
- Definition: define and explain the goals and requirements according to the problem type described above.
- Preparation: search for and organize the necessary data to solve the problem.
- Modeling: build a machine learning model for the current problem.
- Implementation: apply the model to the data in order to solve the problem.
- Testing: repeatedly test and adjust the model to achieve the best possible results.
- Deployment: put the model into use in the real world.
3. Categories of Machine Learning Tasks
Machine learning tasks can be roughly divided into the following three types:
- Supervised learning: learning from labeled data to build a model that can make predictions on future data. Typical examples include spam detection and signature verification.
- Unsupervised learning: dealing with unlabeled data, with the goal of discovering patterns in the data. For example, by learning regular server access logs, it is possible to identify patterns of abnormal access.
- Reinforcement learning: a type of learning that continuously improves performance on unlabeled data through manually designed reward and punishment signals.
4. How Machines Work
To solve problems by completing tasks, machines generally perform the following types of work:
- Classification: as mentioned earlier, classification is one of the most common tasks.
- Regression: put simply, regression does not mean “going back” to somewhere, but rather finding the functional relationship between independent variables and dependent variables.
- Clustering: for content that cannot be strictly classified, methods similar to grouping galaxies by approximate distance and density are used.
- Dimensionality reduction: whether in anime worlds or in machine learning, reducing dimensions is a method that can greatly improve efficiency.
- Trial and error: errors in model design, data splitting, and many other aspects can lead to absurd conclusions, so a large part of machine tasks involves repeated trial and error through training and test sets.
- Optimization: many fields involve optimization problems—finding the optimal objective function under limited constraints, including finance, mathematics, engineering, and more.
- Linear programming: machine learning’s strong dependence on linear algebra is enough to demonstrate the important role of linear programming.
5. Types of Models
Models can also be classified. Broadly speaking, classification methods can be divided into grouping and grading. More specifically, models can roughly be divided into the following three categories:
- Geometric models: taking an imaginable plane as an example, building geometric models through data classification can be visualized intuitively. On this basis, vector spaces composed of multidimensional vectors also fall within the scope of geometric models.
- Probabilistic models: one of the most typical examples is the Bayesian classifier. Prior probability, posterior probability, and related concepts are clearly a very important branch of machine learning models.
- Logical models: decision trees formed through logic are a type of model that machine learning excels at and that is also highly valuable.
It can be seen that machine learning actually integrates multiple disciplines, including geometry, linear algebra, probability theory, logic, and decision theory. This does not even include the specialized application domains such as finance, engineering, economics, and physics, nor the knowledge required to turn theory into practice, such as programming and software engineering. In reality, machine learning is a highly interdisciplinary field.


