Syllabus Point
- Research models used by software engineers to design and analyse ML
Including:
- decision trees
- neural networks
Research models used by software engineers to design and analyse ML
Traditional Computing vs Machine Learning
Understanding the differences between traditional computing approaches and machine learning approaches is fundamental for selecting the right solution strategy.
Purpose
- Traditional Computing: Decision support based on predefined rules
- Machine Learning: Predictive modelling based on learned patterns
Implementation
- Traditional Computing: Static, rule-based systems
- Machine Learning: Dynamic, data-driven learning (historical data)
Complexity Handling
- Traditional Computing: Limited handling of complex scenarios
- Machine Learning: Capable of modelling complex, non-linear relationships
Adaptability
- Traditional Computing: Doesn't adapt to new information
- Machine Learning: Learns and improves with new data
Typical Usage
- Traditional Computing: Business processes, troubleshooting
- Machine Learning: Predictive analytics, classification tasks
Decision Trees
A decision tree is a supervised learning model used for classification and regression. It uses a tree-like structure of nodes and branches to represent combinations of decisions and their possible consequences. Each decision path leads to either another decision or a final action.
Key Terminology
- Root node: Represents the entire population - gets further divided into two or more homogenous sets. Represents features or attributes
- Splitting: Process of dividing a node into two or more sub-nodes
- Node: The decision points in the tree
- Decision (internal) nodes: When a sub-node splits into further sub-nodes
- Leaf/terminal node: The node that has no further splitted nodes - represents the final outcomes/class labels
- Pruning: Process of removing sub-nodes of a decision node (opposite of splitting)
- Branches: Describe the eventual action depending on the condition at the time
- Features: Input variables in a dataset - specific characteristics/measurements that help describe each data point
- Target (label/class): The output variable the model is trying to predict. The decision tree should learn patterns in the features to predict the target
Classification
Using a tree like structure to predict a categorical outcome or class label from a set of data - sort inputs into distinct groups.
Regression
Using the tree to predict a continuous numeric value, rather than a discrete class or label.
How It Works
- At each node, the dataset is split based on the value of a specific feature
- Splitting continues until stopping criteria are met
- The model chooses splits that maximise information gain
- The goal is to build an accurate tree that is generalisable: able to make good predictions on new, unseen data
Gini Impurity
Measures how mixed the classes are in a node. The higher the impurity, the more mixed the classes are.
Overfitting
- When a model becomes too tailored to the training data → captures noise instead of meaningful patterns
- Happens when the tree grows too deep/too many specific rules
- Pruning: removing unnecessary branches after the tree is built
Advantages
- Easy to interpret and visualise
- Good for datasets with clearly separable features
- Used in spam filters, customer segmentation, risk assessment
- Can handle nonlinear relationships
Limitations
- Can be too complex, biased → poor performance on new, unseen data
- Sensitive to small changes in data
Neural Networks
Neural networks are a set of algorithms designed to recognise patterns, by mimicking the way the human brain works. It is suitable for complex problems involving large, unstructured data.
Overview
- Have series of interconnected nodes (artificial neurons)
- Each neuron processes input, applies a function and passes the result forward
- Series of algorithms that recognise relationships in data; commonly used in deep learning to solve complex problems
Architecture: Input Layer
- Receives raw data
- Each neuron in the layer represents a feature of the input data
- Example: grid of pixels
Architecture: Hidden Layers
- Perform computations and transformations on the input data
- The more layers, the deeper the network (hence 'deep learning')
- Inside the network, there are connections between artificial neurons like tiny switches
- Connections have weights (how important each input is) and thresholds (how strong a signal must be to activate the next neuron)
Architecture: Output Layer
- Produces the final result/prediction
- Each neuron corresponds to a possible output class
Adjustable Parameters
- Adjustable parameters within these neurons are called weights and biases
- Weights: determine the strength of connections between neurons
- Biases: provide an additional parameter that shift the activation function's output
- Activation function: decides whether a neuron should be activated based on the weighted sum of its inputs and a bias
- As the network learns these are adjusted to determine the strength of input signals
Training Cycle
The training cycle is when the neural network learns from data.
- Input data is fed in
- Network makes prediction
- Prediction compared to correct answer (called a label)
- Error is calculated (how far off)
- Network uses backpropagation to adjust its internal settings (weights) - Backward chaining: method where the network works backward from the output error to adjust its internal weights
- Process repeats (iteration) over many epochs (passes over the data) to reduce errors
- The goal is to help the network learn to make accurate predictions by adjusting its parameters based on example and feedback
Execution Cycle
The execution cycle is when the trained neural network is used to make real-world decisions (also known as inference).
- New input is fed into the network
- Network runs its learned pattern-matching process
- Outputs a prediction or result
Importance of Neural Networks
- Automating tasks to save time and money
- Improved decision making with better insights
- Increased efficiency
- New products and services
Uses
- Image recognition
- Natural language processing (speech recognition)
- Time series prediction (e.g. financial forecasting)
Related Resources
Keep Progressing
Use the lesson navigation below to move through the module sequence.