A systematic methodology to evaluating optimised machine learning based network intrusion detection systems
- Authors: Chindove, Hatitye Ethridge
- Date: 2022-10-14
- Subjects: Intrusion detection systems (Computer security) , Machine learning , Computer networks Security measures , Principal components analysis
- Language: English
- Type: Academic theses , Master's theses , text
- Identifier: http://hdl.handle.net/10962/362774 , vital:65361
- Description: A network intrusion detection system (NIDS) is essential for mitigating computer network attacks in various scenarios. However, the increasing complexity of computer networks and attacks makes classifying unseen or novel network traffic challenging. Supervised machine learning techniques (ML) used in a NIDS can be affected by different scenarios. Thus, dataset recency, size, and applicability are essential factors when selecting and tuning a machine learning classifier. This thesis explores developing and optimising several supervised ML algorithms with relatively new datasets constructed to depict real-world scenarios. The methodology includes empirical analyses of systematic ML-based NIDS for a near real-world network system to improve intrusion detection. The thesis is experimental heavy for model assessment. Data preparation methods are explored, followed by feature engineering techniques. The model evaluation process involves three experiments testing against a validation, un-trained, and retrained set. They compare several traditional machine learning and deep learning classifiers to identify the best NIDS model. Results show that the focus on feature scaling, feature selection methods and ML algo- rithm hyper-parameter tuning per model is an essential optimisation component. Distance based ML algorithm performed much better with quantile transformation whilst the tree based algorithms performed better without scaling. Permutation importance performs as a feature selection method compared to feature extraction using Principal Component Analysis (PCA) when applied against all ML algorithms explored. Random forests, Sup- port Vector Machines and recurrent neural networks consistently achieved the best results with high macro f1-score results of 90% 81% and 73% for the CICIDS 2017 dataset; and 72% 68% and 73% against the CICIDS 2018 dataset. , Thesis (MSc) -- Faculty of Science, Computer Science, 2022
- Full Text:
- Date Issued: 2022-10-14
- Authors: Chindove, Hatitye Ethridge
- Date: 2022-10-14
- Subjects: Intrusion detection systems (Computer security) , Machine learning , Computer networks Security measures , Principal components analysis
- Language: English
- Type: Academic theses , Master's theses , text
- Identifier: http://hdl.handle.net/10962/362774 , vital:65361
- Description: A network intrusion detection system (NIDS) is essential for mitigating computer network attacks in various scenarios. However, the increasing complexity of computer networks and attacks makes classifying unseen or novel network traffic challenging. Supervised machine learning techniques (ML) used in a NIDS can be affected by different scenarios. Thus, dataset recency, size, and applicability are essential factors when selecting and tuning a machine learning classifier. This thesis explores developing and optimising several supervised ML algorithms with relatively new datasets constructed to depict real-world scenarios. The methodology includes empirical analyses of systematic ML-based NIDS for a near real-world network system to improve intrusion detection. The thesis is experimental heavy for model assessment. Data preparation methods are explored, followed by feature engineering techniques. The model evaluation process involves three experiments testing against a validation, un-trained, and retrained set. They compare several traditional machine learning and deep learning classifiers to identify the best NIDS model. Results show that the focus on feature scaling, feature selection methods and ML algo- rithm hyper-parameter tuning per model is an essential optimisation component. Distance based ML algorithm performed much better with quantile transformation whilst the tree based algorithms performed better without scaling. Permutation importance performs as a feature selection method compared to feature extraction using Principal Component Analysis (PCA) when applied against all ML algorithms explored. Random forests, Sup- port Vector Machines and recurrent neural networks consistently achieved the best results with high macro f1-score results of 90% 81% and 73% for the CICIDS 2017 dataset; and 72% 68% and 73% against the CICIDS 2018 dataset. , Thesis (MSc) -- Faculty of Science, Computer Science, 2022
- Full Text:
- Date Issued: 2022-10-14
Remote fidelity of Container-Based Network Emulators
- Authors: Peach, Schalk Willem
- Date: 2021-10-29
- Subjects: Computer networks Security measures , Intrusion detection systems (Computer security) , Computer security , Host-based intrusion detection systems (Computer security) , Emulators (Computer programs) , Computer network protocols , Container-Based Network Emulators (CBNEs) , Network Experimentation Platforms (NEPs)
- Language: English
- Type: Master's theses , text
- Identifier: http://hdl.handle.net/10962/192141 , vital:45199
- Description: This thesis examines if Container-Based Network Emulators (CBNEs) are able to instantiate emulated nodes that provide sufficient realism to be used in information security experiments. The realism measure used is based on the information available from the point of view of a remote attacker. During the evaluation of a Container-Based Network Emulator (CBNE) as a platform to replicate production networks for information security experiments, it was observed that nmap fingerprinting returned Operating System (OS) family and version results inconsistent with that of the host Operating System (OS). CBNEs utilise Linux namespaces, the technology used for containerisation, to instantiate \emulated" hosts for experimental networks. Linux containers partition resources of the host OS to create lightweight virtual machines that share a single OS kernel. As all emulated hosts share the same kernel in a CBNE network, there is a reasonable expectation that the fingerprints of the host OS and emulated hosts should be the same. Based on how CBNEs instantiate emulated networks and that fingerprinting returned inconsistent results, it was hypothesised that the technologies used to construct CBNEs are capable of influencing fingerprints generated by utilities such as nmap. It was predicted that hosts emulated using different CBNEs would show deviations in remotely generated fingerprints when compared to fingerprints generated for the host OS. An experimental network consisting of two emulated hosts and a Layer 2 switch was instantiated on multiple CBNEs using the same host OS. Active and passive fingerprinting was conducted between the emulated hosts to generate fingerprints and OS family and version matches. Passive fingerprinting failed to produce OS family and version matches as the fingerprint databases for these utilities are no longer maintained. For active fingerprinting the OS family results were consistent between tested systems and the host OS, though OS version results reported was inconsistent. A comparison of the generated fingerprints revealed that for certain CBNEs fingerprint features related to network stack optimisations of the host OS deviated from other CBNEs and the host OS. The hypothesis that CBNEs can influence remotely generated fingerprints was partially confirmed. One CBNE system modified Linux kernel networking options, causing a deviation from fingerprints generated for other tested systems and the host OS. The hypothesis was also partially rejected as the technologies used by CBNEs do not influence the remote fidelity of emulated hosts. , Thesis (MSc) -- Faculty of Science, Computer Science, 2021
- Full Text:
- Date Issued: 2021-10-29
- Authors: Peach, Schalk Willem
- Date: 2021-10-29
- Subjects: Computer networks Security measures , Intrusion detection systems (Computer security) , Computer security , Host-based intrusion detection systems (Computer security) , Emulators (Computer programs) , Computer network protocols , Container-Based Network Emulators (CBNEs) , Network Experimentation Platforms (NEPs)
- Language: English
- Type: Master's theses , text
- Identifier: http://hdl.handle.net/10962/192141 , vital:45199
- Description: This thesis examines if Container-Based Network Emulators (CBNEs) are able to instantiate emulated nodes that provide sufficient realism to be used in information security experiments. The realism measure used is based on the information available from the point of view of a remote attacker. During the evaluation of a Container-Based Network Emulator (CBNE) as a platform to replicate production networks for information security experiments, it was observed that nmap fingerprinting returned Operating System (OS) family and version results inconsistent with that of the host Operating System (OS). CBNEs utilise Linux namespaces, the technology used for containerisation, to instantiate \emulated" hosts for experimental networks. Linux containers partition resources of the host OS to create lightweight virtual machines that share a single OS kernel. As all emulated hosts share the same kernel in a CBNE network, there is a reasonable expectation that the fingerprints of the host OS and emulated hosts should be the same. Based on how CBNEs instantiate emulated networks and that fingerprinting returned inconsistent results, it was hypothesised that the technologies used to construct CBNEs are capable of influencing fingerprints generated by utilities such as nmap. It was predicted that hosts emulated using different CBNEs would show deviations in remotely generated fingerprints when compared to fingerprints generated for the host OS. An experimental network consisting of two emulated hosts and a Layer 2 switch was instantiated on multiple CBNEs using the same host OS. Active and passive fingerprinting was conducted between the emulated hosts to generate fingerprints and OS family and version matches. Passive fingerprinting failed to produce OS family and version matches as the fingerprint databases for these utilities are no longer maintained. For active fingerprinting the OS family results were consistent between tested systems and the host OS, though OS version results reported was inconsistent. A comparison of the generated fingerprints revealed that for certain CBNEs fingerprint features related to network stack optimisations of the host OS deviated from other CBNEs and the host OS. The hypothesis that CBNEs can influence remotely generated fingerprints was partially confirmed. One CBNE system modified Linux kernel networking options, causing a deviation from fingerprints generated for other tested systems and the host OS. The hypothesis was also partially rejected as the technologies used by CBNEs do not influence the remote fidelity of emulated hosts. , Thesis (MSc) -- Faculty of Science, Computer Science, 2021
- Full Text:
- Date Issued: 2021-10-29
- «
- ‹
- 1
- ›
- »