Vulnerability Detection by Using Machine Learning

Vulnerability Detection by Using Machine Learning

Article 1 – “Machine Learning Methods for Inferring Interaction Design Patterns from Textual Requirements” by Silva-Rodríguez et al. (2019).

  1. Which or what kind of method have been used?

The assessment has explored the benefits of ambient intelligence for a variety of embedded computing systems. It also applied natural language processing, besides other machine learning algorithms.

  1. Which data has been used?

The exercise used machine learning to establish the presence of an intrusion. Some of the data sets used included texts collected from many sources such as the PROMISE corpus dataset (Silva-Rodríguez et al., 2019). The data was then classified based on the Toxboe collection.

  1. Which and how the training has been done?

The training was completed using a textual model and a design pattern classification, IDPatternM. The training was then completed using naïve Bayes algorithm.

  1. Which machine learning algorithms have been used?

Two main algorithms were used: the semi-supervised learning algorithm, and the LinearSVM algorithm (Silva-Rodríguez et al., 2019).

  1. What’s the input?

The following inputs were used: search filters, table filter, morphing elements and a forgiving format.

  1. What’s the output?

The model produced a design pattern prediction as an output.

 

Article 2 – Feature-Based Software Design Pattern Detection” by Nazar & Aleti (2020).

  1. Which or what kind of method have been used?

The method used in the exercise entailed the collection of corpus with Javascript. Afterward, the process explored how the code features were chosen through the extraction of semantic representations of the corpus, also known as SSLR.

  1. Which data has been used?

The data was collected from the Github Java Corpus, GJC. The materials contained multiple java files, CSS, HRM and unit test cases.

  1. Which and how the training has been done?

The training system was used alongside a test set. Training entailed learning classifiers based on the main group of labelled individuals. The groups were then linked to each individual from the label.

  1. Which machine learning algorithms have been used?

The following algorithm was used: the Word2Vec

  1. What’s the input?

The following inputs were used: the SSLR file source course features and source code corpus.

  1. What’s the output?

The output was the design pattern labels (DPL).

 

Article 3: “Malware detection using machine learning based analysis of virtual memory access patterns” by Zu et al. (2017).

  1. Which or what kind of method have been used?

The evaluation examined a rootkit of an infected system, and the running of the utility was examined to meet the goal of the kit such as hiding malicious actions. Memory access was then completed using specialized hardware. The machine learning algorithm used attempted to track attacks not seen before and those it had recorded.

  1. Which data has been used?

The system collected memory access patterns using a pin-tool as well as a detected attack.

  1. Which and how the training has been done?

The experiment labelled each execution as malicious or otherwise. It then trained the to identify labels for all function during the execution process.

  1. Which machine learning algorithms have been used?

The process employed the Scikit-learn machine earning algorithm and library (Zu et al., 2017).

  1. What’s the input?

The following inputs were used: memory corruption attacks, Linux kernel rootkits and user code.

  1. What’s the output?

The following output was produced: false positive rates and true false positive rates.

 

Article 4: “Machine learning for network automation: overview, architecture, and applications” by Rafique, and Luis, 2018.

  1. Which or what kind of method have been used?

The procedure employed used machine learning as a universal toolbox that categorized intrusions, and identifying the best class for new st of observations and evaluating the link among data samples.

  1. Which data has been used?

The data used included training sets, evaluation sets, algorithm sets and patterns.

  1. Which and how the training has been done?

The training phase entailed the extraction of data from historical databases. The training further preprocessed the data to normalize and remove outliers.

  1. Which machine learning algorithms have been used?

The procedure used the following algorithms: K-Nearest Neighbors, artificial neural networks, and support vector machine (Rafique, and Luis, 2018).

  1. What’s the input?

The following input data was applied: maximum likelihood learning and generative models.

  1. What’s the output?

The output end featured neurons and reconfigured files.

Article 5 and 6: “Malicious URL detection using machine learning: A survey” by Sahoo et al (2017) and “Using Lexical Features for Malicious URL Detection–A Machine Learning Approach” by Joshi et al. (2019).

  1. Which or what kind of method have been used?

The methods discussed entailed the use of Crisp-DM, which prepared, modelled, evaluated, deployed and understood data based on malicious URLs (Joshi et al., 2019), and malicious URL detectors for fraud or cybersecurity applications (Sahoo et al., 2017).

  1. Which data has been used?

The exercise used millions of URLs that were picked from a variety of sources and online repositories such as Alexa and Openphish.

  1. Which and how the training has been done?

Training was performed based on comparison between different learning systems, which then indicated bagging algorithms as excellent fits for cutting variance.

  1. Which machine learning algorithms have been used

The following machine learning algorithms were used: AdaBoost, Naïve Bayes, Logic Regression and Random Forest.

  1. What’s the input?

Numerical input was used, which entailed turning URL strings into numerical vectors.

  1. What’s the output?

Short URLs were produced as outputs.