Recent Submissions

  • Text Detection from an Image

    Andriamanalimanana, Bruno R.; Thesis Advisor; Novillo, Jorge; Thesis Committee; Spetka, Scott; Thesis Committee; Goda, Piyush Jain (2020-12)
    Recently, a variety of real-world applications have triggered a huge demand for techniques that can extract textual information from images and videos. Therefore, image text detection and recognition have become active research topics in computer vision. The current trend in object detection and localization is to learn predictions with high capacity deep neural networks trained on a very large amount of annotated data and using a high amount of processing power. In this project, I have built an approach for text detection using the object detection technique. Our approach is to deal with the text as objects. We use an object detection method, YOLO (You Only Look Once), to detect the text in the images. We frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. YOLO, a single neural network, that predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on detection performance. The MobileNet pre-trained deep learning model architecture was used and modified in different ways to find the best performing model. The goal is to achieve high accuracy in text spotting. Experiments on standard datasets ICDAR 2015 demonstrate that the proposed algorithm significantly outperforms methods in terms of both accuracy and efficiency.
  • Technology Case Study in Storage Area Networks

    Marsh, John ; Thesis Advisor; Hash, Larry J.; Climek, David; Bull, Ronny; Pethe, Ameya (2014-05)
    In today's world we need immediate access to data. The demand for networked data access has increased exponentially in the last 20 years. With that demand the importance and volume of networked data has also grown exponentially. The speed at which the data can be accessed has increased and with that the data has moved from individual workstations to a networked location. Over the last decade there has been a trend to move mission critical data away from individual workstations to a centralized data center. A centralized data center removes the location constraint for accessing the data. If critical data is stored on individual servers, a failure will cause the data to be inaccessible. Today, mission critical applications are spanned over multiple servers for redundancy. With this topology, having the data in a central location allows the individual servers to better work with data. With the addition of virtualization, servers can be moved online from one physical server to another. If the data is centralized, it can be presented to all hosts in the cluster. This allows servers to move efficiently between hosts without losing access to the critical data. Many businesses in various industries like finance, airline, hospital, research, etc. depend on the speed and secure availability of their centralized data to function efficiently.
  • Non-Convex Optimization: RMSProp Based Optimization for Long Short-Term Memory Network

    Andriamanalimanana, Bruno; Committee Chair; Chiang, Chen-Fu; Thesis Committee; Novillo, Jorge; Thesis Committee; Yan, Jianzhi (2020-05)
    This project would give a comprehensive picture of non-convex optimization for deep learning, explain in details about Long Short-Term Memory (LSTM) and RMSProp. We start by illustrating the internal mechanisms of LSTM, like the network structure and backpropagation through time (BPTT). Then introducing RMSProp optimization, some relevant mathematical theorems and proofs in those sections, which give a clear picture of how RMSProp algorithm is helpful to escape the saddle point. After all the above, we apply it with LSTM with RMSProp for the experiment; the result would present the efficiency and accuracy, especially how our method beat traditional strategy in non-convex optimization.
  • Exploratory Data Analysis and Sentiment Analysis on Brazilian E-Commerce Website

    Andriamanalimanana, Bruno; Committee Chair; Novillo, Jorge; Thesis Committee; Reale, Michael; Thesis Committee; Patel, Mihir (2020-05)
    In the past few years, the growth of e-commerce and digital marketing has generated a huge volume of opinionated data. Analyzing those data would provide enterprises with insight for better business decisions. E-commerce web applications are almost ubiquitous in our day to day life, however as useful as they are, most of them have little to no adaptation to user needs, which in turn can cause both lower conversion rates as well as unsatisfied customers. We propose a machine learning system which learns the user behavior from multiple previous sessions and predicts useful metrics for the current session. In turn, these metrics can be used by the applications to customize and better target the customer, which can mean anything from offering better offers of specific products, targeted notifications or placing smart ads. With recent advances in every field, the need for developing efficient techniques for analytics as well as predictions have increased to larger extend. As the data gets large it becomes difficult for companies to handle such large volume of data, therefore new approaches are developed. Here we work with the dataset from olist e-commerce website taken from year 2016 to 2018. In this work, we study sentiment analysis of product reviews in Portuguese since this dataset contains data from Brazilian supermarkets. Understanding customer sentiments is of paramount importance in marketing strategies today. Not only will it give companies an insight as to how customers perceive their products and/or services, but it will also give them an idea on how to improve their offers. This project attempts to understand the correlation of different variables in customer reviews e-commerce products, and to classify each review whether it recommends the reviewed product or not and whether it consists of positive, negative, or neutral sentiment.
  • Detection of Brain Tumor in Magnetic Resonance Imaging (MRI) Images using Fuzzy C-Means and Thresholding

    Andriamanalimanana, Bruno; Kalakuntla, Shashank; Andriamanalimanana, Bruno R.; First Reader; Novillo, Jorge E.; Second Reader; Spetka, Scott; Third Reader (SUNY Polytechnic Institute, 2020-08)
    Although many clinical experts or radiologists are well trained to identify tumors and other abnormalities in the brain, the identification, detection and segmentation of the affected area in the brain is observed to be a tedious and time consuming task. MRI has been a conventional and resultant image processing technique to visualize structures of the human body. It is very difficult to visualize abnormal structures of the brain using simple imaging techniques. MRI technique uses many imaging modalities that scan and capture the internal structure of the human brain. Even with the use of these techniques, it is a difficult and tedious task for a human eye to be always sophisticated in detecting brain tumors from these images. With emerging technology, we can provide a way to ease the process of detection. This project focuses on identification of brain tumor in MR images, it involves in removing noise using noise removal technique AMF followed by enhancing the images using Balance Enhancement Contrast technique (BCET).Further, image segmentation is performed using fuzzy c-means and finally the segmented images are produced as an input to a canny edge detection resulting with the tumor image. This report entices the approach, design, and implementation of the application and finally the results. I have tried implementing/developing this application in Python. The Jupyter notebook provides a block simulation for the entire flow of the project.
  • Non-Convex Optimization: RMSProp Based Optimization for Long Short-Term Memory Network

    Andriamanalimanana, Bruno; Yan, Jianzhi; Andriamanalimanana, Bruno; First Reader; Chiang, Chen-Fu; Second Reader; Novillo, Jorge; Third Reader (SUNY Polytechnic Institute, 2020-05-09)
    This project would give a comprehensive picture of non-convex optimization for deep learning, explain in details about Long Short-Term Memory (LSTM) and RMSProp. We start by illustrating the internal mechanisms of LSTM, like the network structure and backpropagation through time (BPTT). Then introducing RMSProp optimization, some relevant mathematical theorems and proofs in those sections, which give a clear picture of how RMSProp algorithm is helpful to escape the saddle point. After all the above, we apply it with LSTM with RMSProp for the experiment; the result would present the efficiency and accuracy, especially how our method beat traditional strategy in non-convex optimization.
  • Cyber Security Advantages of Optical Communications in SATCOM Networks

    Kholidy, Hisham A.; Baker, Cameron; Kholidy, Hisham A.; Advisor (SUNY Polytechnic Institute, 2020-12)
    Space-based communications, whether it is ground-to-space or inter-satellite communications, have so far been primarily within the RF spectrum. With the increase in space missions and the need for larger amounts of data being sent to and from satellites, the near infrared or optical spectrum has started to become more widely used instead of RF. Higher bandwidth is not the only advantage of using optics for communications over RF, there is also an inherent security advantage as well. Currently, there is far too little enforcement of security standards for space communications networks, and the use of RF only worsens the problem due to its very large beam spread when compared to optics. This paper will seek to prove that optics is a far more superior technology to be used for space communications networks from a security standpoint as well as providing an increase in available bandwidth. These points will be proven by first introducing the technology by examining current Free Space Optics (FSO) systems and space optics systems being provided by manufacturers. Secondly, this paper will discuss the current state of space communications security, and issues space communications networks are facing using RF with the recent advancement into low-cost SmallSat operations that threaten existing space vehicles, and the lack of standard security practices within these networks. Lastly, this paper will provide evidence into why optics communications can improve the security of spaced based communications due to its lower beam spread and the ability to incorporate quantum key distribution into the communications channel.
  • A Wireless Intrusion Detection for the Next Generation (5G) Networks

    Kholidy, Hisham A.; Ferrucci, Richard; Kholidy, Hisham A.; Advisor (SUNY Polytechnic Institute, 2020-05)
    5G data systems are closed to delivery to the public. The question remains how security will impact the release of this cutting edge architecture. 5G data systems will be sending massive amounts of personal data due to the fact that everybody in the world is using mobile phones these days. With everyone using a 5G device, this architecture will have a huge surface area for attackers to compromise. Using machine learning techniques previously applied to 802.11 networks. We will show that improving upon these previous works, we can have a better handle on security when it comes to 5G architecture security. We find that using a machine learning classifier known as LogIT boost, combined with a selected combination of feature selection, we can provide optimal results in identifying three different classes of traffic referred to as normal, flooding, and injection traffic. We drastically decrease the time taken to perform this classification while improving the results. We simulate the Device2Device (D2D) connections involved in the 5G systems using the AWID dataset. The evaluation and validation of the classification approach are discussed in details in this thesis.
  • ?Generic Datasets, Beamforming Vectors Prediction of 5G Celleular Networks

    Kholidy, Hisham A.; Singh, Manjit; Kholidy, Hisham A.; Advisor (SUNY Polytechnic Institute, 2020)
    The early stages of 5G evolution revolves around delivering higher data speeds, latency improvements and the functional redesign of mobile networks to enable greater agility, efficiency and openness. The millimeter-wave (mmWave) massive multiple-input-multiple-output (massive MIMO) system is one of the dominant technology that consistently features in the list of the 5G enablers and opens up new frontiers of services and applications for next-generation 5G cellular networks. The mmWave massive MIMO technology shows potentials to significantly raise user throughput, enhances spectral and energy efficiencies and increases the capacity of mobile networks using the joint capabilities of the huge available bandwidth in the mmWave frequency bands and high multiplexing gains achievable with massive antenna arrays. In this report, we present the preliminary outcomes of research on mmWave massive MIMO (as research on this subject is still in the exploratory phase) and study two papers related to the Millimeter Wave (mmwave) and massive MIMO for next-gen 5G wireless systems. We focus on how a generic dataset uses accurate real-world measurements using ray tracing data and how machine learning/Deep learning can find correlations for better beam prediction vectors through this ray tracing data. We also study a generated deep learning model to be trained using TensorFlow and Google Collaboratory.
  • Cloud-SCADA Penetrate: Practical Implementation for Hacking Cloud Computing and Critical SCADA Systems

    Kholidy, Hisham A. (SUNY Polytechnic Institute, 2020)
    In this report, we discuss some of our hacking and security solutions that we developed at our Advanced Cybersecurity Research Lab (ACRL). This report consists of the following five main experimental packages: 1) Exploiting the cloud computing system using a DDoS attack and developing a distributed deployment of a cloud based Intrusion Detection System (IDS) solution. 2) Hacking SCADA systems components. 3) Hacking Metasploitable machines. 4) Hacking Windows 7 system. 5) Windows Post Exploitation.
  • 5G Networks Security: Attack Detection Using the J48 and the Random Forest Tree Classifiers

    Kholidy, Hisham A.; Steele II, Bruce; Kholidy, Hisham A.; Advisor (SUNY Polytechnic Institute, 2020)
    5G is the next generation of cellular networks succeeding and improving upon the last generation of 4G Long Term Evolution (LTE) networks. With the introduction of 5G comes significant improvements over the previous generation with the ability to support new and emerging technologies in addition to the growth in the number of devices. The purpose of this report is to give a broad overview of what 5G encompasses including the architecture, underlying technology, advanced features, use cases/applications, and security, and to evaluate the security of this new networks using existing machine learning classification techniques such as The J48 Tree Classifier and the Random Forest tree classifier. The evaluation is based on the UNSW-NB15 dataset that was created at the Cyber Range Lab of the Australian Centre for Cyber Security (ACCS) at the University of New South Wales. Since 5G datasets have yet to have been created, there is no publicly available dataset for the 5G systems. However, While the UNSW-NB15 dataset is built using a standard wireless computer network, we will use it to simulate the device-to-device (D2D) connections that 5G will support. In the case with the UNSW dataset, the J48 tree classifier fits more accurately than the Random Forest classifier. The J48 tree classifier achieved an 86.422% of correctly classified instances. On the other hand, the Random Forest tree classifier achieved 85.8451% of correctly classified instances.
  • An Empirical Wi-Fi Intrusion Detection System

    Kholidy, Hisham A.; Basnet, Diwash Bikram; Kholidy, Hisham A.; Advisor (SUNY Polytechnic Institute, 2020-05)
    Today, the wireless network devices are growing rapidly, and it is of utmost importance for securing those devices. Attackers or hackers use new methods and techniques to trick the system and steal the most important data. Intrusion Detection Systems detect the attacks by inspecting the network traffics or logs. The work demonstrated the effectiveness of detecting the attacks using machine learning techniques on the AWID dataset, which is produced from real wireless network logging. The author of the AWID dataset may have used several supervised learning models to successfully detect the intrusions. In this paper, we propose a newer approach for intrusion detection model based on dense neural networks, and long short-term memory networks (LSTM) and evaluate the model against the AWID-CLS-R subset. To get the best results from the model, we applied feature selection by replacing the unknown data with the value of “none”, getting rid of all repeated values, and kept only the important features. We did preprocess and feature scaling of both training and testing dataset, additional we also change the 2-dimensional to the 3- dimensional array because LSTM takes an input of 3-dimensional array, and later we used flatten layers to change into a 2-dimensional array for output. A comprehensive evaluation of DNN and LSTM networks are used to classify and predict the attacks and compute the precision, recall, and F1 score. We perform binary classification and multiclass classification on the dataset using neural networks and achieve accuracy ranging from 86.70 % to 96.01%.
  • An Analysis of a Signature-based Approach for an Intrusion Detection System in a Wireless Body Area Network (WBAN) using Data Mining Techniques

    Kholidy, Hisham A.; Medina, Serene Elisabeth; Kholidy, Hisham A.; Advisor (SUNY Polytechnic Institute, 2020)
    Wireless Body Area Networks (WBANs) use biosensors worn on, or in the human body, which collect and monitor a patient’s medical condition. WBANs have become increasingly more beneficial in the medical field by lowering healthcare cost and providing more useful information that medical professionals can use for a more accurate, and faster diagnosis. Due to the fact that the data collected from a WBAN is transmitted over a wireless network, there are several security concerns involved. This research looks at the various attacks, and concerns involved with WBANs. A real physiological dataset, consisting of ECG signals obtained from a 25-year-old male, was used in this research to test accuracy of various decision tree classifiers. The Weka software was used to analysis the accuracy and detection rate results of this dataset in its original form, versus a reduced dataset consisting of less, more important attributes. The results concluded that the use of decision tree classifiers using data mining, is an efficient way to test the increased accuracy on a real dataset obtained from a WBAN once it has been altered. The original dataset produced results where the ROC curve ranged from 0.313 (31%) to 0.68 (68%), meaning their accuracy is not very high and the detection rate is low. Once an attribute selection feature was used on the dataset, the newly reduced set showed ROC curves ranging from 0.68 (68%) to 0.969 (97%) amongst the three classes. As a result, decision tree models were much more accurate with a higher detection rate when used on a real dataset that was reduced to function better as a detector for a WBAN.
  • Evaluating Variant Deep Learning and Machine Learning Approaches for the Detection of Cyberattacks on the Next Generation 5G Systems

    Kholidy, Hisham A.; Borgesen, Michael E.; Kholidy, Hisham A.; Advisor (SUNY Polytechnic Institute, 2020)
    5G technology promises to completely transform telecommunication networks, introducing a wealth of benefits such as faster download speeds, lower download times, low latency, high network capacity. These benefits will pave the way for additional new capabilities and support connectivity for applications like smart homes and cities, industrial automation, autonomous vehicles, telemedicine, and virtual/augmented reality. However, attackers use these resources in their advantages to speed up the attacking process. This report evaluates four different machine learning and deep learning approaches namely the Naïve Bayes model, the logistic regression model, the decision tree model, and the random forest model. The performance evaluation and the validation of these approaches are discussed in details in this report.
  • A Genetic Algorithm for Locating Acceptable Structure Models of Systems (Reconstructability Analysis)

    Heath, Joshua; Cavallo, Roger; Advisor; Reale, Michael; Reviewer; Sengupta, Saumendra; Reviewer (2018-05)
    The emergence of the field of General Systems Theory (GST) can be best attributed to the belief that all systems, irrespective of context, share simple, organizational principles capable of being mathematically modeled with any of many forms of abstraction. Structure  modeling is a well‐developed aspect of GST specializing in analyzing the structure of a system ‐ that is, the interactions between the attributes of a system. These interactions, while intuitive in smaller systems, become increasingly difficult to comprehend as the number of measurable attributes of a system increases. To combat this, one may approach an overall system by analyzing its various subsystems and, potentially, reconstruct properties of that system using  knowledge gained from considering a collection of these subsystems (a structure model). In situations where the overall system cannot be fully reconstructed based on a given structure model, the benefits and detriments associated with using such a model should both be considered. For example, while a model may be simpler to understand, or require less storage space in memory than the system as a whole, all information regarding that system may not be inferable from that model. As systems grow in size, determining the acceptability of every meaningful structure model of a system in order tofind the most acceptable becomes exceedingly resource-intensive. In this thesis, a measure of the memory requirements associated with storing a system or a set of subsystems (a structure model) is defined and is used in defining an objective measure of the acceptability of a structure as a representation of an overall system. A Genetic Algorithm for Locating Acceptable Structures (GALAS) is then outlined, with this acceptability criterion serving as an optimizable fitness function. The goal of this heuristic is to search the set of all meaningful structure models, without the need for exhaustively generating each, and produce those that are the most acceptable, based on predefined acceptability criteria. 
  • SECURITY CHALLENGES IN SDN IMPLEMENTATION

    Patil, Pradnya; Hash, Larry; Advisor; White, Joshua; Reviewer; Tekeoglu, Ali, Reviewer (2018-05)
    This study analyzes how security challenges caused by data and control layer separation in the SDN, such as Denial of Service attacks and unauthorized access attacks, limit SDN deployment. This study also offers network engineers’ views on preventing those security issues and whether implementing SDN is a good idea in the first place. This study was conducted in order to answer three questions: 1. How does data and control layer separation in SDN cause DoS and unauthorized access attacks? 2. What are the best practices and measures to minimize such security threats from the engineer’s point of view? 3. Do security threats at the lower layer affect the decision to implement SDN? These questions were answered by reviewing research papers and interviewing engineers from the telecommunication field. DoS and unauthorized access attacks are due to vulnerabilities in OpenFlow, SDN switches and SDN controllers. Table 6 presents solutions for preventing DoS and unauthorized access attacks. Most of the network engineers said SDN should be implemented based on cost, limited risk, customers’ positive views, and company projects, despite the current security challenges.
  • Password Habits of Security Literate Individuals

    Mahesh, Namrata; Hash, Larry; Advisor; Marsh, John; Reviewer; White, Joshua; Reviewer (2018-05)
    In the age of the Internet, the common user has accounts on multiple websites. Basic account authentication has a username and a password. While username may be common knowledge, passwords are secret, and it is important to use good password habits. Security literate internet users, i.e, students, faculty, professionals in the IT industry are expected to know better than to use bad password habits. But that may not always be the case. This thesis aims to test the hypothesis that security literate internet users use bad password habits despite knowing better, and then proceeds to understand the underlying factors behind these habits through a survey. The survey consisted of questions about basic password habits. The responses were analyzed for better insights
  • OBJECT ORIENTED ARTIFICIAL NEURAL NETWORK SIMULATOR IN TEXT AND SYMBOL RECOGNITION

    Piszcz, Alan; Ishaq, Naseem; Advisor; Novillo, Jorge E; Reviewer; Sengupta, Saumendra; Reviewer (1993)
    Objected oriented languages and artificial neural networks are new areas of research and development. This thesis investigates the application of artificial neural networks using an object oriented C++ backpropagation simulator. The application domain investigated is hand printed text and engineering symbol recognition. An object oriented approach to the simulator allows other simulator paradigms to reuse a large body of the object classes developed for this particular application. The review and implementation of image feature extraction methodologies is another area researched in this paper. Four feature techniques are researched, developed, applied and tested, using digits, upper case alphabet characters and engineering symbol images. Final implementation and testing of the feature extraction methods with a baseline technique is analyzed for applicability in the domain of hand printed text and engineering symbols
  • Applicability of the Julia Programming Language to Forward Error-Correction Coding in Digital Communications Systems

    Quinn, Ryan; Andriamanalimanana, Bruno R.; Advisor; Sengupta, Saumendra; Reviewer; Spetka, Scott; Reviewer (2018-05)
    Traditionally SDR has been implemented in C and C++ for execution speed and processor efficiency. Interpreted and high-level languages were considered too slow to handle the challenges of digital signal processing (DSP). The Julia programming language is a new language developed for scientific and mathematical purposes that is supposed to write like Python or MATLAB and execute like C or FORTRAN. Given the touted strengths of the Julia language, it bore investigating as to whether it was suitable for DSP. This project specifically addresses the applicability of Julia to forward error correction (FEC), a highly mathematical topic to which Julia should be well suited. It has been found that Julia offers many advantages to faithful implementations of FEC specifications over C/C++, but the optimizations necessary to use FEC in real systems are likely to blunt this advantage during normal use. The Julia implementations generally effected a 33% or higher reduction in source lines of code (SLOC) required to implement. Julia implementations of FEC algorithms were generally not more than 1/3 the speed of mature C/C++ implementations.While Julia has the potential to achieve the required performance for FEC, the optimizations required to do so will generally obscure the closeness of the implementation and specification. At the current time it seems unlikely that Julia will pose a serious challenge to the dominance of C/C++ in the field of DSP.
  • NYS Fair Events Mobile Application With Client-Side Caching

    Kanala, Sumant; Cheng, Chen-Fu; Advisor; Gherasoiu, Iulian; Reviewer; Tekeoglu, Ali; Reviewer (2017-12)
    NYS Fair Events collects data about fair events which happen in New York state throughout the year, bundles them, displays the upcoming events and useful information about the event itself, the weather and forecast prediction, and a Google Maps to show the route to the event from the user’s location. The motivation for creating this project arose with understanding the growing market for mobile applications and by working for a startup for several months now in the field of web development. A trend has been established in which more users are switching towards mobile apps as their preferred information exchange tool than their traditional PCs and hence the development of better apps should be geared towards mobile phones and tablet PCs. The development of the app is mainly divided into two steps, the client and server side. For the client side I developed a Cordova-based mobile app which is cross-platform and can be compiled to work on Android and IOS based mobile devices. For the server side, I used Node.js runtime environment and deployed it onto Heroku’s free dyno tier which is a cloudbased Platform as a service (paaS). Based on user’s actions, data is requested from the server’s endpoints and appropriate information is served and shown to the user in an intuitive manner.

View more