Publications!

Shiblee Sadik and Le Gruenwald, “Online Outlier Detection for Data Stream,” IDEAS 2011
Outlier detection is a well established area of statistics butmost of the existing outlier detection techniques are designedfor applications where the entire dataset is available for ran-dom access. A typical outlier detection technique constructs a standard data distribution or model and identifies the de-viated data points from the model as outliers. Evidently these techniques are not suitable for online data streams where the entire dataset, due to its unbounded volume, is not available for random access. Moreover, the data distri-bution in data streams change over time which challenges the existing outlier detection techniques that assume a constant standard data distribution for the entire dataset. In addi-tion, data streams are characterized by uncertainty which imposes further complexity. In this paper we propose an adaptive, online outlier detection technique addressing the aforementioned characteristics of data streams, called Adap-tive Outlier Detection for Data Streams (A-ODDS), which identifies outliers with respect to all the received data points as well as temporally close data points. The temporally close data points are selected based on time and change of data distribution. We also present an efficient and online imple-mentation of the technique and a performance study show-ing the superiority of A-ODDS over existing techniques in terms of accuracy and execution time on a real-life dataset collected from meteorological applications.
Shiblee Sadik and Le Gruenwald, “An Adaptive Outlier Detection Technique for Data Streams,” SSDBM 2011
This work presents an adaptive outlier detection technique for data streams, called Automatic Outlier Detection for Data Streams (A-ODDS), which identifies outliers with respect to all the received data points (global context) as well as temporally close data points (local context) where local context are selected based on time and change of data distribution.
Md. Shiblee Sadik and Le Gruenwald, “Security for Data Stream Management System,” Security in computing and networking systems: the state-of-the-art, Eds. William McQuay and Walled W. Smari
There emerge new applications, such as environment monitoring, Web click streams, and network traffic monitoring, where data are in a form of streams that continuously arrive, usually in high speed and with changing data distribution. Due to the unbounded data volume and the real-time continuous high rate data collection and processing characteristics of those applications, traditional database management systems are not suitable to manage them. To fill in the gap, researchers have proposed a new type of systems, called Data Stream Management Systems (DSMS). Like traditional database management systems, DSMS need to provide security mechanisms to protect streams of data along with the system against malicious attacks in sensitive applications. The special characteristics of data stream applications raise new issues that must be considered when developing security mechanisms for DSMS. This paper discusses those issues, reviews how they have been addressed in the literature, and identifies future research directions.
Le Gruenwald, Md. Shiblee Sadik, Rahul Shukla and Hanqing Yang, “DEMS: A Data Mining Based Technique to Handle Missing Data in Mobile Sensor Network Applications,” DMSN’ 2010
In Mobile Sensor Network (MSN) applications, sensors move to increase the area of coverage and/or to compensate for the failure of other sensors. In such applications, loss or corruption of sensor data, known as the missing sensor data phenomenon, occurs due to various reasons, such as power outage, network interference, and sensor mobility. A desirable way to address this issue is to develop a technique that can effectively and efficiently estimate the values of the missing sensor data in order to provide timely response to queries that need to access the missing data. There exists work that aims at achieving such goal for applications in static sensor networks (SSNs), but little research has been done for those in MSNs, which are more complex than SSNs due to the mobility of mobile sensors. In this paper, we propose a novel data mining based technique, called Data Estimation for Mobile Sensors (DEMS), to handle missing data in MSN applications. DEMS mines the spatial and temporal relationships among mobile sensors with the help of virtual static sensors. DEMS converts mobile sensor readings into virtual static sensor readings and applies the discovered relationships on virtual static sensor readings to estimate the values of the missing sensor data. We also present the experimental results using both real life and synthetic datasets to demonstrate the efficacy of DEMS in terms of data estimation accuracy.
Md. Shiblee Sadik and Le Gruenwald, “DBOD-DS: Distance Based Outlier Detection for Data Streams,” DEXA’ 2010
Data stream is a newly emerging data model for applications like environment monitoring, Web click stream, network traffic monitoring, etc. It consists of an infinite sequence of data points accompanied with timestamp coming from external data source. Typically data sources are located onsite and very vulnerable to external attacks and natural calamities, thus outliers are very common in the datasets. Existing techniques for outlier detection are inadequate for data streams because of its metamorphic data distribution and uncertainty. In this paper we propose an outlier detection technique, called Distance-Based Outline Detection for Data Streams (DBOD-DS) based on a novel continuously adaptive probability density function that addresses all the new issues of data streams. Extensive experiments on a real dataset for meteorology applications show the supremacy of DBOD-DS over existing techniques in terms of accuracy.
Le Gruenwald, Hanqing Yang, Md. Shiblee Sadik and Rahul Shukla, “Using Data Mining to Handle Missing Data in Multi-Hop Sensor Network Applications,” MobiDe’ 2010
A sensor’s data loss or corruption, aka sensor data missing, is a common phenomenon in modern wireless sensor networks. It is more severe for multi-hop sensor network (MSN) applications where sensor data reach the base station via other sensors; hence a sensor’s failure can cause multiple missing data. In this paper we present MASTER-M, a data estimation framework based on data clustering and association rule mining to estimate the values of missing sensor data for MSN. Estimating, instead of resending, the missing sensor data is becoming popular as it may reduce query response time and sensor energy consumption; however the current works cater to only single-hop sensor networks. To fill this gap, our novel technique addresses the issues related to MSN, such as simultaneous missing sensors and missing spatially correlated sensors. It consists of three steps: 1) clustering sensors online; 2) capturing association rules between sensors inside each cluster, and 3) estimating the values of the missing data using the obtained association rules. Experimental results on both real-life sensor data and synthetic sensor data demonstrate the efficacy of MASTER-M in terms of estimation accuracy compared to the existing techniques. Moreover, we also present experiments showing the supremacy of data estimation by MASTER-M in terms of energy savings over re-transmission of missing data.
Boris S. Verkhovsky and Md. Shiblee Sadik, “Accelerated Search for Gaussian Generator Based on Triple Prime Integers,” Journal of Computer Science, 2009
Problem statement: Modern cryptographic algorithms are based on complexity of two problems: Integer factorization of real integers and a Discrete Logarithm Problem (DLP). Approach: The latter problem is even more complicated in the domain of complex integers, where Public Key Cryptosystems (PKC) had an advantage over analogous encryption-decryption protocols in arithmetic of real integers modulo p: The former PKC have quadratic cycles of order O (p2) while the latter PKC had linear cycles of order O(p). Results: An accelerated non-deterministic search algorithm for a primitive root (generator) in a domain of complex integers modulo triple prime p was provided in this study. It showed the properties of triple primes, the frequencies of their occurrence on a specified interval and analyzed the efficiency of the proposed algorithm. Conclusion: Numerous computer experiments and their analysis indicated that three trials were sufficient on average to find a Gaussian generator.
Md. Ahsan Arefin, Md. Shiblee Sadik, Forhad Rabbi, and M. A. Mottalib, “3-Tier Architecture of Data Server on Grid: Implemented using Globus Toolkit,” GCA’ 2007
Grid System is one of the newest versions of distributed System. In a distributed System, we often hope to distribute the overheads of the entire system to different PCs. Parallel execution is one of the major features of Grid System as well as we can share RAM or any other resources. But it is quite challenging to design a data server on Grid. To get the full performance of a Grid System the traditional Database Management System (DBMS) fails and so here we propose a modified version of Data Server architecture using existing Database System. Also this implementation base paper includes some comparisons, figures and tables indicating the performance of the Data Server on Grid with the traditional DBMS. We have used Globus Toolkit in our implementation and all the terms in this paper are similar to the terms used on Globus Toolkit.
Md. Ahsan Arefin, Md. Shiblee Sadik, Serena Coetzee, Judith Bishop, “Alchemi Vs Globus: A Performance Comparison,” ICECE’ 2006
Alchemi and the Globus Toolkit are open source software toolkits for implementing a Grid. Although both toolkits are designed for the same purpose, their architecture and underlying technology are completely different. Thus, a performance comparison of a Grid implementation in Alchemi with a similar Grid implementation in the Globus Toolkit will be interesting. We built a test bed to compare the performance of the two toolkits. This paper includes tables and graphs to illustrate the comparison.
Md. Ahsan Arefin, Md. Shiblee Sadik, “Implementation of Server on Grid System: A Super Computer Approach,” ISD’ 2006
The Internet technology has already changed the Information Society in profound ways, and will continue to do so. Nowadays many people foresee that there is a similar trajectory for the next generation of Internet - Grid Technology. As an emerging computational and networking infrastructure, Grid Computing is designed to provide pervasive, uniform and reliable access to data, computational and human resources distributed in a dynamic, heterogeneous environment. Also the development of Grid Security provides a secured field for work. That’s why; we have used the Grid Technology for implementing a server that response to many Clients for any database queries or any other database applications. Our design provides single level distribution of Grid Applications that make the faster response time and faster throughput than the normal server application and even that of distributed application. Here we have used the open source software “Alchemi” in all through our work. This document summarizes the design and implementation of a server using Grid Technology and compares its performance.