Human Behavior Modeling in Video Surveillance of Conference Halls
In this paper, we present a human behavior modeling approach in videos scenes. This approach is used to model the normal behaviors in the conference halls. We exploited the Probabilistic Latent Semantic Analysis technique (PLSA), using the 'Bag-of-Terms' paradigm, as a tool for exploring video data to learn the model by grouping similar activities. Our term vocabulary consists of 3D spatio-temporal patch groups assigned by the direction of motion. Our video representation ensures the spatial information, the object trajectory, and the motion. The main importance of this approach is that it can be adapted to detect abnormal behaviors in order to ensure and enhance human security.
Ensuring Uniform Energy Consumption in Non-Deterministic Wireless Sensor Network to Protract Networks Lifetime
Wireless sensor networks have enticed much of the spotlight from researchers all around the world, owing to its extensive applicability in agricultural, industrial and military fields. Energy conservation node deployment stratagems play a notable role for active implementation of Wireless Sensor Networks. Clustering is the approach in wireless sensor networks which improves energy efficiency in the network. The clustering algorithm needs to have an optimum size and number of clusters, as clustering, if not implemented properly, cannot effectively increase the life of the network. In this paper, an algorithm has been proposed to address connectivity issues with the aim of ensuring the uniform energy consumption of nodes in every part of the network. The results obtained after simulation showed that the proposed algorithm has an edge over existing algorithms in terms of throughput and networks lifetime.
K-Means Based Matching Algorithm for Multi-Resolution Feature Descriptors
Matching high dimensional features between images is computationally expensive for exhaustive search approaches in computer vision. Although the dimension of the feature can be degraded by simplifying the prior knowledge of homography, matching accuracy may degrade as a tradeoff. In this paper, we present a feature matching method based on k-means algorithm that reduces the matching cost and matches the features between images instead of using a simplified geometric assumption. Experimental results show that the proposed method outperforms the previous linear exhaustive search approaches in terms of the inlier ratio of matched pairs.
Generalization of Clustering Coefficient on Lattice Networks Applied to Criminal Networks
A lattice network is a special type of network in
which all nodes have the same number of links, and its boundary
conditions are periodic. The most basic lattice network is the ring, a
one-dimensional network with periodic border conditions. In contrast,
the Cartesian product of d rings forms a d-dimensional lattice
network. An analytical expression currently exists for the clustering
coefficient in this type of network, but the theoretical value is valid
only up to certain connectivity value; in other words, the analytical
expression is incomplete. Here we obtain analytically the clustering
coefficient expression in d-dimensional lattice networks for any link
density. Our analytical results show that the clustering coefficient for
a lattice network with density of links that tend to 1, leads to the
value of the clustering coefficient of a fully connected network. We
developed a model on criminology in which the generalized clustering
coefficient expression is applied. The model states that delinquents
learn the know-how of crime business by sharing knowledge, directly
or indirectly, with their friends of the gang. This generalization shed
light on the network properties, which is important to develop new
models in different fields where network structure plays an important
role in the system dynamic, such as criminology, evolutionary game
theory, econophysics, among others.
Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering
Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.
Hybrid Hierarchical Clustering Approach for Community Detection in Social Network
Social Networks generally present a hierarchy of
communities. To determine these communities and the relationship
between them, detection algorithms should be applied. Most of
the existing algorithms, proposed for hierarchical communities
identification, are based on either agglomerative clustering or
divisive clustering. In this paper, we present a hybrid hierarchical
clustering approach for community detection based on both
bottom-up and bottom-down clustering. Obviously, our approach
provides more relevant community structure than hierarchical
method which considers only divisive or agglomerative clustering
to identify communities. Moreover, we performed some comparative
experiments to enhance the quality of the clustering results and to
show the effectiveness of our algorithm.
A Model Based Metaheuristic for Hybrid Hierarchical Community Structure in Social Networks
In recent years, the study of community detection
in social networks has received great attention. The hierarchical
structure of the network leads to the emergence of the convergence
to a locally optimal community structure. In this paper, we aim
to avoid this local optimum in the introduced hybrid hierarchical
method. To achieve this purpose, we present an objective function
where we incorporate the value of structural and semantic similarity
based modularity and a metaheuristic namely bees colonies algorithm
to optimize our objective function on both hierarchical level divisive
and agglomerative. In order to assess the efficiency and the accuracy
of the introduced hybrid bee colony model, we perform an extensive
experimental evaluation on both synthetic and real networks.
A Computational Cost-Effective Clustering Algorithm in Multidimensional Space Using the Manhattan Metric: Application to the Global Terrorism Database
The increasing amount of collected data has limited the performance of the current analyzing algorithms. Thus, developing new cost-effective algorithms in terms of complexity, scalability, and accuracy raised significant interests. In this paper, a modified effective k-means based algorithm is developed and experimented. The new algorithm aims to reduce the computational load without significantly affecting the quality of the clusterings. The algorithm uses the City Block distance and a new stop criterion to guarantee the convergence. Conducted experiments on a real data set show its high performance when compared with the original k-means version.
Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency
Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.
Optimal Maintenance Clustering for Rail Track Components Subject to Possession Capacity Constraints
This paper studies the optimal maintenance planning of preventive maintenance and renewal activities for components in a single railway track when the available time for maintenance is limited. The rail-track system consists of several types of components, such as rail, ballast, and switches with different preventive maintenance and renewal intervals. To perform maintenance or renewal on the track, a train free period for maintenance, called a possession, is required. Since a major possession directly affects the regular train schedule, maintenance and renewal activities are clustered as much as possible. In a highly dense and utilized railway network, the possession time on the track is critical since the demand for train operations is very high and a long possession has a severe impact on the regular train schedule. We present an optimization model and investigate the maintenance schedules with and without the possession capacity constraint. In addition, we also integrate the social-economic cost related to the effects of the maintenance time to the variable possession cost into the optimization model. A numerical example is provided to illustrate the model.
The Survey Research and Evaluation of Green Residential Building Based on the Improved Group Analytical Hierarchy Process Method in Yinchuan
Due to the economic downturn and the deterioration of the living environment, the development of residential buildings as high energy consuming building is gradually changing from “extensive” to green building in China. So, the evaluation system of green building is continuously improved, but the current evaluation work has the following problems: (1) There are differences in the cost of the actual investment and the purchasing power of residents, also construction target of green residential building is single and lacks multi-objective performance development. (2) Green building evaluation lacks regional characteristics and cannot reflect the different regional residents demand. (3) In the process of determining the criteria weight, the experts’ judgment matrix is difficult to meet the requirement of consistency. Therefore, to solve those problems, questionnaires which are about the green residential building for Ningxia area are distributed, and the results of questionnaires can feedback the purchasing power of residents and the acceptance of the green building cost. Secondly, combined with the geographical features of Ningxia minority areas, the evaluation criteria system of green residential building is constructed. Finally, using the improved group AHP method and the grey clustering method, the criteria weight is determined, and a real case is evaluated, which is located in Xing Qing district, Ningxia. A conclusion can be obtained that the professional evaluation for this project and good social recognition is basically the same.
Hierarchical Checkpoint Protocol in Data Grids
Grid of computing nodes has emerged as a
representative means of connecting distributed computers or
resources scattered all over the world for the purpose of computing
and distributed storage. Since fault tolerance becomes complex due
to the availability of resources in decentralized grid environment,
it can be used in connection with replication in data grids. The
objective of our work is to present fault tolerance in data grids
with data replication-driven model based on clustering. The
performance of the protocol is evaluated with Omnet++ simulator.
The computational results show the efficiency of our protocol in
terms of recovery time and the number of process in rollbacks.
Energy-Efficient Clustering Protocol in Wireless Sensor Networks for Healthcare Monitoring
Wireless sensor networks (WSNs) can facilitate continuous monitoring of patients and increase early detection of emergency conditions and diseases. High density WSNs helps us to accurately monitor a remote environment by intelligently combining the data from the individual nodes. Due to energy capacity limitation of sensors, enhancing the lifetime and the reliability of WSNs are important factors in designing of these networks. The clustering strategies are verified as effective and practical algorithms for reducing energy consumption in WSNs and can tackle WSNs limitations. In this paper, an Energy-efficient weight-based Clustering Protocol (EWCP) is presented. Artificial retina is selected as a case study of WSNs applied in body sensors. Cluster heads’ (CHs) selection is equipped with energy efficient parameters. Moreover, cluster members are selected based on their distance to the selected CHs. Comparing with the other benchmark protocols, the lifetime of EWCP is improved significantly.
Chemical Reaction Algorithm for Expectation Maximization Clustering
Clustering is an intensive research for some years
because of its multifaceted applications, such as biology, information
retrieval, medicine, business and so on. The expectation maximization
(EM) is a kind of algorithm framework in clustering methods, one
of the ten algorithms of machine learning. Traditionally, optimization
of objective function has been the standard approach in EM. Hence,
research has investigated the utility of evolutionary computing and
related techniques in the regard. Chemical Reaction Optimization
(CRO) is a recently established method. So the property embedded
in CRO is used to solve optimization problems. This paper presents
an algorithm framework (EM-CRO) with modified CRO operators
based on EM cluster problems. The hybrid algorithm is mainly
to solve the problem of initial value sensitivity of the objective
function optimization clustering algorithm. Our experiments mainly
take the EM classic algorithm:k-means and fuzzy k-means as an
example, through the CRO algorithm to optimize its initial value, get
K-means-CRO and FKM-CRO algorithm. The experimental results
of them show that there is improved efficiency for solving objective
function optimization clustering problems.
3D Mesh Coarsening via Uniform Clustering
In this paper, we present a fast and efficient mesh coarsening algorithm for 3D triangular meshes. Theis approach can be applied to very complex 3D meshes of arbitrary topology and with millions of vertices. The algorithm is based on the clustering of the input mesh elements, which divides the faces of an input mesh into a given number of clusters for clustering purpose by approximating the Centroidal Voronoi Tessellation of the input mesh. Once a clustering is achieved, it provides us an efficient way to construct uniform tessellations, and therefore leads to good coarsening of polygonal meshes. With proliferation of 3D scanners, this coarsening algorithm is particularly useful for reverse engineering applications of 3D models, which in many cases are dense, non-uniform, irregular and arbitrary topology. Examples demonstrating effectiveness of the new algorithm are also included in the paper.
LiDAR Based Real Time Multiple Vehicle Detection and Tracking
Self-driving vehicle require a high level of situational
awareness in order to maneuver safely when driving in real world
condition. This paper presents a LiDAR based real time perception
system that is able to process sensor raw data for multiple target
detection and tracking in dynamic environment. The proposed
algorithm is nonparametric and deterministic that is no assumptions
and priori knowledge are needed from the input data and no
initializations are required. Additionally, the proposed method is
working on the three-dimensional data directly generated by LiDAR
while not scarifying the rich information contained in the domain of
3D. Moreover, a fast and efficient for real time clustering algorithm
is applied based on a radially bounded nearest neighbor (RBNN).
Hungarian algorithm procedure and adaptive Kalman filtering are
used for data association and tracking algorithm. The proposed
algorithm is able to run in real time with average run time of 70ms
Electricity Generation from Renewables and Targets: An Application of Multivariate Statistical Techniques
Renewable energy is referred to as "clean energy" and common popular support for the use of renewable energy (RE) is to provide electricity with zero carbon dioxide emissions. This study provides useful insight into the European Union (EU) RE, especially, into electricity generation obtained from renewables, and their targets. The objective of this study is to identify groups of European countries, using multivariate statistical analysis and selected indicators. The hierarchical clustering method is used to decide the number of clusters for EU countries. The conducted statistical hierarchical cluster analysis is based on the Ward’s clustering method and squared Euclidean distances. Hierarchical cluster analysis identified eight distinct clusters of European countries. Then, non-hierarchical clustering (k-means) method was applied. Discriminant analysis was used to determine the validity of the results with data normalized by Z score transformation. To explore the relationship between the selected indicators, correlation coefficients were computed. The results of the study reveal the current situation of RE in European Union Member States.
Intelligent Recognition of Diabetes Disease via FCM Based Attribute Weighting
In this paper, an attribute weighting method called fuzzy C-means clustering based attribute weighting (FCMAW) for classification of Diabetes disease dataset has been used. The aims of this study are to reduce the variance within attributes of diabetes dataset and to improve the classification accuracy of classifier algorithm transforming from non-linear separable datasets to linearly separable datasets. Pima Indians Diabetes dataset has two classes including normal subjects (500 instances) and diabetes subjects (268 instances). Fuzzy C-means clustering is an improved version of K-means clustering method and is one of most used clustering methods in data mining and machine learning applications. In this study, as the first stage, fuzzy C-means clustering process has been used for finding the centers of attributes in Pima Indians diabetes dataset and then weighted the dataset according to the ratios of the means of attributes to centers of theirs. Secondly, after weighting process, the classifier algorithms including support vector machine (SVM) and k-NN (k- nearest neighbor) classifiers have been used for classifying weighted Pima Indians diabetes dataset. Experimental results show that the proposed attribute weighting method (FCMAW) has obtained very promising results in the classification of Pima Indians diabetes dataset.
Applying Hybrid Graph Drawing and Clustering Methods on Stock Investment Analysis
Stock investment decisions are often made based on current events of the global economy and the analysis of historical data. Conversely, visual representation could assist investors’ gain deeper understanding and better insight on stock market trends more efficiently. The trend analysis is based on long-term data collection. The study adopts a hybrid method that combines the Clustering algorithm and Force-directed algorithm to overcome the scalability problem when visualizing large data. This method exemplifies the potential relationships between each stock, as well as determining the degree of strength and connectivity, which will provide investors another understanding of the stock relationship for reference. Information derived from visualization will also help them make an informed decision. The results of the experiments show that the proposed method is able to produced visualized data aesthetically by providing clearer views for connectivity and edge weights.
A Cuckoo Search with Differential Evolution for Clustering Microarray Gene Expression Data
A DNA microarray technology is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. It is handled by clustering which reveals the natural structures and identifying the interesting patterns in the underlying data. In this paper, gene based clustering in gene expression data is proposed using Cuckoo Search with Differential Evolution (CS-DE). The experiment results are analyzed with gene expression benchmark datasets. The results show that CS-DE outperforms CS in benchmark datasets. To find the validation of the clustering results, this work is tested with one internal and one external cluster validation indexes.
Cluster-Based Multi-Path Routing Algorithm in Wireless Sensor Networks
Small-size and low-power sensors with sensing, signal
processing and wireless communication capabilities is suitable for the
wireless sensor networks. Due to the limited resources and battery
constraints, complex routing algorithms used for the ad-hoc networks
cannot be employed in sensor networks. In this paper, we propose
node-disjoint multi-path hexagon-based routing algorithms in wireless
sensor networks. We suggest the details of the algorithm and compare
it with other works. Simulation results show that the proposed scheme
achieves better performance in terms of efficiency and message
Hierarchical Clustering Algorithms in Data Mining
Clustering is a process of grouping objects and data
into groups of clusters to ensure that data objects from the same
cluster are identical to each other. Clustering algorithms in one of the
area in data mining and it can be classified into partition, hierarchical,
density based and grid based. Therefore, in this paper we do survey
and review four major hierarchical clustering algorithms called
CURE, ROCK, CHAMELEON and BIRCH. The obtained state of
the art of these algorithms will help in eliminating the current
problems as well as deriving more robust and scalable algorithms for
Personalization of Web Search Using Web Page Clustering Technique
The Information Retrieval community is facing the problem of effective representation of Web search results. When we organize web search results into clusters it becomes easy to the users to quickly browse through search results. The traditional search engines organize search results into clusters for ambiguous queries, representing each cluster for each meaning of the query. The clusters are obtained according to the topical similarity of the retrieved search results, but it is possible for results to be totally dissimilar and still correspond to the same meaning of the query. People search is also one of the most common tasks on the Web nowadays, but when a particular person’s name is queried the search engines return web pages which are related to different persons who have the same queried name. By placing the burden on the user of disambiguating and collecting pages relevant to a particular person, in this paper, we have developed an approach that clusters web pages based on the association of the web pages to the different people and clusters that are based on generic entity search.
A Spanning Tree for Enhanced Cluster Based Routing in Wireless Sensor Network
Wireless Sensor Network (WSN) clustering architecture enables features like network scalability, communication overhead reduction, and fault tolerance. After clustering, aggregated data is transferred to data sink and reducing unnecessary, redundant data transfer. It reduces nodes transmitting, and so saves energy consumption. Also, it allows scalability for many nodes, reduces communication overhead, and allows efficient use of WSN resources. Clustering based routing methods manage network energy consumption efficiently. Building spanning trees for data collection rooted at a sink node is a fundamental data aggregation method in sensor networks. The problem of determining Cluster Head (CH) optimal number is an NP-Hard problem. In this paper, we combine cluster based routing features for cluster formation and CH selection and use Minimum Spanning Tree (MST) for intra-cluster communication. The proposed method is based on optimizing MST using Simulated Annealing (SA). In this work, normalized values of mobility, delay, and remaining energy are considered for finding optimal MST. Simulation results demonstrate the effectiveness of the proposed method in improving the packet delivery ratio and reducing the end to end delay.
A Review on Enhanced Dynamic Clustering in WSN
Recent advancement in wireless internetworking has presented a number of dynamic routing protocols based on sensor networks. At present, a number of revisions are made based on their energy efficiency, lifetime and mobility. However, to the best of our knowledge no extensive survey of this special type has been prepared. At present, review is needed in this area where cluster-based structures for dynamic wireless networks are to be discussed. In this paper, we examine and compare several aspects and characteristics of some extensively explored hierarchical dynamic clustering protocols in wireless sensor networks. This document also presents a discussion on the future research topics and the challenges of dynamic hierarchical clustering in wireless sensor networks.
A Fuzzy Approach to Liver Tumor Segmentation with Zernike Moments
In this paper, we present a new segmentation approach
for liver lesions in regions of interest within MRI (Magnetic
Resonance Imaging). This approach, based on a two-cluster Fuzzy CMeans
methodology, considers the parameter variable compactness
to handle uncertainty. Fine boundaries are detected by a local
recursive merging of ambiguous pixels with a sequential forward
floating selection with Zernike moments. The method has been tested
on both synthetic and real images. When applied on synthetic images,
the proposed approach provides good performance, segmentations
obtained are accurate, their shape is consistent with the ground truth,
and the extracted information is reliable. The results obtained on MR
images confirm such observations. Our approach allows, even for
difficult cases of MR images, to extract a segmentation with good
performance in terms of accuracy and shape, which implies that the
geometry of the tumor is preserved for further clinical activities (such
as automatic extraction of pharmaco-kinetics properties, lesion
Liver Lesion Extraction with Fuzzy Thresholding in Contrast Enhanced Ultrasound Images
In this paper, we present a new segmentation approach
for focal liver lesions in contrast enhanced ultrasound imaging. This
approach, based on a two-cluster Fuzzy C-Means methodology,
considers type-II fuzzy sets to handle uncertainty due to the image
modality (presence of speckle noise, low contrast, etc.), and to
calculate the optimum inter-cluster threshold. Fine boundaries are
detected by a local recursive merging of ambiguous pixels. The
method has been tested on a representative database. Compared to
both Otsu and type-I Fuzzy C-Means techniques, the proposed
method significantly reduces the segmentation errors.
Upgraded Rough Clustering and Outlier Detection Method on Yeast Dataset by Entropy Rough K-Means Method
Rough set theory is used to handle uncertainty and incomplete information by applying two accurate sets, Lower approximation and Upper approximation. In this paper, the rough clustering algorithms are improved by adopting the Similarity, Dissimilarity–Similarity and Entropy based initial centroids selection method on three different clustering algorithms namely Entropy based Rough K-Means (ERKM), Similarity based Rough K-Means (SRKM) and Dissimilarity-Similarity based Rough K-Means (DSRKM) were developed and executed by yeast dataset. The rough clustering algorithms are validated by cluster validity indexes namely Rand and Adjusted Rand indexes. An experimental result shows that the ERKM clustering algorithm perform effectively and delivers better results than other clustering methods. Outlier detection is an important task in data mining and very much different from the rest of the objects in the clusters. Entropy based Rough Outlier Factor (EROF) method is seemly to detect outlier effectively for yeast dataset. In rough K-Means method, by tuning the epsilon (ᶓ) value from 0.8 to 1.08 can detect outliers on boundary region and the RKM algorithm delivers better results, when choosing the value of epsilon (ᶓ) in the specified range. An experimental result shows that the EROF method on clustering algorithm performed very well and suitable for detecting outlier effectively for all datasets. Further, experimental readings show that the ERKM clustering method outperformed the other methods.
An Energy Aware Data Aggregation in Wireless Sensor Network Using Connected Dominant Set
Wireless Sensor Networks (WSNs) have many advantages. Their deployment is easier and faster than wired sensor networks or other wireless networks, as they do not need fixed infrastructure. Nodes are partitioned into many small groups named clusters to aggregate data through network organization. WSN clustering guarantees performance achievement of sensor nodes. Sensor nodes energy consumption is reduced by eliminating redundant energy use and balancing energy sensor nodes use over a network. The aim of such clustering protocols is to prolong network life. Low Energy Adaptive Clustering Hierarchy (LEACH) is a popular protocol in WSN. LEACH is a clustering protocol in which the random rotations of local cluster heads are utilized in order to distribute energy load among all sensor nodes in the network. This paper proposes Connected Dominant Set (CDS) based cluster formation. CDS aggregates data in a promising approach for reducing routing overhead since messages are transmitted only within virtual backbone by means of CDS and also data aggregating lowers the ratio of responding hosts to the hosts existing in virtual backbones. CDS tries to increase networks lifetime considering such parameters as sensors lifetime, remaining and consumption energies in order to have an almost optimal data aggregation within networks. Experimental results proved CDS outperformed LEACH regarding number of cluster formations, average packet loss rate, average end to end delay, life computation, and remaining energy computation.
Knowledge Representation Based On Interval Type-2 CFCM Clustering
This paper is concerned with knowledge representation
and extraction of fuzzy if-then rules using Interval Type-2
Context-based Fuzzy C-Means clustering (IT2-CFCM) with the aid of
fuzzy granulation. This proposed clustering algorithm is based on
information granulation in the form of IT2 based Fuzzy C-Means
(IT2-FCM) clustering and estimates the cluster centers by preserving
the homogeneity between the clustered patterns from the IT2 contexts
produced in the output space. Furthermore, we can obtain the
automatic knowledge representation in the design of Radial Basis
Function Networks (RBFN), Linguistic Model (LM), and Adaptive
Neuro-Fuzzy Networks (ANFN) from the numerical input-output data
pairs. We shall focus on a design of ANFN in this paper. The
experimental results on an estimation problem of energy performance
reveal that the proposed method showed a good knowledge
representation and performance in comparison with the previous