Excellence in Research and Innovation for Humanity

International Science Index

Commenced in January 1999 Frequency: Monthly Edition: International Paper Count: 63

Computer, Electrical, Automation, Control and Information Engineering

  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 63
    463
    A Proposal of an Automatic Formatting Method for Transforming XML Data
    Abstract:

    PPX(Pretty Printer for XML) is a query language that offers a concise description method of formatting the XML data into HTML. In this paper, we propose a simple specification of formatting method that is a combination description of automatic layout operators and variables in the layout expression of the GENERATE clause of PPX. This method can automatically format irregular XML data included in a part of XML with layout decision rule that is referred to DTD. In the experiment, a quick comparison shows that PPX requires far less description compared to XSLT or XQuery programs doing same tasks.

    62
    993
    AJcFgraph - AspectJ Control Flow Graph Builder for Aspect-Oriented Software
    Abstract:
    The ever-growing usage of aspect-oriented development methodology in the field of software engineering requires tool support for both research environments and industry. So far, tool support for many activities in aspect-oriented software development has been proposed, to automate and facilitate their development. For instance, the AJaTS provides a transformation system to support aspect-oriented development and refactoring. In particular, it is well established that the abstract interpretation of programs, in any paradigm, pursued in static analysis is best served by a high-level programs representation, such as Control Flow Graph (CFG). This is why such analysis can more easily locate common programmatic idioms for which helpful transformation are already known as well as, association between the input program and intermediate representation can be more closely maintained. However, although the current researches define the good concepts and foundations, to some extent, for control flow analysis of aspectoriented programs but they do not provide a concrete tool that can solely construct the CFG of these programs. Furthermore, most of these works focus on addressing the other issues regarding Aspect- Oriented Software Development (AOSD) such as testing or data flow analysis rather than CFG itself. Therefore, this study is dedicated to build an aspect-oriented control flow graph construction tool called AJcFgraph Builder. The given tool can be applied in many software engineering tasks in the context of AOSD such as, software testing, software metrics, and so forth.
    61
    1053
    Improving Worm Detection with Artificial Neural Networks through Feature Selection and Temporal Analysis Techniques
    Abstract:
    Computer worm detection is commonly performed by antivirus software tools that rely on prior explicit knowledge of the worm-s code (detection based on code signatures). We present an approach for detection of the presence of computer worms based on Artificial Neural Networks (ANN) using the computer's behavioral measures. Identification of significant features, which describe the activity of a worm within a host, is commonly acquired from security experts. We suggest acquiring these features by applying feature selection methods. We compare three different feature selection techniques for the dimensionality reduction and identification of the most prominent features to capture efficiently the computer behavior in the context of worm activity. Additionally, we explore three different temporal representation techniques for the most prominent features. In order to evaluate the different techniques, several computers were infected with five different worms and 323 different features of the infected computers were measured. We evaluated each technique by preprocessing the dataset according to each one and training the ANN model with the preprocessed data. We then evaluated the ability of the model to detect the presence of a new computer worm, in particular, during heavy user activity on the infected computers.
    60
    1741
    EEIA: Energy Efficient Indexed Aggregation in Smart Wireless Sensor Networks
    Abstract:
    The main idea behind in network aggregation is that, rather than sending individual data items from sensors to sinks, multiple data items are aggregated as they are forwarded by the sensor network. Existing sensor network data aggregation techniques assume that the nodes are preprogrammed and send data to a central sink for offline querying and analysis. This approach faces two major drawbacks. First, the system behavior is preprogrammed and cannot be modified on the fly. Second, the increased energy wastage due to the communication overhead will result in decreasing the overall system lifetime. Thus, energy conservation is of prime consideration in sensor network protocols in order to maximize the network-s operational lifetime. In this paper, we give an energy efficient approach to query processing by implementing new optimization techniques applied to in-network aggregation. We first discuss earlier approaches in sensors data management and highlight their disadvantages. We then present our approach “Energy Efficient Indexed Aggregation" (EEIA) and evaluate it through several simulations to prove its efficiency, competence and effectiveness.
    59
    1872
    New Mitigating Technique to Overcome DDOS Attack
    Abstract:

    In this paper, we explore a new scheme for filtering spoofed packets (DDOS attack) which is a combination of path fingerprint and client puzzle concepts. In this each IP packet has a unique fingerprint is embedded that represents, the route a packet has traversed. The server maintains a mapping table which contains the client IP address and its corresponding fingerprint. In ingress router, client puzzle is placed. For each request, the puzzle issuer provides a puzzle which the source has to solve. Our design has the following advantages over prior approaches, 1) Reduce the network traffic, as we place a client puzzle at the ingress router. 2) Mapping table at the server is lightweight and moderate.

    58
    1991
    Similarity Detection in Collaborative Development of Object-Oriented Formal Specifications
    Abstract:
    The complexity of today-s software systems makes collaborative development necessary to accomplish tasks. Frameworks are necessary to allow developers perform their tasks independently yet collaboratively. Similarity detection is one of the major issues to consider when developing such frameworks. It allows developers to mine existing repositories when developing their own views of a software artifact, and it is necessary for identifying the correspondences between the views to allow merging them and checking their consistency. Due to the importance of the requirements specification stage in software development, this paper proposes a framework for collaborative development of Object- Oriented formal specifications along with a similarity detection approach to support the creation, merging and consistency checking of specifications. The paper also explores the impact of using additional concepts on improving the matching results. Finally, the proposed approach is empirically evaluated.
    57
    2001
    An Efficient and Optimized Multi Constrained Path Computation for Real Time Interactive Applications in Packet Switched Networks
    Abstract:

    Quality of Service (QoS) Routing aims to find path between source and destination satisfying the QoS requirements which efficiently using the network resources and underlying routing algorithm and to fmd low-cost paths that satisfy given QoS constraints. One of the key issues in providing end-to-end QoS guarantees in packet networks is determining feasible path that satisfies a number of QoS constraints. We present a Optimized Multi- Constrained Routing (OMCR) algorithm for the computation of constrained paths for QoS routing in computer networks. OMCR applies distance vector to construct a shortest path for each destination with reference to a given optimization metric, from which a set of feasible paths are derived at each node. OMCR is able to fmd feasible paths as well as optimize the utilization of network resources. OMCR operates with the hop-by-hop, connectionless routing model in IP Internet and does not create any loops while fmding the feasible paths. Nodes running OMCR not necessarily maintaining global view of network state such as topology, resource information and routing updates are sent only to neighboring nodes whereas its counterpart link-state routing method depend on complete network state for constrained path computation and that incurs excessive communication overhead.

    56
    2169
    Extended Study on Removing Gaussian Noise in Mechanical Engineering Drawing Images using Median Filters
    Abstract:
    In this paper, an extended study is performed on the effect of different factors on the quality of vector data based on a previous study. In the noise factor, one kind of noise that appears in document images namely Gaussian noise is studied while the previous study involved only salt-and-pepper noise. High and low levels of noise are studied. For the noise cleaning methods, algorithms that were not covered in the previous study are used namely Median filters and its variants. For the vectorization factor, one of the best available commercial raster to vector software namely VPstudio is used to convert raster images into vector format. The performance of line detection will be judged based on objective performance evaluation method. The output of the performance evaluation is then analyzed statistically to highlight the factors that affect vector quality.
    55
    3059
    A Novel Hybrid Mobile Agent Based Distributed Intrusion Detection System
    Abstract:
    The first generation of Mobile Agents based Intrusion Detection System just had two components namely data collection and single centralized analyzer. The disadvantage of this type of intrusion detection is if connection to the analyzer fails, the entire system will become useless. In this work, we propose novel hybrid model for Mobile Agent based Distributed Intrusion Detection System to overcome the current problem. The proposed model has new features such as robustness, capability of detecting intrusion against the IDS itself and capability of updating itself to detect new pattern of intrusions. In addition, our proposed model is also capable of tackling some of the weaknesses of centralized Intrusion Detection System models.
    54
    3379
    Pruning Method of Belief Decision Trees
    Abstract:
    The belief decision tree (BDT) approach is a decision tree in an uncertain environment where the uncertainty is represented through the Transferable Belief Model (TBM), one interpretation of the belief function theory. The uncertainty can appear either in the actual class of training objects or attribute values of objects to classify. In this paper, we develop a post-pruning method of belief decision trees in order to reduce size and improve classification accuracy on unseen cases. The pruning of decision tree has a considerable intention in the areas of machine learning.
    53
    4490
    Using Automatic Ontology Learning Methods in Human Plausible Reasoning Based Systems
    Abstract:
    Knowledge discovery from text and ontology learning are relatively new fields. However their usage is extended in many fields like Information Retrieval (IR) and its related domains. Human Plausible Reasoning based (HPR) IR systems for example need a knowledge base as their underlying system which is currently made by hand. In this paper we propose an architecture based on ontology learning methods to automatically generate the needed HPR knowledge base.
    52
    4997
    A Robust Salient Region Extraction Based on Color and Texture Features
    Abstract:
    In current common research reports, salient regions are usually defined as those regions that could present the main meaningful or semantic contents. However, there are no uniform saliency metrics that could describe the saliency of implicit image regions. Most common metrics take those regions as salient regions, which have many abrupt changes or some unpredictable characteristics. But, this metric will fail to detect those salient useful regions with flat textures. In fact, according to human semantic perceptions, color and texture distinctions are the main characteristics that could distinct different regions. Thus, we present a novel saliency metric coupled with color and texture features, and its corresponding salient region extraction methods. In order to evaluate the corresponding saliency values of implicit regions in one image, three main colors and multi-resolution Gabor features are respectively used for color and texture features. For each region, its saliency value is actually to evaluate the total sum of its Euclidean distances for other regions in the color and texture spaces. A special synthesized image and several practical images with main salient regions are used to evaluate the performance of the proposed saliency metric and other several common metrics, i.e., scale saliency, wavelet transform modulus maxima point density, and important index based metrics. Experiment results verified that the proposed saliency metric could achieve more robust performance than those common saliency metrics.
    51
    5048
    Performance Comparison and Evaluation of AdaBoost and SoftBoost Algorithms on Generic Object Recognition
    Abstract:
    SoftBoost is a recently presented boosting algorithm, which trades off the size of achieved classification margin and generalization performance. This paper presents a performance evaluation of SoftBoost algorithm on the generic object recognition problem. An appearance-based generic object recognition model is used. The evaluation experiments are performed using a difficult object recognition benchmark. An assessment with respect to different degrees of label noise as well as a comparison to the well known AdaBoost algorithm is performed. The obtained results reveal that SoftBoost is encouraged to be used in cases when the training data is known to have a high degree of noise. Otherwise, using Adaboost can achieve better performance.
    50
    5330
    IMLFQ Scheduling Algorithm with Combinational Fault Tolerant Method
    Abstract:
    Scheduling algorithms are used in operating systems to optimize the usage of processors. One of the most efficient algorithms for scheduling is Multi-Layer Feedback Queue (MLFQ) algorithm which uses several queues with different quanta. The most important weakness of this method is the inability to define the optimized the number of the queues and quantum of each queue. This weakness has been improved in IMLFQ scheduling algorithm. Number of the queues and quantum of each queue affect the response time directly. In this paper, we review the IMLFQ algorithm for solving these problems and minimizing the response time. In this algorithm Recurrent Neural Network has been utilized to find both the number of queues and the optimized quantum of each queue. Also in order to prevent any probable faults in processes' response time computation, a new fault tolerant approach has been presented. In this approach we use combinational software redundancy to prevent the any probable faults. The experimental results show that using the IMLFQ algorithm results in better response time in comparison with other scheduling algorithms also by using fault tolerant mechanism we improve IMLFQ performance.
    49
    5353
    A New Source Code Auditing Algorithm for Detecting LFI and RFI in PHP Programs
    Abstract:
    Static analysis of source code is used for auditing web applications to detect the vulnerabilities. In this paper, we propose a new algorithm to analyze the PHP source code for detecting LFI and RFI potential vulnerabilities. In our approach, we first define some patterns for finding some functions which have potential to be abused because of unhandled user inputs. More precisely, we use regular expression as a fast and simple method to define some patterns for detection of vulnerabilities. As inclusion functions could be also used in a safe way, there could occur many false positives (FP). The first cause of these FP-s could be that the function does not use a usersupplied variable as an argument. So, we extract a list of usersupplied variables to be used for detecting vulnerable lines of code. On the other side, as vulnerability could spread among the variables like by multi-level assignment, we also try to extract the hidden usersupplied variables. We use the resulted list to decrease the false positives of our method. Finally, as there exist some ways to prevent the vulnerability of inclusion functions, we define also some patterns to detect them and decrease our false positives.
    48
    5398
    Segmentation Problems and Solutions in Printed Degraded Gurmukhi Script
    Abstract:
    Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper we have proposed a complete solution for segmenting touching characters in all the three zones of printed Gurmukhi script. A study of touching Gurmukhi characters is carried out and these characters have been divided into various categories after a careful analysis. Structural properties of the Gurmukhi characters are used for defining the categories. New algorithms have been proposed to segment the touching characters in middle zone, upper zone and lower zone. These algorithms have shown a reasonable improvement in segmenting the touching characters in degraded printed Gurmukhi script. The algorithms proposed in this paper are applicable only to machine printed text. We have also discussed a new and useful technique to segment the horizontally overlapping lines.
    47
    5572
    A Distributed Algorithm for Intrinsic Cluster Detection over Large Spatial Data
    Abstract:
    Clustering algorithms help to understand the hidden information present in datasets. A dataset may contain intrinsic and nested clusters, the detection of which is of utmost importance. This paper presents a Distributed Grid-based Density Clustering algorithm capable of identifying arbitrary shaped embedded clusters as well as multi-density clusters over large spatial datasets. For handling massive datasets, we implemented our method using a 'sharednothing' architecture where multiple computers are interconnected over a network. Experimental results are reported to establish the superiority of the technique in terms of scale-up, speedup as well as cluster quality.
    46
    5749
    Meta-Classification using SVM Classifiers for Text Documents
    Abstract:
    Text categorization is the problem of classifying text documents into a set of predefined classes. In this paper, we investigated three approaches to build a meta-classifier in order to increase the classification accuracy. The basic idea is to learn a metaclassifier to optimally select the best component classifier for each data point. The experimental results show that combining classifiers can significantly improve the accuracy of classification and that our meta-classification strategy gives better results than each individual classifier. For 7083 Reuters text documents we obtained a classification accuracies up to 92.04%.
    45
    5982
    Grid-HPA: Predicting Resource Requirements of a Job in the Grid Computing Environment
    Abstract:

    For complete support of Quality of Service, it is better that environment itself predicts resource requirements of a job by using special methods in the Grid computing. The exact and correct prediction causes exact matching of required resources with available resources. After the execution of each job, the used resources will be saved in the active database named "History". At first some of the attributes will be exploit from the main job and according to a defined similarity algorithm the most similar executed job will be exploited from "History" using statistic terms such as linear regression or average, resource requirements will be predicted. The new idea in this research is based on active database and centralized history maintenance. Implementation and testing of the proposed architecture results in accuracy percentage of 96.68% to predict CPU usage of jobs and 91.29% of memory usage and 89.80% of the band width usage.

    44
    6132
    VLSI Design of 2-D Discrete Wavelet Transform for Area-Efficient and High-Speed Image Computing
    Abstract:
    This paper presents a VLSI design approach of a highspeed and real-time 2-D Discrete Wavelet Transform computing. The proposed architecture, based on new and fast convolution approach, reduces the hardware complexity in addition to reduce the critical path to the multiplier delay. Furthermore, an advanced twodimensional (2-D) discrete wavelet transform (DWT) implementation, with an efficient memory area, is designed to produce one output in every clock cycle. As a result, a very highspeed is attained. The system is verified, using JPEG2000 coefficients filters, on Xilinx Virtex-II Field Programmable Gate Array (FPGA) device without accessing any external memory. The resulting computing rate is up to 270 M samples/s and the (9,7) 2-D wavelet filter uses only 18 kb of memory (16 kb of first-in-first-out memory) with 256×256 image size. In this way, the developed design requests reduced memory and provide very high-speed processing as well as high PSNR quality.
    43
    6213
    Automatic Translation of Ada-ECATNet Using Rewriting Logic
    Authors:
    Abstract:
    One major difficulty that faces developers of concurrent and distributed software is analysis for concurrency based faults like deadlocks. Petri nets are used extensively in the verification of correctness of concurrent programs. ECATNets are a category of algebraic Petri nets based on a sound combination of algebraic abstract types and high-level Petri nets. ECATNets have 'sound' and 'complete' semantics because of their integration in rewriting logic and its programming language Maude. Rewriting logic is considered as one of very powerful logics in terms of description, verification and programming of concurrent systems We proposed previously a method for translating Ada-95 tasking programs to ECATNets formalism (Ada-ECATNet) and we showed that ECATNets formalism provides a more compact translation for Ada programs compared to the other approaches based on simple Petri nets or Colored Petri nets. We showed also previously how the ECATNet formalism offers to Ada many validation and verification tools like simulation, Model Checking, accessibility analysis and static analysis. In this paper, we describe the implementation of our translation of the Ada programs into ECATNets.
    42
    6385
    Topology Preservation in SOM
    Abstract:
    The SOM has several beneficial features which make it a useful method for data mining. One of the most important features is the ability to preserve the topology in the projection. There are several measures that can be used to quantify the goodness of the map in order to obtain the optimal projection, including the average quantization error and many topological errors. Many researches have studied how the topology preservation should be measured. One option consists of using the topographic error which considers the ratio of data vectors for which the first and second best BMUs are not adjacent. In this work we present a study of the behaviour of the topographic error in different kinds of maps. We have found that this error devaluates the rectangular maps and we have studied the reasons why this happens. Finally, we suggest a new topological error to improve the deficiency of the topographic error.
    41
    6593
    Improving Convergence of Parameter Tuning Process of the Additive Fuzzy System by New Learning Strategy
    Abstract:
    An additive fuzzy system comprising m rules with n inputs and p outputs in each rule has at least t m(2n + 2 p + 1) parameters needing to be tuned. The system consists of a large number of if-then fuzzy rules and takes a long time to tune its parameters especially in the case of a large amount of training data samples. In this paper, a new learning strategy is investigated to cope with this obstacle. Parameters that tend toward constant values at the learning process are initially fixed and they are not tuned till the end of the learning time. Experiments based on applications of the additive fuzzy system in function approximation demonstrate that the proposed approach reduces the learning time and hence improves convergence speed considerably.
    40
    7095
    An Innovational Intermittent Algorithm in Networks-On-Chip (NOC)
    Abstract:
    Every day human life experiences new equipments more automatic and with more abilities. So the need for faster processors doesn-t seem to finish. Despite new architectures and higher frequencies, a single processor is not adequate for many applications. Parallel processing and networks are previous solutions for this problem. The new solution to put a network of resources on a chip is called NOC (network on a chip). The more usual topology for NOC is mesh topology. There are several routing algorithms suitable for this topology such as XY, fully adaptive, etc. In this paper we have suggested a new algorithm named Intermittent X, Y (IX/Y). We have developed the new algorithm in simulation environment to compare delay and power consumption with elders' algorithms.
    39
    7466
    Real-Time Visual Simulation and Interactive Animation of Shadow Play Puppets Using OpenGL
    Abstract:
    This paper describes a method of modeling to model shadow play puppet using sophisticated computer graphics techniques available in OpenGL in order to allow interactive play in real-time environment as well as producing realistic animation. This paper proposes a novel real-time method is proposed for modeling of puppet and its shadow image that allows interactive play of virtual shadow play using texture mapping and blending techniques. Special effects such as lighting and blurring effects for virtual shadow play environment are also developed. Moreover, the use of geometric transformations and hierarchical modeling facilitates interaction among the different parts of the puppet during animation. Based on the experiments and the survey that were carried out, the respondents involved are very satisfied with the outcomes of these techniques.
    38
    7683
    Web Application to Profiling Scientific Institutions through Citation Mining
    Abstract:

    Recently the use of data mining to scientific bibliographic data bases has been implemented to analyze the pathways of the knowledge or the core scientific relevances of a laureated novel or a country. This specific case of data mining has been named citation mining, and it is the integration of citation bibliometrics and text mining. In this paper we present an improved WEB implementation of statistical physics algorithms to perform the text mining component of citation mining. In particular we use an entropic like distance between the compression of text as an indicator of the similarity between them. Finally, we have included the recently proposed index h to characterize the scientific production. We have used this web implementation to identify users, applications and impact of the Mexican scientific institutions located in the State of Morelos.

    37
    7781
    Integrating Security Indifference Curve to Formal Decision Evaluation
    Abstract:
    Decisions are regularly made during a project or daily life. Some decisions are critical and have a direct impact on project or human success. Formal evaluation is thus required, especially for crucial decisions, to arrive at the optimal solution among alternatives to address issues. According to microeconomic theory, all people-s decisions can be modeled as indifference curves. The proposed approach supports formal analysis and decision by constructing indifference curve model from the previous experts- decision criteria. These knowledge embedded in the system can be reused or help naïve users select alternative solution of the similar problem. Moreover, the method is flexible to cope with unlimited number of factors influencing the decision-making. The preliminary experimental results of the alternative selection are accurately matched with the expert-s decisions.
    36
    7852
    Another Formal Proposal For Stealth
    Abstract:
    Taking into account the link between the efficiency of a detector and the complexity of a stealth mechanism, we propose in this paper a new formalism for stealth using graph theory.
    35
    7918
    Robust Adaptive ELS-QR Algorithm for Linear Discrete Time Stochastic Systems Identification
    Abstract:
    This work proposes a recursive weighted ELS algorithm for system identification by applying numerically robust orthogonal Householder transformations. The properties of the proposed algorithm show it obtains acceptable results in a noisy environment: fast convergence and asymptotically unbiased estimates. Comparative analysis with others robust methods well known from literature are also presented.
    34
    8005
    An Approach to Concerns and Aspects Mining for Web Applications
    Abstract:
    Web applications have become very complex and crucial, especially when combined with areas such as CRM (Customer Relationship Management) and BPR (Business Process Reengineering), the scientific community has focused attention to Web applications design, development, analysis, and testing, by studying and proposing methodologies and tools. This paper proposes an approach to automatic multi-dimensional concern mining for Web Applications, based on concepts analysis, impact analysis, and token-based concern identification. This approach lets the user to analyse and traverse Web software relevant to a particular concern (concept, goal, purpose, etc.) via multi-dimensional separation of concerns, to document, understand and test Web applications. This technique was developed in the context of WAAT (Web Applications Analysis and Testing) project. A semi-automatic tool to support this technique is currently under development.
    33
    8257
    Feature's Extraction of Human Body Composition in Images by Segmentation Method
    Abstract:

    Detection and recognition of the Human Body Composition and extraction their measures (width and length of human body) in images are a major issue in detecting objects and the important field in Image, Signal and Vision Computing in recent years. Finding people and extraction their features in Images are particularly important problem of object recognition, because people can have high variability in the appearance. This variability may be due to the configuration of a person (e.g., standing vs. sitting vs. jogging), the pose (e.g. frontal vs. lateral view), clothing, and variations in illumination. In this study, first, Human Body is being recognized in image then the measures of Human Body extract from the image.

    32
    8644
    A New Scheduling Algorithm Based on Traffic Classification Using Imprecise Computation
    Abstract:
    Wireless channels are characterized by more serious bursty and location-dependent errors. Many packet scheduling algorithms have been proposed for wireless networks to guarantee fairness and delay bounds. However, most existing schemes do not consider the difference of traffic natures among packet flows. This will cause the delay-weight coupling problem. In particular, serious queuing delays may be incurred for real-time flows. In this paper, it is proposed a scheduling algorithm that takes traffic types of flows into consideration when scheduling packets and also it is provided scheduling flexibility by trading off video quality to meet the playback deadline.
    31
    8711
    Application of Wireless Visual Sensor for Semi- Autonomous Mine Navigation System
    Abstract:
    The present paper represent the efforts undertaken for the development of an semi-automatic robot that may be used for various post-disaster rescue operation planning and their subsequent execution using one-way communication of video and data from the robot to the controller and controller to the robot respectively. Wireless communication has been used for the purpose so that the robot may access the unapproachable places easily without any difficulties. It is expected that the information obtained from the robot would be of definite help to the rescue team for better planning and execution of their operations.
    30
    8791
    A New Approach to Annotate the Text's of the Websites and Documents with a Quite Comprehensive Knowledge Base
    Abstract:
    Machine-understandable data when strongly interlinked constitutes the basis for the SemanticWeb. Annotating web documents is one of the major techniques for creating metadata on the Web. Annotating websites defines the containing data in a form which is suitable for interpretation by machines. In this paper, we present a new approach to annotate websites and documents by promoting the abstraction level of the annotation process to a conceptual level. By this means, we hope to solve some of the problems of the current annotation solutions.
    29
    9393
    Maximum Common Substructure Extraction in RNA Secondary Structures Using Clique Detection Approach
    Authors:
    Abstract:
    The similarity comparison of RNA secondary structures is important in studying the functions of RNAs. In recent years, most existing tools represent the secondary structures by tree-based presentation and calculate the similarity by tree alignment distance. Different to previous approaches, we propose a new method based on maximum clique detection algorithm to extract the maximum common structural elements in compared RNA secondary structures. A new graph-based similarity measurement and maximum common subgraph detection procedures for comparing purely RNA secondary structures is introduced. Given two RNA secondary structures, the proposed algorithm consists of a process to determine the score of the structural similarity, followed by comparing vertices labelling, the labelled edges and the exact degree of each vertex. The proposed algorithm also consists of a process to extract the common structural elements between compared secondary structures based on a proposed maximum clique detection of the problem. This graph-based model also can work with NC-IUB code to perform the pattern-based searching. Therefore, it can be used to identify functional RNA motifs from database or to extract common substructures between complex RNA secondary structures. We have proved the performance of this proposed algorithm by experimental results. It provides a new idea of comparing RNA secondary structures. This tool is helpful to those who are interested in structural bioinformatics.
    28
    9452
    Digital Social Networks: Examining the Knowledge Characteristics
    Abstract:
    In today-s information age, numbers of organizations are still arguing on capitalizing the values of Information Technology (IT) and Knowledge Management (KM) to which individuals can benefit from and effective communication among the individuals can be established. IT exists in enabling positive improvement for communication among knowledge workers (k-workers) with a number of social network technology domains at workplace. The acceptance of digital discourse in sharing of knowledge and facilitating the knowledge and information flows at most of the organizations indeed impose the culture of knowledge sharing in Digital Social Networks (DSN). Therefore, this study examines whether the k-workers with IT background would confer an effect on the three knowledge characteristics -- conceptual, contextual, and operational. Derived from these three knowledge characteristics, five potential factors will be examined on the effects of knowledge exchange via e-mail domain as the chosen query. It is expected, that the results could provide such a parameter in exploring how DSN contributes in supporting the k-workers- virtues, performance and qualities as well as revealing the mutual point between IT and KM.
    27
    9632
    Systholic Boolean Orthonormalizer Network in Wavelet Domain for Microarray Denoising
    Abstract:

    We describe a novel method for removing noise (in wavelet domain) of unknown variance from microarrays. The method is based on the following procedure: We apply 1) Bidimentional Discrete Wavelet Transform (DWT-2D) to the Noisy Microarray, 2) scaling and rounding to the coefficients of the highest subbands (to obtain integer and positive coefficients), 3) bit-slicing to the new highest subbands (to obtain bit-planes), 4) then we apply the Systholic Boolean Orthonormalizer Network (SBON) to the input bit-plane set and we obtain two orthonormal otput bit-plane sets (in a Boolean sense), we project a set on the other one, by means of an AND operation, and then, 5) we apply re-assembling, and, 6) rescaling. Finally, 7) we apply Inverse DWT-2D and reconstruct a microarray from the modified wavelet coefficients. Denoising results compare favorably to the most of methods in use at the moment.

    26
    10072
    DCBOR: A Density Clustering Based on Outlier Removal
    Abstract:
    Data clustering is an important data exploration technique with many applications in data mining. We present an enhanced version of the well known single link clustering algorithm. We will refer to this algorithm as DCBOR. The proposed algorithm alleviates the chain effect by removing the outliers from the given dataset. So this algorithm provides outlier detection and data clustering simultaneously. This algorithm does not need to update the distance matrix, since the algorithm depends on merging the most k-nearest objects in one step and the cluster continues grow as long as possible under specified condition. So the algorithm consists of two phases; at the first phase, it removes the outliers from the input dataset. At the second phase, it performs the clustering process. This algorithm discovers clusters of different shapes, sizes, densities and requires only one input parameter; this parameter represents a threshold for outlier points. The value of the input parameter is ranging from 0 to 1. The algorithm supports the user in determining an appropriate value for it. We have tested this algorithm on different datasets contain outlier and connecting clusters by chain of density points, and the algorithm discovers the correct clusters. The results of our experiments demonstrate the effectiveness and the efficiency of DCBOR.
    25
    10457
    Service Identification Approach to SOA Development
    Abstract:
    Service identification is one of the main activities in the modeling of a service-oriented solution, and therefore errors made during identification can flow down through detailed design and implementation activities that may necessitate multiple iterations, especially in building composite applications. Different strategies exist for how to identify candidate services that each of them has its own benefits and trade offs. The approach presented in this paper proposes a selective identification of services approach, based on in depth business process analysis coupled with use cases and existing assets analysis and goal service modeling. This article clearly emphasizes the key activities need for the analysis and service identification to build a optimized service oriented architecture. In contrast to other approaches this article mentions some best practices and steps, wherever appropriate, to point out the vagueness involved in service identification.
    24
    10464
    A Parallel Quadtree Approach for Image Compression using Wavelets
    Abstract:
    Wavelet transforms are multiresolution decompositions that can be used to analyze signals and images. Image compression is one of major applications of wavelet transforms in image processing. It is considered as one of the most powerful methods that provides a high compression ratio. However, its implementation is very time-consuming. At the other hand, parallel computing technologies are an efficient method for image compression using wavelets. In this paper, we propose a parallel wavelet compression algorithm based on quadtrees. We implement the algorithm using MatlabMPI (a parallel, message passing version of Matlab), and compute its isoefficiency function, and show that it is scalable. Our experimental results confirm the efficiency of the algorithm also.
    23
    10492
    A New Model for Discovering XML Association Rules from XML Documents
    Abstract:
    The inherent flexibilities of XML in both structure and semantics makes mining from XML data a complex task with more challenges compared to traditional association rule mining in relational databases. In this paper, we propose a new model for the effective extraction of generalized association rules form a XML document collection. We directly use frequent subtree mining techniques in the discovery process and do not ignore the tree structure of data in the final rules. The frequent subtrees based on the user provided support are split to complement subtrees to form the rules. We explain our model within multi-steps from data preparation to rule generation.
    22
    10830
    A Subjective Scheduler Based on Backpropagation Neural Network for Formulating a Real-life Scheduling Situation
    Abstract:
    This paper presents a subjective job scheduler based on a 3-layer Backpropagation Neural Network (BPNN) and a greedy alignment procedure in order formulates a real-life situation. The BPNN estimates critical values of jobs based on the given subjective criteria. The scheduler is formulated in such a way that, at each time period, the most critical job is selected from the job queue and is transferred into a single machine before the next periodic job arrives. If the selected job is one of the oldest jobs in the queue and its deadline is less than that of the arrival time of the current job, then there is an update of the deadline of the job is assigned in order to prevent the critical job from its elimination. The proposed satisfiability criteria indicates that the satisfaction of the scheduler with respect to performance of the BPNN, validity of the jobs and the feasibility of the scheduler.
    21
    11337
    Artificial Neural Network Development by means of Genetic Programming with Graph Codification
    Abstract:
    The development of Artificial Neural Networks (ANNs) is usually a slow process in which the human expert has to test several architectures until he finds the one that achieves best results to solve a certain problem. This work presents a new technique that uses Genetic Programming (GP) for automatically generating ANNs. To do this, the GP algorithm had to be changed in order to work with graph structures, so ANNs can be developed. This technique also allows the obtaining of simplified networks that solve the problem with a small group of neurons. In order to measure the performance of the system and to compare the results with other ANN development methods by means of Evolutionary Computation (EC) techniques, several tests were performed with problems based on some of the most used test databases. The results of those comparisons show that the system achieves good results comparable with the already existing techniques and, in most of the cases, they worked better than those techniques.
    20
    11564
    Underlying Cognitive Complexity Measure Computation with Combinatorial Rules
    Abstract:
    Measuring the complexity of software has been an insoluble problem in software engineering. Complexity measures can be used to predict critical information about testability, reliability, and maintainability of software systems from automatic analysis of the source code. During the past few years, many complexity measures have been invented based on the emerging Cognitive Informatics discipline. These software complexity measures, including cognitive functional size, lend themselves to the approach of the total cognitive weights of basic control structures such as loops and branches. This paper shows that the current existing calculation method can generate different results that are algebraically equivalence. However, analysis of the combinatorial meanings of this calculation method shows significant flaw of the measure, which also explains why it does not satisfy Weyuker's properties. Based on the findings, improvement directions, such as measures fusion, and cumulative variable counting scheme are suggested to enhance the effectiveness of cognitive complexity measures.
    19
    11750
    Estimating Shortest Circuit Path Length Complexity
    Abstract:
    When binary decision diagrams are formed from uniformly distributed Monte Carlo data for a large number of variables, the complexity of the decision diagrams exhibits a predictable relationship to the number of variables and minterms. In the present work, a neural network model has been used to analyze the pattern of shortest path length for larger number of Monte Carlo data points. The neural model shows a strong descriptive power for the ISCAS benchmark data with an RMS error of 0.102 for the shortest path length complexity. Therefore, the model can be considered as a method of predicting path length complexities; this is expected to lead to minimum time complexity of very large-scale integrated circuitries and related computer-aided design tools that use binary decision diagrams.
    18
    12310
    Feature Selection with Kohonen Self Organizing Classification Algorithm
    Abstract:
    In this paper a one-dimension Self Organizing Map algorithm (SOM) to perform feature selection is presented. The algorithm is based on a first classification of the input dataset on a similarity space. From this classification for each class a set of positive and negative features is computed. This set of features is selected as result of the procedure. The procedure is evaluated on an in-house dataset from a Knowledge Discovery from Text (KDT) application and on a set of publicly available datasets used in international feature selection competitions. These datasets come from KDT applications, drug discovery as well as other applications. The knowledge of the correct classification available for the training and validation datasets is used to optimize the parameters for positive and negative feature extractions. The process becomes feasible for large and sparse datasets, as the ones obtained in KDT applications, by using both compression techniques to store the similarity matrix and speed up techniques of the Kohonen algorithm that take advantage of the sparsity of the input matrix. These improvements make it feasible, by using the grid, the application of the methodology to massive datasets.
    17
    12768
    Rotation Invariant Fusion of Partial Image Parts in Vista Creation using Missing View Regeneration
    Abstract:
    The automatic construction of large, high-resolution image vistas (mosaics) is an active area of research in the fields of photogrammetry [1,2], computer vision [1,4], medical image processing [4], computer graphics [3] and biometrics [8]. Image stitching is one of the possible options to get image mosaics. Vista Creation in image processing is used to construct an image with a large field of view than that could be obtained with a single photograph. It refers to transforming and stitching multiple images into a new aggregate image without any visible seam or distortion in the overlapping areas. Vista creation process aligns two partial images over each other and blends them together. Image mosaics allow one to compensate for differences in viewing geometry. Thus they can be used to simplify tasks by simulating the condition in which the scene is viewed from a fixed position with single camera. While obtaining partial images the geometric anomalies like rotation, scaling are bound to happen. To nullify effect of rotation of partial images on process of vista creation, we are proposing rotation invariant vista creation algorithm in this paper. Rotation of partial image parts in the proposed method of vista creation may introduce some missing region in the vista. To correct this error, that is to fill the missing region further we have used image inpainting method on the created vista. This missing view regeneration method also overcomes the problem of missing view [31] in vista due to cropping, irregular boundaries of partial image parts and errors in digitization [35]. The method of missing view regeneration generates the missing view of vista using the information present in vista itself.
    16
    12779
    K-Means for Spherical Clusters with Large Variance in Sizes
    Abstract:
    Data clustering is an important data exploration technique with many applications in data mining. The k-means algorithm is well known for its efficiency in clustering large data sets. However, this algorithm is suitable for spherical shaped clusters of similar sizes and densities. The quality of the resulting clusters decreases when the data set contains spherical shaped with large variance in sizes. In this paper, we introduce a competent procedure to overcome this problem. The proposed method is based on shifting the center of the large cluster toward the small cluster, and recomputing the membership of small cluster points, the experimental results reveal that the proposed algorithm produces satisfactory results.
    15
    13070
    A Message Passing Implementation of a New Parallel Arrangement Algorithm
    Abstract:
    This paper describes a new algorithm of arrangement in parallel, based on Odd-Even Mergesort, called division and concurrent mixes. The main idea of the algorithm is to achieve that each processor uses a sequential algorithm for ordering a part of the vector, and after that, for making the processors work in pairs in order to mix two of these sections ordered in a greater one, also ordered; after several iterations, the vector will be completely ordered. The paper describes the implementation of the new algorithm on a Message Passing environment (such as MPI). Besides, it compares the obtained experimental results with the quicksort sequential algorithm and with the parallel implementations (also on MPI) of the algorithms quicksort and bitonic sort. The comparison has been realized in an 8 processors cluster under GNU/Linux which is running on a unique PC processor.
    14
    13275
    Support Vector Machine based Intelligent Watermark Decoding for Anticipated Attack
    Abstract:
    In this paper, we present an innovative scheme of blindly extracting message bits from an image distorted by an attack. Support Vector Machine (SVM) is used to nonlinearly classify the bits of the embedded message. Traditionally, a hard decoder is used with the assumption that the underlying modeling of the Discrete Cosine Transform (DCT) coefficients does not appreciably change. In case of an attack, the distribution of the image coefficients is heavily altered. The distribution of the sufficient statistics at the receiving end corresponding to the antipodal signals overlap and a simple hard decoder fails to classify them properly. We are considering message retrieval of antipodal signal as a binary classification problem. Machine learning techniques like SVM is used to retrieve the message, when certain specific class of attacks is most probable. In order to validate SVM based decoding scheme, we have taken Gaussian noise as a test case. We generate a data set using 125 images and 25 different keys. Polynomial kernel of SVM has achieved 100 percent accuracy on test data.
    13
    13484
    ANN Based Currency Recognition System using Compressed Gray Scale and Application for Sri Lankan Currency Notes - SLCRec
    Abstract:
    Automatic currency note recognition invariably depends on the currency note characteristics of a particular country and the extraction of features directly affects the recognition ability. Sri Lanka has not been involved in any kind of research or implementation of this kind. The proposed system “SLCRec" comes up with a solution focusing on minimizing false rejection of notes. Sri Lankan currency notes undergo severe changes in image quality in usage. Hence a special linear transformation function is adapted to wipe out noise patterns from backgrounds without affecting the notes- characteristic images and re-appear images of interest. The transformation maps the original gray scale range into a smaller range of 0 to 125. Applying Edge detection after the transformation provided better robustness for noise and fair representation of edges for new and old damaged notes. A three layer back propagation neural network is presented with the number of edges detected in row order of the notes and classification is accepted in four classes of interest which are 100, 500, 1000 and 2000 rupee notes. The experiments showed good classification results and proved that the proposed methodology has the capability of separating classes properly in varying image conditions.
    12
    13828
    A Deterministic Polynomial-time Algorithm for the Clique Problem and the Equality of P and NP Complexity Classes
    Abstract:
    In this paper a deterministic polynomial-time algorithm is presented for the Clique problem. The case is considered as the problem of omitting the minimum number of vertices from the input graph so that none of the zeroes on the graph-s adjacency matrix (except the main diagonal entries) would remain on the adjacency matrix of the resulting subgraph. The existence of a deterministic polynomial-time algorithm for the Clique problem, as an NP-complete problem will prove the equality of P and NP complexity classes.
    11
    14016
    A Knowledge-Based E-mail System Using Semantic Categorization and Rating Mechanisms
    Abstract:
    Knowledge-based e-mail systems focus on incorporating knowledge management approach in order to enhance the traditional e-mail systems. In this paper, we present a knowledgebased e-mail system called KS-Mail where people do not only send and receive e-mail conventionally but are also able to create a sense of knowledge flow. We introduce semantic processing on the e-mail contents by automatically assigning categories and providing links to semantically related e-mails. This is done to enrich the knowledge value of each e-mail as well as to ease the organization of the e-mails and their contents. At the application level, we have also built components like the service manager, evaluation engine and search engine to handle the e-mail processes efficiently by providing the means to share and reuse knowledge. For this purpose, we present the KS-Mail architecture, and elaborate on the details of the e-mail server and the application server. We present the ontology mapping technique used to achieve the e-mail content-s categorization as well as the protocols that we have developed to handle the transactions in the e-mail system. Finally, we discuss further on the implementation of the modules presented in the KS-Mail architecture.
    10
    14122
    Novelty as a Measure of Interestingness in Knowledge Discovery
    Abstract:
    Rule Discovery is an important technique for mining knowledge from large databases. Use of objective measures for discovering interesting rules leads to another data mining problem, although of reduced complexity. Data mining researchers have studied subjective measures of interestingness to reduce the volume of discovered rules to ultimately improve the overall efficiency of KDD process. In this paper we study novelty of the discovered rules as a subjective measure of interestingness. We propose a hybrid approach based on both objective and subjective measures to quantify novelty of the discovered rules in terms of their deviations from the known rules (knowledge). We analyze the types of deviation that can arise between two rules and categorize the discovered rules according to the user specified threshold. We implement the proposed framework and experiment with some public datasets. The experimental results are promising.
    9
    14259
    Design and Implementation of Shared Memory based Parallel File System Logging Method for High Performance Computing
    Abstract:
    I/O workload is a critical and important factor to analyze I/O pattern and file system performance. However tracing I/O operations on the fly distributed parallel file system is non-trivial due to collection overhead and a large volume of data. In this paper, we design and implement a parallel file system logging method for high performance computing using shared memory-based multi-layer scheme. It minimizes the overhead with reduced logging operation response time and provides efficient post-processing scheme through shared memory. Separated logging server can collect sequential logs from multiple clients in a cluster through packet communication. Implementation and evaluation result shows low overhead and high scalability of this architecture for high performance parallel logging analysis.
    8
    14265
    Voice Disorders Identification Using Hybrid Approach: Wavelet Analysis and Multilayer Neural Networks
    Abstract:
    This paper presents a new strategy of identification and classification of pathological voices using the hybrid method based on wavelet transform and neural networks. After speech acquisition from a patient, the speech signal is analysed in order to extract the acoustic parameters such as the pitch, the formants, Jitter, and shimmer. Obtained results will be compared to those normal and standard values thanks to a programmable database. Sounds are collected from normal people and patients, and then classified into two different categories. Speech data base is consists of several pathological and normal voices collected from the national hospital “Rabta-Tunis". Speech processing algorithm is conducted in a supervised mode for discrimination of normal and pathology voices and then for classification between neural and vocal pathologies (Parkinson, Alzheimer, laryngeal, dyslexia...). Several simulation results will be presented in function of the disease and will be compared with the clinical diagnosis in order to have an objective evaluation of the developed tool.
    7
    15023
    The Modified Eigenface Method using Two Thresholds
    Abstract:
    A new approach is adopted in this paper based on Turk and Pentland-s eigenface method. It was found that the probability density function of the distance between the projection vector of the input face image and the average projection vector of the subject in the face database, follows Rayleigh distribution. In order to decrease the false acceptance rate and increase the recognition rate, the input face image has been recognized using two thresholds including the acceptance threshold and the rejection threshold. We also find out that the value of two thresholds will be close to each other as number of trials increases. During the training, in order to reduce the number of trials, the projection vectors for each subject has been averaged. The recognition experiments using the proposed algorithm show that the recognition rate achieves to 92.875% whilst the average number of judgment is only 2.56 times.
    6
    15050
    Generating Speq Rules based on Automatic Proof of Logical Equivalence
    Abstract:
    In the Equivalent Transformation (ET) computation model, a program is constructed by the successive accumulation of ET rules. A method by meta-computation by which a correct ET rule is generated has been proposed. Although the method covers a broad range in the generation of ET rules, all important ET rules are not necessarily generated. Generation of more ET rules can be achieved by supplementing generation methods which are specialized for important ET rules. A Specialization-by-Equation (Speq) rule is one of those important rules. A Speq rule describes a procedure in which two variables included in an atom conjunction are equalized due to predicate constraints. In this paper, we propose an algorithm that systematically and recursively generate Speq rules and discuss its effectiveness in the synthesis of ET programs. A Speq rule is generated based on proof of a logical formula consisting of given atom set and dis-equality. The proof is carried out by utilizing some ET rules and the ultimately obtained rules in generating Speq rules.
    5
    15193
    Calibration Method for an Augmented Reality System
    Abstract:
    In geometrical camera calibration, the objective is to determine a set of camera parameters that describe the mapping between 3D references coordinates and 2D image coordinates. In this paper, a technique of calibration and tracking based on both a least squares method is presented and a correlation technique developed as part of an augmented reality system. This approach is fast and it can be used for a real time system
    4
    15212
    Considerations of Public Key Infrastructure (PKI), Functioning as a Chain of Trust in Electronic Payments Systems
    Abstract:
    The growth of open networks created the interest to commercialise it. The establishment of an electronic business mechanism must be accompanied by a digital – electronic payment system to transfer the value of transactions. Financial organizations are requested to offer a secure e-payment synthesis with equivalent level of security served in conventional paper-based payment transactions. PKI, which is functioning as a chain of trust in security architecture, can enable security services of cryptography to epayments, in order to take advantage of the wider base either of customer or of trading partners and the reduction of cost transaction achieved by the use of Internet channels. The paper addresses the possibilities and the implementation suggestions of PKI in relevance to electronic payments by suggesting a framework that should be followed.
    3
    15268
    An Efficient Key Management Scheme for Secure SCADA Communication
    Abstract:
    A SCADA (Supervisory Control And Data Acquisition) system is an industrial control and monitoring system for national infrastructures. The SCADA systems were used in a closed environment without considering about security functionality in the past. As communication technology develops, they try to connect the SCADA systems to an open network. Therefore, the security of the SCADA systems has been an issue. The study of key management for SCADA system also has been performed. However, existing key management schemes for SCADA system such as SKE(Key establishment for SCADA systems) and SKMA(Key management scheme for SCADA systems) cannot support broadcasting communication. To solve this problem, an Advanced Key Management Architecture for Secure SCADA Communication has been proposed by Choi et al.. Choi et al.-s scheme also has a problem that it requires lots of computational cost for multicasting communication. In this paper, we propose an enhanced scheme which improving computational cost for multicasting communication with considering the number of keys to be stored in a low power communication device (RTU).
    2
    15302
    Application of “Multiple Risk Communicator“ to the Personal Information Leakage Problem
    Abstract:
    Along with the progress of our information society, various risks are becoming increasingly common, causing multiple social problems. For this reason, risk communications for establishing consensus among stakeholders who have different priorities have become important. However, it is not always easy for the decision makers to agree on measures to reduce risks based on opposing concepts, such as security, privacy and cost. Therefore, we previously developed and proposed the “Multiple Risk Communicator" (MRC) with the following functions: (1) modeling the support role of the risk specialist, (2) an optimization engine, and (3) displaying the computed results. In this paper, MRC program version 1.0 is applied to the personal information leakage problem. The application process and validation of the results are discussed.
    1
    15879
    Evolutionary Feature Selection for Text Documents using the SVM
    Abstract:
    Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, we present three feature selection methods: Information Gain, Support Vector Machine feature selection called (SVM_FS) and Genetic Algorithm with SVM (called GA_SVM). We show that the best results were obtained with GA_SVM method for a relatively small dimension of the feature vector.