Machine learning is a major scientific discipline. It helps in the design and development of different algorithms that allow computers to attain behavior based on empirical data, such as from sensor data and different databases. Machine Learning is the study of computer algorithms that improve automatically with experience.


Tool Names


Open Source Computer Vision Library

Open source computer learning system making use of the Bayesian inferencing engine.

Weka 3: Data Mining Software

It is a collection of tools that implement decision trees and tables, rule learners, Naive Bayes, voted perceptrons, multi-layer perceptron, support vector machines. Meta schemes in it include bagging, stacking, and boosting.

Pfam: Database of Protein Families and HMMs

Pfam is a huge collection of multiple sequence alignments and trained hidden Markov models which cover a large number of common protein domains. Different inbuilt alignments are present along with many models for 8296 protein families, based on the Swissprot 48.9 and SP-TrEMBL 31.9 protein sequence databases.

HMMER: Biosequence Analysis

A tool mainly used to build HMMs from multiple alignments and calculate e-scores.


The famous free software BOT developed in promotion of AIML (Artificial Intelligence Markup Language)

GAlib: Matthew's Genetic Algorithms Library 

A toolset of genetic algorithm objects for C++ to perform optimization. The documentation contains implementation and examples. Nice screenshots. Includes graphic examples that use the Athena, Motif widget sets, or MFC.

Rapid Miner

The Rapid Miner toolset is an environment for machine learning through use of nested operators

MEME/MAST: Motif Discovery and Search

A software package to discover motifs (highly conserved regions) in groups of related DNA or protein sequences and, search sequence databases using motifs.

Pattern Matching Pointers

Algorithms are used to address issues of searching and matching strings and more complicated patterns such as trees, regular expressions, graphs, point sets, and arrays.

Tree Visualizer

Software which allows one to navigate (fly) through the data tree, zoom in on interesting nodes, click on bars to get counts, and mark interesting places in the tree.

mloss : Machine Language Open Source Software

It’s an effort by community to list the reproducible research via open source software, open access to data and results, and open standards for interchange.

Spider: General Purpose Machine Learning Toolbox in Matlab

It is an object orientated environment for machine learning in Matlab. Many algorithms can be plugged collectively and can be compared with model selection, statistical tests and visual plots.

HMM: Pattern search and discovery

A collection of various tools for creating and using HMMs for biological sequences

Milepost GCC Compiler

A program compiler with built-in machine learning to find the most efficient compilation possible based upon the processor the program is run on. This is the open source compiler that IBM has created.

ArrayMiner - ClassMarker

Programmatically it isolates similarities between scattered classes of genes. Very rich graphical interface. Samples of an unknown class are possible given enough data.

Classification Toolbox for MATLAB

A complete set of algorithms, which can be used for classification, clustering, feature selection and reduction for Matlab.

MALLET: Advanced Machine Learning for Language 

Integrated collection of Java code which is useful for statistical and natural language processing, document classification, clustering, information extraction, and other machine learning applications