So far, we have published prototype packages and web services of
Python modules
for data mining and parallel computing. All are licensed
as Open
Source software.
Software
- DMtools
The DMtools are an efficient and flexible
toolbox for common tasks in data mining, written
in Python.
- Pypar
Pypar is a module that allows
Python programs to run in parallel using MPI
for communication.
- Febrl
Febrl (Freely extensible biomedical record linkage)
contains so far modules for name and address cleaning and
standardisation. Modules for probabilistic data linkage
will be included in the next version.
- qlapply
qlapply is
an add-on package of the open-source statistical
language/package R, which provides a parallel version of
R's lapply function.
- Predictive Modelling
with Sparse Grids (PMSG)
PMSG is a collection of Python modules implementing
predictive modelling using sparse grids.
- POZITIV
POZITIV Alignment of biological sequences in Python.
Web Services
- Geographic
Demo
The predictive model with sparse grid for geographical
data mining.