Project Details
Description
Large scale genomics projects exploiting high throughput leading technology have produced and continue to produce massive data sets with exponential growing rates. So far, only a small part of this data can be abstracted, managed and processed, giving an incomplete understanding of the biological process being observed. The lack of processing power is a bottle neck in acquiring results. Comparative genomics is a good example since it includes all the ingredients: huge and ever growing datasets, complex applications that demands large computational resources and new mathematical and statistical models for analysing and synthetizing genomic information. A promising approach to address such massive data sets is the creation of new computer software that makes effective use of parallel processing. This proposal pursues the linking of different research domains to come up with a coordinated multi-disciplinary approach in the development of tools targeting Big-Data and computationally intensive scientific applications. Generic solutions for Big-Data storage, management, distribution, processing and final analysis will be developed. These solutions will target a broad range of scientific applications, in concrete, as proof-of-concept they will be implemented in the ‘Comparative Genomics’ field of bioinformatics and biomedical domains. Applications such as the detection of main evolutionary events, new comparative genomics’ models that can be evaluated experimentally, for inter-species evolutionary distance, the composition of the k-mers dictionaries for each specie, or customising symbolic computing methods to determine the consensus tree from a sequence of trees with application in multiple sequence alignments, phylogenetic studies, clustering algorithms, etc. present in diverse fields of bioinformatics, from NGS-DNA assembly to gene-expression, all of them well suited applications to apply HPC-CC approaches and with high and attractive potential for commercialization.
Status | Finished |
---|---|
Effective start/end date | 01.02.2013 → 31.01.2017 |
Fields of science
- 106005 Bioinformatics
- 305 Other Human Medicine, Health Sciences
- 102018 Artificial neural networks
- 102 Computer Sciences
- 106041 Structural biology
- 101029 Mathematical statistics
- 106023 Molecular biology
- 106013 Genetics
- 102001 Artificial intelligence
- 106002 Biochemistry
- 101004 Biomathematics
- 102015 Information systems
- 101019 Stochastics
- 102003 Image processing
- 103029 Statistical physics
- 101018 Statistics
- 101017 Game theory
- 101016 Optimisation
- 202017 Embedded systems
- 101015 Operations research
- 101014 Numerical mathematics
- 101028 Mathematical modelling
- 101026 Time series analysis
- 101024 Probability theory
- 102032 Computational intelligence
- 102004 Bioinformatics
- 101027 Dynamical systems
- 102013 Human-computer interaction
- 305907 Medical statistics
- 305905 Medical informatics
- 101031 Approximation theory
- 102033 Data mining
- 305901 Computer-aided diagnosis and therapy
- 102019 Machine learning
- 106007 Biostatistics
- 202037 Signal processing
- 202036 Sensor systems
- 202035 Robotics
JKU Focus areas
- Digital Transformation