TY - GEN
T1 - A Cloud-based GWAS Analysis Pipeline for Clinical Researchers
AU - Heinzlreiter, Paul
AU - Perkins, James Richard
AU - Torreno, Oscar
AU - Karlsson, Johan
AU - Ranea, Juan Antonio
AU - Mitterecker, Andreas
AU - Blanca, Miguel
AU - Trelles, Oswaldo
PY - 2014
Y1 - 2014
N2 - The cost of obtaining genome-scale biomedical data continues to drop rapidly, with many hospitals and universities being able to produce large amounts of data. Managing and analysing such ever-growing datasets is becoming a crucial issue. Cloud computing presents a good solution to this problem due to its flexibility in obtaining computational resources. However, it is essential to allow end-users with no experience to take advantage of the cloud computing model of elastic resource provisioning. This paper presents a workflow that allows the end-user to perform the core steps of a genome wide association analysis where raw gene- expression data is quality assessed. A number of steps in this process are computationally intensive and vary greatly depending on the size of the study, from a few samples to a few thousand. Therefore cloud computing provides an ideal solution to this problem by enabling scalability due to elastic resource provisioning. The key contributions of this paper are a real world application of cloud computing addressing a critical problem in biomedicine through parallelization of the appropriate parts of the workflow as well as enabling the end-user to concentrate on data analysis and biological interpretation of results by taking care of the computational aspects.
AB - The cost of obtaining genome-scale biomedical data continues to drop rapidly, with many hospitals and universities being able to produce large amounts of data. Managing and analysing such ever-growing datasets is becoming a crucial issue. Cloud computing presents a good solution to this problem due to its flexibility in obtaining computational resources. However, it is essential to allow end-users with no experience to take advantage of the cloud computing model of elastic resource provisioning. This paper presents a workflow that allows the end-user to perform the core steps of a genome wide association analysis where raw gene- expression data is quality assessed. A number of steps in this process are computationally intensive and vary greatly depending on the size of the study, from a few samples to a few thousand. Therefore cloud computing provides an ideal solution to this problem by enabling scalability due to elastic resource provisioning. The key contributions of this paper are a real world application of cloud computing addressing a critical problem in biomedicine through parallelization of the appropriate parts of the workflow as well as enabling the end-user to concentrate on data analysis and biological interpretation of results by taking care of the computational aspects.
UR - https://www.scopus.com/pages/publications/84902352565
U2 - 10.5220/0004802103870394
DO - 10.5220/0004802103870394
M3 - Conference proceedings
SN - 978-989-758-019-2
T3 - CLOSER 2014 - Proceedings of the 4th International Conference on Cloud Computing and Services Science
SP - 387
EP - 394
BT - Proc. of the 4th International Conference on Cloud Computing and Services Science (CLOSER 2014)
ER -