DiseaseGene

From HP-SEE Wiki

Revision as of 17:03, 20 February 2012 by Lifesci (Talk | contribs)
Jump to: navigation, search

Contents

General Information

  • Application's name: DiseaseGene - In-silico Disease Gene Mapper
  • Virtual Research Community: Life Sciences
  • Scientific contact: Kozlovszky Miklos, Windisch Gergely; m.kozlovszky at sztaki.hu
  • Technical contact: Kozlovszky Miklos, Windisch Gergely; m.kozlovszky at sztaki.hu
  • Developers: Biotech Group, Obuda University – John von Neumann Faculty of Informatics
  • Application website (HP SEE Bioinformatics Portal): http://ls-hpsee.nik.uni-obuda.hu:8080
  • Web site: https://biotech.nik.uni-obuda.hu

Short Description

Complex data mining and data processing tool using large-scale external open-access databases. The aim of the task is to port a data mining tool to the SEE-HPC infrastructure, which can help researchers to do comparative analysis and target candidate genes for further research of polygene type diseases. The implemented solution is capable to target candidate genes for various diseases such as asthma, diabetes, epilepsy, hypertension or schizophrenia using external online open-access eukaryotic (animal: mouse, rat, B. rerio, etc.) databases. The application does an in-silico mapping between the genes coming from the different model animals and search for unexplored potential target genes. With small modification the application is useful to target human genes too.

Problems Solved

The implemented solution is capable to target candidate genes for various diseases such as asthma, diabetes, epilepsy, hypertension or schizophrenia using external online open-access eukaryotic (animal: mouse, rat, B. rerio, etc.) databases. The application does an in-silico mapping between the genes coming from the different model animals and search for unexplored potential target genes. With small modification the application is useful to target human genes too. Grid's reliability parameters and response time (1-5 min) is not suitable for such service.

Scientific and Social Impact

Researchers in the region will be able to target candidate genes for further research of polygene type diseases. Create a data mining a service to the SEE-HPC infrastructure, which can help researchers to do comparative analysis.

Collaborations and Beneficiaries

People who are interested in using short fragment alignments will greatly benefit from the availability of this service. The service will be freely available to the LS community. We estimate that a number of 2-5 scientific groups (5-15 researchers) world wide will use our service. Ongoing collaborations so far: Hungarian Bioinformatics Association, Semmelweis University


Technical Features and HP-SEE Implementation

  • Primary programming language: C/C++
  • Parallel programming paradigm: Clustered multiprocessing (ex. using MPI) + Multiple serial jobs (data-splitting, parametric studies)
  • Main parallel code: WS-PGRADE/gUSE + C/C++
  • Pre/post processing code: Perl/BioPerl (in-house development)
  • Application tools and libraries: Perl/BioPerl (in-house development)
  • Number of cores required: 128 – 256
  • Minimum RAM/core required: 4 - 8 GB
  • Storage space during a single run: 2-5 GB
  • Long-term data storage: 5-10TB

Usage Example

Tobefilledin, text and (maybe) images.

Publications

  • ...
  • ...
Personal tools