DiseaseGene

From HP-SEE Wiki

(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
== General Information ==
== General Information ==
-
* Application's name:  
+
* Application's name: DiseaseGene - In-silico Disease Gene Mapper
-
* Virtual Research Community: ''VRC Name''
+
* Virtual Research Community: Life Sciences
-
* Scientific contact: ''Name Surname, e-mail''
+
* Scientific contact: Kozlovszky Miklos, Windisch Gergely; m.kozlovszky at sztaki.hu
-
* Technical contact: ''Name Surname, e-mail''
+
* Technical contact: Kozlovszky Miklos, Windisch Gergely; m.kozlovszky at sztaki.hu
-
* Developers: ''Group, department, institution, country''
+
* Developers: Biotech Group, Obuda University – John von Neumann Faculty of Informatics
-
* Web site: http://tobefilledin/  
+
* Web site: https://biotech.nik.bmf.hu/web/
== Short Description ==
== Short Description ==
-
''Tobefilledin''
+
Complex data mining and data processing tool using large-scale external open-access databases.  The aim of the task is to port a data mining tool to the SEE-HPC infrastructure, which can help researchers to do comparative analysis and target candidate genes for further research of polygene type diseases.  The implemented solution is capable to target candidate genes for various diseases such as asthma, diabetes, epilepsy, hypertension or schizophrenia using external online open-access eukaryotic (animal: mouse, rat, B. rerio, etc.) databases. The application does an in-silico mapping between the genes coming from the different model animals and search for unexplored potential target genes. With small modification the application is useful to target human genes too.
== Problems Solved ==
== Problems Solved ==
-
''Tobefilledin''
+
The implemented solution is capable to target candidate genes for various diseases such as asthma, diabetes, epilepsy, hypertension or schizophrenia using external online open-access eukaryotic (animal: mouse, rat, B. rerio, etc.) databases. The application does an in-silico mapping between the genes coming from the different model animals and search for unexplored potential target genes. With small modification the application is useful to target human genes too. Grid's reliability parameters and response time (1-5 min) is not suitable for such service.
== Scientific and Social Impact ==
== Scientific and Social Impact ==
-
''Tobefilledin''
+
Researchers in the region will be able to target candidate genes for further research of polygene type diseases.
 +
Create a data mining a service to the SEE-HPC infrastructure, which can help researchers to do comparative analysis.
== Collaborations and Beneficiaries ==
== Collaborations and Beneficiaries ==
-
''Tobefilledin''
+
People who are interested in using short fragment alignments will greatly benefit from the availability of this service. The service will be freely available to the LS community. We estimate that a number of 2-5 scientific groups (5-15 researchers) world wide will use our service.
 +
Ongoing collaborations so far: Hungarian Bioinformatics Association, Semmelweis University
 +
 
== Technical Features and HP-SEE Implementation ==
== Technical Features and HP-SEE Implementation ==
-
* Primary programming language: ''Tobefilledin''
+
* Primary programming language: C/C++
-
* Parallel programming paradigm: ''Tobefilledin''
+
* Parallel programming paradigm: Clustered multiprocessing (ex. using MPI) + Multiple serial jobs (data-splitting, parametric studies)
-
* Main parallel code: ''Tobefilledin''
+
* Main parallel code: WS-PGRADE/gUSE + C/C++
-
* Pre/post processing code: ''Tobefilledin''
+
* Pre/post processing code: Perl/BioPerl (in-house development)
-
* Application tools and libraries: ''Enumerate (comma separated)''
+
* Application tools and libraries: Perl/BioPerl (in-house development)
-
* Number of cores required: ''Tobefilledin''
+
* Number of cores required: 128 – 256
-
* Minimum RAM/core required: ''Tobefilledin''
+
* Minimum RAM/core required: 4 - 8 GB
-
* Storage space during a single run: ''Tobefilledin''
+
* Storage space during a single run: 2-5 GB
-
* Long-term data storage: ''Tobefilledin''
+
* Long-term data storage: 5-10TB
== Usage Example ==
== Usage Example ==

Revision as of 20:26, 7 July 2011

Contents

General Information

  • Application's name: DiseaseGene - In-silico Disease Gene Mapper
  • Virtual Research Community: Life Sciences
  • Scientific contact: Kozlovszky Miklos, Windisch Gergely; m.kozlovszky at sztaki.hu
  • Technical contact: Kozlovszky Miklos, Windisch Gergely; m.kozlovszky at sztaki.hu
  • Developers: Biotech Group, Obuda University – John von Neumann Faculty of Informatics
  • Web site: https://biotech.nik.bmf.hu/web/

Short Description

Complex data mining and data processing tool using large-scale external open-access databases. The aim of the task is to port a data mining tool to the SEE-HPC infrastructure, which can help researchers to do comparative analysis and target candidate genes for further research of polygene type diseases. The implemented solution is capable to target candidate genes for various diseases such as asthma, diabetes, epilepsy, hypertension or schizophrenia using external online open-access eukaryotic (animal: mouse, rat, B. rerio, etc.) databases. The application does an in-silico mapping between the genes coming from the different model animals and search for unexplored potential target genes. With small modification the application is useful to target human genes too.

Problems Solved

The implemented solution is capable to target candidate genes for various diseases such as asthma, diabetes, epilepsy, hypertension or schizophrenia using external online open-access eukaryotic (animal: mouse, rat, B. rerio, etc.) databases. The application does an in-silico mapping between the genes coming from the different model animals and search for unexplored potential target genes. With small modification the application is useful to target human genes too. Grid's reliability parameters and response time (1-5 min) is not suitable for such service.

Scientific and Social Impact

Researchers in the region will be able to target candidate genes for further research of polygene type diseases. Create a data mining a service to the SEE-HPC infrastructure, which can help researchers to do comparative analysis.

Collaborations and Beneficiaries

People who are interested in using short fragment alignments will greatly benefit from the availability of this service. The service will be freely available to the LS community. We estimate that a number of 2-5 scientific groups (5-15 researchers) world wide will use our service. Ongoing collaborations so far: Hungarian Bioinformatics Association, Semmelweis University


Technical Features and HP-SEE Implementation

  • Primary programming language: C/C++
  • Parallel programming paradigm: Clustered multiprocessing (ex. using MPI) + Multiple serial jobs (data-splitting, parametric studies)
  • Main parallel code: WS-PGRADE/gUSE + C/C++
  • Pre/post processing code: Perl/BioPerl (in-house development)
  • Application tools and libraries: Perl/BioPerl (in-house development)
  • Number of cores required: 128 – 256
  • Minimum RAM/core required: 4 - 8 GB
  • Storage space during a single run: 2-5 GB
  • Long-term data storage: 5-10TB

Usage Example

Tobefilledin, text and (maybe) images.

Publications

  • ...
  • ...
Personal tools