DNAMA

From HP-SEE Wiki

(Difference between revisions)
Jump to: navigation, search
(Infrastructure Usage)
(Publications)
Line 96: Line 96:
== Publications ==
== Publications ==
-
*  
+
* Luka Filipović, Danilo Mrdak and Božo Krstajić, "Performance evaluation of computational phylogeny software in parallel computing environment", ICT Innovations 2012
== Foreseen Activities ==
== Foreseen Activities ==

Revision as of 12:33, 4 October 2012

Contents

General Information

  • Application's name: DNA Multicore Analysis
  • Application's acronym: DNAMA
  • Virtual Research Community: Life Sciences
  • Scientific contact: Danilo Mrdak, danilomrdak@gmail.com
  • Technical contact: Luka Filipovic, lukaf@ac.me
  • Developers: Center of Information System & Faculty of Natural Sciences - University of Montenegro
  • Web site: http://wiki.hp-see.eu/index.php/DNAMA

Short Description

Using of Network Cluster Web with potential of super-computer performances for DNA sequences analyzing will give us unlimited potential for DNA research. This will give us unlimited potential in term of analyzed sequence number and time consumption for analysis to be carried out. As many of DNA comparing and analyzing software use Monte Carlo and Markov chain algorithms that are time consuming regarding to sequence numbers, super-computer resource will faster our job and make the robust and overall analysis possible.Using of all published sequences for one group (e.g. for all salmonid species: salmons, trout, grayling, river huchon) from the same DNA region (mitochondrial D-loop DNA, Cytochrom b gene…) will give us more detailed insight in their relationships and phylogeny relationships.

DNAMA application is based on RAxML application from The Exelixis Lab.

Problems Solved

The working resource that is possible to use trough network computer clustering will allow us to put in analysis as much samples as we wish and that those analysis will be finished in one to few hours. Moreover, we will tray to modified the algorithms in order to have multi-loci analysis to get a consensus three that will suggest the most possible pathways of phylogeny with much higher level of confidence

Scientific and Social Impact

Use of Network Cluster Web with potential of super-computer performances for DNA sequence comparison analysis will give unlimited potential for DNA research. Use of all published sequences for one group (e.g. for all salmonid species: salmons, trout, grayling, river huchon) from the same DNA region (mitochondrial D-loop DNA, Cytochrom b gene…) will give a more detailed insight in their relationships and phylogeny relationships (RAXML).

Problems solved : Analyze as many samples as possible within a few hours. Modification of algorithms in order to have multi-loci analysis to get a consensus tree that will suggest the most probable pathways of phylogeny with much higher level of confidence.

Impact : Assessing whether computer resources are reliable is one of the main obstacle for making scientific breakthroughs in field of Molecular Biology and Phylogeny (Evolution). HPC allows for faster and more reliable results. Enhancement of competitiveness in terms of regional and European collaboration. Drawing the attention of national stakeholders in future building of Montenegro as a "society of knowledge”

Collaborations

Beneficiaries

  • University of Montenegro - Faculty of Natural sciences - Biology Department

Number of users

15-20 from Faculty of natural Sciences, University of Montenegro

Development Plan

  • Concept: before the start of the project - finished by RAxML developers
  • Start of alpha stage: before the start of the project
  • Start of beta stage: 09.2010
  • Start of testing stage: 03.2011
  • Start of deployment stage: 11.2011
  • Start of production stage: 11.2011

Resource Requirements

  • Number of cores required for a single run: up to 512 cores
  • Minimum RAM/core required: 1 GB
  • Storage space during a single run: 256 MB
  • Long-term data storage: 1 GB
  • Total core hours required: .

Technical Features and HP-SEE Implementation

  • Primary programming language: C
  • Parallel programming paradigm: MPI, OpenMPI
  • Main parallel code: MPI, OpenMPI
  • Pre/post processing code: C, Dendroscope (for visualization for results)
  • Application tools and libraries: RAxML

Usage Example

Execution from command line : /opt/exp_software/mpi/mpiexec/mpiexec-0.84-mpich2-pmi/bin/mpiexec -np 128 /home/lukaf/raxml/RAxML-7.2.6/raxmlHPC-MPI -m GTRGAMMA -s /home/lukaf/raxml/trutte_input.txt -# 1000 -n T16x8

Infrastructure Usage

  • Home system: HPCG, Bulgaria
    • Applied for access on: 05.2011
    • Access granted on: 05.2011
    • Achieved scalability: up to 256 cores
  • Accessed production systems:
  1. Debrecen SC, NIIF
    • Applied for access on: 02.2012
    • Access granted on: 03.2012
    • Achieved scalability: 240 cores
  2. Pesc SC, NIIF, HU
    • Applied for access on: 02.2012
    • Access granted on: 03.2012
    • Achieved scalability: ... cores
  • Porting activities: ...
  • Scalability studies: ...

Running on Several HP-SEE Centres

  • Benchmarking activities and results: .
  • Other issues: .

Achieved Results

Publications

  • Luka Filipović, Danilo Mrdak and Božo Krstajić, "Performance evaluation of computational phylogeny software in parallel computing environment", ICT Innovations 2012

Foreseen Activities

  • data analysis for new DNA sequences
  • multigene analysys (mt DNA, Cut B. … ) and 3 simulated genes
  • Benchmark activities for MPI, PThreads & Hybrid version
Personal tools