Nexenta Cluster at the German Cancer Research Center (Deutsches Krebsforschungszentrum – DKFZ)

Being the largest biomedical research institution in Germany and a member of the Helmholtz community of German research centers, the German Cancer Research Center (DKFZ), in accordance with its articles of association, is exclusively dedicated to the task of cancer research.

The research center’s staff – more than 3.000 employees, more than 1.000 of them scientists, are researching the causes of cancer in more than 90 departments and study groups, they record risk factors for cancer and are seeking strategies with the aim to prevent people from developing cancer.

They also develop new approaches for a more precise tumor diagnosis and for a more successful treatment of cancer patients. The modern procedures which allow an analysis of complete genomic sequences (genomes) or proteins (proteomes), generate enormous
data volumes that cannot be dealt with by human beings alone.

Mission

While, during the past few years, most of the bioinformatic procedures were aimed at the analysis of genome sequences, these procedures are now primarily used to clarify the correlations between the superior organisation of the genome and its multiple functions.

The objective is to achieve a holistic understanding of complex biological processes. Therefore a highly-available NFS-Server was required for this department, to be used primarily for storing the home directories of Unix users.

In addition, the server was intended to allow the safe exchange of sensitive research data between the research groups and the cooperation partners as well as hosting the centrally installed bioinformatics programs and molecular-biological reference data required for a Linux cluster.

The Nexenta HA cluster was purchased since it provides for an uninterrupted operation and simplified administration at the same time, ensuring data safety by using NFSv4 and Kerberos

Up to 50 users may access a Linux HPC cluster with 50 nodes and 1600 cores via the system, processing different tasks from the fields of next-generation-sequencing, network modelling and systems biology.

Solution

EUROstor offers cluster systems with Nexenta software providing especially high available NFS series. To keep the complete hardware redundant, the two cluster
nodes are connected to two JBODs containing the hard disk drives.

The disk drives are being mirrord in pairs so that even a failure of a complete JBOD does not interrupt availability of data. As a filesystem ZFS guarantees a high data integrity with checksums, which also make it easy to recognize and correct “Stealth Errors” on single disk drives.

ZFS also allows to create a nearly infinite number of snapshots to reconstuct a former status of the data. On this file system NFS volumes can be made accessible to the clients, transparent over virtual IP addreses.

To provide high speed data access, SAS disks are being used together with RAM disks for ZIL cache and SSDs for L2ARC (both coming from Stec/HGST).

ES-2800 Nexenta HA Cluster with shared Storage:

FURTHER LINKS