Layer 2Layer 3Layer 4Layer 5Layer 6Layer 7Layer 8Layer 9Layer 10Layer 11Layer 12Layer 13Layer 14Layer 15Layer 16Layer 17Layer 18Layer 19Layer 20Layer 22Layer 24
More Information About
  · Bio-ITWorld Conference & Expo
  · IDC Life Sciences Market Data
  · Buyer's Guide Submissions
BIO-IT World
Information Technology for the Life Sciences
Home > News > Netezza Appliance Speeds Bioinformatics Data Searches, Queries









Bio-IT World Buyer's Guide







Bio-IT World News Section HeadLine
Netezza Appliance Speeds Bioinformatics Data Searches, Queries


By Salvatore Salamone
Bio-IT World (online)

(10/02/03)-Data storage appliance vendor Netezza Corp. has introduced the Netezza Performance Server (NPS) data warehouse for bioinformatics.

Essentially, the NPS system lets a life science company build a "biologically aware" data warehouse that integrates sequence searches and comparisons within the same analytic database that is used for discovery tasks, without the need for working with multiple copies of data.

Specifically, the company has integrated BLAST and defined genomic data types (i.e., large nucleotide and protein text types) that can be directly searched by a type of SQL JOIN query that supports NCBI BLAST. The result is a system that has the capacity to store terabyte-sized genomic databases with dedicated hardware and software to process sequence analysis SQL queries of such databases.

The company claims that its appliance's unique architecture eliminates some of the performance bottlenecks experienced today in bioinformatics searches of large databases. For instance, Netezza claims the NPS can do sequence similarity checking in times that are comparable to those achieved when using supercomputing clusters - all while offering the benefits of using an appliance (i.e., it's easy to manage and offers a low cost of ownership).

Netezza claims that its approach addresses the limiting factors experienced in many bioinformatics data warehouse applications today. For example, much of the effort in research today is placed on increasing the computing power available for bioinformatics work. But often, "the real problem is the data, not computer [capacity]," says Bill Blake, Netezza's senior vice president of product development.

How it works
At the heart of Netezza's approach is an architecture that addresses the limiting factors in many bioinformatics queries. Namely, in many searches, the retrieval of data off of storage systems and the partitioning of data so that a query may be processed slow down the entire process. Additionally, query performance slows as more simultaneous queries and more complex queries are made against a database.

The NPS addresses all of these issues using what Netezza calls an "asymmetric massively parallel processing" architecture. This architecture uses features of two other common architectures - symmetric multiprocessing (SMP) and massively parallel processing (MPP).

For example, one bottleneck in many database systems is the challenge of handling large numbers of simultaneous queries or very complex queries. To deal with this performance-limiting issue, Netezza uses an SMP-based host to compile queries in parallel while supplying the processing power to sort and aggregate large sets of queries results.

Another factor that limits performance is the time it takes to simply move data. To deal with this, the NPS uses an MPP architecture to move data onto and off of multiple nodes (within the NPS) over which a large database is distributed. This deals with input/output performance issues commonly encountered when working with terabyte-sized databases.

Being an appliance, the NPS is designed to fit into existing life science infrastructures. The idea is to load the NPS up with data and keep on using all existing front-end database and analytical applications without having to modify the applications themselves.

The way Netezza accomplishes this is by using common and standards-based application programming interfaces (APIs) that allow applications to submit queries against the data stored on the NPS in the same way queries would be done with any other database system. APIs supported include SQL, ODBC, and JDBC.

The NPS is available now. Versions of the NPS line include models that support from 4.5 TB to 81 TB of total storage capacity. Pricing starts at $622,000.








DocFinder
(enter numeric code from the magazine article)




Search
Bio-IT World News




advertisers

FreeTechMail.org



Sponsored Links:

· Uncover Everything! Learn how to optimize your process. Get Automsoft's free Process Optimization whitepaper.

· Imaging and Gel Analysis Software With 21 CFR Part 11 Compliance Features

· Stay informed about bioinformatics. Sign up to the LION bioscience Newsletter.

· Get Bio-Rad's FREE Spectroscopy, Chem, & ADME/Tox Informatics CD

· See, access, share all your data??? envision the possibilities

· Simplify study start-up, boost margins, improve satisfaction rates.

· Introducing a holistic approach to your bioinformatics infrastructure

   Printer Friendly Email this Page

Subscribe        Newsletters       Contact Us
   © 2002-2003 Bio-IT World Inc.     Privacy Policy

October 02, 2003