Carl on Storage and Network Enhancements

Carl is using the SAS staff to preview his all-hands talk on storage. He’ll be giving the same talk (with edits based on our feedback) at next Tuesday’s all-hands.

To get the extra money to redo the network and storage infrastructure, we had to demonstrate that we needed to fund a major infrastructure initiative.
Current infrastructure will not support future storage growth at the Institute.

Current model is direct attached storage with limited backups. 40 TB on 1300 cross-mounted disks. 10% of network load is simple keep-alive cross mounting messaging. No accountability for data integrity, significant hardware and network dependency.

The data volume growth is driven by both science and functional work. That will grow from routine work, and will be exploded by SM-4. 60% growth per month; doubling every 18 months. 260 TB by 2010.

Carl described how Paul described how INS does an ACS image of M51. All the steps in the pipeline, mosaicing, drizzling, combination. All that consumes lots of space, and grows the space. As instruments/detector packages grow, so will our storage needs, and the network and disk space will break down.

So he walked through some scenarios about the desired end-state. Main goal is to significantly improve data integrity without losing performance.
First example was centralized storage over a gigabit network. The example was clearly adequate, if it will scale. This still only works, however, if people move from direct attached to centralized storage.

In order to do storage, we first need to improve network bandwidth. The goal is 10Gbit backbone, 1Gbit to desktop, but no upgrade to Bloomberg. That will stay at 10/100 to the desktop, and about 1Gb backbone. In Muller, three closet swtiches talk to a central switch over 10Gb.

For storage, we have a central store with virtualization and ILM, so that the desktop talks over a lan to virtual disks that are part of a central store. Data migrates from the NAS to a disk library, to a tape backup. All the migration off the NAS is over the SAN fabric; only talking to clients happens over the NAS. Virtualization takes care of where things are, ILM gives you the ability to define rules about where things should live. The rules are configurable, so we can be flexible based on users, file types, age, etc.

Current costs are based on engineering estimates, no attempt to negotiate. Cable costs have gone way up, so network is between $411K and $611K. (One third to one half of which is copper in petroleum-based insulators.) The total cost is $2.4M of which $1.78M is for initial procurement in FY06, though cost was split across FY06 and FY07.

Partly because of the organizational changes, Carl will continue to lead the process for a while.

And Carl will do a verison of this presentation at the All-Hands next week.


Explore posts in the same categories: SAS

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: