jdcmg banner
 

 

Minutes of the JDCMG Meeting

December 1, 2005, UCSB

1. Status of Disaster Recovery at each campus:

  • UCOP
    • If a disaster starts, UCOP would have to pay $600K a month to keep things running - insufficient network bandwidth as well - likely cost $25K per day to run all their services (currently pay $210K/yr for a subset of systems - no 'distributed' systems)
    • Unix and Windows servers actually much tougher to recover since not as monolithic (DNS servers, etc. as predecessors)
    • Don't have business aspects appropriately engaged
    • If you want to be serious about this then need staff and funding
  • UCLA
    • Has 50% of a manager FTE working on this - trying to get a full FTE
    • Important functions is payroll and vendor payment
    • Smaller cost to IBM but still $75K/year
    • Trying to move to more offsite data transfers (once a week is being requested) - quicker to 30 miles away, much further (300 miles) twice a month
    • Audits discussed - campuses getting audited for IT and has set need for disaster recovery testing
    • Gartner report for UCLA
      • Business impact analysis
      • Risk assessment for each business
        • Cost of being down: What would be the cost per day if no e-mail services existed?
        • Timeline for restoration: Can you be down for a day?
      • Authentication, emergency web site, and emergency e-mail system are seen as critical as well
      • Again distributed servers are difficult and expensive to have disaster recovery plans - had an example with a system upgrade
      • resting suggestion during discussion: Have important campus people get an alternate web-based e-mail, e.g., Yahoo or Gmail just for emergency communications
  • UCB
    • Has an FTE for IT and one on campus for admin/emergency
    • Used IBM for disaster recovery for mainframe ($45K/year for just mainframe)
    • Other services have been now seen as critical
      • e-mail
      • authentication
      • BearFacts (student information)
      • web-casting
    • Want to look at partnering with other UCs instead of paying IBM for the mainframe service only
      • E-mail
      • CalNet (Kerberos authentication?)
      • BearFacts
      • Webcast
    • Seems to be having more interest in non-mainframe systems
    • Need to reconfirm with supplies 'what is the *real* state of replacement'
  • UCSB
    • Business services claimed that two-week downtime in an emergency would be acceptable
    • To do this buying new gear from vendor on one to five day delivery and recovery from tapes
    • No additional staff has been allocated
    • Starting discussions on campus - no input yet/ and some assume there is already a rapid recovery plan and system
    • No arranged locations on campus and no agreement off-campus
    • Was shipping tapes to a nearby CSU but then tapes were returned as bad quality - ended service
    • Small enough mainframe demand that can buy a spare computer
    • Datacenter HVAC failed in April destroying a disk array which brought down everything - five-day downtime
  • LBL
    • Covered by NIST standards for IT - must have alternate site for DR, but doesn't have funding for DR!
    • All distributed systems - no mainframe IBM
    • Alternate site with UCD including installing equipment next year - would be interested in seeing the MOU between UCD and LBL
    • Critical services:
      • E-mail
      • Payroll
    • Created emergency response action checklists/plan
    • Got emergency equipment - laptops, phones, radios
    • Created lots of documentation about critical services, rebuilding process
    • Created high value emergency purchasing parts - like a very high value credit card kept in a safe - interesting idea!
  • UCSD
    • Various things have been done over eight years but all proposals have not been funded - no DR services
    • Emergency plans are written to CD and handed to emergency personnel
    • No recovery site arrangements - on - or off-campus
    • Tape backups of mainframe on a variety of timetables and shipped via Iron Mountain (35 miles away)
    • Would not like to depend on tape recovery; prefer remote disk
    • Looking at NetApp 'remote hands' software to create remote site for UNIX/ Windows 'distributed' servers
    • Machine room upgrades:
      • Water detection
      • Extra network connections
      • Seismic retrofit via isolation dampers - neat stuff we saw when we visited there
    • The campus is financially unprepared to provide necessary funds to build a separate site for communications services (e-mail, Web sites, etc.)
    • Question raised: Do software contracts allow for running on a separate server, i.e., based on capacity not server number

  • UCR
    • Systems deemed critical:
      • Payroll
      • Student information
      • E-mail
      • Network
      • Phone
    • Wants to partner with other UC site. How about a stand-alone location in AZ?

  • UCI
    • Administrative Computing plans
      • Reciprocity with NACS and HSIS
      • Mainline recovery of hardware
      • UCSD to run payroll remotely
      • No plans for distributed systems - hearing it's harder with open systems tapes sent to Iron Mountain
      • Machine room upgrades awaiting EPA permit for generator
      • Discussed UCSD/ UCI plans for payroll
    • I discussed NACS issues - see my notes on presentation

  • UCD Medical
    • Changed electronic data a few days ago and can't be down for almost anytime
    • Went with Epic
    • Good idea to have co-gen plant - from their experience over the 2002 power crises
    • Have a 2nd site at nearby hospital machine room
    • Stable environmental area
    • One SAN extended to each site

  • UCSC - new member of group from UCSC!
    • With all the new changes on campus there is no DR plan
    • Reorganization has been successful so far due to lots of planning and focus on users and needs
    • Disk failure caused part of this reorganization
    • Recent major power outage - really tested Data Center power services - resulted in increase in demand
    • Long term commitment to build out function
    • Found a 2nd site a few miles from campus - ex commercial computer site (9000 sq ft of raised floor)

  • UC Merced - new member of group from UCM!
    • UCLA does payroll for them
    • So far distributed systems only
    • Doing tape backups so far
    • School opened in September 2005
    • New telecommunication building and network

2. Gartner Review of Disaster Recovery

  • Went over the major issues that need to planned for via DR
    • Interesting numbers
      • 2-7% of IT spending on business continuity
      • Single path of power and cooling - 99.671 uptime
      • Single path with redundant components - 99.741 uptime/ not much improvement compared to previous
      • Multiple cooling and power paths but only one path active - 99.982 uptime
    • Useful discussion afterwards:
      • Are their universities or systems that have used Gartner services for DR?
      • There is a negotiation with Gartner research across UC for research purposes

3. Update on IBM negotiations from Karen Melick
  • We thought we had a deal on hardware, but IBM said it's not going to work since there are no published IBM 'list pricing' for mainframes - so they can't quote us a price
  • Software side: we're close to a contract that's on the table. IBM wants it done by end of the calendar year. They are forcing us to choose a list of products but will have a substitution clause for other products or new ones. Changes will be a major issue if not on the substitution list.
  • Lots of discussion about particulars including commitment of 3 year agreement
  • Agreement that UCOP would do license administration
  • Does Karen continue this process or not? She's going to continue but not a lot of enthusiasm for the level of discount we got
  • Lots of talks about stopping any development on IBM systems due to pricing being unchanged despite any attempts based on grouping our expenses

4. Discussion on how to improve DR planning across UC

  • Karen summarized our morning campus summaries
  • Karen is creating a document to submit to ITLC about our input on DR planning
    • Need two people to assess what needs to be done and what's important to do
    • Determine costs (as a comparison about $50K per campus for Gartner BIA report)

Next Meetings:
  • Conference call
      January 13th and February 10th
  • Meeting
      March 9th at the UCOP
For questions or changes to the minutes please contact Charlotte Klock, Director, UCSD Data Center, (858) 822-1223.
 
 
Copyright © 2007 The Regents of the University of California, All Rights Reserved. UC Joint Data Center Management Group (JDCMG)
Updated: January 26, 2010