|
Minutes of the JDCMG Meeting
December 1, 2005, UCSB
- UCOP
- If a disaster starts, UCOP would have to pay $600K a month to keep things running -
insufficient network bandwidth as well - likely cost $25K per day to run all their services
(currently pay $210K/yr for a subset of systems - no 'distributed' systems)
- Unix and Windows servers actually much tougher to recover since not as monolithic (DNS servers, etc. as predecessors)
- Don't have business aspects appropriately engaged
- If you want to be serious about this then need staff and funding
- UCLA
- Has 50% of a manager FTE working on this - trying to get a full FTE
- Important functions is payroll and vendor payment
- Smaller cost to IBM but still $75K/year
- Trying to move to more offsite data transfers (once a week is being requested) - quicker to 30 miles away, much further (300 miles) twice a month
- Audits discussed - campuses getting audited for IT and has set need for disaster recovery testing
- Gartner report for UCLA
- Business impact analysis
- Risk assessment for each business
- Cost of being down: What would be the cost per day if no e-mail services existed?
- Timeline for restoration: Can you be down for a day?
- Authentication, emergency web site, and emergency e-mail system are seen as critical as well
- Again distributed servers are difficult and expensive to have disaster recovery plans - had an
example with a system upgrade
- resting suggestion during discussion: Have important campus people get an alternate web-based
e-mail, e.g., Yahoo or Gmail just for emergency communications
- UCB
- Has an FTE for IT and one on campus for admin/emergency
- Used IBM for disaster recovery for mainframe ($45K/year for just mainframe)
- Other services have been now seen as critical
- e-mail
- authentication
- BearFacts (student information)
- web-casting
- Want to look at partnering with other UCs instead of paying IBM for the mainframe service only
- E-mail
- CalNet (Kerberos authentication?)
- BearFacts
- Webcast
- Seems to be having more interest in non-mainframe systems
- Need to reconfirm with supplies 'what is the *real* state of replacement'
- UCSB
- Business services claimed that two-week downtime in an emergency would be acceptable
- To do this buying new gear from vendor on one to five day delivery and recovery from tapes
- No additional staff has been allocated
- Starting discussions on campus - no input yet/ and some assume there is already a rapid recovery plan and system
- No arranged locations on campus and no agreement off-campus
- Was shipping tapes to a nearby CSU but then tapes were returned as bad quality - ended service
- Small enough mainframe demand that can buy a spare computer
- Datacenter HVAC failed in April destroying a disk array which brought down everything - five-day downtime
- LBL
- Covered by NIST standards for IT - must have alternate site for DR, but doesn't have funding for DR!
- All distributed systems - no mainframe IBM
- Alternate site with UCD including installing equipment next year - would be interested in seeing the MOU between UCD and LBL
- Critical services:
- Created emergency response action checklists/plan
- Got emergency equipment - laptops, phones, radios
- Created lots of documentation about critical services, rebuilding process
- Created high value emergency purchasing parts - like a very high value credit card kept in a safe - interesting idea!
- UCSD
- Various things have been done over eight years but all proposals have not been funded - no DR services
- Emergency plans are written to CD and handed to emergency personnel
- No recovery site arrangements - on - or off-campus
- Tape backups of mainframe on a variety of timetables and shipped via Iron Mountain (35 miles away)
- Would not like to depend on tape recovery; prefer remote disk
- Looking at NetApp 'remote hands' software to create remote site for UNIX/ Windows 'distributed' servers
- Machine room upgrades:
- Water detection
- Extra network connections
- Seismic retrofit via isolation dampers - neat stuff we saw when we visited there
- The campus is financially unprepared to provide necessary funds to build a separate site for communications services (e-mail, Web sites, etc.)
- Question raised: Do software contracts allow for running on a separate server, i.e., based on capacity not server number
-
UCR
- Systems deemed critical:
- Payroll
- Student information
- E-mail
- Network
- Phone
- Wants to partner with other UC site. How about a stand-alone location in AZ?
-
UCI
Administrative Computing plans
- Reciprocity with NACS and HSIS
- Mainline recovery of hardware
- UCSD to run payroll remotely
- No plans for distributed systems - hearing it's harder with open systems tapes sent to Iron Mountain
- Machine room upgrades awaiting EPA permit for generator
- Discussed UCSD/ UCI plans for payroll
- I discussed NACS issues - see my notes on presentation
-
UCD Medical
- Changed electronic data a few days ago and can't be down for almost anytime
- Went with Epic
- Good idea to have co-gen plant - from their experience over the 2002 power crises
- Have a 2nd site at nearby hospital machine room
- Stable environmental area
- One SAN extended to each site
- UCSC - new member of group from UCSC!
- With all the new changes on campus there is no DR plan
- Reorganization has been successful so far due to lots of planning and focus on users and needs
- Disk failure caused part of this reorganization
- Recent major power outage - really tested Data Center power services - resulted in increase in demand
- Long term commitment to build out function
- Found a 2nd site a few miles from campus - ex commercial computer site (9000 sq ft of raised floor)
- UC Merced - new member of group from UCM!
- UCLA does payroll for them
- So far distributed systems only
- Doing tape backups so far
- School opened in September 2005
- New telecommunication building and network
- Went over the major issues that need to planned for via DR
- Interesting numbers
- 2-7% of IT spending on business continuity
- Single path of power and cooling - 99.671 uptime
- Single path with redundant components - 99.741 uptime/ not much improvement compared to previous
- Multiple cooling and power paths but only one path active - 99.982 uptime
- Useful discussion afterwards:
- Are their universities or systems that have used Gartner services for DR?
- There is a negotiation with Gartner research across UC for research purposes
- We thought we had a deal on hardware, but IBM said it's not going to work since there are no published IBM 'list pricing' for mainframes -
so they can't quote us a price
- Software side: we're close to a contract that's on the table. IBM wants it done by end of the calendar year.
They are forcing us to choose a list of products but will have a substitution clause for other products or new ones.
Changes will be a major issue if not on the substitution list.
- Lots of discussion about particulars including commitment of 3 year agreement
- Agreement that UCOP would do license administration
- Does Karen continue this process or not?
She's going to continue but not a lot of enthusiasm for the level of discount we got
- Lots of talks about stopping any development on IBM systems due to pricing being
unchanged despite any attempts based on grouping our expenses
- Karen summarized our morning campus summaries
- Karen is creating a document to submit to ITLC about our input on DR planning
- Need two people to assess what needs to be done and what's important to do
- Determine costs (as a comparison about $50K per campus for Gartner BIA report)
:
- Conference call
January 13th and February 10th
- Meeting
For questions or changes to the minutes please contact Charlotte Klock,
Director, UCSD Data Center, (858) 822-1223.
|
|
|
|