Date

Attendees

  • PSNC, GRNET, CSC, ARNES, UNINETT, Uni.Pisa, GÉANT

Goals

  • Organizing work together to get a first release of the TCO calculator

Discussion items

Preliminary TCO calculator tool

Preliminary tool has been designed and distributed by PSNC.

The initial aim was to calculate the minimum volume of a local storage solution for PSNC.

The numbers in the tool are dummy numbers.

Re-use other tools or build our own?

Short review of the existing material was made including:

  • SNIA cost calculator – models too simplistic, failed to simulate e.g. Ceph based cluster performance
  • Wmarow's IOPS calculator – might be worth having a look at as includes some modelling

TCO Tool improvements/requirements

Here is a summary of the discussions around the tool

 Quick summary of suggested improvements / extensions:

  • to split „areas” into multiple sheets – so that we can distribute the work
  • good idea would be to have a sheet on example / reference server configs and disk parameters
  • network uplinks – include 1Gbit for management in rack space / ports budget – see network part
  • cooling efficiency factor – might be included within power price – see electricity
  • collecting failure rates would be good (again might be perceived as sensitive)

Power consumption:

  • Power for the disks must be separated from the power for the main board and other server components: memory, network interfaces etc.
  • Different storage architectures need some historical data on power consumptions to be analysed.
  • Various may need power consumption modelling. Work on the models as options to select.
  • GRNET has some data collected – Panos will check if they can explored and will share conclusions/data
  • CSC has some data / models – options what can be shared will be checked.
  • PSNC will be able to provide the data it is planning to collect.
  • We need long-term averaged measurements not the point-in-time data as the power consumption varies depending on system activity.
  • We predict the real-life data to be more usable than models however modelling should still be explored.

Staff expenses

  • Different from country to country, not to share real numbers in public - politically sensitive.
  • Overhead should be included in the total cost of an employee.
  • 0.5 FTE for maintenance, including both hardware and operating system is realistic.
  • The lower the quality of the hardware (i.e. cheaper) the more HW maintenance you need. This factor can be base don experimental data.

OPEX

  • Data Centre operations costs (using the existing facility) included Building maintenance, Security, UPS, etc. can be virtualized and factored in.
  • The same model can be applied for rented spaces (co-location) – we should enable calculating in RUs
  • Cooling can be part of these cost or part of the electricity calculation separate (for now it is part of electricity cost – but the impression is that this should be analysed more explicitely in the model in order to enable more detailed analysis / using various parameters)

Network

  • Cost of the switches is included (10 Gbit ports to servers, pair of ToR switches with uplink – see below)
  • Uplink component is missing for. FibreChannel could also be considered as alternative technology.
  • At least 10G ports for uplink per switch and network for management (access to IPMI interfaces) must be considered.  Some cases 40G ports.
  • Should the network be redundant? It depends on the size of the setup and the other redundancy features of the architecture. Network redundancy can be checked against the desired availability figures. (We should be carefull as saving money on switches may enforce high data replication factor – costs...)

IOPS

  • There are some Java tolls available. Panos, GRNET will play with them and share experiences.
  • Calculations can be included in a separate sheet.

Modelling

  • Electricity, Cooling factor, Staff expenses, and other costs (i.e. OPEX above) are the main cost components we have to calculate with.
  • The number of racks used should be an open parameter in the model. Different disk/server/rack density.
  • Consider not only full racks but also fractions of it (couple of RUs). In some cases this level of granularity is needed.
  • We could come up with 4-5 different server configurations as different calculation models to use. Footnotes are needed to explain the differences and the reasoning behind the models.
  • Create separate XLS sheets per cost component, and re-distribute the calculator.
  • Share numbers on power consumption, if possible.
  • Using Goolge doc with dummy numbers on sensitive data (staffing, etc.) is fine for now. Some XLS Marcos may need to be developed off-line. Let's decide how to share the final product later.

Action items

Document editing:

  • Maciej, PSNC to share the updated version of the TCO calculator (split into areas such as servers, energy, staff etc) and start collecting comments/notes in a separate document, to be developed as a "Cost Effective Storage how-to"
  • ALL, to connect on the TCO tools and start working on the various sheets.

IOPS modelling:

  • Panos, GRNET to play with the IOPS tools and let us know the experiences.
  • others to review / look for other sources of IOPS/bandwidth, power consumption models

Power usage (important part of cost):

  • Panos, GRNET to review datasets on power consumption data and share them.
  • volunteers needed to explore power consumption data
  • PSNC/CSC to speak about this during week 23-27.03.2015 and share results