You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 14 Next »

General Clarifications

How do we combine subcategories into a score?

For example, OS1 has several subcomponents, many of which you may have or not to different levels.

I propose a mean of no combined score (Adam Slagell).

DaveK - I agree.

Standardize Language

The spreadsheet and SCIv1 document have ambiguities. For example, one refers to service providers and another to service operators.

DaveK - yes - we need to check the whole document for this

Base-level Examples

There are always questions of scope and completeness in filling out this evaluation form. While no implementation or documentation is ever exhaustive or covers every corner case, if there are significant holes then noting the scope that is covered is useful. For example, there may be centrally managed services for an infrastructure, while there are shared infrastructure at the resource providers that follow different policies. Or there may be different policies for different tiers of infrastructure worth noting.

Operational Security

[OS1]

A security model addressing issues such as authentication, authorisation, access control, confidentiality, integrity and availability, together with compliance mechanisms ensuring its implementation.

Examples of an authentication model might be a Kerberos system or PKI use to identify users. Another piece that may be included in an authentication model is how one federates with other identity providers.

Authorization models might include something like VOMS or a central database to manage allocations and a corresponding process to decide which projects or communities get allocations. Another important process is how PIs authorize who can be on their projects.

Access control example Dave?

DaveK - from minutes of the 1/6 meeting - "Access control" for files relates to role-based authZ to read/write/delete/control files. For XSEDE, Adam comments that their most important example of central access control is to for accounting.

Confidentiality example Dave?

DaveK - No access unless authorised. Hide the existence of jobs and their details

Integrity example Dave?

DaveK - Researchers like to be sure that their data has not been tampered with. It is interesting to know what has been done to ensure integrity during data transfer and then also during storage

Examples of compliance mechanisms are top-level security policies, resource provider agreements, and terms of service that allow the organization to enforce policies for entities bypassing the model. For example, a resource provider setting up a gateway which bypasses authentication and authorization by sharing an account might be cut off from resources for breaking the model.

Dave, does this just duplicate OS7?

DaveK - I guess it could do but I think the idea was that OS1 talks more about the management commitment to ensure compliance and the policies requiring this, whereas OS7 is more about the escalation and enforcement procedures. The words don't make this clear so we need to modify 

[OS2]

A process that ensures that security patches in operating system and application software are applied in a timely manner, and that patch application is recorded and communicated to the appropriate contacts.


A simple patch management process might be regular vulnerability scans, with a process to assign tickets to owners, and regular reviews of tickets to ensure that they are resolved within timelines specified security policies. Sometimes this may be the responsibility of the the distributed infrastructure, but other times it may be the responsibility of service operators. Patch management policies may differ for different classes of resources, too.

Recording and communication could be as simple as assigning tickets to appropriate service operators.

[OS3]

A process to manage vulnerabilities (including reporting and disclosure) in any software distributed within the infrastructure. This process must be sufficiently dynamic to respond to changing threat environments.

This item differs from the patch management process in that it is about software owned or distributed by the infrastructure to the resource providers. In OS2 we might be talking about an XSS flaw in a central user portal or website for the infrastructure, whereas here we might be talking about accounting or job submission software pushed out to all the service operators.

This process could be as simple as a regular meeting to discuss new vulnerabilities, e.g., the latest OpenSSL flaws, to determine the impact on software distributed by the infrastructure along with an email list to distribute such information to each service operator.

Dave, I don't know what to say about "dynamic".

[OS4]

The capability to detect possible intrusions and protect the infrastructure against significant and immediate threats on the infrastructure.

This does not mean the ability is there to detect all kinds of attacks or prevent them. It could be something as simple as detecting brute-force login attempts or compromised accounts and a mechanisms to lockout accounts manually or automatically.

Dave, I don't know how useful this is without agreeing on a few required threats and actions. Maybe you should be able to block IPs or networks, detect brute-force attacks, lockout accounts, and detect compromised accounts. I don't know what others count as significant.

[OS5]

The capability to regulate the access of authenticated users.

There simply needs to be a way to suspend access and terminate existing sessions and jobs in an emergency.

Dave, how does this differ from OS7?

[OS6]   

The capability to identify and contact authenticated users, service providers and resource providers.

Identifying users could be as simple as having unique usernames tied to email addresses. Each resource provider should have a contact for security incidents recorded in a central place as well as the admin for each service. This could simply be a spreadsheet in a shared location.

[OS7]

The capability to enforce the implementation of the security policies, including an escalation procedure, and the powers to require actions as deemed necessary to protect resources from or contain the spread of an incident.

Enforcement may just be the ability to remove individuals and resource providers from the infrastructure for violating policies. Resource providers might locally still allow a user even if removed from the infrastructure.

An escalation procedure could simply be a chain of command to escalate noticed policy violations to senior levels of management with the authority to censure violators.

Emergency powers could simply be a way for incident response teams to disable accounts directly or remove authorizations for the infrastructure. Even if they cannot remove all access at a single resource provider, they should be able to remove users from centralized authentication, authorization and access control to limit the spread of an instance. For example, they might revoke certificates and access to a user portal for a user, while the individual resource providers retain control of local credentials to other services. Critically, an infrastructure should be able to contain a compromise to their infrastructure and from spreading to other infrastructures, .e.g, by revoking certificates or disabling accounts in their identity provider.

 Incident Response

[IR1]

Security contact information for all service providers, resource providers and communities together with expected response times for critical situations.

A simple spreadsheet or wiki page with security contacts for the resource providers and the owners/operators of any services suffices.

Dave, what do we mean by communities?.

Expected incident response times for an infrastructure must be documented and shared, and do not necessarily need formal SLAs, MOUs, charters, etc.

[IR2]

A formal Incident Response procedure. This must address: roles and responsibilities, identification and assessment of an incident, minimizing damage, response & recovery strategies, communication tools and procedures.

Do you have answers to the following questions? 

  • Who might be pulled into an incident response activity and what are their responsibilities?
  • What counts as a real incident? How do you rate the criticality?
  • How do you contain common kinds of incidents, such as, account compromise?
  • How do you determine when a service can be returned to normal operations or an account restored?
  • How do you securely communicate with everyone one who is investigating and responding to an incident?

[IR3]

The capability to collaborate in the handling of a security incident with affected service and resource providers, communities, and infrastructures.

I don't really know what is here that isn't already covered by procedures and communication channels. If this is about communicating with external infrastructures, then maybe all it is about is having a security point of contact and participating in relevant trust groups –Adam.

[IR4]

Assurance of compliance with information sharing restrictions on incident data obtained during collaborative investigations. If no information sharing guidelines are specified, incident data will only be shared with site-specific security teams on a need to know basis, and will not be redistributed further without prior approval.

A good privacy policy would cover this, but so would an understanding that the security team has some autonomy and shares on a need-to-know basis.


Some explanations from Dave Kelsey (my personal views - recalling the history)

Section 4 - Operational Security

OS1 - What is meant by a "security model"?

Here we were considering an architecture or an agreed set of technical and managerial/policy components. In EGI for example this means - authentication is today based on an X.509 PKI with an approved set of CAs (as accredited by IGTF). Authorisation is in the hands of the VOs using VOMS attribute certificates together with a set of technical components at the service level for policy enforcement (LCAS, LCMAPS, ARGUS, etc.). We have security policies on the approved CAs, on the VO membership management procedures (registration, renewal, suspension, etc).  And a top-level security policy which specifies what happens in non-compliance.

This works for eInfrastructures (or did work) because we had a single security architecture and we needed all participants and services to use it.

With the current move to different technologies, more generalised federated identity management and different levels of assurance, not forgetting new types of service like the EGI Federated Cloud service, this is no longer true.

OS1.3 - What is meant by "access control"?

"Access control" is the technical means to enforce authorisation policy and decisions. In EGI, VOMS specifies VO and sub-group membership and possession of other generalised attributes. The Access Control system then decides whether a job can be run, whether a file can be written or read based on the authorisation attributes.

 

to be continued

  • No labels