PhD thesis

Transcription

PhD thesis
N◦ d’ordre 05ISAL0055
Année 2005
Thèse
Conception et mise en oeuvre de mécanismes
sécurisés d’échange de données
confidentielles ; application à la gestion de
données biomédicales dans le cadre
d’architectures de grilles de calcul/données
présentée devant
L’Institut National des Sciences Appliquées de Lyon
pour obtenir
le grade de docteur
École doctorale : Informatique et Information pour la Société
(EDIIS-EDA 335)
Spécialité : Documents Multimédia, Images et Systèmes
D’Information Communicants (DISIC)
par
Ludwig SEITZ
Soutenue le 11 Juillet 2005 devant la Commission d’examen
Jury
BERTINO Elisa
Professeure
PUCHERAL Philippe Professeur
BRUNIE Lionel
Professeur
PIERSON Jean-Marc Maı̂tre de Conférence
MULMO Olle
Chercheur
ROCH Jean-Louis
Maı̂tre de Conférence
Rapporteure
Rapporteur
Directeur de thèse
Co-directeur de thèse
Examinateur
Examinateur
Ordering number 05ISAL0055
Year 2005
Thesis
Design and Implementation of Secure
Mechanisms for Sharing Confidential Data;
Application to the Management of
Biomedical Data in a Grid Computing
Environment
Submitted to the
National Institute of Applied Sciences of Lyon
In fulfillment of the requirements for a
Doctoral Degree
Doctoral school of Computer Science and Informatics (EDIIS-EDA 335)
Affiliated Area: Computer Science
Defended at 11th
Prepared by
Ludwig SEITZ
July 2005 in front of the Examination Committee
Committee Members
BERTINO Elisa
Professor
Reviewer
PUCHERAL Philippe Professor
Reviewer
BRUNIE Lionel
Professor
Supervisor
PIERSON Jean-Marc Associate Professor Co-supervisor
MULMO Olle
Researcher
Examiner
ROCH Jean-Louis
Associate Professor Examiner
Résumé
Les grilles de calcul sont devenus une des architectures de choix, pour
des applications qui consomment un grand volume de données et qui demandent beaucoup de puissance de calcul. Les grilles permettent de partager
des ressources multiples et hétérogènes, comme la puissance de calcul, l’espace de stockage et les données, a travers d’une architecture qui permet de
faire interopérer ces ressources d’une manière transparente pour l’utilisateur.
Des applications récentes des grilles sont les réseaux de soin. Le but d’un
tel réseau est d’une part de permettre aux médecins d’utiliser la puissance
de calcul des grilles pour leurs algorithmes d’analyse d’images médicales et
d’autre part de permettre le partage transparent et multi-institutionnel de
données de patient distribuées.
Contrairement aux premières applications des grilles (par exemple la physique de particules ou l’observation terrestre), la sécurité est très importante pour les applications médicales. Les données des patients doivent être
protégées contre des accès illicites, tout en étant en même temps accessibles
par des personnes autorisées. Des mécanismes de protection de données classiques ne sont que d’une utilité limitée pour cette tâche, à cause des nouveaux
défis posés par la Grille. Le plus grand problème pour la sécurité des données
sur une grille, est le fait que des données peuvent être répliquées en dehors du
domaine de leur possesseur pour les rapprocher d’une unité de calcul censé les
traiter. Pour cette raison un système de contrôle d’accès doit être décentralisé
et le possesseur d’une donnée doit avoir le contrôle sur qui a accès à ses
données. Puisque il peut être nécessaire d’accéder rapidement à des données,
le système de contrôle d’accès doit permettre de la délégation immédiate
de droits. Finalement le fait que les données en question peuvent être très
confidentielles fait que l’on peut pas se fier uniquement à un mécanisme de
contrôle d’accès, puisque un attaquant ayant accès au materiel physique de
stockage peut contourner ce contrôle.
Dans la thèse ici présente, nous proposons une architecture pour la protection de données confidentielles sur une grille. Cette architecture comprend
un système de contrôle d’accès et un système de stockage chiffré.
iii
Le mécanisme de contrôle d’accès Sygn proposé, permet le stockage et
la gestion décentralisée de permissions. Toutes les permissions en Sygn sont
encodées dans des certificats, qui sont stockés par leurs possesseurs et utilisés
quand nécessaire. Des permissions peuvent êtres créées à tout moment par
les possesseurs des ressources ou par des administrateurs auxquels ce droit
a été délégué. Pour cette création de permissions aucune interaction avec un
système de stockage de permissions centralisé n’est nécessaire. La délégation
en plusieurs étapes est réalisée en Sygn par des chaı̂nes de certificats. Pour
ces raisons, Sygn permet la gestion de ressources et de permissions changeant
dynamiquement.
Les serveurs de contrôle d’accès de Sygn stockent un minimum d’informations critiques à la sécurité. Ils sont mis en place proches des ressources auxquels ils contrôlent l’accès pour minimiser l’impact d’une attaque
réussie. Sygn évite l’utilisation de services centralisés et minimise les tiers de
confiance. Sygn a été intégré avec succès dans une architecture minimale de
grilles.
Le système proposé pour le stockage chiffré CryptStore est conçu pour
permettre à une communauté dynamique d’utilisateurs le stockage chiffré de
données, l’accès à ces données et leur modification. Pour réaliser cette fonction, CryptStore met en œuvre des serveurs de clefs distribués qui donnent
accès aux clefs de déchiffrage à des utilisateurs autorisés. Pour minimiser l’impact d’une attaque réussie sur un serveur de clefs, aucun serveur ne stocke
une clé entière. Les clefs sont partagées à l’aide d’algorithmes classiques de
partage de secrets et les parts sont distribuées sur plusieurs serveurs de clefs.
Pour éviter de rajouter une couche supplémentaire et potentiellement incohérente de contrôle d’accès, l’accès aux parts de clefs est contrôlé à travers
le mécanisme de contrôle d’accès aux données de la grille. Pour cela les serveurs de clefs ont une interface générique qui peut être adaptée à n’importe
quel mécanisme de contrôle d’accès sur la grille. Une adaptation de cette
interface qui permet l’utilisation de Sygn pour le contrôle d’accès aux clefs a
été implémentée pour CryptStore.
Abstract
Grid computing has become the architecture of choice for applications that
process a large amount of data and require a lot of computing power. Indeed Grids allow users to share multiple heterogeneous resources, such as
computing power, storage capacity and data, and provide an architecture for
transparent interoperation of these resources from the user’s point of view.
An upcoming application for Grids is health-care, with the goal of giving
medical doctors the computing power of Grids to speed up and improve
their diagnosis software, or to gain a transparent, cross-institutional access
to distributed patient records.
More than for the first applications of Grids (e.g. particle physics, terrestrial observation), security is a major issue for medical applications. Conventional data protection mechanisms are only of limited use, due to the novel
security challenges posed by Grids. The most important challenge is that on
demand of the middleware data on a Grid may be copied outside the home
domain of their owner in order to be stored close to some distant computing
resource. To respond to these challenges we propose an access control system
that is decentralized and where the owners of some data are in control of the
permissions concerning their data.
Furthermore data may be needed at very short notice, the access control
system must support a delegation of rights that is effective immediately.
Grid users also need delegation mechanisms to give rights to processes, that
act on their behalf. As these processes may spawn sub processes, multi-step
delegation must be possible.
In addition to these useability requirements, the transparent storage and
replication mechanisms of Grids make it necessary to implement additional
protection mechanisms for confidential data. Access control can be circumvented by attackers having access to the physical storage medium. We therefore need encrypted storage mechanisms to enhance the protection of data
stored on a Grid.
In this thesis we propose a comprehensive architecture for the protection
of confidential data on Grids. This architecture includes an access control
v
system and an encrypted storage scheme.
The proposed access control mechanism Sygn provides a decentralized
permission storage and management system. All permissions in Sygn are
encoded in certificates, which are stored by their owners and used when required. Permissions can be created on demand, by the owners of the resources
or by administrators to whom this responsibility has been delegated, without
the need to contact a central permission storage system. Multi-step delegation of permissions is realized in Sygn through the use of certificate chains.
Thus, Sygn allows an efficient decentralized administration of dynamically
changing resources and permissions.
The format of Sygn permissions allows fine-grained specification of resources to be protected, in order to give each resource owner the possibility
to express specific authorizations on his or her resource.
The Sygn access control servers are deployed close to the resources they
control, and store only minimal security critical information, in order to minimize the impact of a successful attack. So Sygn avoids the use of centralized
services and minimizes the use of trusted third parties in order to enhance
security and extensibility.
The proposed encrypted storage scheme CryptStore is designed to allow
users to manage dynamically changing data sets, by dynamically changing
user communities. To achieve this goal, CryptStore relies on distributed keyservers that allow dynamic sharing of decryption keys based on file access
permissions. In order to minimize the impact of a successful attack on a keyserver, no single key-server stores full encryption keys. Instead keys are split,
using a classical secret sharing algorithm, and distributed among several keyservers.
To avoid adding a duplicate and possibly incoherent layer of access control, access to key-shares stored on CryptStore key-servers is granted according to file-access permissions of the Grid access control system. Thus the
key-servers have a generic interface that can be adapted to interact with any
Grid access control system.
Sygn has been successfully integrated in a lightweight grid middleware for
access control to files and an instantiation of the generic CryptStore access
control interface has been implemented, that allows to use Sygn for key access
control.
Acknowledgments
First and foremost I would like to thank my supervisors Lionel Brunie and
Jean-Marc Pierson. Thanks to them I have discovered Grid computing as an
interesting field of research for security applications. Through all the years
their advice and expertise have been very valuable for me and have helped
me to succeed in this work.
I would also like to thank Elisa Bertino and Philippe Pucheral who have
accepted the hard task of reviewing my work and being members of my
examination committee.
My thanks to Olle Mulmo and Jean-Louis Roch for accepting to be members of my examination committee too.
I am deeply grateful to my girlfriend Elin A. Topp, who has supported
me even though thousands of kilometers separate us and who has raised my
morale when I was frustrated.
Many thanks also go to the students Dan Hididis and Didier Oriol, who
have directly contributed to my work in their projects and who both did an
excellent job.
Johan Montagnat has also given valuable contributions both through advice, questioning and by providing his excellent software libraries.
Furthermore Olle Mulmo and Thomas Sandholm from KTH as well as
Babak Sadighi Firozabadi and Erik Rissanen from SICS in Sweden have all
helped me a lot through comments, pointers to literature and suggestions.
Pierre Maret and Jacques Calmet have made this work possible by establishing the contact to INSA Lyon.
And finally I wish to thank my colleagues from the LISI/LIRIS laboratory with whom I have discussed some of the gruesome details and implications of my thesis: Solomon Atnafu, David Coquil, Girma Berhe, Sonia Ben
Mokhtar, Amine Demidem, Hector Duque, Rami Rifaieh, Yonny Cardenas,
Ny-Haingo Andrianarisoa, Dejene Ejigu, Marian Scuturici, Rachid Saadi and
Julien Gossa.
vii
Contents
1 Résumé Français
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 État de l’art . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.1 Modèles de contrôle d’accès . . . . . . . . . . . . . . .
1.3.2 Séquence de messages pour le contrôle d’accès . . . . .
1.3.3 Langages d’expression de politiques de contrôle d’accès
1.3.4 Certificats . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.5 Systèmes de contrôle d’accès . . . . . . . . . . . . . . .
1.3.6 Pourquoi le stockage chiffré ? . . . . . . . . . . . . . . .
1.3.7 Systèmes de stockage chiffré . . . . . . . . . . . . . . .
1.4 Le système de contrôle d’accès Sygn . . . . . . . . . . . . . . .
1.4.1 Aperçu de Sygn . . . . . . . . . . . . . . . . . . . . . .
1.4.2 Le langage de Sygn . . . . . . . . . . . . . . . . . . . .
1.4.3 Les meta-données de Sygn . . . . . . . . . . . . . . . .
1.4.4 L’algorithme de décision de Sygn . . . . . . . . . . . .
1.4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . .
1.5 Le stockage chiffré avec CryptStore . . . . . . . . . . . . . . .
1.5.1 Concepts de base de CryptStore . . . . . . . . . . . . .
1.5.2 Architecture de CryptStore . . . . . . . . . . . . . . .
1.5.3 Les meta-données de CryptStore . . . . . . . . . . . . .
1.5.4 Les algorithmes de CryptStore . . . . . . . . . . . . . .
1.5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . .
1.6 Sygn et CryptStore intégrés dans une Grille . . . . . . . . . .
1.6.1 µgrid . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.6.2 Les standards OGSA et WSRF . . . . . . . . . . . . .
1.6.3 Intégration de Sygn dans une grille . . . . . . . . . . .
1.6.4 Intégration de CryptStore dans une grille . . . . . . . .
1.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
1
1
2
5
5
6
7
7
9
9
9
12
12
12
14
15
17
19
20
20
22
22
23
25
25
26
27
28
28
2 Introduction
2.1 Security aspects of resource sharing on a Grid . . . . . . . . .
2.2 Why Grids pose novel security challenges . . . . . . . . . . . .
2.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
32
34
35
3 Motivation
3.1 Use-Cases . . . . . . . . . . . . . . . . . . . . . . .
3.2 General principles of good security . . . . . . . . .
3.3 Constraints of the Grid environment . . . . . . . .
3.4 Constraints of the application . . . . . . . . . . . .
3.5 Legal issues dealing with medical data . . . . . . .
3.5.1 European laws concerning privacy protection
3.5.2 French Law concerning privacy protection .
.
.
.
.
.
.
.
37
37
39
41
43
44
44
47
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
49
49
50
50
51
52
53
54
57
57
57
59
59
60
60
61
61
62
62
62
63
64
64
64
65
65
4 Related Work in Access Control
4.1 Terminology . . . . . . . . . . . . . . . . .
4.2 Access Control Models . . . . . . . . . . .
4.2.1 Discretionary Access Control . . . .
4.2.2 Mandatory Access Control . . . . .
4.2.3 Role Based Access Control . . . . .
4.2.4 Current directions in access control
4.3 Authorization Frameworks . . . . . . . . .
4.4 Authorization Expression Languages . . .
4.4.1 KeyNote . . . . . . . . . . . . . . .
4.4.2 XACML . . . . . . . . . . . . . . .
4.4.3 XrML . . . . . . . . . . . . . . . .
4.4.4 General remarks . . . . . . . . . .
4.5 Standards for authorization assertion . . .
4.5.1 SAML . . . . . . . . . . . . . . . .
4.5.2 X.509 Attribute Certificates . . . .
4.5.3 SPKI . . . . . . . . . . . . . . . . .
4.6 Access Control Systems . . . . . . . . . . .
4.6.1 Shibboleth . . . . . . . . . . . . . .
4.6.2 Akenti . . . . . . . . . . . . . . . .
4.6.3 PERMIS . . . . . . . . . . . . . . .
4.6.4 CAS . . . . . . . . . . . . . . . . .
4.6.5 VOMS . . . . . . . . . . . . . . . .
4.6.6 Cardea . . . . . . . . . . . . . . . .
4.6.7 PRIMA . . . . . . . . . . . . . . .
4.6.8 Summary . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5 Related Work in Storage Security
5.1 Overview of encryption algorithms for storage
5.2 Standardization . . . . . . . . . . . . . . . . .
5.3 Encrypted storage systems . . . . . . . . . . .
5.3.1 CFS . . . . . . . . . . . . . . . . . . .
5.3.2 TCFS . . . . . . . . . . . . . . . . . .
5.3.3 CryptFS . . . . . . . . . . . . . . . . .
5.3.4 P. Gutmann’s SFS . . . . . . . . . . .
5.3.5 WinEFS . . . . . . . . . . . . . . . . .
5.3.6 SNAD . . . . . . . . . . . . . . . . . .
5.3.7 Cepheus . . . . . . . . . . . . . . . . .
5.3.8 J.P. Hughes’ SFS . . . . . . . . . . . .
5.3.9 C-SDA . . . . . . . . . . . . . . . . . .
5.3.10 Summary . . . . . . . . . . . . . . . .
6 Sygn access control
6.1 Sygn overview . . . . . . . . . . .
6.2 Syntax and semantics of the Sygn
6.2.1 Subjects . . . . . . . . . .
6.2.2 Objects . . . . . . . . . .
6.2.3 Actions . . . . . . . . . .
6.2.4 Capabilities . . . . . . . .
6.2.5 Authorization Certificates
6.2.6 Certificate Paths . . . . .
6.2.7 User requests . . . . . . .
6.2.8 Sygn-PDP responses . . .
6.2.9 Extensibility . . . . . . . .
6.2.10 Formal representation . .
6.3 PDP meta-data . . . . . . . . . .
6.4 PDP algorithm . . . . . . . . . .
6.5 Sygn performance . . . . . . . . .
6.6 Discussion . . . . . . . . . . . . .
. . . . . .
Language
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
7 CryptStore encrypted storage
7.1 Basic concepts of CryptStore . . . .
7.2 Architecture and use of CryptStore
7.3 CryptStore meta-data . . . . . . .
7.4 CryptStore algorithms . . . . . . .
7.4.1 Cryptographic algorithms .
7.4.2 Request handling . . . . . .
7.5 Security Analysis . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
71
73
77
77
77
78
78
78
79
79
80
81
81
82
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
85
85
87
87
89
91
91
92
94
95
95
96
97
99
101
110
112
.
.
.
.
.
.
.
115
. 115
. 117
. 121
. 122
. 122
. 125
. 126
.
.
.
.
.
.
.
.
.
.
.
.
.
7.6
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
8 Sygn and CryptStore in a Grid
8.1 µgrid . . . . . . . . . . . . . . . . . . . .
8.2 OGSA/WSRF standardized Grids . . . .
8.3 Integrating Sygn in a Grid . . . . . . . .
8.4 Setting up CryptStore as a Grid service .
8.5 Using Sygn for CryptStore access control
8.6 Summary . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
131
131
132
136
138
138
139
9 Conclusions and Future Works
141
A XML Schema for the Sygn language
145
B XML Schema for CryptStore
153
C Sygn permission creation GUI
155
List of Figures
1.1
Sygn algorithm, automaton representation, french version . . . 16
4.1
4.2
Authorization Message Sequences . . . . . . . . . . . . . . . . 55
Example of an XACML Policy . . . . . . . . . . . . . . . . . . 58
5.1
5.2
5.3
5.4
Cipher block chaining mode . . .
Ciphertext stealing in CBC mode
Cipher-feedback mode . . . . . .
The lockbox concept . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
75
75
76
80
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
6.10
6.11
6.12
6.13
6.14
6.15
Sygn deployment and interactions . . . . .
Example of Sygn user identifiers . . . . . .
Example of Sygn role identifier . . . . . .
Example of Sygn file-set identifier . . . . .
Example of Sygn action . . . . . . . . . .
Example of Sygn capability . . . . . . . .
Example of Sygn add to set capability . .
Example of Sygn authorization certificate .
Example of Sygn certificate path . . . . .
Example of Sygn request . . . . . . . . . .
Example of a Sygn-PDP response . . . . .
Example of Sygn administrative command
Complex certificate path example . . . . .
Sygn algorithm, automaton representation
Sygn-PDP performance . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
87
88
89
90
91
92
92
94
95
96
97
100
102
104
111
7.1
7.2
7.3
7.4
7.5
7.6
Simple authorization example . . . . . . . . . .
Authorization concerning sets of files . . . . . .
Authorization concerning groups of users . . . .
Authorization concerning sets of files and groups
CryptStore usage . . . . . . . . . . . . . . . . .
Example of a CryptStore meta-data header . . .
. . . . .
. . . . .
. . . . .
of users
. . . . .
. . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
117
117
118
118
120
123
xiii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7.7
7.8
7.9
The concept of secret sharing . . . . . . . . . . . . . . . . . . 124
Examples of CryptStore file owner requests . . . . . . . . . . . 126
Examples of CryptStore file user requests . . . . . . . . . . . . 126
8.1
8.2
Web service invocation . . . . . . . . . . . . . . . . . . . . . . 133
Relationship between OGSA, WSRF and Web services . . . . 133
C.1 Sygn Certificate Creation Tool . . . . . . . . . . . . . . . . . . 157
List of Tables
4.1
4.2
4.3
5.1
5.2
Summary of how different architectures respond to requirements of a medical application. . . . . . . . . . . . . . . . . . 67
Summary of how different architectures follow principles of
good security. . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Summary of how different architectures respond to requirements of a Grid environment. . . . . . . . . . . . . . . . . . . 69
Summary of block cipher modes . . . . . . . . . . . . . . . . . 76
Summary of encrypted storage systems. . . . . . . . . . . . . . 83
xv
Chapter 1
Résumé Français
Ce résumé est destiné aux lecteurs francophones. Il a pour but de leur
donner une idée précise du contenu de cette thèse. En raison de sa brièveté, les
détails et explications précises de nombreux points ne peuvent être abordés
dans ce résumé. Nous prions le lecteur intéressé de consulter la partie anglaise
de ce document.
1.1
Introduction
Le partage de ressources connaı̂t une popularité croissante depuis la
création de l’Internet. L’utilisation de ressources matérielles, et surtout de la
puissance de calcul est souvent caractérisée par de longues périodes d’inactivité, interrompues par de courts intervalles d’activité intensive. En réunissant
de telles ressources pour les partager, chaque utilisateur peut avoir à sa disposition une puissance matérielle très importante, au moment où il en a besoin,
pourvu que les utilisateurs n’aient pas tous besoin des ressources en même
temps.
Des applications consommant et produisant de grandes quantités de
données peuvent ainsi profiter du partage de l’espace de stockage, surtout
si ce partage est combiné avec un partage de ressources de calcul. L’espace
de stockage partagé peut permettre notamment de conserver une copie des
données près de l’application qui en a besoin.
Le partage de données est lui aussi d’un grand intérêt, qu’il s’agisse
d’objectifs informationels ou pour supporter des projets distribués ou des
coopérations.
Un problème qui apparaı̂t souvent dans le partage de ressources a travers un réseau est l’hétérogénité des systèmes utilisés qui les rend incapables
d’interopérer. Même en présence de systèmes d’exploitation identiques, des
1
2
CHAPTER 1. RÉSUMÉ FRANÇAIS
détails de configuration peuvent aussi faire échouer l’utilisation de ressources
distantes. Une résolution de ce problème nécessite souvent et un travail compliqué de configuration manuelle et une excellente connaissance de spécificités
de la ressource distante.
Les grilles de calcul proposent une nouvelle approche pour faciliter le
partage de ressources (comme la puissance de calcul, l’espace de stockage,
les données ou les capteurs) et pour vaincre les difficultés liées à l’interoperabilité. Une architecture de grille de calcul implémente une plate-forme
commune de partage de ressources. L’allocation et l’utilisation de ressources
sont ainsi gérées de manière transparente pour l’utilisateur.
Dans cette thèse nous traitons le problème de la sécurité de données
confidentielles partagées à travers une grille de calcul.
Les premières applications sur des grilles abordèrent des sujets qui
nécessitent une grande puissance de calcul comme la physique de particules
et l’observation terrestre. Les questions de sécurité, en particulier celles liées
à la protection des données ont moins d’importance dans ces domaines.
Plus récemment , plusieurs projets ont été menés pour le déploiement de
grilles biomédicales dédiés à la mise en œuvre d’applications manipulant les
données biologiques et médicales (réseaux de soins notamment [70]). En effet
ce type d’applications manipulent des volumes considérables de données1 ,
distribuées (hôpitaux, centres de soins, médecins traitant etc.) et génèrent
des traitements très coûteux en terme de puissance de calcul (par exemple de
l’imagerie 2D et 3D, des études épidémiologiques sur des cohortes très importants). Les grilles de calcul constituent une solution architecturale (matérielle
et logicielle) très prometteuse pour ce type d’application, à condition qu’elles
garantissent un haut-niveau de confidentialité.
Cette thèse propose une architecture de contrôle d’accès aux ressources
d’une grille. La protection de données stockées nous a également amenés à
nous intéresser au stockage chiffré.
1.2
Motivations
Pour pouvoir évaluer l’état de l’art et pour décider des améliorations
nécessaires, nous avons, dans une première étape listé les conditions et les
contraintes liées au contrôle d’accès que devrait satisfaire une grille utilisée
dans le cadre d’applications médicales.
Les résultats de ces réflexions peuvent êtres classifiés en trois domaines :
les principes de bonne sécurité en général, les spécificités des grilles de calcul
1
Un hôpital universitaire de taille moyenne génère chaque année de l’ordre de 1 à 10
To de données numériques (images médicales, dossiers patients, analyses biologiques, etc.)
1.2. MOTIVATIONS
3
et les conditions et contraintes liées aux applications médicales.
En ce qui concerne la sécurité en général, nous sommes arrivés aux conclusions suivantes :
– Il est préférable d’éviter une centralisation des services qui gèrent des
fonctions ou données relatives à la sécurité. En effet, non seulement ce
type de services passent mal à l’échelle, mais aussi ils représentent une
cible idéale pour des attaques.
– Il faut minimiser le nombre de tiers de confiance, pour réduire les possibilités d’attaques.
– Concernant plus spécifiquement le contrôle d’accès, nous considérons
qu’il est important d’utiliser des permissions minimales pour
l’exécution de toute opération. Ceci réduit les dommages que peuvent
faire des processus malhonnêtes agissant au nom et avec les permissions
d’un utilisateur tiers honnête.
– La séparation des tâches et des permissions qui y sont reliées est un
autre principe important auquel nous adhérons. Ceci facilite la gestion
des permissions et évite des abus résultants de combinaisons inattendues de permissions.
– Nous considérons important de préserver la cohérence des permissions
concernant des objets identiques. Lorsque de multiples copies d’une
donnée peuvent exister, il doit être possible d’appliquer les mêmes permissions à chacune de ces copies. Ceci est d’autant plus important dans
une grille, où des mécanismes de réplication peuvent générer automatiquement des copies de données pour les rapprocher d’un noeud de
traitement.
– Il nous semble essentiel de sécuriser les permissions, surtout pour le
stockage à long terme. Cette sécurisation doit dans l’hypothèse où une
permission serait interceptée, empêcher un pirate de l’utiliser pour luimême.
Les contraintes apportées par les environnements de grille de calcul
forment le deuxième ensemble thématique de contraintes, que nous avons
examiné :
– Les grilles réunissent des communautés d’utilisateurs qui évoluent dynamiquement. Ainsi les services qui nécessitent une pré-configuration
avec les identités des utilisateurs ne sont pas viables. Des mécanismes
de délégation de droits peuvent aider un système de contrôle d’accès à
réagir d’une manière flexible à ces changements dynamiques.
– Les ressources d’une grille de calcul (capacité de calcul, espace de
stockage et données) sont soumises à une disponibilité dynamique. Le
système de sécurité doit prendre en compte le fait que des ressources
peuvent soudainement ne plus être disponibles.
4
CHAPTER 1. RÉSUMÉ FRANÇAIS
– Les grilles de calcul réunissent des systèmes informatiques hétérogènes
et fournissent un accès transparent aux ressources de ces systèmes. Il est
donc essentiel que les services de sécurité sur une grille soient génériques
et ne dépendent pas d’une architecture matérielle ou logicielle spécifique
(par exemple au niveau du système d’exploitation).
– Une grille de calcul permet le partage des ressources à travers des limites institutionnelles. Un système de sécurité ne peut donc pas être
imposé au niveau de la grille, il doit rester sous le contrôle individuel des institutions participant à la grille (principe de subsidiarité).
Le contrôle d’accès doit permettre d’appliquer à la fois la politique de
sécurité de l’organisation de l’utilisateur et du propriétaire de la ressource concernée.
– Les permissions concernant une donnée doivent être indépendantes de
son lieu de stockage (cf. migration ou duplication de la donnée).
– Enfin, les grilles de calcul mettent en œuvre un nombre important de
ressources et d’utilisateurs. Des solutions qui fonctionnent bien a petite
échelle peuvent se montrer défaillantes à grande échelle. Il est donc
important que tout système utilisé sur une grille passe bien à l’échelle
(propriété d’extensibilité).
Un dernier groupe de contraintes concerne notre domaine ciblé : les applications biomédicales. Il est important de remarquer que les contraintes
évoquées ci-dessous pourraient être dérivées de nombreuses autres applications qui manipulent des données confidentielles.
– Les utilisateurs d’applications médicales ont des tâches structurées
qui nécessitent des permissions spécifiques. Fréquemment, ces permissions ont une structure hiérarchique, dans laquelle des utilisateurs d’un
niveau hiérarchique plus élevé héritent des permissions des niveaux
hiérarchiques inférieurs (par exemple : Chef de clinique > Médecin en
chef d’un service > Infirmière). Le contrôle d’accès par rôles (RBAC)
est une approche très efficace pour gérer une telle structure de permissions.
– Pour le traitement de données personnelles, et spécialement pour le
traitement de données médicales, des conditions très sévères de protection sont imposées légalement. Une condition sine qua non est la
traçabilité non-répudiable de tout accès à ces données.
– Vues les lourdes responsabilités des propriétaires de données médicales
il n’est pas envisageable qu’une autre personne qu’eux-mêmes en soit les
sources d’autorité. Toute autorisation permettant l’accès à une donnée
doit avoir une source qui remonte au propriétaire. Des systèmes de
contrôle d’accès mettant à disposition des mécanismes de délégation
peuvent aider à apporter une solution à cette condition.
1.3. ÉTAT DE L’ART
5
– Puisque des ressources de stockage sur une grille peuvent aussi être
accédées localement, des mesures doivent êtres prises pour éviter
un accès aux données confidentielles stockées sur ces ressources qui
contournerait le système de contrôle d’accès.
1.3
État de l’art
Dans la section suivante nous présentons un court état de l’art sur le
contrôle d’accès puis le stockage chiffré, tout en motivant le lien entre ces
deux domaines de la sécurité2 .
1.3.1
Modèles de contrôle d’accès
Dans le domaine du contrôle d’accès, trois modèles sont généralement
reconnus :
– Le contrôle d’accès discrétionnaire (DAC).
– Le contrôle d’accès obligatoire (MAC).
– Le contrôle d’accès basé sur des rôles (RBAC).
Dans le modèle DAC, les permissions sont représentées par une matrice,
dans laquelle chaque ligne correspond à un utilisateur et chaque colonne à une
ressource. Le contenu de chaque élément de cette matrice définit les droits
d’accès pour l’utilisateur correspondant à la ligne sur la ressource correspondant à la colonne.
Le modèle MAC attribue un niveau de sécurité à chaque utilisateur et à
chaque ressource. On accorde l’accès à un utilisateur seulement si son niveau
de sécurité est supérieur ou égal au niveau de la ressource à laquelle il veut
accéder. De plus, pour éviter des fuites d’informations vers des niveaux moins
sécurisés on interdit à un utilisateur qui fait usage de son niveau d’accès
décrire des données d’un niveau inférieur. Ce concept peut être enrichi en
utilisant une classification (par exemple armée, marine, force aérienne) des
données et des utilisateurs en plus de leur niveau de sécurité. Des utilisateurs n’ont alors droit qu’aux ressources correspondant aux mêmes classes
auxquelles ils appartiennent.
Enfin le modèle RBAC a pour but de faciliter la gestion de permissions
associées à des tâches. Des permissions sont regroupées par tâches et assignées
à un rôle, qui sera attribué aux utilisateurs qui devront remplir cette tâche.
Ainsi les changements de permissions lorsqu’un utilisateur est attribué à
une nouvelle tâche deviennent plus facilement gérables, puisqu’il suffit de
2
Nous renvoyons le lecteur au corps du manuscrit pour une étude plus détaillée de l’état
de l’art
6
CHAPTER 1. RÉSUMÉ FRANÇAIS
changer les attributions de rôles. De même, si les permissions liées à une
tâche changent, il suffit de rajouter ou d’enlever ces permissions au rôle. Le
modèle RBAC introduit deux autres concepts importants : celui des rôles
hiérarchiques (un rôle peut hériter de l’ensemble des permissions d’un rôle
hiérarchiquement inférieur) et celui de séparation des tâches (l’utilisation de
deux rôles simultanément peut être interdite pour éviter des abus résultant
de la combinaison des droits associés).
La flexibilité du modèle DAC pour des permissions ad-hoc et la gestion
des droits performante du modèle RBAC sont deux attributs souhaitables
pour notre application.
1.3.2
Séquence de messages pour le contrôle d’accès
L’IETF et l’ISO ont définit des architectures (“frameworks”) pour le
contrôle d’accès. Conceptuellement similaires, elles diffèrent surtout dans le
choix du vocabulaire.
Le RFC 2904 [96] propose trois séquences d’échange de messages entre les
utilisateurs, les ressources et les serveurs d’autorisation : la séquence agent, la
séquence pull et la séquence push. La séquence agent, fait intervenir le serveur
d’autorisation comme agent entre l’utilisateur et la ressource. L’utilisateur
interagit donc uniquement avec le serveur d’autorisation. Celui-ci transmet
ses demandes à la ressource, après vérification des droits et se charge de faire
parvenir la réponse de la ressource à l’utilisateur.
La séquence pull charge la ressource de gérer toute l’interaction avec le
serveur d’autorisation. L’utilisateur soumet sa requête à la ressource qui,
elle, demande au serveur d’autorisation si cette demande est autorisée. Si la
réponse est positive la ressource donne accès à l’utilisateur.
La séquence push découple la ressource du serveur d’autorisation. Pour
exécuter une requête, un utilisateur demande d’abord au serveur d’autorisation de lui certifier qu’il a droit à cette requête et présente ensuite cette
certification à la ressource avec la requête.
Ces trois séquences de messages sont illustrées dans la figure 4.1, page 55.
La séquence push a l’avantage de permettre de découpler temporellement
la certification d’une permission et son utilisation. La charge sur les serveurs
d’autorisation (pour la séquence agent) et sur les ressources (pour la séquence
pull) est réduite ce qui rend le système plus extensible. De plus, la séquence
push permet de mettre en œuvre de manière sûre l’utilisation de permissions
minimales, puisque l’utilisateur garde le contrôle des permissions qu’il fournit
au service de contrôle d’accès. Si l’utilisateur a besoin de plusieurs autorisations de sources différentes ce modèle lui permet facilement de les récupérer
séparément et ne nécessite pas de protocole de coopération entre les serveurs
1.3. ÉTAT DE L’ART
7
d’autorisation. L’inconvénient de la séquence push est qu’elle nécessite la
mise en place d’un mécanisme de révocation de permissions, car il n’est pas
possible de retirer une permission à un utilisateur avant sa date d’expiration,
une fois qu’elle lui a été attribuée.
1.3.3
Langages d’expression de politiques de contrôle
d’accès
Pour exprimer des autorisations d’accès et définir les principes généraux
d’une politique de contrôle d’accès, un langage formel est nécessaire. Plusieurs
propositions pour de tels langages existent. Nous avons examiné KeyNote
[14], XACML [52] et XrML [28] pour déterminer leurs apports possibles à
nos travaux.
Le langage défini par KeyNote [14] permet de lier des autorisations à des
clés publiques, à l’instar de l’approche SPKI (voir section suivante). Il permet
la délégation au travers de certificats. L’inconvénient du langage défini par
KeyNote est qu’il ne prévoit pas de support pour le modèle RBAC, puisqu’il
est orienté spécifiquement vers le modèle DAC.
Le langage XACML [52] est une proposition de standard issue du consortium OASIS pour un langage générique de définition de politiques de contrôle
d’accès. Basé sur XML, ce langage propose une grande variété de types de
données et de fonctions pour combiner ou comparer les données. Par contre
la généricité de XACML fait que même les politiques les plus simples sont
très longues et difficiles à lire, comprendre et à modifier. De plus la délégation
n’est pas prévue dans la version courante de XACML.
Le langage XrML [28] est aussi un langage en format XML qui sert à
décrire des politiques de contrôle d’accès. Son approche est fondamentalement
la même que XACML. Par contre il est moins générique car orienté vers la
gestion de restrictions numériques (DRM). De plus, XrML, n’a pas de support
spécifique pour le modèle RBAC.
1.3.4
Certificats
Certaines architectures de contrôle d’accès implémentent des mécanismes
de stockage de permissions qui ne les protègent pas contre les modifications
frauduleuses. Nous sommes de l’avis que pour une application nécessitant une
forte sécurité ceci n’est pas satisfaisant. Dans un tel système, un maximum
de données liées à la sécurité et spécialement au contrôle d’accès devraient
êtres encodées dans des certificats portant des signatures digitales.
Des séquences ordonnées de certificats d’autorisation permettent de former des chemins de certification qui peuvent êtres utilisés pour la délégation
8
CHAPTER 1. RÉSUMÉ FRANÇAIS
de permissions. Les certificats permettent ainsi une gestion flexible et
sécurisée de droits d’accès dynamiques.
Nous avons examiné trois approches pour l’encodage de données d’autorisation sous forme de certificats : SAML [68], X.509 AC [45] et SPKI [39].
SAML est une proposition de standard du consortium OASIS, tout
comme XACML. SAML est basé sur XML et propose des formats pour demander et fournir des certificats qui confirment une authentification, des
attributs ou des permission d’un utilisateur. SAML tout comme XACML est
très générique et met à disposition un grand nombre de types de données
et de fonctions de comparaison. Ceci rend le langage SAML presque aussi
difficile à lire que celui de XACML. En outre la spécification de SAML ne
traite pas la délégation. Diverses propositions récentes ([77, 97]) proposent
des extensions au standard SAML pour remédier ce manque.
Les certificats d’attributs (AC) sont une extension du format de certificat
X.509 utilisé pour l’authentification. Le but de cette extension est de permettre d’encoder des informations liées à l’autorisation dans des certificats
X.509. La proposition de format est très limitée, car elle contient la recommandation de ne pas supporter la délégation par chaı̂nes de certificats jugée
trop complexe. De plus, pour chaque ensemble d’attributs que peuvent certifier les AC X.509, il doit y avoir qu’une seule et unique autorité qui émet
des certificats. Une telle restriction est fortement préjudiciable à une gestion
décentralisée du contrôle d’accès, et impose des limites sévères au niveau du
passage à l’échelle.
Les RFCs 2692 et 2693 [38, 39] proposent une infrastructure SPKI qui
est simple et basée sur des clés publiques pour la gestion de confiance (authentification et autorisation). Dans SPKI une liste de sources d’autorité est
associée à chaque ressource, spécifiant les entités qui peuvent émettre des certificats de permissions concernant la ressource. SPKI définit un système de
délégation utilisant des chaı̂nes de certificats. Toute entité est identifiée par
sa clé publique, ce qui facilite la vérification de signatures digitales et évite les
confusions qui peuvent avoir lieu à cause des homonymes. Le désavantage de
ce système (comparé au fait de lier des permissions à des noms d’utilisateurs
comme dans X.509), est que si une clé est révoquée toutes les permissions
délivrées pour cette clé doivent êtres révoquées aussi. Si, par contre, on utilise
un nom d’utilisateur on peut lui attribuer une nouvelle clé, sans pour autant
devoir changer le nom. Dans ce cas il n’est pas nécessaire de révoquer les permissions qui sont associées à ce nom. Nous sommes de l’avis que les avantages
de lier des permissions à une clé compensent largement ces inconvénients. Le
travail sur la standardisation de SPKI a cessé depuis 2001 et donc quelques
questions importantes comme le support de RBAC dans SPKI n’ont pas été
traitées.
1.3. ÉTAT DE L’ART
1.3.5
9
Systèmes de contrôle d’accès
Nous avons étudié un grand nombre de systèmes de contrôle d’accès distribués certains dédiés aux grilles de calcul. Il s’agit d’Akenti [91, 92] du Distributed Systems Department des laboratoires Lawrence Berkeley aux ÉtatsUnis, de PERMIS [23] de l’Information Systems Security Research Group de
l’Université de Salford au Royaume-Uni, de CAS [79, 78] qui est un système
spécifique aux grilles développé par la Globus Alliance, de VOMS [2] qui est
aussi spécifique aux grilles et qui à été développé au cours du projet européen
DataGrid, de Cardea [65] du NASA Advanced Supercomputing (NAS) Division au NASA Ames Research Center aux États-Unis et de PRIMA [67, 66]
du Department of Computer Science au Virginia Polytechnic Institute and
State University aux États-Unis.
Confrontés aux conditions et contraintes que nous avons établis préalablement, aucun de ces systèmes ne s’est montré satisfaisant. Les résultats
de cette analyse sont présentés dans les tableaux 4.1–4.3 pages 67– 69.
1.3.6
Pourquoi le stockage chiffré ?
Une question que l’on doit se poser si l’on met en œuvre des mécanismes
de contrôle d’accès est comment éviter que ces mécanismes soient contournés.
Pour le contrôle d’accès aux données, ceci est particulièrement problématique
lorsqu’un adversaire peut avoir accès au matériel physique de stockage. Dans
un tel cas, il est toujours simple de désactiver les mécanismes de contrôle
d’accès. Nous avons donc conclu qu’une solution satisfaisante pour la sécurité
des données devait empêcher un tel accès direct aux données brutes. Pour
cela, nous avons décidé de coupler à notre système de contrôle d’accès des
mécanismes qui permettent le chiffrement des données.
Le problème auquel il faut répondre pour qu’un système de stockage
chiffré soit utilisable sur une grille, est celui du partage des fichiers chiffrés.
Les groupes d’utilisateurs qui ont accès à de tels fichiers sont dynamiques,
tout comme les ensembles de fichiers chiffrés partagés. Il faut donc gérer
cette fluctuation de membres ou d’éléments d’une manière qui ne ralentisse
pas l’accès aux fichiers de manière significative, ce qui exclut une distribution
manuelle des clé de chiffrement.
1.3.7
Systèmes de stockage chiffré
Nous avons examiné des systèmes des stockage chiffré, principalement en
fonction de leurs mécanismes de partage de clés. CFS [12] développé par
Matt Blaze en 1993 aux laboratoires AT&T est un des plus anciens systèmes
10
CHAPTER 1. RÉSUMÉ FRANÇAIS
disponibles pour le stockage chiffré. CFS n’a aucun mécanisme de partage
de clés. TCFS [21, 22] développé à l’Université de Salerne en Italie en 1997
apporte des améliorations vis-à-vis de CFS, mais ne possède toujours aucun
mécanisme de partage de clés. CryptFS [103] est une architecture proposée
dans le but d’améliorer les fonctions de CFS en les rendant plus efficaces
et plus résistantes contre des attaques de personnes ayant des connaissances
précises du système. Tout comme CFS et TCFS, CryptFS ne possède pas de
mécanisme de partage de clés.
SFS de Peter Gutmann [56], créé en 1995, est un autre système de stockage
chiffré. Tout comme les systèmes présentés ci-dessus il ne propose pas de
mécanisme de partage de clés, par contre il possède une fonction intéressante :
pour pouvoir accéder à une clé en cas de perte, SFS permet de la diviser en
morceaux en utilisant l’algorithme de partage de secrets de Shamir [90] et de
distribuer ces morceaux à des tiers de confiance. En cas de perte, la clé peut
être reconstituée en assemblant un certain nombre de morceaux transmis
par les tiers de confiance. Le nombre des morceaux générés et le nombre de
morceaux nécessaires à la reconstruction peuvent être librement choisis par
l’utilisateur. Ils n’ont aucune influence sur la nature des morceaux que l’on
utilise pour la reconstitution, dés lors qu’est réuni le nombre nécessaire. Par
contre, un tiers de confiance seul ne peut déduire aucune information sur la
clé complète à partir de la part qui lui a été confiée. Nous avons adapté cette
idée pour notre approche (cf. section 1.5).
WinEFS (Windows Encrypting File System) [73] est un système de stockage chiffré qui offre la possibilité de partager des clés. Pour chaque utilisateur qui a accès à un ficher, la clé de chiffrement est déposée dans l’entête
du fichier, chiffrée avec la clé publique de cet utilisateur. Cette information
(appelé un lockbox en anglais) permet à cet utilisateur (et à lui uniquement)
d’accéder à cette clé de déchiffrement. Il est clair qu’un tel système passe mal
à l’échelle, si l’on est confronté à des mises à jour fréquentes des permissions
et à une communauté dynamique et large d’utilisateurs.
SNAD [74] développé à l’Université de Californie en 2002 est un autre
système de stockage chiffré qui utilise le concept de lockbox. Il a donc les
mêmes limitations que WinEFS par rapport à notre application.
Le système Cepheus [50] fut développé au MIT aux États-Unis entre
1998 et 1999 par Kevin E. Fu. Il est basé sur SFS de David Maziéres [71] et
propose un serveur de groupes, pour gérer des fichiers chiffrés partagés par
des groupes d’utilisateurs. Un fichier partagé par un groupe est chiffré avec
une clé de groupe. Pour chaque membre du groupe, un exemplaire de cette
clé est stocké dans un lockbox sur le serveur de groupes. Ce nécessite que
l’administrateur du fichier chiffré connaisse tous les membres du groupe et
qu’il mette à jour manuellement le serveur de groupes à chaque changement.
1.3. ÉTAT DE L’ART
11
Une telle approche est clairement inefficace dans un environnement distribué
avec des groupes dont les membres changent dynamiquement.
Hughes et al. ont proposé un autre système SFS, qui améliore le concept
de serveur de groupes. Pour des utilisateurs individuels, SFS propose de
stocker la clé de chiffrement dans l’entête du ficher à l’aide d’une lockbox. Pour
le partage par groupes, l’entête du fichier contient des permissions signées par
le propriétaire du fichier, qui spécifient quels sont les groupes qui ont accès
au fichier, ainsi qu’une lockbox qui peut être déchiffrée uniquement par le
serveur de groupes. Un membre du groupe qui veut avoir accès à la clé de
déchiffrement doit envoyer les permissions et la lockbox au serveur de groupes.
Celui-ci vérifie les permissions et, si elles sont correctes, déchiffre la lockbox et
transmet la clé à l’utilisateur demandant l’accès. Le problème de SFS est que
le serveur de groupes est un tiers de confiance, et donc une cible importante
pour des attaquants.
Le problème commun à tous les systèmes de stockage chiffrés qui autorisent le partage de fichiers est qu’ils introduisent une nouvelle couche de
contrôle d’accès au système. Un utilisateur peut donc se retrouver dans une
situation incohérente, dans laquelle le système de contrôle d’accès lui donne
accès à un fichier, mais le système de partage de clés lui refuse l’accès à la
clé de déchiffrement.
Bouganim et al. ont proposé un système C-SDA, qui réalise un stockage
chiffré à l’aide d’une carte à puce. Les clés auxquelles l’utilisateur a accès
sont stockées sur la carte. Cette même carte gère aussi le déchiffrement des
données : les clés ne quittent donc jamais la carte. Les données auxquelles
un utilisateur peut avoir accès au travers de C-SDA peuvent êtres générées
dynamiquement (une vue sur une base de données par exemple). De ce fait,
les clés de chiffrement peuvent ne pas correspondre aux permissions d’accès.
La carte gère donc aussi les permissions d’un utilisateur et les données dynamiques crées à partir des données brutes déchiffrées auxquelles l’utilisateur
veut accéder. Le problème que nous voyons avec C-SDA est que la carte
est considérée inviolable, même par l’utilisateur. Or depuis l’invention des
attaques utilisant des canaux cachés par P. Kocher [62, 63], la communauté
cryptographique ne cesse de développer des attaques contre des cartes à puces
basées sur ce principe (par exemple [81] ou [69]). Il est donc nécessaire de
souvent mettre à jour les mécanismes de protection des algorithmes utilisés
sur une carte à puce, ce qui représente un effort considérable.
12
CHAPTER 1. RÉSUMÉ FRANÇAIS
1.4
Le système de contrôle d’accès Sygn
Pour répondre aux conditions et contraintes que nous avons établies pour
le contrôle d’accès sur une grille pour des applications médicales, nous avons
conçu le système de contrôle d’accès Sygn 3 . Nous commençons par un aperçu
du système, ensuite nous présentons le langage dans lequel des autorisations
sont exprimées dans Sygn, puis les meta-données utilisées par Sygn et enfin
l’algorithme de décision. Nous concluons par une discussion des principes de
Sygn en regard de notre problématique et de l’état de l’art.
1.4.1
Aperçu de Sygn
Puisque nos buts principaux sont d’offrir un support pour la gestion de
décisions ad-hoc de contrôle d’accès et un système de délégation de permissions décentralisé, nous avons décidé d’utiliser des chaı̂nes de certificats
d’autorisation pour attribuer des permissions. Nous avons choisi d’utiliser
une séquence de messages push pour les raisons présentées en section 1.3.2.
Les utilisateurs de Sygn vont donc obtenir et stocker eux-mêmes les certificats
qui leur sont attribués. Ces certificats sont protégés contre des manipulations
illicites par leur signature digitale.
Le processus d’accès et de permissions se déroule de la manière suivante,
illustré dans la figure 6.1, page 87 : l’administrateur d’une ressource la met
à disposition sur la grille et s’inscrit comme source d’autorité (SOA) pour
cette ressource dans la base de meta-données du serveur Sygn local (étape
1). Ensuite, il créé un ou plusieurs certificats d’autorisation, qui donnent
le droit d’accéder à la ressource et les transfère aux utilisateurs concernés
(étape 2). Ces utilisateurs stockent les certificats et les utilisent au moment
où ils veulent accéder à la ressource, pour prouver au serveur Sygn local à la
ressource leur droit d’accès (étape 3).
1.4.2
Le langage de Sygn
Le langage de Sygn introduit différents éléments pour définir des permissions et des requêtes. Les utilisateurs sont identifiés par un uid. Un uid
est une clé publique (Sygn suit l’approche de SPKI et KeyNote en liant des
permissions à des clés publiques).
Les identifiant de sujets (SID) servent à faire référence à un utilisateur
(uid) ou à un rôle (rid). Les SID sont utilisés pour identifier le ou les posses3
Dans la mythologie nordique, Sygn est une déesse de la vérité, mais aussi des portes
et des verrous. Elle garde l’entrée du palais Wingolf et ne laisse entrer que les personnes
honnêtes.
1.4. LE SYSTÈME DE CONTRÔLE D’ACCÈS SYGN
13
seur(s) d’un certificat d’autorisation ainsi que le ou les source(s) d’autorité
d’une ressource.
Les identifiants d’objets (OID) servent à faire référence aux différents types
de ressources pouvant êtres concernées par des permissions : les fichiers (fid),
les collections de fichiers (fsid), les ressources matérielles (resid) et les rôles
quand ils sont objets d’une permission (roid).
Sygn définit des actions (action) qui peuvent être exécutées par un utilisateur sur une ressource.
En utilisant ces éléments, il est possible de définir une capacité
(CAPABILITY) qui consiste en un objet (OID) et une action sur cet objet.
Ces capacités sont attribuées à des utilisateurs dans un certificat d’autorisation (AC). Il contient, en plus de la capacité, l’uid du créateur (CREATOR)
du certificat, le SID du propriétaire du certificat (OWNER), des dates de limite de validité (NOT BEFORE, NOT AFTER), une limite de la profondeur de
délégation (DELEGATION), des restrictions (NOT WITH) énumérant les rôles qui
ne peuvent pas être utilisés en une même requête avec cet AC ainsi qu’une
signature digitale par le créateur du certificat.
Une chaı̂ne d’ACs (AC CHAIN) vise à autoriser une capacitée ciblée
(TARGET) pour un certain utilisateur. On parle alors de chemin d’autorisation
(PATH).
Un utilisateur (ISSUER) peut soumettre à un serveur Sygn une requête
(SURF) demandant la validation de plusieurs capacités ciblées par plusieurs
chemins d’autorisation contenus dans la requête.
Le langage Sygn (en version légèrement simplifiée) est décrit par la grammaire suivante, comprenant les symboles terminaux uid, rid, fid, fsid,
resid, roid et action comme définis ci-dessus, ainsi que timestamp qui
représente une heure et une date, integer value qui représente une valeur
entière égale ou supérieure à zéro et signature qui représente une signature
digitale.
SID -> uid | rid
OID -> fid | fsid | resid | roid
CAPABILITY -> OID, action
CREATOR -> uid
OWNER -> SID
NOT_BEFORE -> timestamp
NOT_AFTER -> timestamp
DELEGATION -> integer_value
NOT_WITH -> rid | NOT_WITH, rid
14
CHAPTER 1. RÉSUMÉ FRANÇAIS
AC -> CREATOR, OWNER, CAPABILITY, NOT_BEFORE, NOT_AFTER,
NOT_WITH, DELEGATIONS, signature | CREATOR, OWNER,
CAPABILITY, NOT_BEFORE, NOT_AFTER, DELEGATIONS,
signature
TARGET -> CAPABILITY
AC_CHAIN -> AC | AC, AC_CHAIN
PATH -> TARGET, AC_CHAIN
ISSUER -> uid
PATHES -> PATH | PATH, PATHES
SURF -> ISSUER, PATHES
1.4.3
Les meta-données de Sygn
Un serveur Sygn nécessite un certain nombre de meta-données pour son
fonctionnement. Ces meta-données sont stockées dans une base de données
située près du serveur. Puisque les informations contenues dans la base de
données sont critiques pour la sécurité de Sygn, il est important de bien
protéger l’accès à cette base. Sygn permet d’administrer à distance certaines
de ces meta-données. Pour cela, un utilisateur soumet une commande administrative et un chemin de certificats qui autorise l’exécution de cette commande au serveur Sygn, qui vérifie les permissions et puis met à jour sa base
de données.
Les meta-données les plus importantes, stockées avec les serveurs Sygn
sont les sources d’autorité (SOA) des ressources (matériels et fichiers) sous
le contrôle du serveur. Les SOA sont les racines de toute délégation de droits
sur leurs ressources. Les identifiants des SOA sont utilisées par le moteur de
décision pour amorcer le processus de traitement d’une requête.
Pour des ressources matérielles (puissance de calcul et espace de stockage), le serveur Sygn stocke aussi l’utilisation faite par les diverses entités
autorisées, afin de faire respecter des quotas d’utilisation. Il est de la responsabilité de mécanismes extérieurs à Sygn de mesurer et transmettre ces
données d’usage de ressources au serveur Sygn.
Sygn maintient aussi une liste d’utilisateurs bannis de tout accès sur un
site (blacklist). L’administrateur local peut ainsi exclure toute personne qui
perturbe délibérément le fonctionnement du système. Les certificats d’autorisation créés par une personne bannie d’un site ne sont pas reconnus sur ce
site.
Pour supporter la révocation de certificats d’autorisation, Sygn maintient
1.4. LE SYSTÈME DE CONTRÔLE D’ACCÈS SYGN
15
aussi une liste des identifiants de certificats invalidés. Un certificat peut être
invalidé soit par son créateur, soit par le SOA de la capacité qu’il délègue.
Si le traçage est activé, le serveur Sygn se charge aussi de sauvegarder
toutes requêtes qui lui sont soumises.
1.4.4
L’algorithme de décision de Sygn
L’algorithme de décision de Sygn traite les chemins de certificats et décide
si ceux-ci donnent droit à la capacité ciblée. Il est le cœur de l’architecture
de Sygn. Fondé globalement sur le principe de l’induction complète, il utilise
une mémoire globale. Nous décrivons informellement cet algorithme à l’aide
d’un automate.
Les paramètres de départ de l’algorithme sont une capacité : cible, composée d’une action cible et d’un objet cible ; et d’un émetteur requête pour
lequel le chemin doit autoriser la capacité cible.
L’algorithme utilise la variable cible actuelle qui peut varier de cible
si objet cible est rajouté à une collection. cible actuelle consiste d’une action actuelle qui est toujours égale à action cible et d’un objet actuel qui est
mis à jour lorsque objet actuel est rajouté à une collection. La valeur initiale
de la variable cible actuelle est cible.
L’automate qui représente l’algorithme est illustré par la figure 1.1. Il
possède trois états de base ainsi que quatre états intermédiaires qui traitent la
délégation d’un rôle et la déclaration d’une hiérarchie de rôles. Les transitions
entre les états se font en fonction du certificat suivant dans le chemin, dont la
fonction est décrite dans l’état de destination. Les transitions peuvent êtres
liées à des conditions supplémentaires qui sont indiquées séparément dans la
figure auprès des transitions. Les états de base sont :
– l’attribution de la permission d’utiliser la capacité cible actuelle et
l’état associé dans lequel l’automate passe si cette permission est attribuée à un rôle.
– l’ajout d’objet actuel à une collection et l’état associé dans lequel l’automate passe si la source d’autorité (SOA) de cette collection est un
rôle. La collection à laquelle on ajoute objet actuel devient le nouvel
objet actuel. Des hiérarchies de collections peuvent êtres déclarées implicitement à partir de cet état de l’automate, si objet actuel est déjà
une collection.
– l’attribution de la permission d’ajouter objet actuel à une collection
et l’état associé dans lequel l’automate passe si cette permission est
attribuée à un rôle.
L’algorithme a trois ensembles de conditions : les conditions de départ, les
conditions d’induction et les conditions de fin. Pour qu’un chemin de certi-
CHAPTER 1. RÉSUMÉ FRANÇAIS
16
Cette police : Variables
Cette police : Conditions
et actions des états
SOA de
objet_cible
est un rôle
Cette police : Commentaires
attribution de la
permission d’activer
un rôle
délégation ou
déclaration d’une
hierarchie de rôles
attribution de la
permission d’activer
un rôle
attribution à
émetteur_requête
délégation ou
déclaration d’une
attribution
attribution de la
attribution de la
à
un
rôle
permission de rajouter
permission d’activer
objet_cible à une
un rôle
collection
délégation
attribution à
émetteur_requête
permission d’utiliser
délégation ou
cible_actuelle
déclaration d’une
hierarchie de rôles
délégation
attribution de la
déclaration
permission d’activer
hierarchie
de collections
un rôle
attribution de la
ajouter objet_cible
SOA de hierarchie de rôles
à une collection
la collection
objet_cible := collection
est un rôle
attribution
à un rôle
délégation ou
déclaration
d’une hierarchie attribution à
émetteur_requête
de rôles
Fig. 1.1 – Représentation informelle de l’algorithme de décision de Sygn par un automate. Une transition vers un
état est initiée par un certificat. Le texte d’un état indique la nature du certificat qui a initié la transition, les textes
adjacents aux transitions sont soit des explications soit des conditions supplémentaires.
1.4. LE SYSTÈME DE CONTRÔLE D’ACCÈS SYGN
17
ficats soit valide, son premier certificat doit vérifier les conditions de départ,
chaque paire consécutive de certificats doit vérifier les conditions d’induction
et le dernier certificat doit vérifier les conditions de fin.
Il vérifie quatre cas à chaque étape de la chaı̂ne de délégation :
– La délégation simple de la capacité ciblée.
– L’activation de rôles donnant la permission de déléguer la capacité
ciblée (ceci peut inclure l’activation d’une hiérarchie de rôles).
– La délégation de la permission d’ajouter l’objet de la capacité ciblée
dans une collection.
– L’ajout de l’objet cible soit rajouté à une collection. La collection devient alors le nouvel objet de la capacité ciblée. La collection peut par
la suite elle-même être ajoutée à d’autres collections, créant ainsi des
hiérarchies de collections.
1.4.5
Discussion
Sygn utilise une séquence push pour transmettre les permissions au serveur de contrôle d’accès. Puisque les ACs de Sygn ne sont pas uniquement
conçues comme permissions de courte durée de vie, mais peuvent aussi êtres
utilisées pour stocker des permissions permanentes, il est nécessaire de mettre
en place un système de révocation, capable d’invalider une permission avant
que l’AC par laquelle elle a été attribuée arrive à son expiration. Cet inconvénient est propre aux séquences push et doit être comparé à leurs avantages. Un utilisateur est ainsi capable de soumettre exactement les ACs dont
il a besoin pour autoriser une requête, ce qui lui permet de suivre le principe d’utilisation de permissions minimales. De plus l’utilisateur peut choisir
exactement quelles seront les permissions qui seront exposées aux différents
services de la grille.
Le fait de relier des permissions à des clés publiques possède un inconvénient comparé aux approches utilisant des noms d’utilisateur : si la clé
privée correspondant à la clé publique est volée, toutes les permissions liées
à la clé publique doivent êtres révoquées. Par contre, ce système a l’avantage
de rendre la vérification de signatures plus facile et d’éviter le transfert de
chaı̂nes de certificats d’authentification pour lier un nom d’utilisateur à une
clé publique. De plus il évite les problèmes qui peuvent avoir lieu avec des
homonymes (noms d’utilisateur identiques).
Une propriété centrale de Sygn est le support pour la création
décentralisée de permissions. Différents SOAs peuvent administrer les permissions à un niveau très précis, sans l’intervention d’un tiers parti. Par
contre, cette propriété induit qu’il est impossible d’être sûr de l’ensemble des
permissions données à un utilisateur ou à un rôle. Les résultats de fonctions
18
CHAPTER 1. RÉSUMÉ FRANÇAIS
d’aperçu de permissions (obligatoires dans le standard RBAC) ne sont pas
nécessairement complets. Pour assurer des résultats complets, il faudrait faire
valider par un service central toutes les permissions, ce qui mettrait en cause
les avantages de l’attribution décentralisée et ad-hoc des permissions. Pour
cela, les fonctions d’aperçu de permissions dépendent de la bonne volonté des
créateurs de permissions, qui sont censés enregistrer toutes les permissions
qu’ils créent pour un rôle dans un système d’aperçu.
Une autre propriété centrale de Sygn est son mécanisme de délégation.
Suivant l’approche de SPKI, nous avons examiné les trois choix suivants pour
le contrôle de la délégation :
1. Pas de contrôle. Tout utilisateur peut déléguer l’ensemble des permissions qui lui sont attribuées.
2. Contrôle booléen. Chaque permission spécifie si l’utilisateur a le droit
de la déléguer ou non.
3. Contrôle de la profondeur de délégation. Chaque permission spécifie à
combien d’étapes de niveaux elle peut être déléguée.
Les arguments pour la première option sont que si l’on restreint la
délégation, les utilisateurs vont partager leurs authentificateurs pour réaliser
une délégation en contournant les restrictions. D’après cette argumentation
il est donc nuisible à la sécurité du mécanisme d’authentification d’imposer
des contraintes sur la délégation de permissions.
Nous avons choisi de ne pas suivre cette argumentation, car nous sommes
de l’avis que l’éducation des utilisateurs à la sécurité devrait empêcher de
telles aberrations. Si les utilisateurs d’un système ont de telles mauvaises
habitudes concernant la sécurité, aucun système ne parviendra à protéger les
ressources sur une grille contre des accès illicites.
L’argument pour les deux autre solutions est qu’il peut être nécessaire
de restreindre la délégation de permissions, pour des raisons de responsabilité (légale) du SOA. Si une permission déléguée est abusée, le SOA peut
être tenu partiellement responsable, puisqu’il est au sommet de la chaı̂ne de
délégation. Il peut donc être important de différencier si une permission peut
être déléguée ou non.
Les créateurs de SPKI argumentent qu’un contrôle de la profondeur de
délégation ne donne aucun contrôle réel sur la prolifération d’une permission
déléguée, car seule la profondeur de délégation et pas le nombre de délégations
au même niveau peut être contrôlé. Ils optent donc pour un contrôle booléen.
Nous admettons que cet argument est valide, mais nous sommes quand
même de l’avis que le contrôle de la profondeur de délégation est préférable
au contrôle booléen. En effet, le contrôle de la profondeur permet de restreindre la profondeur de l’arbre de délégation, permettant plus facilement
1.5. LE STOCKAGE CHIFFRÉ AVEC CRYPTSTORE
19
de trouver les responsables en cas d’abus de permission. De plus, les grilles
utilisent souvent des mécanismes de proxy, pour créer des authentificateurs
temporaires a partir d’authentificateurs à long terme (voir [93] pour plus de
détails). Avec un contrôle de délégation booléen, il faudrait donc donner à
un utilisateur le droit de délégation sur toutes ses permissions à longue durée
de vie. Ceci rendrait le contrôle de la délégation presque inutile. Avec un
contrôle de profondeur on peut restreindre la délégation à un niveau, permettant aux utilisateurs de déléguer leurs permissions à leurs proxys. Pour
ces raisons, nous avons choisi le contrôle de profondeur de délégation pour
Sygn.
Sygn permet l’utilisation de RBAC pour la gestion de permissions, mais
aussi l’utilisation de permissions du type DAC. Cette dualité permet d’adapter le type des permissions aux situations dans lesquelles elles sont les plus
appropriées. Si une structure complexe de permissions est en place ou si les
autorisations sont basées sur des tâches, Sygn permet l’utilisation de RBAC.
Pour la création de permissions ad-hoc ou dans des situations similaires où
RBAC est trop lourd à utiliser, Sygn gère des permissions DAC qui sont plus
faciles à créer et à utiliser.
Enfin, la structure de Sygn permet de supporter des scénarios dans lesquels des permissions multiples sont nécessaires simultanément. Un exemple
simple serait la réplication d’un fichier sur la grille. Une telle opération
nécessite la permission de lecture du fichier, ainsi que la permission d’utiliser un certain espace de stockage sur le site de réplication. En utilisant des
requêtes Sygn avec plusieurs chemins de certificats, les autorisations pour de
telles opérations peuvent êtres groupées dans une requête de contrôle d’accès
de manière pratique.
1.5
Le stockage chiffré avec CryptStore
Afin de répondre au problème de la protection du stockage de données et
du partage de clés de fichiers chiffrés, nous avons conçu le système CryptStore.
Nous présentons d’abord les concepts de base de CryptStore. Ensuite nous
exposons l’architecture de CryptStore et son utilisation. Les meta-données de
CryptStore sont le sujet de la section suivante. Les algorithmes de CryptStore
sont présentés et finalement nous concluons avec une discussion des propositions de CryptStore.
20
1.5.1
CHAPTER 1. RÉSUMÉ FRANÇAIS
Concepts de base de CryptStore
CryptStore permet à un utilisateur qui contrôle un fichier de le chiffrer,
avant de le stocker sur une grille. Les meta-données pour le traitement d’un
tel fichier chiffré sont automatiquement générées par un outil qui fait partie
de CryptStore et qui peut soit être intégré dans une interface à la grille
permettant la gestion de fichiers en général, soit être utilisé séparément.
Pour permettre le partage de fichiers chiffrés, CryptStore nécessite la mise
en place de plusieurs serveurs de clés. Les utilisateurs voulant accéder à un
fichier chiffré peuvent soumettre une requête aux serveurs de clés pour obtenir
la clé de déchiffrement. Pour éviter que les serveurs de clés deviennent euxmêmes des cibles attractives pour des attaques, une clé destinée au stockage
sur les serveurs est divisée en plusieurs parties grâce à l’algorithme de partage
de secrets de Shamir [90].
L’outil d’administration de fichiers de CryptStore gère les tâches reliées
au chiffrement du fichier, la génération des parties de la clé et le stockage des
parties et des meta-données associées sur les serveurs de clés.
Pour accéder à un fichier chiffré, CryptStore met à disposition un outil qui
sert à retrouver les serveurs de clés dans les meta-données d’un fichier chiffré,
à contacter les serveurs pour récupérer les parties de la clé, à reconstruire la
clé à partir des parties et finalement à déchiffrer le fichier.
L’accès aux parties de la clé est contrôlé en utilisant les permissions
d’accès aux fichiers. Les serveurs de clés ont donc une interface générique,
qui permet de les intégrer avec le système de contrôle d’accès de la grille.
Si le système de contrôle d’accès fonctionne de manière décentralisée, une
instance du système de contrôle d’accès peut être co-localisée avec le serveur
de clés.
1.5.2
Architecture de CryptStore
CryptStore nécessite le mise à disposition de trois composants pour être
fonctionnel sur une grille : l’outil d’administrateur de fichier, l’outil d’accès
aux fichiers chiffrés et les serveur de clés.
L’outil d’administration gère les fonctions suivantes :
– Chiffrement du fichier sur la machine de l’utilisateur.
– Optionellement, la création d’un code d’authentification de message à
partir de la clé de chiffrement pour sécuriser l’intégrité du fichier.
– La génération des parties de clé et leur stockage sur des serveurs de clés
avec les meta-données associées.
– Le stockage, dans l’entête du fichier, de meta-données permettant
de retrouver les serveurs de clés et de configurer l’algorithme de
1.5. LE STOCKAGE CHIFFRÉ AVEC CRYPTSTORE
21
déchiffrement.
– La mise à jour des parties de clés et des autres meta-données sur les
serveurs de clé en cas de renouvellement du chiffrement.
L’outil d’accès aux fichiers chiffrés prend en charge les fonctions
suivantes :
– Extraction à partir de l’entête du fichier chiffré des adresses des serveurs
de clés qui stockent des parties de la clé de déchiffrement .
– Soumission de requêtes aux serveurs de clés pour récupérer les parties
de clé. L’utilisateur doit intervenir pour l’authentification et il doit
aussi fournir les permissions si le système de contrôle d’accès utilise
une séquence de messages push.
– La reconstruction de la clé en fonction des parties.
– Le déchiffrement du fichier, comprenant l’extraction des paramètres de
configuration de l’algorithme de déchiffrement.
Les serveurs de clés mettent à disposition les services suivants :
– Stockage et mise à jour de parties de clés et d’identifiants de fichiers
auxquels correspondent ces parties.
– Point d’accès pour l’outil d’accès aux fichiers, qui permet de soumettre
des requêtes pour des parties de clés.
– Interface générique avec le système de contrôle d’accès de la grille, qui
sert à déterminer à quelles parties de clé un utilisateur peut accéder.
L’utilisation de CryptStore est illustrée par la figure 7.5 sur la page 120.
Elle se fait en sept étapes. La première étape est prise en charge par l’outil
d’administration et consiste à chiffrer le fichier et à générer des parties de
la clé de chiffrement. A la deuxième étape, l’outil d’administration contacte
différents serveurs de clés pour stocker les parties de clé et l’identifiant du
fichier associé. Ensuite, à la troisième étape, l’outil administrateur génère les
meta-données qui servent à retrouver les serveurs de clés et stocke le fichier
chiffré avec ces informations en entête sur un serveur de stockage de la grille.
La quatrième étape se déroule hors des fonctions de CryptStore et consiste
pour l’administrateur du fichier à donner des permissions d’accès au fichier
à un utilisateur.
Cet utilisateur récupère le fichier chiffré grâce à ses droits d’accès et utilise
l’outil d’accès de CryptStore pour trouver les adresses des serveurs de clés
dans l’entête du fichier (étape cinq). Dans une sixième étape l’utilisateur
contacte les différents serveurs de clés et récupère autant de parties de clé
que nécessaire pour la reconstruction de la clé. La septième et dernière étape
de l’utilisation de CryptStore consiste à reconstruire la clé de déchiffrement
grâce à l’outil d’accès et à déchiffrer le fichier.
22
1.5.3
CHAPTER 1. RÉSUMÉ FRANÇAIS
Les meta-données de CryptStore
CryptStore nécessite trois catégories de meta-données : les meta-données
relatives aux paramètres de la fonction de chiffrement (exceptée la clé), les
meta-données qui permettent de localiser les serveurs de clés pour un fichier
chiffré, et les meta-données des serveurs de clés, qui leur permettent d’associer
une partie de clé à un fichier.
Les deux premiers types de meta-données doivent êtres stockés avec le
fichier chiffré et peuvent aussi comprendre optionellement le code d’authentification de message qui sert à vérifier l’intégrité du message.
Le design actuel de CryptStore prévoit de stocker ces informations dans
l’entête du fichier chiffré. La taille de ces données est relativement faible et
n’augmente donc pas beaucoup la taille du fichier. Le fichier chiffré peut ensuite être traité comme un fichier standard par les systèmes de stockage de
la grille. Nous sommes néanmoins conscients d’applications où ce système ne
pourrait pas être utilisable : si les données chiffrées sont stockées dans des
tables de bases de données qui ont une taille fixe, le fait d’ajouter les informations d’entête peut accroı̂tre la taille de la donnée au-delà de la limite fixée.
En un tel cas, CryptStore devrait être légèrement modifié pour permettre le
stockage de ces meta-données à l’extérieur du ficher. Nous reviendrons sur ce
problème lors de la discussion.
Si un mécanisme décentralisé de contrôle d’accès est co-localisé avec les
serveurs de clés, il peut s’avérer nécessaire de stocker aussi les SOA des fichiers
pour lesquels le serveur stocke des parties de clé. Par contre ces meta-données
sont gérées par le système de contrôle d’accès et non par CryptStore.
1.5.4
Les algorithmes de CryptStore
Pour le chiffrement de fichiers, il faut tout d’abord choisir si l’on veut
utiliser un algorithme de chiffrement de blocs ou un algorithme de chiffrement
de flux. En général, les algorithmes de chiffrement de flux sont plus rapides
que les algorithmes de chiffrement de blocs. Par contre il n’est pas sûr de
réutiliser les clés avec un chiffrement de flux. Puisque dans CryptStore il est
assez coûteux de renouveler une clé (nécessité de contacter tous les serveurs
de clés concernés) il peut s’avérer utile de pouvoir re-chiffrer des données avec
la même clé. Nous avons donc choisi d’utiliser AES qui est un algorithme de
chiffrement de blocs. AES est le standard américain qui est aussi utilisé par
la majorité des produits cryptographiques non-américains. Nous utilisons le
mode cipher block chaining (CBC) qui cache les répétitions dans les différents
blocs du fichier chiffré et qui permet un accès aléatoire (random access) aux
blocs du fichier. Pour être capables de garder la taille du fichier constante en
1.5. LE STOCKAGE CHIFFRÉ AVEC CRYPTSTORE
23
vue de possibles restrictions de stockage, nous utilisons aussi la technique du
ciphertext stealing (CTS) pour le chiffrement du dernier bloc du fichier (cf.
figure 5.2 page 75.
Pour la protection de l’intégrité du fichier, nous avons choisi d’utiliser des
codes d’authentification de messages (MAC, ne pas confondre avec le contrôle
d’accès obligatoire) plutôt que des signatures digitales. Contrairement aux
signatures digitales, un MAC utilise une clé secrète pour générer le code
d’authentification du message (empreinte). L’utilisation d’un MAC est plus
pratique pour protéger l’intégrité d’un fichier qui peut potentiellement être
modifié par plusieurs utilisateurs. Une signature digitale il rendrait nécessaire
de fournir la clé publique du signataire pour pouvoir la vérifier. Avec un
MAC il nous est possible d’utiliser la même clé qui a servi au chiffrement
pour générer l’empreinte du fichier. L’algorithme de MAC que nous utilisons
est le HMAC, car il est standardisé et utilisé par de nombreux systèmes.
Les parties de clés utilisées pour le stockage sur les serveurs de clés sont
créées avec l’algorithme de partage de secrets de Shamir [90]. L’algorithme
permet à un utilisateur de choisir deux paramètres : le nombre n de parties
qui seront générées, et le nombre m (n ≥ m) de parties qui seront nécessaires
pour reconstruire le secret. N’importe quel ensemble de m parts ainsi générées
permet de reconstruire le secret partagé. Aucun ensemble contenant moins
de m parts donne une information qui permet de réduire la complexité d’une
recherche exhaustive pour trouver le secret. Ce principe est illustré par la
figure 7.7 sur la page 124.
Pour plus de détails concernant les algorithmes cryptographiques nous
renvoyons le lecteur à [87].
1.5.5
Discussion
Dans cette section nous discutons les choix algorithmiques et architecturaux faits pour CryptStore.
Notre application concerne le traitement de données médicales, qui
peuvent comprendre des images radiologiques, tomographiques, IRM etc. très
volumineuses. Nous devons donc prendre en compte le fait que les fichiers manipulés peuvent avoir une taille très importante. Dans cette perspective les
algorithmes de chiffrement de flux surpassent les algorithmes de chiffrement
de blocs. A l’inverse ceux-ci permettent de re-chiffrer les données avec une
même clé sans compromettre la sécurité. Un autre critère important est la
capacité d’un algorithme de chiffrement de ne pas changer la taille de la
donnée chiffrée. Cette propriété est inhérente aux chiffrements de flux puisqu’ils traitent des flux de bits (ou d’octets dans le cas de l’algorithme RC4)
un par un. Pour les chiffrements de blocs, la propriété de ne pas changer
24
CHAPTER 1. RÉSUMÉ FRANÇAIS
la taille de la donnée peut être obtenue en utilisant le mode de chiffrement
ciphertext stealing (CTS) pour le dernier bloc de la donnée.
Pour permettre des mises à jour fréquentes des données chiffrées par plusieurs utilisateurs différents, sans être obligé de changer la clé à chaque fois,
nous avons choisi d’utiliser un chiffrement de blocs en mode CBC. Le CBC
permet aussi l’utilisation du CTS, ce qui rend possible le chiffrement de
données sans augmenter leur taille, si cela s’avérerait nécessaire.
La décision de stocker les meta-données relatives au chiffrement et aux
serveurs de clés dans l’entête des fichiers chiffrés a été prise dans le but de
pouvoir traiter les fichier chiffrés comme d’autres fichiers du point de vue des
serveurs de stockage de la grille. Nous sommes conscients que cela change la
taille de la donnée, un fait qui peut poser problème si la donnée est stockée
dans une base de données. Une extension du design actuel de CryptStore
pour permettre le stockage externe de ces meta-données ne poserait pas de
problème majeur, puisque l’architecture de grille doit de toute façon stocker
des meta-données relatives aux fichiers sur la grille. Ces mécanismes de stockage de meta-données pourraient êtres utilisés pour stocker les meta-données
de CryptStore.
La décision d’utiliser un algorithme de partage de secrets pour le stockage
des clés sur les serveurs de clés est motivée par le paradigme général de
cette thèse d’éviter si possible les tierces parties qui peuvent être un point
central de faille. Il semble impossible d’éviter d’utiliser une tierce partie si
nous voulons supporter le partage de collections de fichiers chiffrés par des
groupes d’utilisateurs dynamiques. Pour limiter l’impact d’une attaque sur
un des serveurs de clés, nous avons choisi de ne pas leur confier les clés
en entier. Grâce aux propriétés des algorithmes de partage de secrets nous
bénéficions en outre d’autres avantages : CryptStore est robuste contre un
certain nombre de failles des serveurs de clés, si le nombre de parties de clés
créées est supérieur au nombre nécessaire pour la reconstitution de la clé. De
plus le stockage de parties de clé peut servir comme sauvegarde de la clé, au
cas où l’administrateur du fichier chiffré la perdrait.
Pour l’utilisation de CryptStore, il est important de décider d’une politique de re-chiffrement. Si les permissions d’un utilisateur qui avait accès à la
clé de déchiffrement sont révoquées, nous ne pouvons pas être sûr qu’il n’a pas
gardé une copie de la clé de déchiffrement. Il existent trois possibilités pour
traiter ce cas : La première est de ne rien faire et d’espérer que le système
de contrôle d’accès empêchera l’accès au fichier, la deuxième est de re-chiffer
le fichier avec une nouvelle clé, des qu’il est mis à jour (re-chiffrement paresseux ) et la troisième est de re-chiffrer le fichier immédiatement avec une
nouvelle clé. Puisque nous ne pouvons pas empêcher des utilisateurs qui ont
eu accès au fichier d’en faire des copies et de les diffuser à des personnes non-
1.6. SYGN ET CRYPTSTORE INTÉGRÉS DANS UNE GRILLE
25
autorisées, nous conseillons d’utiliser le re-chiffrement paresseux, qui empêche
qu’un utilisateur dont les droits ont été révoqués prenne connaissance des
mises à jour d’un fichier.
Ce problème de re-chiffrement pourrait être évité si les utilisateurs
n’avaient jamais accès à la clé de déchiffrement. Ceci nécessiterait la mise
en place d’un service de déchiffrement. Par contre un tel service serait un
tiers de confiance et un point central de faille. Nous avons donc décidé de
laisser faire le déchiffrement sur la machine de l’utilisateur final de la donnée,
où il est géré par l’outil d’accès de CryptStore.
Le concept le plus important de CryptStore est l’interface générique avec
le service de contrôle d’accès. La motivation pour cette approche est de garder
les permissions d’accès aux fichiers cohérentes avec les permissions d’accès
aux clés qui permettent le déchiffrement de ces fichiers. Nous avons donc
choisi d’éviter de rajouter une deuxième couche de contrôle d’accès. L’interfaçage avec le service dédié au contrôle d’accès nous permet de prendre des
décisions d’accès cohérente avec les décisions d’accès aux fichiers sur la grille.
Cette approche nécessite que les propriétaires de fichiers soient aussi leur
source d’autorité (SOA) pour toute décision de contrôle d’accès les concernant. Si comme dans le cas de VOMS les administrateurs des sites de stockage sont les SOA pour l’accès aux fichiers, notre approche n’apporte pas
d’amélioration de la sécurité. Puisqu’une des conditions que nous défendons
dans cette thèse est le contrôle d’accès par les possesseurs, cette contrainte
est compatible avec l’approche générale que nous visons.
1.6
Sygn et CryptStore intégrés dans une
Grille
Nous présentons dans cette section nos travaux liés à l’intégration de nos
systèmes Sygn et CryptStore dans un environnement réel de grilles de calcul.
A titre d’exemple nous avons choisi deux architectures de grilles de calcul :
µgrid, une architecture de grille minimale développée par Johan Montagnat
et Diane Lingrant [88], et Globus Toolkit version 4, qui offre des services
standardisés OGSA et WSRF.
1.6.1
µgrid
L’architecture µgrid à été créée comme architecture de grille de calcul
minimale pour des tests d’applications scientifiques sur une grille. L’idée de
base de µgrid est d’être simple à installer, configurer et administrer, ce qui
26
CHAPTER 1. RÉSUMÉ FRANÇAIS
n’est pas le cas des architectures de grilles utilisées à grande échelle pour des
applications de production.
L’architecture µgrid est composée de trois parties : le client utilisateur,
qui permet aux utilisateurs d’accéder à la grille, le gestionnaire de ferme qui
est le point d’entrée à la grille, groupant les ressources, gérant l’agencement
des tâches, l’attribution des ressources et la répartition des données. Les ordinateurs qui fournissent des ressources à la grille sont dirigés par le troisième
composant, le gestionnaire d’hôtes, qui prend en charge l’exécution des calculs et le stockage des données. Toute communication entre les composants
est réalisée par des sockets, utilisant une architecture client/serveur.
Ainsi µgrid permet le partage transparent de ressources tout en étant
simple d’utilisation. Concernant les fichiers, µgrid permet de copier des fichiers d’un disque local sur la grille et vice-versa, de répliquer un fichier sur la
grille et de supprimer un fichier sur la grille. Une API C++ permet d’utiliser
ces commandes de manipulation de fichiers à partir d’un logiciel exécuté sur
la grille.
L’authentification est implémentée en utilisant OpenSSL et une infrastructure de clés publiques (PKI). Chaque utilisateur, chaque ferme et chaque
hôte est muni d’un propre certificat leur permettant une authentification mutuelle. La version actuelle de µgrid part du principe qu’il existe une autorité
de certification unique pour la grille entière.
Dans sa version actuelle, µgrid ne passe pas bien à l’échelle, le gestionnaire
de ferme étant vite surchargé si on lui attribue trop de ressources à gérer.
Pour cette raison, les auteurs de µgrid envisagent d’ajouter une couche de
serveurs au-dessus des gestionnaires de ferme, qui serviront aussi comme
nouveaux points d’accès à la grille.
1.6.2
Les standards OGSA et WSRF
L’architecture ouverte de services de grilles (OGSA) est un standard
développé par le Global Grid Forum (GGF). OGSA se veut une architecture
commune pour des applications s’exécutant sur une grille de calcul. OGSA
nécessite une architecture de services de calcul distribué pour être mise en
œuvre. Cette architecture est réalisée par des services web.
Les services web utilisent le langage WSDL pour décrire et publier
leurs interfaces exposées sur un réseau. Le protocole de communication
généralement utilisé est SOAP. Le W3C a défini les services web comme étant
sans état. De ce fait, les services web purs ne suffisent pas aux spécifications
du standard OGSA, qui demande des services pouvant gérer des informations d’état. Pour cette raison, le consortium OASIS a développé le modèle
de ressources web services WSRF. WSRF spécifie comment des services web
1.6. SYGN ET CRYPTSTORE INTÉGRÉS DANS UNE GRILLE
27
peuvent êtres augmentés d’informations d’état.
Le standard OGSA n’adresse que très brièvement le contrôle d’accès (voir
[76]) en précisant que chaque domaine aura généralement son propre service
d’autorisation, et donc que le modèle d’autorisation d’une grille devra être
basé sur des standards en cours de production, comme XACML, SAML et
WS-Authorization pour garantir l’interopérabilité.
1.6.3
Intégration de Sygn dans une grille
Pour rendre Sygn indépendant de l’architecture de grille sur laquelle Sygn
est utilisée, nous avons choisi l’approche suivante : un module d’intégration
est chargé d’imposer les décisions de Sygn aux utilisateurs. Ce module agit
comme agent entre l’utilisateur et la ressource. A chaque requête sur la grille,
l’utilisateur génère une requête Sygn qui autorise la requête sur la grille. Le
module d’intégration reçoit les requêtes, les sépare et soumet la requête Sygn
à l’algorithme de décision de Sygn. Si la réponse est positive, le module
d’intégration vérifie que la requête Sygn provient bien du même utilisateur
qui a soumis la requête grille. Pour cela, le module d’intégration doit interagir avec les mécanismes d’authentification de la grille, pour obtenir la
clé publique de l’utilisateur. Le module d’intégration vérifie aussi que les
ressources en question et les actions demandées sur ces ressources se correspondent dans la requête à la grille et la requête Sygn. Si ces vérifications sont
positives, le module d’intégration transmet la requête à la ressource pour être
traitée.
Un tel module d’intégration pour Sygn et l’architecture µgrid a été
implémenté par Didier Oriol, au cours de son projet de fin d’études à l’INSA
de Lyon. Ce module permet l’utilisation de Sygn pour le contrôle d’accès aux
fichiers dans µgrid, avec les fonctions décrites ci-dessus.
Pour l’intégration de Sygn dans une grille standardisée OGSA il faut se
poser la question, s’il est nécessaire de donner à Sygn une interface de service
web. Puisque Sygn est conçu pour être co-localisé avec les ressources qu’elle
contrôle, il semble possible de faire interagir les ressources localement avec
le module d’intégration, au lieu d’implémenter une interface service web.
Par contre si elle s’avérait nécessaire, l’implémentation d’une telle interface
service web ne poserait aucun problème. Le moteur de décision de Sygn
est sans état et pourrait donc être implémenté comme simple service web,
sans qu’il soit nécessaire de prendre en compte des extensions de WSRF.
Puisque toute communication avec Sygn est déjà encodée en XML, il serait
simplement nécessaire de définir une description WSDL des interfaces et de
générer le code pour la communication par le protocole SOAP. De nombreux
outils pour générer ce code à partir d’une description WSDL existent, par
28
CHAPTER 1. RÉSUMÉ FRANÇAIS
exemple l’outil de génération de services web gSOAP4 . Nous prévoyons de
réaliser une telle intégration au cours des travaux futurs.
1.6.4
Intégration de CryptStore dans une grille
Pour pouvoir mettre à disposition CryptStore sur une grille standardisée
OGSA, les serveurs de clés doivent êtres munis d’une interface de service
de grille. Puisque les serveurs de clés sont sans état, ils peuvent donc être
implémentés comme simples services web. Les requêtes et réponses au/du serveur de clés sont déjà encodées en XML, il serait donc uniquement nécessaire
d’écrire une description WSDL des interfaces et d’en générer le code pour
la communication par le protocole SOAP. Cela peut être fait comme nous
l’avons décrit dans la section précédente.
Par contre, il faut aussi instancier l’interface générique des serveurs de clés
avec le service de contrôle d’accès. Nous avons créé une telle instanciation
pour utiliser Sygn en conjonction avec CryptStore. Dans cette approche, un
serveur Sygn peut être co-localisé avec le serveur de clés. Le serveur Sygn se
charge de stocker les sources d’autorité pour les fichiers chiffrés pour lesquels
le serveur de clés stocke des parts de clés. Utilisant cette information, le
serveur Sygn peut prendre des décisions de contrôle d’accès pour les parties
de clés, sans devoir consulter un service externe.
Les outils d’administrateur et d’accès de fichier encapsulent une requête
Sygn dans chaque requête CryptStore, suivant les actions que l’utilisateur
veut initier. L’interface avec le service de contrôle d’accès de CryptStore
réalise ici les fonctions du module d’intégration que nous avons discutés en
section 1.6.3.
Une version fonctionnelle de CryptStore est implémentée avec une interface vers le contrôle d’accès Sygn. Ce logiciel peut être téléchargé à partir du
site http ://liris.cnrs.fr/ lseitz.
1.7
Conclusion
Au cours de la thèse, nous avons examiné l’utilisation de grilles de calcul
pour des applications médicales sous l’angle de la sécurité des données. Nous
avons démontré que les solutions classiques ne sont pas toutes directement
utilisables, en raison des spécificités des grilles de calcul. Puisque le problème
central des applications médicales est la confidentialité des données, nous
avons choisi d’examiner le contrôle d’accès.
4
Disponible sur http ://www.cs.fsu.edu/∼engelen/soap.html
1.7. CONCLUSION
29
En nous appuyant sur un ensemble de cas d’utilisation liés au déploiement
d’applications médicales sur des grilles de calcul, nous avons présenté une liste
de conditions et de contraintes fondées sur des principes de bonne sécurité,
sur la nature de l’architecture des grilles de calcul et sur les spécificités des
applications médicales. Les points les plus importants sont l’administration
décentralisée, la traçabilité et le stockage chiffré des données.
La nécessité du stockage chiffré pour compléter le contrôle d’accès provient du fait que sans chiffrement des données, le contrôle d’accès peut être
contourné par des utilisateurs ayant un accès physique au matériel de stockage. A l’inverse le stockage chiffré oblige à passer par le contrôle d’accès
pour accéder au fichier.
En regarde de ces considérations, nous avons examiné l’état de l’art sur
le contrôle d’accès distribué et le contrôle d’accès dans les grilles. Nous avons
trouvé qu’aucun des systèmes ne répond à toutes nos conditions, même si
l’on omet la nécessité du stockage chiffré.
Nous avons ensuite examiné l’état de l’art sur le stockage chiffré. Notre
intérêt principal était d’analyser comment est géré le partage de clés chiffrant
des données mises à jour souvent et partagées par des groupes d’utilisateurs
soumis à des changements de membres fréquents. Nous avons analysé que les
systèmes de stockage chiffré qui supportent le partage de clés n’ont pas un
support satisfaisant pour des groupes dynamiques. De plus, la plupart de ces
systèmes mettent en œuvre un mécanisme spécifique de contrôle d’accès aux
clés, créant ainsi une couche redondante de contrôle d’accès aux fichiers, qui
peut mener à des incohérences.
Notre première contribution, le système de contrôle d’accès Sygn, est
conçue pour la gestion décentralisée de permissions. Pour cela, Sygn
implémente un concept décentralisé de rôles et de collections de fichiers,
basé uniquement sur des certificats d’autorisation. La gestion décentralisée de
permissions est aussi supportée par des mécanismes de délégation fondés sur
des chemins de certificats d’autorisation. Cette décentralisation nous a aussi
amenés à minimiser les informations relatives au contrôle d’accès qui doivent
être présentes aux points de décision. La plupart des informations nécessaires
pour les décisions de contrôle d’accès sont fournies par les utilisateurs qui demandent l’accès en présentant des chemins de certificats qui autorisent ces
accès. Les points de décision doivent seulement connaı̂tre les sources d’autorité pour chaque ressource qu’ils contrôlent. Cette décentralisation aide au
passage à l’échelle du système, mais surtout elle réduit l’impact d’une attaque réussie sur un serveur de contrôle d’accès, puisque seuls les ressources
locales seront exposées.
Sygn propose aussi des fonctions intégrées pour le traçage et peut être
configuré pour garantir la non-répudiabilité des requêtes, qui peuvent être
30
CHAPTER 1. RÉSUMÉ FRANÇAIS
utilisées comme preuves pour un audit. En intégrant la traçabilité au sein du
contrôle d’accès, Sygn permet de mettre en œuvre facilement les deux fonctions. Le contrôle d’accès est un point idéal pour obtenir de l’information de
traçage, puisque toutes les requêtes d’un système sont obligées d’y transiter.
Notre seconde contribution, CryptStore, complète les fonctions du
contrôle d’accès en protégeant les données contre le contournement du
système de contrôle d’accès. CryptStore permet aux utilisateurs de stocker
leurs données sous forme chiffrée et de partager les clés de déchiffrement
avec des utilisateurs autorisés. Puisqu’il est nécessaire d’avoir la clé de
déchiffrement pour accéder aux fichiers chiffrés, un utilisateur qui accède
directement au média de stockage ne pourra pas prendre connaissance du
contenu des fichiers.
Pour avoir des permissions cohérentes sur les fichiers et les clés qui servent
à les déchiffrer, CryptStore utilise les mécanismes de contrôle d’accès de la
grille pour décider quel utilisateur aura accès à une clé. Pour cela CryptStore
possède une interface générique qui peut être adaptée au système de contrôle
d’accès présent sur la grille de calcul.
Puisque les clefs elles-mêmes sont des données de valeur, aucun serveur de
clefs ne stocke une copie entière d’une clé. Les clefs sont divisées en parties,
générées par un algorithme de partage de secrets et distribuées sur plusieurs
serveurs de clefs. Grâce à la possibilité de créer des parties de clé redondantes,
CryptStore est robuste contre la défaillance d’un ou plusieurs serveurs de
clefs.
Comme travaux futurs sur Sygn nous prévoyons d’intégrer des mécanismes de contrôle d’accès sur des bases de données. Ceci permettrait d’exposer une base de données sur la grille, tout en contrôlant l’accès aux données
contenues dans cette base, indépendamment de l’architecture de la base de
données utilisée. Suivant la même direction générale nous prévoyons aussi
une extension de Sygn pour le contrôle d’accès aux éléments d’un document
XML.
Une autre question intéressante que nous allons examiner est l’implication
légale des grilles de calcul pour le traitement de données personnelles (cf.
chapitre 3.5). Pour cette question, nous allons coopérer avec des experts
juristes pour déterminer les conditions légales d’utilisation et pour valider
que nos solutions techniques répondent à ces conditions.
Chapter 2
Introduction
Resource sharing has always been a central issue in computer science. At
first crude time-sharing protocols were used, where a user had to reserve
computation time on a central machine, and submitted his program written
on punch cards to the operators of the computer. Then the Internet was
created as an architecture to share data resources, which is still its main use
today.
However the use of the Internet is not limited to sharing data. Other
resources such as computing power and storage space can also be shared,
using Internet technologies.
A problem that is often encountered when trying to share resources over
the Internet is that the heterogeneous systems used to exploit the resources
are not capable to interoperate. Even if the same operating system and application software are installed, minor differences in the configuration can cause
major problems when one tries to use distant resources. Often painstaking
manual configuration and detailed knowledge about the specifics of the distant resources is required to make them work.
Grid computing [48] offers a framework that facilitates the sharing of
resources and that aims to overcome interoperability problems. Grids provide
a common resource sharing platform that handles the discovery, allocation
and use of resources for the user in a transparent way. Grids support the
sharing of resources such as sensors, computing power, storage capacity and
data.
First Grid applications concentrated on compute intensive applications
such as particle physics and terrestrial observation. In these applications computing power and storage capacity is of paramount importance. Security aspects, especially related to data security are of lesser concern. As Grids move
on to biomedical research projects, such as comparative genetics, data security aspects have gained more importance. Recently Grid have been identified
31
32
CHAPTER 2. INTRODUCTION
as a possible architecture to support health-care networks [70]. In health-care
applications, additionally to its role as provider of raw processing power, the
Grid allows to share data resources of various formats across organizational
boundaries.
As more and more resources are exposed to the Internet, a major problem is to protect these resources against unauthorized use, while allowing
authorized users to access them even when connecting from a distant site.
The most important resources that need to be protected are clearly data.
Data are the main asset of most applications and its misuse has the most dire
consequences. This is especially true for personal medical data. Uncontrolled
disclosure of medical data can make it impossible for the concerned person to
get employed or even to get medical insurance. The medical community has
seen the advantages but also the risks of information technology for health
care [82]. The use of Grids in health-care could improve treatments in several
ways, as for example through a significant speed-up in processing complex
image analysis, or the possibility to make all medical data available, even
when they are stored at geographically distant sites on heterogeneous systems. However we also have to guarantee the users that an adequate privacy
protection is ensured, or this new technology will never be accepted.
This thesis provides an architecture that controls access to resources in
a Grid environment with a special focus on the protection of data resources.
The framework for this research is provided by the application of Grids for the
creation of health-care networks and the specific data protection requirements
that arise from these applications.
2.1
Security aspects of resource sharing on a
Grid
The full spectrum of security related issues are applicable to Grids in general
and to the use of Grids as framework for health-care networks. The issues
that need to be addressed include:
• Authentication of entities using the Grid.
• Authorization of actions performed on the Grid, especially access to
resources on the Grid.
• Confidentiality of communications within the Grid.
• Confidentiality of data stored on the Grid.
• Integrity of data stored on the Grid.
2.1. SECURITY ASPECTS OF RESOURCE SHARING ON A GRID
33
• Auditing, Accounting and Non-repudiability.
• Intrusion detection and other passive defense methods against attackers.
• Robustness against errors, break-downs and malicious actions, such as
denial-of-service attacks.
Authentication deals with the ways of proving, possibly mutually, the
identity of communicating entities. Authentication raises interesting technological problems in Grids related to the requirement of single-sign-on that
states that a user should only have to authenticate once when using a Grid.
Most implementations of Grid architectures favor certificate based authentication using a public key infrastructure (PKI).
Authorization deals with the decisions who is allowed to use which resource in what way. The term access control refers to all methods that enforce
authorization decisions.
Confidentiality is the protection of data on an insecure medium against
accidental or malicious disclosure to third parties. Basically we have to differentiate between confidentiality of communications (i.e. confidentiality of data
being transmitted over a network), and confidentiality of storage (i.e. confidentiality of data on a storage medium). The main difference is the lifetime
of the data, which is very short in communication compared to storage.
Integrity of data refers to the protection of data against unauthorized
modification. As a full integrity protection of data on re-writable media is not
possible, most algorithms deal with making violations of integrity detectable
to users. Integrity checking is a secondary topic of this work and is therefore
only given limited consideration.
Auditing systems collect data that allow to review actions of all entities
concerning the Grid’s resources. Auditing can be valuable for post-mortem
analysis in a case of suspected or actual misuse of resources. Auditing is
closely related to accounting, which uses auditing systems for measuring and
possibly invoicing the use of resources. The third element of this group is a
more restricted form of auditing, where the audit data allows to prevent that
an entity denies having undertaken actions. Non-repudiation uses techniques
such as digital signatures in order to bind actions to the users who initiate
them.
The term intrusion detection refers to all activities aimed at an early discovery of unauthorized access to resources. The goal of intrusion detection is
to limit the damage done by a successful attack by reducing the time an attacker remains undetected. Among others, intrusion detection uses integrity
protection techniques in order to achieve its goals.
34
CHAPTER 2. INTRODUCTION
Robustness of systems, especially against malicious actions has become a
general requirement of networked services. Measures for ensuring robustness
include redundancy of critical systems and data, cross-checking of critical
information. It is therefore an aspect that has to be considered in all other
topics above.
In this thesis we mainly address authorization and access control to resources, as we believe that these are the scientifically most challenging parts
of Grid security. We also show how confidentiality of storage is closely linked
to data access control, as a way to protect data against the circumvention of
the access control mechanisms. As it is practical and very easy to integrate
together with confidentiality of storage, data integrity protection is also given
limited consideration in this context. Auditing and non-repudiation is also
well suited to be integrated with authorization services. This is due to the
fact that all actions must pass through the authorization system and it is
therefore useful to co-locate authorization and auditing.
Authentication is not an issue studied in this thesis. We assume a working
PKI infrastructure is available, that allows easy and secure authentication of
the Grid users. These authentication mechanisms are a necessary prerequisite
for authorization mechanisms.
The confidentiality of communications and intrusion detection are not an
issue for which Grids pose novel security challenges. As Grids use normal
network communication mechanisms, existing protocols such as TLS/SSL or
IPSec can be used to ensure the confidentiality of communications. Intrusion
detection is a task related to a closed system and is therefore not applicable to
Grids as a whole. In the closed components of a Grid conventional intrusion
detection measures can be used.
Robustness is not specifically addressed in this work, but rather considered as a requirement for every aspect of the other approaches.
2.2
Why Grids pose novel security challenges
A large amount of previous work exists in the field of computer security. Some
robust and widely tested standards have been created for which numerous
support tools exist. We have therefore to consider the question how much of
these tools we can re-use, which ones we can adapt to make then usable on
Grids and where we have to develop new ones. In order to do this we have
to examine the specifics of Grid computing architectures and applications.
Grids are generally used by large, dynamic cross-organizational communities. In contrast to classical user communities, these are not centrally administrated and therefore the use of centralized authentication mechanisms
2.3. OUTLINE
35
is not possible. Currently public key infrastructures (PKI) are investigated
as a method to provide cross-organizational authentication. However many
organizational and technical problems still remain [55].
Resources offered on a Grid are subject to dynamical changes that can not
be centrally predicted. For example clusters assigned to a Grid may be taken
offline for maintenance or simply in order to use them for non-Grid activities. These resources are geographically and organizationally distributed and
consist of heterogeneous hardware, software and data formats. Therefore it is
infeasible to handle authorizations that allow access to these resources centrally. Furthermore a common format for communicating authorization and
authentication information is needed, that works for all of these resources
and that can be adapted, following the dynamical changes of availability.
Several important aspects of Grid security come from the deployment of
applications over a decentralized architecture: First there is no central point
of access to a Grid, which requires a decentralized mechanism for authentication and authorization. The entry point of a user does rarely correspond
with the point where the access to the used resources is controlled. Therefore
mechanisms have to be established to allow users to transfer authorizations
from his Grid entry point to the resource that is to be used.
Grid resources are subject to different, sometimes overlapping security
policies, that need to be combined to make them work together. Therefore
the security architecture must provide mechanisms to resolve conflicts between security policies and must allow local administrators to apply their
policy to their resources. A special problem is posed by the data, as with the
transparent nature of storage, files may be stored outside the owner’s home
domain. Nevertheless data owners want to be able to control access to their
data. This requires a fine-grained access control, allowing users that are not
previously known on a system to specify permissions on files that have been
stored on this system.
Using Grids for Biomedical applications poses further application specific
security problems. These problems will be discussed in chapter 3.
2.3
Outline
The remaining part of this thesis is organized as follows. Chapter 3 presents
use-cases and motivates the design goals of our approaches with them. In
chapter 4 we present related work in the domain of access control. Chapter
5 explores the related work in the domain of encrypted storage. We present
our access control architecture Sygn in chapter 6. In chapter 7 we present our
encrypted storage architecture CryptStore. Chapter 8 analyzes how both ar-
36
CHAPTER 2. INTRODUCTION
chitectures can be integrated in Grids and finally chapter 9 draws conclusions
and presents future works.
Chapter 3
Motivation
In this chapter we examine the resource sharing scenario presented in the
introduction in relation to security aspects. We present some realistic usecases and threats within this scenario and use them to derive constraints
and requirements for an effective and secure approach to access control for
confidential data in a Grid environment. These requirements are divided in
three thematic groups: General principles of good security, specifics of the
Grid environment and specifics of applications dealing with confidential data.
3.1
Use-Cases
1. A medical doctor treats several patients and therefore has access to
their medical files, including radiological images stored at a distant
clinical database. The doctor wants to use Grid resources to perform a
computation intensive image analysis on a specific radiological picture
of one of those patients. To do this, the doctor uses a Grid interface
to launch an application that performs the necessary operations on his
behalf using his access rights to download the picture from the clinical
database.
In order to speed up the processing, the picture is replicated on Grid
storage resources, near the processing units. The computation process
accesses to this replica using the doctor’s permissions.
As the doctor is legally liable for the confidentiality of the data he
uses, he wants to have to trust as few entities as possible in the data
processing.
During the processing of the picture, a Grid storage resource on which
a replica of the picture is stored breaks down. It is disconnected from
37
38
CHAPTER 3. MOTIVATION
the Grid for maintenance and brought online again several weeks later.
At this point in time, the patient whose picture is still on this storage
resource has changed his doctor and the former doctor should not have
access to the picture replica anymore.
Since the patient suspects a misuse of his medical files by this former
doctor, he wants a detailed information on how this doctor used his
files.
2. A clinic employs several medical doctors, who are each responsible for
several patients. These doctors also work at the clinic’s research laboratory, which cooperates with several other research centers for studies
on genetically induced diseases. The clinic wants to enforce different
permissions for the same doctor depending on the task he or she is currently executing in order to prevent accidental disclosure of confidential
patient data within the research department.
In order to effectively share resources within cooperative research
projects, the research centers have installed a Grid architecture and
formed virtual organizations (VO) to manage the Grid resources allocated to each project. Each VO distributes permissions related to its
resources using a server, hosted by one of the participating centers.
The clinic employs a significant number of trainees and assigns some of
them to a workpackage of one of the common projects. These trainees
change frequently and require a standard set of access rights to the
resources of the VO. The tasks assigned to the trainees are not subject to major changes and therefore the permissions related to those
tasks remain relatively similar. The clinic wants to administrate these
permissions in an effective and simple way.
The centers providing computing and storage resources to the VO have
agreed to provide those resources via the Grid, on condition that they
have the final administrative power to decide who may access them and
especially that they are able to deny access to troublemakers.
As the project evolves during the lifetime of the VO more personnel gets
involved and the resources administrated by the VO are considerably
expanded.
3. A tourist falls ill during his holidays. He goes to a local health-care
center and wants to give some local doctors access to a part of his
medical files so they can gather the necessary background information
for effective treatment.
3.2. GENERAL PRINCIPLES OF GOOD SECURITY
39
The local health-care center has a medical information system that is
incompatible with the systems on which the tourists data are stored.
However it is connected to an international health-care network based
on a Grid architecture and shares medical data using this network. This
architecture should allow the tourist to give the doctors at the center
authorization to access his files through the Grid. This authorization
should be effective immediately.
The center employs a small software company as subcontractor to maintain its medical application software. The employees of this company
have administrator access to the centers hard disks used as Grid storage resources. However the health-care center does not want them to
be able to read the patient data on those disks.
3.2
General principles of good security
This section presents the requirements we found for our application with
respect to some general principles of good security. We motivate each requirement in the context of the use-cases we have presented before.
• S1 Least privilege: Since a malicious or faulty application that executes
actions on behalf of a user could misuse his permissions, it should be
possible to control the extend of permissions that are used for a specific
action. In the first use-case, if the process that performs the analysis
on the radiological picture had the medical doctor’s full permissions, it
could access other patients’ data and breach their confidentiality. Grid
users will most likely not have the technical knowledge or the time
to check if software applications they use have no hidden malicious
functionality. Therefore this requirement can help to reduce the damage
done by such an application.
• S2 Permission consistency: A permission on a data resource should be
applicable to any replica of these data. While permissions on identical
copies of the same data may have different permissions, depending on
the context in which they are used, replicas are created automatically
by the Grid middleware in order to speed up the access for applications
using this data. For such replicas the access control system should not
produce inconsistent access control decisions, where the same permissions give access to one replica of a file and do not give access to another
replica of the same file. In the first use-case (second paragraph), the
new replicas of the picture should be accessible for the process that is
40
CHAPTER 3. MOTIVATION
to perform the computation on them. The process should not need any
new permissions to access this replica, since those would require external intervention. Furthermore the replicas of the picture should not be
accessible to any user that does not have the permission to access the
original picture.
• S3 Minimize the use of (trusted) third parties: In any distributed computing scenario, there are typically two actors: The one who requests
an operation and the one who executes it. For every additional actor
introduced in-between those two, the risk of something going wrong
increases. Creating a trustworthy third party in a security critical protocol, requires a lot of effort and may often not be worth while. It is
therefore best to avoid the necessity for trusted third parties when possible in the design of a security protocol. In our first use-case (third
paragraph), the doctor will be more reluctant to use the Grid, if he or
she has to trust several third parties (for example a centralized permission authority).
• S4 Separation of duties: Users tend to have multiple permissions that
are not necessarily all related to the same task. Unexpected combinations of permissions may enable users to commit fraudulent or erroneous actions that they should not be able to perform. However any of
those permissions for itself may be necessary for some tasks assigned
to the user. Therefore an access control system should permit to put
restrictions on which permissions may be used simultaneously. In our
second use-case (first paragraph) a doctor may accidentally copy confidential patient data to a publicly accessible server at the research
laboratory. Separation of duties could help to avoid this, for example
by preventing doctors from accessing confidential patient data, while
using publicly available resources of the research laboratory.
• S5 Secure permission storage: Permissions need to be highly accessible.
The more permissions are stored together the higher is the value of such
storage sites as target for attacks. If an access control system has no way
to verify the integrity of the permission it uses, it becomes vulnerable
to undetected modifications attackers may have made. In our second
use-case (second paragraph) the cooperating centers forming the VO
for the common project need a way to protect the permissions related
to resources allocated to the VO otherwise the permission distribution
server becomes a considerable security risk.
• S6 Avoid centralized security services: Centralized services scale badly,
3.3. CONSTRAINTS OF THE GRID ENVIRONMENT
41
because they often become a bottleneck and a single point of failure,
when the workload increases. Moreover they represent a security vulnerability, since attackers can disrupt the system by targeting those
centralized services with denial of service attacks. Also centralized security services often hinder decentralized resource sharing. In our third
use-case (first paragraph), the tourist might have trouble granting the
foreign doctor access to his medical files if this would involve a centralized access control service, since the service may be down or have
a long response time.
3.3
Constraints of the Grid environment
This section presents the requirements for our system with regard to the
specifics of the Grid environment. As before we motivate each requirement
in the context of the use-cases we have presented in section 3.1.
• G1 Handle ad-hoc permissions within dynamic user communities: In
Grid environments resources are shared by cross-organizational communities that form virtual organizations (VO). These VO’s can have a
high fluctuation in members and short term cooperations can require
spontaneous access to some resources. The Grid access control system
must be able to handle these situations for example by supporting flexible delegation mechanisms. In our second use-case (third paragraph)
the clinic’s trainees work on short term projects related to the global
goal of the VO. Frequently these trainees require an access to the Grid
resources on short notice, in order to perform tests or calculations.
This will require permissions to be created or modified. Such permission management should not be hindered by complex processes for the
creation of new users or the need for intensive administrator intervention. The access control service should be able to handle single users
and resources as well as groups of users and sets of resources.
• G2 Manage dynamic resource availability: In a Grid environment, resources are subject to dynamic availability. Services may break down at
any time or the connection to those services may become interrupted.
The Grid access control system must handle such outages gracefully,
even if a component of the access control system itself becomes unavailable. In our first use-case (paragraph four) this means that the data
access control should not solely rely on locally stored permissions, since
those may become invalid during an offline period.
42
CHAPTER 3. MOTIVATION
• G3 Integrate heterogeneous environments: A Grid is intended to bring
together resources originating from computers using different operating
systems and application software configurations. These resources are
transparently available through the Grid regardless of the underlying
system. Therefore the Grid access control should require a minimum of
specific application software to be deployed on the Grid elements and
should provide a maximum of openness. In our third use-case (second
paragraph) the health-care center should still be able to gain access
to the patients’ files, even though it operates a medical information
system that is not compatible with the one where the patients’ data
are stored.
• G4 Enable local control of hardware resources: Most system administrators would not accept to provide local hardware resources on a Grid,
if that would mean to give up their administrative power over those
devices. Therefore a Grid access control system must enable resource
providers to control the access to their resources (i.e. when the resource
may be used in which way and by whom). This need is illustrated in
the use-case two (paragraph four).
• G5 Transparency of the data resource location: Since the storage location of files is transparent for the Grid user, data access control permissions should not depend on the storage location of the data either.
This requirement is somewhat parallel to requirement S2 and therefore
concerns the same use-case, however whereas in S2 the emphasis is on
replicas, here it is on the storage location of the data.
• G6 Scalability of the access control system: A Grid is designed to provide a huge amount of resources to a vast community of users. Solutions that work well on a small scale may fail, when used at large
scale. Therefore the Grid access control system needs to be applicable
even in intensive usage scenarios involving lots of resources and users.
Generally this implies that it should not rely on centralized services
(see requirement S6) and that the system can be expanded easily by
adding decentralized components. Use-case two (last paragraph) illustrates that the level of required scalability can not always be estimated
correctly in advance and that the system must provide enough flexibility to deal with such situations.
3.4. CONSTRAINTS OF THE APPLICATION
3.4
43
Constraints of the application
Since the predominant application examples in this thesis are Grid based
health-care networks and Grid based medical research, we examine the specific requirements of these applications in the present section. Several of our
conclusions apply more generally for applications that require sharing of confidential data on a Grid or on another distributed architecture. Again we take
up the use-cases from section 3.1 to illustrate our points.
• A1 Role based access control: A user working on a medical application
has structured tasks and requires permissions related specifically to
those tasks. Furthermore permissions may have a hierarchic structure,
where users higher up in the hierarchy inherit all permissions of those
that are lower (e.g. Clinic’s leading medical doctor > station’s medical
Doctor > Nurse). Furthermore we have shown the necessity of separation of duties in requirement S4. Role based access control is ideal to
respond to these requirements. It is presented in detail in section 4.2.
In our second use-case (paragraph one) the permissions for the clinic’s
staff could be very effectively managed using roles.
• A2 Traceability: Medical applications are legally required to provide
logs that allow to trace all use of confidential patient data. Therefore
mechanisms to ensure non-repudiable tracing of all access attempts
have to be provided. Since all access passes through the access control
systems, they are ideally suited to host such a tracing service. In our
first use-case (last paragraph), such logs could be an important evidence
in case of a trial.
• A3 Data access control by owners and delegation: When Grids are used
to share large amounts of data, access permissions concerning specific
data needs to be updated frequently (see requirement G1). An access
control system that requires administrator intervention for each permission change would be a major hindrance for effective data sharing.
Furthermore owners of sensitive data would be reluctant to have the
data access rights administrated by somebody else. Therefore the access control system needs to support fine-grained access decisions and
the owners of the data need to be the source of authority for access control decisions. Furthermore decentralized delegation mechanisms must
be available to enable the owners of data to administrate the access
permissions. In our third use-case (first paragraph) the tourist should
be able to grant access to his medical data, without external administrative intervention.
44
CHAPTER 3. MOTIVATION
• A4 Protection against circumventing access control: Data stored on
physical devices are at risk of being disclosed by persons having an administrator access to these devices. Furthermore an attacker can easily
gain an administrator access to a device, if he has physical access to it.
Therefore measures must be taken to protect confidential data stored
on the Grid. In our third use-case (paragraph three) the employees
of the subcontractor should be able to perform their maintenance, and
therefore need administrator access to the storage media. However they
should not have access to the confidential patient data stored on that
media.
3.5
Legal issues dealing with medical data
In this section we give a short overview of the laws governing private data
in general and more specially private health data. Readers should be aware
that this overview is necessarily incomplete from a legal point of view.
We have concentrated on laws of the European Union using as principal
source of information the web-pages of the European Union itself [41].
3.5.1
European laws concerning privacy protection
Within Europe, the root of all legal actions for the protection of personal
privacy comes from Article 8 of the European Convention for the Protection
of Human Rights and Fundamental Freedoms, signed in Rome on November
4th 1950 [29].
This article lays down the principles of personal privacy protection and
the need of a lawful basis for any interference in this personal privacy.
The Treaty on the European Union declares that these rights shall be
respected within Community law in Article F of the Common Provisions
[44].
Furthermore the rights of privacy protection and of the protection of
personal data have been laid down in the Charter of Fundamental Rights of
the European Union on December 7th 2000 [43].
In order to formalize these goals and to achieve legal harmonization in all
EU member states, the European Union has issued the directive 95/46 EC
(directive on the protection of individuals with regard to the processing of
personal data and on the free movement of such data) on the 24th of October
1995 [42].
The directive begins with various definitions, where the most important
are personal data, controller and processor in Article 1:
3.5. LEGAL ISSUES DEALING WITH MEDICAL DATA
45
(a) ’personal data’ shall mean any information relating to an
identified or identifiable natural person . . . . . . (d) ’controller’ shall
mean the natural or legal person, . . . which alone or jointly with
others determines the purposes and means of the processing of
personal data; . . . (e) ’processor’ shall mean a natural or legal
person, . . . which processes personal data on behalf of the controller.
The definition of personal data implies that data that have been superficially
anonymized are to be considered as personal data, if it is possible to infer
the identity of the person they concern using secondary sources. In Grid
environments controllers of some piece of data should be well defined as the
owners of the data (see also requirement A3 of the previous section). However
it is the definition of the processors that is difficult to handle, since in a Grid,
these processors would be entities providing Grid resources such as storage
space and/or processing power. Transparent resource sharing as used within
Grids makes the identification of processors for a specific Grid job difficult,
adding requirements to auditing and accounting systems.
The directive explicitly mentions health data in Article 8:
1. Member States shall prohibit the processing of personal data
. . . concerning health . . .
2. Paragraph 1 shall not apply where: (a) the data subject has
given his explicit consent to the processing of those data, . . .
3. Paragraph 1 shall not apply where processing of the data
is required for the purposes of preventive medicine, medical
diagnosis, the provision of care or treatment or the management of health-care services, . . .
Readers should note that this does not explicitly includes medical research,
therefore the patients explicit consent has to be obtained in order to use
medical data for research purposes.
The directive gives the data subject several rights in articles 10–15, including the right to access to his data, the right of rectification, blocking or
erasure of incomplete or inaccurate data.
The most important part for Grids when dealing with personal information is Article 17, dealing with the Security of processing:
1. Member States shall provide that the controller must implement appropriate technical and organizational measures to
protect personal data against loss, alteration, unauthorized
disclosure or access . . .
46
CHAPTER 3. MOTIVATION
Having regard of the state of the art and the cost of their
implementation, such measures shall ensure a level of security appropriate to the risks represented by the processing
and the nature of the data to be protected.
2. The Member States shall provide that the controller, where
processing is carried out on its behalf, chooses a processor
providing sufficient guarantees in respect of the technical
security measures ... and must ensure compliance with those
measures.
3. The carrying out of a processing by way of a processor must
be governed by a contract or legal act binding the processor
to the controller . . .
These regulations have several implications for processing of medical data,
using a Grid. The most constraining surely is the obligation to have a contract between processor and controller. To make a Grid workable under such
constraints, we see two possibilities:
• Resource providers wanting to make their resources accessible to medical applications make prior legal contracts, which regulate their role
as processor. To allow more flexibility these contracts could be made
with a medical resource broker, who can then make contracts with the
home organizations of users treating medical data on a Grid.
• The other possibility would be ad-hoc contracts concluded over the
Internet similar to many e-commerce applications.
The legal implications as well as the security requirements for such a contracting system are out of the scope of this thesis.
The other obligations stated in this article also deserve some more
thought. What are the risks represented by processing medical data and
what is the nature of the data to be protected? We believe that the risks
when dealing with medical data in general are very high. Disclosure or falsification of such data could make it difficult for the concerned person to
get an employment or even a medical insurance. These problems are intensified by the use of a Grid. The transparent resource sharing and the fact
that large communities of users have access to Grid infrastructures, make the
task of privacy protection very difficult. Therefore the best possible technical
measures have to be taken to protect the data.
For more detailed considerations on the need for confidentiality in healthcare when using information technology, see [82].
For an in-depth analysis of the requirements of directive 95/46 EC with
respect to a medical Grid, see [58].
3.5. LEGAL ISSUES DEALING WITH MEDICAL DATA
3.5.2
47
French Law concerning privacy protection
All directives of the European Union have to be implemented in national law
within a certain period of time. For directive 95/46 EC this period was set to
three years from the date of its adoption (Article 32). Therefore the member
states have implemented new laws or modified existing ones to respond to
the requirements of the directive.
As an example we briefly examine the French implementation of directive
95/46 EC. It is codified in the law n◦ 2004-801 from the 6th of August 2004
that modifies the law n◦ 78-17 of the 6th of January 1978 [53]. We refer to
the articles within the modified version of law n◦ 78-17 in the rest of this
section.
The law starts by taking over the definitions of Article 1 of the EU directive. However it specifically states that this law is not applicable to personal
data that are temporarily stored to speed up access, for example in a cache
(Art. 4).
The law also specifies that personal data may be re-used beyond its original purpose for scientific research, if certain provisions concerning the rights
of the data subject are respected (Art. 6 n◦ 2). This clearly goes further than
the EU directive that does not specifically mention the use of personal data
for scientific research.
Chapter III (Articles 11-21) of this law establishes the National Commission for Liberties and Informatics (CNIL). The goals of the CNIL are to
inform the public about the rights and obligations with respect to the processing of private data and to supervise that entities processing private data
respect the provisions of this law.
The law also considers the implications of anonymization of personal data
in Article 8, paragraph III. It requires anonymization procedures to be authorized as conforming with this law by the the CNIL. The CNIL decides on
case-to-case basis if the anonymized data can be used for such applications
as for example medical research.
Specific attention is also given to the transfer of personal data to States
outside the European Union (Chapter XII). The bottom line of these regulations is that the controller of the data has to make sure that a sufficient level
of privacy protection exists in the target state. This would allow international
health-Grid applications, provided that all states in which Grid resources are
located have sufficient privacy protection laws.
In conclusion one can say that the legal implications of using Grids for
the processing of medical data are not yet fully explored. Many pitfalls exist,
and even though laws are harmonized throughout the European Union, differences in legal details can greatly influence the feasibility of medical Grids
48
CHAPTER 3. MOTIVATION
from state to state. However most states have displayed a high level of interest in the use of Grids for medical applications. Therefore it can be expected
that the governments will not let medical Grids fail on the basis of legal
details. However the strict regulations concerning privacy protection require
that all security aspects of medical Grids are treated with the greatest care
from a technical point of view. This thesis contributes to the effort of making
Grids secure for medical use and aims to satisfy legal requirements about
technical measures to secure private data. Since the legal requirements are
quite broad and do not address specific technical details of the security measures for the protection of private data, no direct impact on the contributions
of this thesis could be derived. We do however note that our proposal is not
contradictory which the limitations and requirements of the law dealing with
privacy protection.
Chapter 4
Related Work in Access
Control
In this chapter we address the connections between our work and the state of
the art. We discuss general access control models, frameworks and standards,
and actual implementations of access control systems.
4.1
Terminology
In this section we introduce some vocabulary that we will use to describe the
different concepts in this thesis.
Definition 1 The natural persons acting on a Grid architecture are referred
to as users. To specify a user, a user group or a process acting on behalf of
a user we use the term entity.
Definition 2 Data, storage space and computing power shared within a Grid
architecture are referred to as resources. Storage space and computing power
are also called hardware resources. Resources are provided on a Grid architecture by resource servers.
Definition 3 To refer to an entity that is given an authorization, we use the
term authorization subject or subject for short. To specify in authorizations
what subjects can do on resources (e.g. read and write for data resources) we
use the term actions. Resources within an authorization are referred to as
authorization objects or objects for short.
Definition 4 For each resource, an entity is identified as the resource’s
source of authority (SOA). The SOA has the authority to issue and delegate authorizations that allow specific actions on the resource.
49
50
CHAPTER 4. RELATED WORK IN ACCESS CONTROL
4.2
Access Control Models
In access control three general models are recognized:
• Discretionary Access Control (DAC)
• Mandatory Access Control (MAC)
• Role Based Access Control (RBAC)
In the following we give a short description of these access control models
and evaluate which are their advantages and disadvantages with respect to
our objectives. See [86] or [84] for a more detailed presentation of access
control models. We then briefly present current directions in access control
and examine their relevance for our objectives.
4.2.1
Discretionary Access Control
In DAC all permissions can be represented by an access matrix, where each
row of the matrix corresponds to a user and each column to a resource. The
contents of the cells of the matrix are the actions the user specified by the row
is allowed to perform on the resource specified by the column. This concept
was first proposed in 1974 by Lampson [64], then refined by Graham and
Denning in [54] and formalized by Harrison, Ruzzo and Ullmann in [57]
Since in systems handling large numbers of users and resources the complete representation of the matrix is not feasible, several ways of representing
the non-empty cells of the matrix are proposed:
• Access Control Lists (ACL) correspond to storing the matrix by column. Each resource is associated with a list, containing the actions the
various users may exercise on the resource.
• Capabilities correspond to storing the matrix by row. Each user is associated with a list, containing the actions he or she may perform on
the various resources.
• Finally Access Control Relations store the non-empty cells of the matrix
as three-tuples (user, resource, action) in a table.
The advantage of DAC with respect to our objectives is that is permits
a fine-grained access control and an easy ad-hoc permission granting, when
combined with Authorization Certificates encoding tuples of Access Control
Relations. An Authorization Certificate allowing ad-hoc access to a resource
4.2. ACCESS CONTROL MODELS
51
can be created and issued on demand by the resource’s SOA and transferred
directly to the user specified in the relation. The user can present this certificate as proof of his permission to a resource site and thus gain access to the
resource. Fine-grained access control permissions can be specified this way,
provided that fine-grained resource identifiers exist on the Grid architecture.
The disadvantage of DAC is that it can be cumbersome to manage, when
permissions are assigned to users based on their tasks. This problem becomes
even more obvious, when users are reassigned to new tasks, since in a DAC
model this would mean to revoke every single permission related to the old
tasks and reassign every single permission related to the new task.
4.2.2
Mandatory Access Control
MAC typically deals with data resources. All resources are assigned a label
specifying a classification (typically security levels like: top secret, secret,
confidential, unclassified) which is stored as meta-data for the resource.
Users are assigned clearances within this classification. Based on their
clearance users are allowed to:
• read all the data resources that are of the same or lower level.
• write to all the data resources that are of the same or higher level.
The first of those two rules is quite clear. The second one has the goal of
preventing information to leak to a lower security level (e.g. by preventing a
doctor who has access to confidential patient information to write this data
to a publicly accessible medical database). If the user wants to write to data
resources on a lower level he can log on the system using a lower clearance
than his maximal allowed one. These two principles where first formulated
by Bell and LaPaluda in 1973 [5] and then revised in [4] for the protection
of confidentiality of a information.
Based on the principles of Bell and LaPaluda, Biba [11] proposed a MAC
model for the protection of integrity of information.
The concept of MAC can be augmented by adding categories, where data
and users are additionally assigned to one or more category (e.g. radiological, psychological, pharmacological). With this addition, access can also be
restricted on a need-to-know basis. Users will only have access to data that
belong to one of their categories.
Clearances and categories can either be stored in secure permission repositories that are queried by the access control systems to retrieve information
on a specific user or be distributed to the users in the form of certificates.
52
CHAPTER 4. RELATED WORK IN ACCESS CONTROL
The problem with MAC is that even when using categories, it lacks granularity and flexibility of access permissions, since single objects cannot be
specifically addressed and the set of actions on the objects is restricted .
4.2.3
Role Based Access Control
Compared to DAC and MAC, Role Based Access Control (RBAC) is a relatively new paradigm. It was first introduced in 1992 by D.F. Ferraiolo and
D.R. Kuhn at the 15th National Computer Security Conference [46]. A framework for RBAC was proposed by R. Sandhu et al. in [85]. The American National Standards Institute (ANSI) has adopted a consensus model for RBAC
based upon [47] in 2004.
RBAC is an effort to overcome the cumbersome administration of permissions inherent to DAC. The basic concept of RBAC is the role. A role is
a named collection of permissions and possibly other roles, that are needed
to perform a specific task. Users are assigned roles according to the tasks
they have to perform. Therefore the management of permissions, especially
when a user is (re-)assigned to new tasks, becomes much easier, since only
the relations user-to-roles have to be changed. Furthermore the permissions
related to a task can be changed globally without having to modify them for
every user who is assigned to that task.
The RBAC community differentiates the concept of a role from concept
of a group in order to avoid confusion with the well-established meaning of
groups in operating systems. A group is a named collection of users and
possibly other groups and is therefore relatively similar to the concept of a
role. Both can be used in access control to assign the same permissions to
a group of users. However roles require some additional functionality that is
not necessarily provided by groups.
The core concept of the RBAC standard requires that each user and each
permission can be assigned to multiple roles and each role can be assigned
to multiple users and permissions. Furthermore review functions must be
available, that allow to see the roles assigned to a user and the users assigned
to a role, as well as the permissions assigned to a role and the roles assigned
to a permission. Core RBAC also defines the concept of user sessions which
allow to selectively activate and deactivate roles, in order to use the least
privilege necessary. Finally users must be able to simultaneously exercise the
permissions of multiple roles.
The core concepts of RBAC are extended by hierarchical RBAC. In hierarchical RBAC, a partial order is defined between some roles, defining
hierarchically superior and inferior roles. Superior roles inherit the permissions of inferior ones and users assigned to superior roles are automatically
4.2. ACCESS CONTROL MODELS
53
assigned to inferior roles too.
Another extension of the RBAC concepts, constrained RBAC introduces
the static and dynamic separation of duties (SSD and DSD). Hereby the combination of roles that can be assigned to (SSD) or simultaneously activated
by (DSD) a user are subject to restrictions.
Core RBAC achieves the objective of flexible and easy administration of
permissions related to tasks. RBAC also allows to follow the paradigm of
least privilege and to enforce an effective separation of duties.
The limitations of RBAC lie in the necessity to assign all permissions
through roles. Therefore ad-hoc permission granting would require the creation of a new role assigned to only one user, which is somewhat impractical.
4.2.4
Current directions in access control
We have examined the following current directions in access control:
• Policy composition frameworks
• Attribute based access control
• Trust negotiation
Research on policy composition frameworks [16, 101] investigates how to
integrate different, independent access control policies from multiple entities
on a distributed system. A very important issue in policy composition frameworks are the possible inconsistencies, that can arise when trying to combine
heterogeneous policies.
Another interesting issue for policy composition frameworks, that applies
directly to Grid computing, is mobile policies [95]. A mobile policy is associated with an access control object and follows the object if it is replicated
or moved. Such mobile policies could be attached to data resources in order
to regulate the fine-grained access control to data on a Grid.
Attribute based access control (ABAC) is an approach, where authorization decisions are not based on the identity of the request issuers, but on a set
of attributes that requesting users have to provide (usually proved through
the possession of attribute certificates, see section 4.5).
Attribute based access control can be seen as a generalization and extension of RBAC, where attributes (instead of roles) are assigned to users,
and permissions depend on the attributes a user has. The differences between
RBAC and ABAC are, that in attribute based access control a user may need
multiple attributes in order to use a single permission. Such a construction
is not supported by all RBAC systems. Furthermore in ABAC permissions
54
CHAPTER 4. RELATED WORK IN ACCESS CONTROL
are not assigned to attributes, instead the access control objects specify the
attributes that are required in order to access them.
A framework for attribute-based access control specification and enforcement is presented in [15].
In [95] a good overview of current issues concerning policy composition
frameworks and attribute based access control is given.
Trust negotiation is another recent direction of research in access control. Trust negotiation systems handle the task of establishing mutual trust
between two entities that have no previous relationship. This is achieved
by providing credentials from a trusted third party, known to both entities
(possibly through intermediaries). Based on the level of mutual trust that
is established through the trust negotiation system, one entity may give the
other certain access rights to its resources. An example of a trust negotiation
system is presented in [83].
Trust negotiation is not limited to mapping one access control policy to
another, as the negotiation process can lead to different results depending on
the negotiation strategies adopted by the participants.
In [8] a set of requirements for trust negotiation systems are proposed.
Furthermore a good overview of existing trust negotiation systems is given.
The article underlines the importance of credential chains for delegation,
which supports our similar argument presented in chapter 3.
Summarizing the current directions of access control one can say that
the central goal of all new approaches are decentralization and cooperation
between cross-organizational security systems. These general goals are reflected in the requirements of our application, and therefore the approaches
presented in this thesis are oriented in the same direction.
4.3
Authorization Frameworks
The Request For Comments 2904 (RFC 2904, August 2000) [96] and the
International Organization for Standardization recommendation ISO/IEC
10181-3 [61] both define frameworks for Authorization systems. They are
conceptually similar, but use a distinctly different terminology. As this may
(and has) often lead to confusion we use the terms defined at the beginning
of this chapter exclusively.
The RFC framework proposes three message sequences that define how
users, resources and servers handling authentication, authorization and auditing (AAA servers) interact: Agent, Pull, and Push. Figure 4.1 illustrates
these message sequences.
4.3. AUTHORIZATION FRAMEWORKS
1
AAA
Server
User
4
2
55
AAA
Server
2
3
3
1
Resource
Resource
User
AAA
Server
1
User
2
3
4
Resource
4
Agent sequence
Pull sequence
Push sequence
Figure 4.1: Authorization Message sequences for an agent, pull and push
authorization structure
In the Agent message sequence, the user interacts only with the AAAserver for authorization. The AAA server relays the user’s requests to the
resource server and notifies him once the service is ready. In a first step the
user contacts the AAA server which then retrieves the user’s permission and
checks if it allows the requested action. If the AAA server reaches a positive
decision, it transmits the user’s request to the resource server in a second
step. The resource server makes the requested available for execution and
returns an acknowlegement to the AAA server in a third step. The AAA
server informs the user that the request is ready at the resource server in a
final step.
In the Pull message sequence, the user only interacts with the resource
server. The resource server handles all interactions with the AAA server. In
a first step the user submits a request to the resource server. The server contacts the corresponding AAA server in a second step and asks for a decision
whether the requested action should be allowed. The AAA server retrieves
the user’s permissions to make its decision and communicates the result to
the resource server in a third step. If the decision is positive, the resource
server executes the user’s request and returns the result to the user in a final
step.
The Push message sequence puts all the burden of interaction on the user
and thus separates the AAA server from the resource server. In a first step
the user contacts the AAA server to retrieve assertions on his permissions.
The AAA server retrieves the requested permissions and returns them to the
user in a second step. Then, in a third step the user submits his request
and the required permissions to the resource server. The server checks the
submitted permissions and if they allow the requested action it proceeds to
execute the request. In a final step, the resource server then return the results
of the request to the user.
Following the classification of RFC 2904 [96], the ISO framework is either
agent or pull model, depending on where the function that enforces access
control decisions is implemented (AAA server or resource).
56
CHAPTER 4. RELATED WORK IN ACCESS CONTROL
RFC 2904 [96] also defines a set of architecture components that include
the Policy Decision Point (PDP), where access control decisions are made
based on the access control information provided in the message sequences
presented before. The job of the Policy Enforcement Point (PEP) is to enforce the access decisions of the PDP with regard to the resources.
Let us now consider a scenario with distributed resources, administrated
by multiple different authorization authorities. In the case of data resources
on a Grid, the SOA for the data may not be directly related to the storage
resource on which the data are located.
Using the pull sequence, a storage resource has the duty of contacting the
different AAA servers for the data resources it stores. A malicious resource
could violate the paradigm of using the least privilege by requesting more
privileges than actually required from the AAA servers.
The agent sequence imposes the additional duty of communicating with
the resources on the AAA servers. If access to a resource involves authorizations between multiple AAA servers, a coordination mechanism between
them is required.
The push sequence allows a temporal decoupling of the authorization
assertion from the actual request. The duty of querying the AAA servers
and the resource is put on the user, reducing the load on AAA servers and
resources. Furthermore the paradigm of using the least privilege can be securely enforced, provided that the AAA server allows the user to request
assertions of a subset of his authorizations. Finally if the user’s request requires authorizations from multiple AAA servers the user can easily query
them sequentially and combine their assertions to support his request. The
disadvantage of the push sequence is that it requires an authorization revocation mechanism, since AAA servers have no longer access to authorizations
once they are issued to their holders.
Considering the drawbacks and advantages cited above, we believe the
push sequence to be the best choice for this scenario.
The RFC framework also discusses the use of attribute certificates (AC) to
store authorization data. Their proposal is based upon the work on X.509 Attribute Certificates by the Public Key Infrastructure (PKIX) Working Group
of the IETF and is discussed in section 4.5.2. The RFC framework explicitly
states the necessity to ensure that the AC owner is also the request issuer in
this context.
4.4. AUTHORIZATION EXPRESSION LANGUAGES
4.4
57
Authorization Expression Languages
To express authorizations given to a user and general policies governing access
control, a well defined language is needed. Several approaches have been
proposed for such a language. In the following, we briefly examine the impact
of KeyNote [14], XACML [52], and XrML [28] on our work.
4.4.1
KeyNote
KeyNote [14] is a Trust-Management System and as such it combines authentication and access control in a unified framework for evaluating authorization
requests. KeyNote defines an assertion language that allows to bind authorizations to entities. These entities may be represented by public keys, similar
as in the SPKI approach (see section 4.5.3). If this is the case, those entities
can delegate their authorizations by issuing digitally signed assertions.
Currently KeyNote does not support the revocation of assertions, furthermore the KeyNote language has no support for RBAC since it is oriented
towards DAC. As our application requires support for RBAC, KeyNote does
not suit our requirements.
4.4.2
XACML
The eXtensible Access Control Markup Language (XACML) [52] is a standard proposal by the OASIS consortium1 . It defines a general purpose language for specifying access control policies. XACML is highly expressive and
offers a large variety of datatypes and functions to combine or compare them.
XACML manages policy sets, that each consist of one or more rules. Each
rule defines the actions a subject may perform on a resource.
The XACML policy language is entirely written in XML [20] and is therefore more human readable than binary encodings such as ASN.1 [94]. This
advantage is somewhat negated by the fact that XACML is very verbose
and requires an enormous overhead for even the most simple policies. This
makes policies difficult to write, understand and manage. Therefore special
policy creation tools are needed to help untrained users to understand and
create XACML policies. The PRIMA architecture presented in section 4.6.7
proposes such a tool. Figure 4.2 shows an example of a simple policy with a
rule giving read access to a file.
1
OASIS is a non-profit, global consortium, that drives the development, convergence
and adoption of e-business standards. Its foundational sponsors are Innodata Isogen, SAP
and Sun Microsystems, Inc.
58
CHAPTER 4. RELATED WORK IN ACCESS CONTROL
00 <Policy PolicyId="FileAccessPolicy"
01
RuleCombiningAlgId="urn:oasis:names:tc:xacml:1.0:
02
rule-combining-algorithm:permit overrides">
03 <Target>
04
<Subjects> <AnySubject/> </Subjects>
05
<Resources> <AnyResource/> </Resources>
06
<Actions> <AnyAction/> </Actions>
07 </Target>
08 <Rule RuleId="FileAccessRule" Effect="Permit">
09
<Target>
10
<Subjects> <Subject> <SubjectMatch
11
MatchId="urn:oasis:names:tc:xacml:1.0:
12
function:string-equal">
13
<AttributeValue
14
DataType="http://www.w3.org/2001/XMLSchema#string">
15
/O=Grid/O=SomeVO/OU=liris.cnrs.fr/CN=LudwigSeitz
16
</AttributeValue>
17
</SubjectMatch> </Subject> </Subjects>
18
<Resources> <Resource> <ResourceMatch
19
MatchId="urn:oasis:names:tc:xacml:1.0:
20
function:string-equal">
21
<AttributeValue
22
DataType="http://www.w3.org/2001/XMLSchema#string">
23
SomeGridfileId
24
</AttributeValue>
25
</ResourceMatch> </Resource> </Resources>
26
<Actions> <Action> <ActionMatch
27
MatchId="urn:oasis:names:tc:xacml:1.0:
28
function:string-equal">
29
<AttributeValue
30
DataType="http://www.w3.org/2001/XMLSchema#string">
31
read
32
</AttributeValue>
33
</ActionMatch> </Action> </Actions>
34
</Target>
35 </Rule>
36 </Policy>
Figure 4.2: An example of an XACML policy granting read access to a file.
4.4. AUTHORIZATION EXPRESSION LANGUAGES
59
Another drawback of XACML is that it has no explicit support for delegation. The PRIMA architecture has provided a workaround, by adding new
actions, that give granting rights on existing actions. However this solution
requires modification of the standard XACML PDP to take into account existing delegations when evaluating a subject’s access rights. Furthermore this
solution mixes actions and delegations which are conceptually different parts
of access control.
4.4.3
XrML
The eXtensible rights Markup Language (XrML) [28] is a general purpose
language in XML used to describe the rights and conditions for using digital
resources.
It has the same underlying goal as XACML since it was designed to answer the same question: “Is such-and-such a Principal authorized to exercise
such-and-such a Right against such-and-such a Resource?” (see XrML Core
Schema, p. 43 available from [28]), however XrML is focused on digital rights
management (DRM).
The core specification of XrML has no explicit support for RBAC and the
XrML language has less expressive power compared to XACML. Contrary
to XACML however XrML provides mechanisms for delegation of rights.
XrML also supports binding authorizations to public keys similar to the
SPKI approach (see section 4.5.3).
The lack of support for RBAC and the strong focus on DRM make XrML
not well suited for expressing complex access control scenarios.
4.4.4
General remarks
We do believe that XML encoding is a reasonable approach to formulate access control decisions. First of all it is a human readable format and second
there are a lot of XML support tools that allow to process XML encoded
data. A standardized access control language is also desirable to achieve interoperability. It is obvious that an acceptable approach needs to be very
generic. This somewhat explains the verboseness of XACML and other related standards. However in research, implementing such standards puts a
huge workload on the researcher, and does not necessarily lead to scientifically significant results. We have therefore chosen not to implement standards
in our approach, since we only target a proof of concept and standard conformance would have made the system very cumbersome to use and the access
control data very difficult to understand. However our system can be easily
adapted to support any XML based permission specification standard.
60
4.5
CHAPTER 4. RELATED WORK IN ACCESS CONTROL
Standards for authorization assertion
Several access control architectures store permissions unprotected (see section
4.6). We deem such an approach to be not appropriate for our application,
since such permissions would be prime targets for hackers and a successful
attack would give access to a wide range of resources. We believe that a
maximum of security critical information should be stored securely encoded
in digitally signed certificates.
Ordered sequences of authorization certificates can be used to form certificate paths that allow delegation of authorizations. This allows for flexible
and secure management of dynamic access rights.
4.5.1
SAML
The Security Assertion Markup Language (SAML) [68] from the OASIS consortium defines an XML based syntax and protocols for requesting and providing authentication, attribute and authorization assertions. The authentication assertions contain descriptions on how subjects have been authenticated,
attribute assertions bind certain attributes (e.g.roles) to subjects and authorization assertions convey an authorization decision for a specific request.
If assertions need to be secured, SAML uses the XML digital signature
recommendation of the W3C [33]. These signed assertions are equivalent to
certificates.
Due to its high level of expressiveness, SAML assertions are quite verbose
and not easy to read and understand for an average user.
The core specification of SAML does not address delegation of authorizations in any way. Recent proposals ([77, 97]) address this drawback by
proposing extensions to the SAML specification.
Both approaches extend some element of a SAML assertion in order to
allow the expression a multi-step delegation in a single SAML assertion. In
each delegation step, the access rights can be restricted by adding conditions
or constraints. The approaches only differ in the choice of the SAML element to be extended for adding delegation and in the exact syntax of their
delegation statement.
SAML and XACML have some overlap, however while the focus of SAML
is on conveying information such as user attributes, authorization decisions
and authentication methods, XACML is centered on policies governing the
request object. Therefore one could say that SAML is subject centered while
XACML is object centered. This means that SAML can be used to provide
PDPs using XACML with authorization information.
4.5. STANDARDS FOR AUTHORIZATION ASSERTION
4.5.2
61
X.509 Attribute Certificates
The Request For Comments 3281 [45] defines a profile for the use of X.509
Attribute Certificates (ACs) for Authorization. ACs bind attributes to a user
identity, which is to be authenticated by using a X.509 public key certificate.
Since issuers of AC can define their own attribute types, any kind of authorization information can be encoded within an AC.
The current profile is very limited, since it recommends not to support
delegation because the administration and processing of AC paths is deemed
to be too complex. Furthermore for each particular set of attributes only one
source of authority may exist that functions as AC issuer. As an example this
means that role memberships can only be issued by one single authority. Such
a limitation would make scalable, decentralized authorization impossible and
is therefore not suitable in a Grid architecture.
The PERMIS and the PRIMA access control architectures make use of
X.509 ACs (see 4.6). The authors of PRIMA have extended the X.509 AC
specification in order to support certificate paths.
4.5.3
SPKI
The Requests For Comments 2692 and 2693 [38, 39] define a Simple Public
Key Infrastructure for trust management. SPKI introduces a simple format
for authorization certificates. An access control list (ACL) that is co-located
with each resource specifies the public keys of the resource administrators.
These administrators may issue permissions on the resource and authorize
other users to delegate them by issuing digitally signed certificates. SPKI uses
public keys to identify entities and to create unique namespace identifiers.
Furthermore it specifies a delegation mechanism through chains of certificates
and details tuple reduction rules to produce an authorization decision out of
such a certificate chain. Work on SPKI standardization has ceased since 2001
and thus important questions such as implementation of standard RBAC
using SPKI have not been addressed.
Binding permissions to public keys has several advantages with regard to
binding permissions to user identities as in X.509 ACs. First of all it solves the
problem of finding globally unique names for users and second it simplifies
the integrity checking of SPKI certificates, since the creators public key is
included in the certificate.
The drawback of binding permissions to public keys is related to the
revocation of a user’s private key. In such a case all authorizations related
to this key have to be revoked too. When permissions are bound to user
identities this is not necessary, since the user identity does not change if the
62
CHAPTER 4. RELATED WORK IN ACCESS CONTROL
underlying authentication key is replaced.
4.6
Access Control Systems
We now present existing access control architectures that are specifically
designed for Grids or distributed resources and discuss how they relate to
our requirements.
4.6.1
Shibboleth
Shibboleth [40] is an access control architecture developed since 1999 by the
Middleware Architecture Committee for Education (MACE) of the Internet2
consortium and supported by IBM. The distinct feature of Shibboleth are its
mechanisms for user privacy and information release control.
Shibboleth is specifically designed to control access to web based services. With respect to RFC 2904 (see section 4.3), Shibboleth uses the pull
sequence, where the service provider is contacted by the user and then pulls
the user’s attributes from the Attribute Authority (AA) of the user’s home
organization. Based on local access control lists, the resource provider then
decides which rights to grant to the user based on his attributes.
Storage of the attributes is left to the discretion of AAs, attribute assertions are passed between the AA and the service providers using SAML (see
section 4.5).
The problem in the current design of Shibboleth is that the AA is considered to be located at the user’s home organization. This is not necessarily
the case in Grid environments with multiple distributed sources of authority.
This problem, the limitations of the current SAML specification concerning
delegation, and the drawbacks of the pull sequence make Shibboleth unsuited
for our requirements.
In a recent spinoff project from Shibboleth, GridShib, the developers plan
to integrate Shibboleth into the Globus Toolkit [99]. As the project is relatively recent and still in an early stage of development, nothing more precise
can be said about its outcome.
4.6.2
Akenti
Akenti [91, 92] is an access control system developed at the Distributed Systems Department of the Lawrence Berkeley Laboratory in the USA since
1998. Akenti uses signed certificates to store access control policies, resource
4.6. ACCESS CONTROL SYSTEMS
63
use-conditions and attribute assignments. This protects them against unauthorized modification and makes it possible to store them on less secured
sites. However the policy certificates are self-signed and must therefore be
considered as trusted information. They are co-located with the resources
to which they apply and specify the sources of authority (SOA) for these
resources.
Akenti can used both the authorization push and pull sequence. In both
cases the server is contacted when an access control decision is needed. The
server then uses the relevant certificates (either submitted by the user or
gathered by the server from the locations specified in the policy certificates)
and makes its decision based on those.
Currently Akenti uses a proprietary XML-based policy and assertion language, however the Akenti development team is considering the use of SAML
as assertion language and XACML as policy language.
Akenti does not support delegation of rights through paths of certificates.
Instead resource owners who want to give administrative power to other users
need to specify those in the policy certificates. Akenti therefore fails to meet
some of our requirements.
4.6.3
PERMIS
PERMIS [23] is also a certificate based access control system. It has been
developed by the Information Systems Security Research Group of the University of Salford in the United Kingdom since 2001.
It uses the authorization pull sequence and stores all relevant certificates
in LDAP directories. Only the SOAs that may issue valid policies need to be
stored locally with the PDP.
PERMIS relies on the X.509 attribute certificates (AC) [45] to securely
store role assignments and policies. This implies that PERMIS inherits some
of the limitations of the X.509 AC specification as described in section 4.5,
namely the limitation to one source of authority per set of attributes.
Currently PERMIS uses a proprietary policy language but PERMIS developers are considering the use of the XACML policy language.
PERMIS allows static delegation of roles from the SOA to a subordinate
AA. This means that a central authority has to be contacted, and has to
register subordinate AAs in its policy, before they are entitled to assign
privileges. This delegation can be restricted by specifying a delegation depth
limit in the role assignment. Once authorized to do so, AAs can assign roles
by creating new role attribution ACs.
PERMIS has no explicit support for ad-hoc permission granting, all permission assignments have to be done through the assignment of a user to a
64
CHAPTER 4. RELATED WORK IN ACCESS CONTROL
role and permissions to the role. This, the limitations of the X.509 ACs and
the reliance on the pull sequence make PERMIS unsuited for our application.
4.6.4
CAS
The Community Authorization Service (CAS) [79, 78], developed since 2002
by the Globus Alliance, is an access control service for Grids. It builds on the
concept of Virtual Communities (also called Virtual Organizations in other
projects) that are defined as cross-organizational communities of users that
share resources and cooperate for a common project.
Each virtual community is granted bulk rights and runs a CAS that stores
the information how these bulk rights are restricted for the individual members of its community. CAS uses an authorization push model, where users
retrieve permissions from the CAS server acting as AAA-server.
The fact that CAS centralizes access control information makes it a potential bottleneck and a trusted third party. A CAS can therefore become a
single point of failure, e.g. if an attacker compromises a CAS server he has
access to all resources granted to the community that this CAS manages.
Furthermore the fact that each CAS is centrally managed and that resources
grant bulk rights to the communities make fine grained data access control
and ad-hoc granting of rights extremely difficult to manage.
4.6.5
VOMS
The Virtual Organization Membership Service (VOMS) [2] was developed
from 2001 to 2004 within the DataGrid project (IST-2000-25182). Its development continues within the EU project Enabling Grids for E-sciencE
(EGEE, IST-2003-508833).
VOMS is an access control service that is conceptually similar to CAS.
A VOMS server stores group memberships for the members of a Virtual
Organization (VO). The resource sites store the rights assigned to the various
user groups. VOMS can be used in both authorization push and pull mode
with the VOMS server acting as attribute authority. However VOMS does not
provide the resource service to interpret the attribute statements it issues. It
is therefore incomplete as access control service. Similar concerns as for CAS
apply due to the centralization of access control information.
4.6.6
Cardea
Cardea [65] is an access control solution for distributed systems. It has been
developed since 2003 at the NASA Advanced Supercomputing (NAS) Divi-
4.6. ACCESS CONTROL SYSTEMS
65
sion of the NASA Ames Research Center in the USA.
Cardea uses XACML as policy language and SAML to certify authorization information. Since the XACML is based on the pull model, the same
concerns as for Akenti and PERMIS apply. This and the current limitations
of XACML and SAML with regard to delegation mechanisms as described
in sections 4.4 and 4.5 make this approach unsuited for our application.
4.6.7
PRIMA
PRIMA [67, 66] is a Grid access control system, that has been developed since
2003 at the Department of Computer Science, Virginia Polytechnic Institute
and State University, USA.
PRIMA is a hybrid push/pull architecture, where user attributes are
pushed to the PDP and global policies are pulled by the PDP.
PRIMA specifically supports ad-hoc permission granting. It uses XACML
as policy language and X.509 ACs for authorization. However the designers
of PRIMA have implemented support for delegation through paths of certificates.
PRIMA maps the data access permissions of a user to local POSIX.1e file
system access control lists [27] or Grid Access Control Lists (GACL) [72]. This
approach makes it more difficult to realize the Grid paradigm of integrating
heterogeneous systems, since it requires one of those specific Systems to be
deployed on all machines participating in the Grid architecture.
4.6.8
Summary
In this section we discuss the access control architectures presented until
now with regard to the constraints and requirements we have established in
chapter 3. The results are presented in three tables. Table 4.1 summarizes
the aspects related to the constraints of the medical application, table 4.2
shows the results relating to general principles of good security and table 4.3
shows the results with regard to the constraints of the Grid environment.
Question marks in the tables indicate that the available documentation
does not make it clear if the architecture fulfills the specific requirement. We
now outline the reasons for the negative entries in the tables.
• S1: PERMIS and Cardea are designed to use the authorization pull
message sequence. Moreover PERMIS does not seem to support the
RBAC concept of activating and deactivating roles. We must therefore assume that all roles are active at any time. Akenti can use both
authorization pull or push message sequence.
66
CHAPTER 4. RELATED WORK IN ACCESS CONTROL
• S2 and G5: In Shibboleth and VOMS, the local sites determine the
access control policies related to all local resources (based on externally
provided attributes). Therefore there may be different access rights to
replicas of data stored at different sites and if a local storage site goes
offline it may miss an update in the permissions concerning the stored
data.
• S3, S5 and S6: CAS and VOMS use a central server for the Community/VO. It stores authorization information for all members of the
Community/VO in unprotected form and is therefore a trusted third
party. While Shibboleth does not have such a centralized service, it assumes that the attribute authority for each user is his home organization. This hinders decentralized authorization systems, where attribute
assertion may originate from distributed sources of authority.
• G1: All the negatively rated systems require a source of authority to
contact a permission storage in order to submit new permissions. Only
then the PDP or the user can retrieve this authorization information
in the process of an authorization decision. This procedure encumbers
ad-hoc granting.
• G2: CAS and VOMS require a centralized system in order to get access
to authorization assertions. If the CAS or VOMS server breaks down,
no user of the community/VO will be able to access any authorization
assertions.
• G3: The PRIMA system requires some specific software (POSIX.1e or
GACL) to be deployed at system level on all machines providing Grid
resources.
• A1: The CAS documentation2 indicates that CAS supports user and
object groups. However none of the requirements of RBAC (see section
4.2) are specifically addressed.
• A3: In VOMS local sites have complete control over the permissions
related to the data resources they store. Therefore the owner of a file
who stores it on a Grid can not directly control its access permissions.
• A4: It is impossible for an access control system alone to prevent circumvention of data access control by persons having access to the hard2
Available from
http://www-unix.globus.org/toolkit/docs/development/4.0-drafts/security/cas
4.6. ACCESS CONTROL SYSTEMS
67
ware. This requires additional measures that are discussed in chapter
5.
Shibboleth
Akenti
PERMIS
CAS
VOMS
Cardea
PRIMA
Constraints of the application
A1
A2
A3
A4
RBAC Traceability
Owner managed
circumvention
data access control
protection
?
?
?
no
yes
yes
yes
no
yes
?
yes
no
no
?
yes
no
yes
?
no
no
yes
yes
?
no
yes
yes
yes
no
Table 4.1: Summary of how different architectures respond to requirements
of a medical application.
CHAPTER 4. RELATED WORK IN ACCESS CONTROL
68
System
Shibboleth
Akenti
PERMIS
CAS
VOMS
Cardea
PRIMA
S1
Least
privilege
yes
yes/no
no
yes
yes
no
yes
General principles of good security
S2
S3
S4
S5
Permission Minimal use Separation
Secure
consistency
of trusted
of
permission
third parties
duties
storage
yes
?
?
yes
?
yes
yes
yes
yes
no
?
no
no
?
no
yes
?
?
yes
yes
?
no
yes
yes
yes
no
yes
yes
S6
No
centralized
services
no
yes
yes
no
no
yes
yes
Table 4.2: Summary of how different architectures follow principles of good security.
yes
yes
yes
yes
yes
yes
yes
G6
Scalability
Table 4.3: Summary of how different architectures respond to requirements of a Grid environment.
Shibboleth
Akenti
PERMIS
CAS
VOMS
Cardea
PRIMA
G1
Ad hoc
permission
granting
no
yes/no
no
no
no
?
yes
Constraints of the Grid environment
G2
G3
G4
G5
Dynamic
Integration
Local
Transparency
availability of heterogeneous hardware of data storage
of resources
systems
control
locations
yes
yes
yes
no
yes
yes
yes
yes
yes
yes
yes
yes
no
yes
yes
yes
no
yes
yes
no
?
yes
?
?
yes
no
yes
yes
4.6. ACCESS CONTROL SYSTEMS
69
70
CHAPTER 4. RELATED WORK IN ACCESS CONTROL
Chapter 5
Related Work in Storage
Security
When dealing with confidential data, the transparent and distributed nature
of grid storage can become a problem. As described in chapter 3 an attacker
who has physical or administrator access to the device providing the storage
space is able to access the data using the local operating system. Such an
access avoids the Grid access control mechanism. Integrating the Grid access control mechanism into the local file system would contradict the Grid
paradigm of interoperating autonomous and heterogeneous resources without requiring fundamental changes in their operating systems. Furthermore
such a mechanism would not prevent data disclosure from attackers having
a physical access to the device, since the access control could be deactivated
by mounting the disk under another operating system.
Therefore additional protection is definitely needed for confidential data
that are to be shared across a Grid. Some users believe that for our example application of medical data, anonymization and pseudonymization
are sufficient measures of protection. To consider these arguments one has
to differnciate between privacy protection and confidentiality. While a true
anonymization would solve the problem of privacy protection, there may
be cases, where confidentiality is nevertheless required, for example to protect some business information. Even if privacy protection is sufficient, one
has to consider that true anonymization is hard to obtain, since confidential
data can often be derived through secondary sources, which look innocuous
at first. Total anonymization of medical data is often impossible, without
loosing its usefulness. Good anonymization require painstaking case-to-case
examination of the files and is therefore not feasible at the moment (efforts
to automate this process are described in [25]).
Encryption is therefore the best solution for storage security. However
71
72
CHAPTER 5. RELATED WORK IN STORAGE SECURITY
only very limited algorithms exist to perform computations based on encrypted data (see [1], [32]). Therefore data will actually have to be decrypted
before being used.
Having decided to use encryption for secure storage, we have to deal with
the following side-conditions:
• The scientific issues related to the actual process of encryption and
decryption of files for storage are out of the scope of this thesis. It is
however important, that encryption is carried out before copying a file
containing confidential information to the Grid and that the decryption
happens after retrieving an encrypted file from the Grid. The files’
meta-data should contain all the necessary information about which
encryption algorithm was used and all its parameters except the secret
key. For ease of handling, it is preferable that these meta-data are
contained in the header of the file.
• Owners of encrypted files should be able to share them with user groups
that are dynamically changing. This means that the users authorized
to access the file are not known at moment of the encryption and may
change during the lifetime of the encrypted file. This requires a mechanism that allows authorized users to access decryption keys when they
need them.
• Access to the decryption keys should be controlled via the normal finegrained file access control mechanisms of the Grid. This avoids inconsistent situations where a user is given access to the encrypted file by
the Grid access control but is denied access to the decryption key.
• As the loss of a decryption key also means losing the encrypted data,
the storage of such keys needs to be fault tolerant and thus redundant.
• Measures are to be taken to avoid collusion between the authorities
that manage key storage and an attacker who has access an encrypted
file.
• When an access permission to an encrypted file becomes revoked, one
has to decide how to deal with the encryption keys concerned by the
revoked permission. Three options with increasing levels of security
are available: The first option is to do nothing and rely on the access
control mechanism to prevent access to the encrypted file. The second
option is to do a lazy re-encryption. That means that the file is reencrypted with a new key, as soon as its content changes. The third
option is an immediate re-encryption with a new key. When choosing
5.1. OVERVIEW OF ENCRYPTION ALGORITHMS FOR STORAGE 73
between those options one should consider that in a Grid environment
no measure can protect against a malicious disclosure by an authorized
user. Such a user can create unprotected copies of the file on the Grid.
5.1
Overview of encryption algorithms for
storage
In order to encrypt files for storage, the choice of the encryption algorithm
has to be made. In this section we discuss some features of the available
encryption algorithms that are relevant to our application.
Given the fact that asymmetric encryption algorithms are by magnitudes
slower than symmetric encryption of the same strength, we concentrate on
symmetric algorithms for bulk file encryption.
There are two types of symmetric encryption algorithms: block cipher
algorithms and stream cipher algorithms. A block cipher applies a fixed, key
dependent function on blocks of data (the size of these blocks is typically
64 or 128 bits although algorithms with variable block sizes exist). A stream
cipher on the other hand uses the key to generate a pseudo-random stream
of bits, that is XORed with the plaintext bits.
The advantage a block cipher is that it allows random access to blocks of
the encrypted data, whereas when using a stream cipher the entire previous
cipher stream has to be calculated in order to access some specific piece of
data. Furthermore when using a block cipher one can securely re-encrypt
modified data with the same key, whereas this would be a major security
risk with a stream cipher, since an adversary would be able to XOR both the
original and the modified data together thereby eliminating the key stream
and getting two plaintexts XORed to each other. Such a combination of two
cleartexts is cryptographically easy to decipher.
The advantage of stream ciphers is that generally stream ciphers are faster
than equivalent block ciphers. In a test we ran with the Crypto++ library
(version 5.1) on a 1.9 GHz Pentium 4, the stream cipher ARC4 encrypted
at a rate of 24 MB/s, the stream cipher SEAL at 60 MB/s while the block
cipher AES using a 128-bit key encrypted at a rate of 10 MB/s.
Block ciphers can be operated in different modes that have several interesting characteristics for our application. We have examined the electronic
codebook mode (ECB), the cipher block chaining mode (CBC) and the cipherfeedback mode (CFB).
The ECB mode just encrypts the data block by block with no further
modifications.
74
CHAPTER 5. RELATED WORK IN STORAGE SECURITY
The advantages of ECB are that both encryption and decryption are
parallelizable, that random access to blocks of an encrypted file is possible
and that re-encryption of modified blocks with the same key is possible.
The drawbacks of ECB is that it is relatively easy to make undetected
manipulations of encrypted blocks of data and that this encryption mode
does not conceal identical patterns between blocks of cleartext. This would
allow an attacker to gain information about the content of the encrypted file,
without having to decrypt it. Another drawback is that data must have a
size that is a multiple of the cipher’s block size. Therefore the last block of
data may be too short. The most common solution to this problem is to pad
it with meaningless bits in order to make it fit. This means that the size of
the encrypted data will be bigger than the size of the cleartext. Although
the increase in size is very small (smaller than the block size of the cipher
algorithm), sometimes this may still lead to problems, for example when the
ciphertext is to replace plaintext stored in a database table cell having a
fixed size. A method known as ciphertext stealing allows to keep ciphertext
and plaintext the same size. It is presented for CBC mode in figure 5.2. For
a description of ciphertext stealing in ECB mode, please refer to chapter 9
of [87].
The Cipher Block Chaining (CBC) mode, illustrated in figure 5.1 makes
all ciphertext blocks dependent on the previous ciphertext blocks. The goal
is to make manipulations of the plaintext detectable and to conceal identical
patterns between blocks of plaintext in the ciphertext. The cost of this is
that encryption is no longer parallelizable, however decryption still is, and
random access to blocks of an encrypted file is still possible. Re-encryption
under the same key is possible, however this requires the re-encryption of all
following blocks.
The Cipher Feedback (CFB) mode turns a block cipher into a stream
cipher, by using the block cipher’s output as key stream. Block ciphers in
CFB mode can operate on pieces of data smaller than the block size. This
could be used for bit-by-bit encryption, however such a mode of operation
would be very ineffective. The CFB mode is illustrated in figure 5.3. As for
CBC it has the effect of concealing patterns in blocks of plaintext and making
manipulations of the ciphertext detectable. As with CBC the encryption is
no longer parallelizable, but decryption still is. Random access to blocks of
an encrypted file is possible. Re-encryption requires to use a completely new
initialization vector, since otherwise the same security issues as with normal
stream ciphers would apply.
The advantages and drawbacks of each encryption mode are summarized
in table 5.1. For more details please refer to chapter 9 of [87].
5.1. OVERVIEW OF ENCRYPTION ALGORITHMS FOR STORAGE 75
P1
IV
P2
key
Encrypt
P3
key
C1
C3
key
key
Decrypt
Decrypt
Decrypt
P1
P2
P3
key
Encrypt
...
Encrypt
...
IV
C3
C2
C1
C2
key
CBC Decryption
CBC Encryption
Figure 5.1: The cipher block chaining mode. Pi0 s are the plaintext blocks,
the Ci0 s the ciphertext blocks and IV is a randomly generated initialization
vector.
Pn
Pn−1
Cn−2
key
Encrypt
Cn−1
0
key
key
Decrypt
Decrypt
key
Encrypt
Cn
Cn C’
Cn C’
Cn−1
Encryption
Cn−2
0
Pn
C’
Pn−1
Decryption
Figure 5.2: Ciphertext stealing in CBC mode. Pi0 s are the plaintext blocks,
the Ci0 s the ciphertext blocks. C’ is a temporary value that is not stored with
the ciphertext.
76
CHAPTER 5. RELATED WORK IN STORAGE SECURITY
Shift Register (with n−bit cells)
key
Shift Register (with n−bit cells)
key
Encrypt
Select n
leftmost bits
Select n
leftmost bits
Pi
Encrypt
Ci
Ci
Pi
CFB Encryption
CFB Decryption
Figure 5.3: An n-bit cipher-feedback mode. At the start of this operation, the
shift register is filled with an initialization vector IV . The Pi and Ci are n
bits long. The total size of the shift register is the blocksize of the encryption
algorithm.
ECB:
- Plaintext patterns are not concealed.
- Easy to manipulate encrypted blocks of data.
+ Parallelizable de/encryption.
+ Random access possible.
+ Re-encryption of modified blocks possible.
CBC:
+ Plaintext patterns are concealed.
+/- Plaintext somewhat difficult to manipulate.
+/- Encryption is not parallelizable, decryption is.
+ Random access possible.
+/- Re-encryption of all subsequent blocks after modification.
CFB:
+ Plaintext patterns are concealed.
+/- Plaintext somewhat difficult to manipulate.
+/- Encryption is not parallelizable, decryption is.
+ Random access possible.
- Re-encryption of the whole file after modification.
Table 5.1: Summary of the advantages and drawbacks of block cipher modes
with regard to encrypted storage.
5.2. STANDARDIZATION
5.2
77
Standardization
The IEEE Computer Society has taken interest in storage security and has
sponsored the Security in Storage Working Group (SISWG) 1 as a body to
work on the definition of standards for cryptographic algorithms and methods
for encrypting data for storage. The group has started its work in early
2004 and has produced three draft documents so far. A proposal for a key
backup format [35], and two proposals of block-cipher modes for the AES
algorithm, specifically suited for storage security [36], [34]. The publications
presenting these new block-cipher modes suggest that they have the same
positive properties as the CBC encryption mode and additionally they keep
encryption and decryption parallelizable. However they double the necessary
encryption operations, therefore it remains to be seen if the advantages of
this proposal outweigh the loss of performance. We therefore prefer to let the
cryptographic community study these proposals for some time, before taking
the decision to use them.
The global goal of SISWG’s efforts is to facilitate the interoperability of
the encrypted data storage mechanisms and to reduce the risk of data loss
through incompatibilities that may occur if encrypted data are accessed after
a long period of encrypted storage.
Since the work of the group is still relatively recent, and currently mainly
deals with cryptographic algorithms it has no major impact on the proposals
of this thesis. However one can expect that future standards issued by this
working group will be more relevant to this approach.
5.3
5.3.1
Encrypted storage systems
CFS
The first system supporting encrypted storage was the Cryptographic File
System (CFS) [12]. It was developed by Matt Blaze from AT&T Bell Laboratories in 1993. CFS provides encryption and decryption functionality at
local file system level.
It uses DES in a combination of the two encryption modes ECB and OFB
(output feedback mode). The idea behind this encryption scheme is to allow
modifications in encrypted file blocks without having to re-encrypt the rest
of the file as would be necessary with the CBC encryption mode.
The granularity of protection in CFS are directories, therefore fine-grained
file encryption is not possible. No functionality for sharing encrypted data
1
http://www.siswg.org
78
CHAPTER 5. RELATED WORK IN STORAGE SECURITY
and related decryption keys in distributed environments is provided. Later
a key escrow system was introduced in[13]. It uses smartcards to build a
bilaterally auditable escrow system, where both the key holder can verify
that the escrow agent has not used the key and the escrow agent can verify
that he holds a valid key without using it. Its purpose is to recover the key,
should its main copy become unavailable.
5.3.2
TCFS
The Transparent Cryptographic File System (TCFS) developed at the University of Salerno in Italy during 1997 [21, 22] works fundamentally the same
way as CFS.
TCFS was originally designed to use the DES cipher in CBC mode. However the newest version of TCFS [22] provides dynamic encryption modules
that allow the user to choose the encryption algorithm.
The latest version of TCFS also proposes a group sharing protocol. It
allows a group of users to access a file, if a certain threshold number of group
members participate in the access attempt. This mechanism does not work
for distributed group access since all group members must log into the same
workstation.
5.3.3
CryptFS
CryptFS [103] is a stackable Vnode Level Encryption File System. It was
developed around 1998 at the Computer Science Department of the Columbia
University, USA. CryptFS extends the functionality of CFS to make it more
efficient and resilient against insider attacks by integrating it in the kernel of
the operating system.
CryptFS uses the Blowfish [87] encryption algorithm in CBC mode on file
data blocks of 4 of 8 KB. This means that each block is independent from
the others and can be modified and re-encrypted separately.
As CFS, CryptFS does not provide any file- and key-sharing mechanisms.
5.3.4
P. Gutmann’s SFS
Peter Gutmann’s Secure FileSystem (SFS) [56] implements a cryptographic
storage file system for MS-DOS. It was developed until 1995 while Gutmann
was a graduate student at the University of Auckland, New Zealand.
SFS uses the MDC/SHS encryption algorithm designed by Gutmann himself that turns a one-way hash function into a block cipher that runs in CFB
mode. Schneier raises some concerns against this construction on p. 353 of
5.3. ENCRYPTED STORAGE SYSTEMS
79
[87], as hash functions are generally not designed to be used in that way and
therefore their security for encryption use is not well researched.
Although SFS has no support for file sharing, it has an interesting feature
that makes it worth mentioning: an emergency key access mechanism, using
Shamir’s secret sharing scheme [90] where the key is split into n shares which
are distributed to trusted key escrow agents. The key can be recovered with
any subset of m key shares. For any smaller subset, the recovery is computationally infeasible. Therefore at least m escrow agents must collude to gain
access to the key. We have adapted this idea in our approach as described in
section 7.1.
5.3.5
WinEFS
The Windows Encrypting File System (WinEFS) [73], is delivered with Microsoft Windows NT 5.0/2000/XP and Windows Server 2003.
WinEFS uses the DESX encryption algorithm with a 128-bit key or TDES
with a 168-bit key. In export versions only 40 bits of the actual key are used.
From the documentation of WinEFS it remains unclear which block-cipher
mode is used.
WinEFS uses the lockbox concept to make decryption keys available for
authorized users. The idea of a lockbox is to store the decryption key for
a file, encrypted with the public key(s) of the user(s) authorized to access
the file. Figure 5.4 illustrates this concept. WinEFS stores these lockboxes
in the file’s header. File sharing is done by adding a lockbox encrypted with
the newly authorized user’s public key in the file header. This concept is
infeasible to manage groups with dynamically changing membership, since it
would require constant updates in the headers of the encrypted files.
5.3.6
SNAD
The Secure Network Attached Disks (SNAD) [74] system has been developed
around 2002 at the University of California, USA.
SNAD uses the RC5 encryption algorithm with a key length of 128 bits
in CBC mode.
It handles file sharing in a similar way as WinEFS. The decryption keys
are stored in lockboxes and associated to the files as meta-data. In contrast
to WinEFS, a lockbox can be associated to multiple files (which means all
those files have been encrypted with the same key). This design has the same
limitations as WinEFS with regard to our requirements.
80
CHAPTER 5. RELATED WORK IN STORAGE SECURITY
Symmetric key k
public key pu
Symmetric
Encryption
confidential
file
(a)
private key pr
Asymmetric
Decryption
Lockbox
(c)
Symmetric key k
Asymmetric
Encryption
Lockbox
encrypted
file
(b)
Symmetric key k
Symmetric
Decryption
Symmetric key k
encrypted
file
(d)
confidential
file
Figure 5.4: The lockbox concept. In step (a) a confidential file is encrypted
with a symmetric key k. In step (b) this symmetric key k is encrypted with
the public key of a user authorized to access the encrypted file. In step (c)
this users retrieves the lockbox and opens it with his private key. Finally
in step (d) this user decrypts the file with the symmetric key found in the
lockbox.
5.3.7
Cepheus
Cepheus [50] is cryptographic storage system supporting group sharing and
random access. It was developed from 1998 to 1999 at the Massachusetts
Institute of Technology (MIT) by Kevin E. Fu. Cepheus is based on parts of
D. Mazières Self-certifying File System (also developed at the MIT) [71].
Cepheus uses the RC5 encryption algorithm in CBC mode. Similarly to
TCFS, each file data blocks of 8 KB is separately encrypted in this mode
using a different initialization vector each, thus allowing for random read
and write access.
In Cepheus, a file shared between a group of users is encrypted with a
symmetric group key. A group database server maintains up-to-date group
membership information for users. It stores the group key in lockboxes encrypted with the public keys of the group members. The group database
server responds to requests from user agents and delivers group key lockboxes
to group members. The file server communicates with the group database
server to determine if a user has access to a specific file based on a group
membership.
This configuration requires the owner of the encrypted file to know all
public keys of the group members with whom he wants to share the decryption key. If new users are added to the group, the owner of the file is
5.3. ENCRYPTED STORAGE SYSTEMS
81
required to update the group database by adding new key lockboxes. Such an
approach is clearly too costly in distributed environments with dynamically
changing user groups.
5.3.8
J.P. Hughes’ SFS
The Secure File System (SFS)2 [60, 59] is a joint project between the University of Minnesota, USA and StorageTek Corp. which started in 1999. It
aims at providing an easy to use cryptographic file system.
The publications concerning SFS do not indicate which cryptographic
algorithms are used within SFS.
SFS proposes a group sharing mechanism that is also based on group
servers. Each group server can manage key access for several subgroups.
The header of an encrypted file contains an access control list (ACL)
signed by the source of authority for that file. The ACL specifies groups
and individual users that are allowed to access the file. For each individual
entity that is allowed to access an encrypted file, a lockbox encrypted with
that entities’ public key is provided in the ACL. The ACL may also contain
shared authorizations, that require different entities to cooperate in order to
gain access to the key. In this setting, the lockboxes contain shares of the
file decryption key. For group access to encrypted files the header contains a
lockbox encrypted with the public key of the group server.
A user who wants to access to an encrypted file through a group membership must recover the ACL and send it to the corresponding group server.
The group server uses the ACL to determine if the user is allowed to access
to the file. If access is granted the server decrypts the lockbox and returns
the file decryption key to the user.
SFS also provides a smartcard interface for the secure storage of a user’s
private key. All private key operations are performed on the smartcard.
Therefore the private key never leaves the smartcard and is better protected
than in classical password based protection schemes.
The drawback of SFS is that the group server is a single trusted entity. It
can decrypt any lockbox related to groups it manages and therefore is itself
a valuable target for attacks.
5.3.9
C-SDA
Chip-Secured Data Access (C-SDA) [18, 17] is a recent encrypted storage
approach developed since 2002 at the PRISM laboratory of the University of
2
SFS by P. Gutmann, SFS by D. Mazières and SFS by J.P. Hughes et al. are completely
unrelated
82
CHAPTER 5. RELATED WORK IN STORAGE SECURITY
Versailles in France.
It proposes sharing of encrypted files through the use of smartcards as
tamper-resistant devices for storing access rights and decryption keys. In the
C-SDA approach, every user must have a smartcard that stores the access
rights and the decryption keys to the data he may access. If the user requests
access to some encrypted data, the smartcart verifies if that user has the right
to access that data and performs the decryption of the data for the user. The
decryption keys never leave the smartcard and therefore there is no necessity
to update the encryption keys, when access rights are revoked.
The access rights and keys on the smartcard are updated from external
servers at connection time (i.e. when the user inserts the smartcard in reader).
The protection granularity of C-SDA are views of databases. As these are
generated dynamically, no bijection between encryption and access rights
exists. Therefore encryption must remain orthogonal to access rights in this
approach.
Our concern with this approach is that the smartcards are considered as
tamper-proof devices. Since the discovery of side-channel attacks [62, 63], new
methods of attacking smartcards and similar devices based on side-channel
information are found at an alarming rate (e.g. [81], [69]) and therefore the
protection mechanisms have to be updated frequently.
5.3.10
Summary
In this section we have presented a summary of the main secure storage systems. We have concentrated on three characteristics: the granularity of encryption, the key sharing mechanisms and the special features (when present)
that make the system noteworthy. The results of this summary are presented
in table 5.2. Our conclusion is that a protection granularity of an encrypted
storage system should allow to protect individual files and that current key
sharing mechanisms are not suited for dynamically changing permissions.
5.3. ENCRYPTED STORAGE SYSTEMS
Encrypted
Storage System
CFS
Granularity
of encryption
Directory
TCFS
CryptFS
Gutmann’s SFS
WinEFS
SNAD
Cepheus
Hughes’ SFS
User Account
User Account
Partition
File
File
File
File
C-SDA
Database
table views
Key sharing
No
No
No
No
Lockbox
Lockbox
Group server
Group server
& Lockbox
Based on
access rights
83
Special features
Smartcard based
escrow
Threshold sharing
Key sharing escrow
Smartcard support
Smartcard based
management of keys
and permissions
Table 5.2: Summary of encrypted storage systems.
84
CHAPTER 5. RELATED WORK IN STORAGE SECURITY
Chapter 6
Sygn access control
In this chapter we describe the design and implementation of our access
control architecture Sygn 1 .
We first give an overview of the components and their deployment in
section 6.1, then the syntax and semantics of the Sygn access control language
is presented in section 6.2. Section 6.3 describes the meta-data used by the
Sygn access control. A detailed description of the Sygn decision algorithm is
given in section 6.4. The chapter finally closes with a discussion of Sygn in
section 6.6.
6.1
Sygn overview
As our main goal is to support ad-hoc access control decisions and decentralized delegation, we have decided to use authorization certificate paths for
permission granting. We have opted for a permission push model for the reasons outlined in section 4.3, and users store all their permissions themselves.
The permissions are protected against tampering by a digital signature.
We distinguish between two kinds of users in the Sygn architecture:
• Users owning a resource and acting as its source of authority: These
users grant permissions that allow the use of their resource. For this
purpose they use a Sygn owner client (SOC).
• Users acting as resource consumers: These users want to access resources with their permissions. A Sygn user client (SUC) allows them
1
Sygn is a name from the Nordic mythology. It designs a goddess of truthfulness but
also of doors and locks. She guards the entrance of the Wingolf palace and admits only
the honest.
85
86
CHAPTER 6. SYGN ACCESS CONTROL
to store owned permissions and retrieve resources according to these
permissions.
As a particular user may act both as owner or consumer, we have created
a dual-use client that functions either as SAC or SUC, depending on the
users actions.
In figure 6.1, a resource owner installs a hardware resource or stores a
data resource on a (possibly remote) resource server. The SAC contacts the
resource server and registers the user as source of authority (SOA) for this
resource in the local Sygn server’s meta-data-base (step 1).
To grant access to this resource to a resource consumer (step 2) the resource owner issues authorization certificates that allow access to the resource. Note that this process involves only the owner and the consumer(s),
and can be done offline.
The SUC allows a user to store and retrieve authorization certificates as
needed. To access the resource, the user contacts the corresponding resource
server (we assume that the localization of the correct server is realized by the
Grid middleware) and submits the request along with the needed certificates
as shown in step 3.
The resource server needs two different Sygn modules. The Sygn PDP
that produces an access control decision based on the Sygn request provided
by the user and a Sygn PEP. The PEP has two primary functions. First
it has to make sure that the Sygn request corresponds to the Grid request
the user submitted (otherwise a user could submit valid permissions for one
resource together with a Grid request concerning another one). Second it
uses the Grid authentication mechanism to verify if the request issuer really
owns the permissions he submits.
The Sygn PDP decides if the request is correct and authorized. For traceability the Sygn server can be configured to log all requests. Non-repudiation
of those logs can be achieved by another server configuration option, that
makes the timestamping and digital signature of requests by the issuer
mandatory.
As the Sygn PDP is completely separated from the Grid middleware,
only the PEP has to be re-implemented for different Grid middlewares, depending on the requests this middleware allows and the used authentication
procedures. Thus we have created a Sygn integration module for the µ-Grid
middleware 2 , that supports basic file access requests like: get-file, delete-file
and put-file.
2
Available from http://www.i3s.unice.fr/∼johan/ugrid/ugrid.html
6.2. SYNTAX AND SEMANTICS OF THE SYGN LANGUAGE
SOC
meta−
data
Sygn
owner
client
1) Stores resource
and Sygn SOA metadata
Sygn
user
client
PDP
meta−
data
Sygn
PEP
3) Uses AC(s)
to access resource
2) Issues
AC(s)
for resource
Resource owner
Sygn
PDP
87
Resource server
SUC
meta−
data
Resource consumer
Figure 6.1: Deployment of, and interaction between Sygn modules on a Grid.
6.2
Syntax and semantics of the Sygn Language
As explained in the previous section the Sygn access control language is
designed to support a decentralized management of permissions through the
use of certificates. The Sygn access control model supports both role based
access control for flexible management of permissions and discretionary access
control for fine-grained, ad-hoc granting of permissions in inter-institutional
resource sharing scenarios.
For the reasons specified in section 4.4.4 we have chosen XML[20] to
represent the Sygn access control language. Basically the role of every access
control language is to express which actions an access control subject may
perform on an access control object. The Sygn access control language is
used for the expression of both, requests and permissions. A XML schema
definition of the elements presented in this chapter can be found in appendix
A.
6.2.1
Subjects
Sygn recognizes two types of access control subjects: Individual entities and
roles. Subjects can be granted permissions and can be sources of authority
(SOA) for some access control object.
Following the concept of SPKI (see section 4.5.3) Sygn identifies individual entities using their public key. Such an identifier is encoded as User
identifier (UID) as shown in figure 6.2. The corresponding private key is
used for authentication of the entity and for signing authorization certifi-
88
CHAPTER 6. SYGN ACCESS CONTROL
cates if the entity acts as SOA for an access control object. Note that the
public key is considered sufficient as unique identifier. Sygn does not associate a distinguished name to public keys for user identification such as it
would be the case in X.509[80] certificates. Given the size of the namespace
(valid public/private-key pairs) it is extremely improbable that two users will
accidentally be assigned the same UID.
00 <USER_ID>
01
MIGdMA0GCSqGSIb3DQEBAQUAA4GLADCBhwKBgQDqmTTMboHuJ7
02
LuhajR/tdhu/WhdKLPca4b4LYFiOzkkB0aCa1KUBhoZAz0VU+R
03
xTvSx9cORUl3+t8rHwPPusq39RK+Sr3pPho+KL+IfzlhqRRx9O
04
TSiPgSvTEGXllVd2VYnjV8ssoguzsCsMZRKcQXXmreDHbWF9sK
05
KYT76aUraQIBEQ==
06 </USER_ID>
Figure 6.2: An example of a user identifier (UID) in Sygn using a 1024 bit
RSA public key.
A role is identified by its name and the UID of its SOA. The UID of the
SOA hereby forms a namespace prefix under which the role’s name must be
unique (i.e. two roles are equal if they have the same name and the same
SOA).
The SOA can grant the role to other subjects, including other roles.
Granting activation of a role A to a role B makes role A hierarchically inferior
to role B thus B inherits all permissions of role A. The following example
illustrates these facts.
Consider the following permissions:
• P1 permits read access on f ile I to role A
• P2 permits write access on f ile I to role B
• P3 permits activation of role A to role B
Then role B is said to be hierarchically superior to role A, since users who
can only activate role A can only read f ile I while users who can activate
role B can also activate role A and thus read and write f ile I.
If the SOA of a role could be another role, that would allow two paradoxical situations: First a role A that is SOA of role B and vice versa, which would
make the hierarchy graph cyclic. Secondly an infinite recursion might be created, where ∀i ∈ N role Ai has role Ai+1 as SOA. While the first paradox
simply makes no sense, the second one could be used to create unnaturally
6.2. SYNTAX AND SEMANTICS OF THE SYGN LANGUAGE
89
big certificates in order to crash the system. The benefits of allowing a first
role to be SOA of second role, would be that every member of the first role
would be SOA of the second. This can also be achieved more cleanly by giving the first role the right to delegate activation of the second role. We have
therefore decided that only user identifiers (UID) may be used as SOA’s for
a role.
Sygn encodes a role in a Role identifier (RID) as shown in figure 6.3.
The field containing the review-repository indicate a location where copies
of all permissions concerning this role should be stored for review. Such
review functions are required in standard RBAC(see section 4.2.3). Note that
within architecture of Sygn it is impossible to enforce that every SOA who
grants a permission to a role also sends a copy to the role’s review-repository.
However such functionality could be integrated in a standard permissiongranting interface that is provided to the Grid users.
00 <ROLE_ID>
01
<ROLE_SOA>
02
<USER_ID>
03
MIGdMA0GCS...0/OfMwIBEQ==
04
</USER_ID>
05
</ROLE_SOA>
06
<ROLE_NAME>
07
nurse/station1B/SomeHospital
08
</ROLE_NAME>
09
<REVIEW_REPOSITORY>
10
http://repository.SomeHospital.fr:4711
11
</REVIEW_REPOSITORY>
12 </ROLE_ID>
Figure 6.3: An example of a role identifier (RID) in Sygn.
Sygn supports a special subject identifier, the <ANY SID/> tag. This
identifier matches equal with any other subject identifier. It can therefore be
used to grant permissions to everyone.
6.2.2
Objects
Sygn currently allows to identify four types of access control objects. However
the Sygn architecture allows to add more object types easily. The current
object types allow to identify files, file-sets, role-objects (roles used as objects)
and hardware resources. Every object is identified by a name and a SOA. The
90
CHAPTER 6. SYGN ACCESS CONTROL
SOA can be any Sygn subject, with the exception of role-objects where the
type is necessarily an UID. As with role subjects, the identifier of the SOA
forms a namespace prefix under which an object’s name must be unique.
Semantically an object’s SOA is considered to have all permissions on that
object, and therefore the SOA is therefore never required to grant himself
any such permissions explicitly.
File identifiers (FID) are used to identify single files for fine-grained access control decisions. The logical filenames that are part of a FID can be
generated by Sygn, as a result of a cryptographic hash applied to the files
content. However this is not mandatory and can be replaced by any other
mechanism for naming files. Sygn access control assumes that when files are
replicated on Grid storage, the replicas get the same file identifier as the original. This allows Sygn permissions to be consistently applied to any replica
of a file.
File-set identifiers (FSID) allow to address a set of files. This makes it
possible to group files together into a set and issue global permissions on
that set. Single files identified by FID’s can be added to a set thus making
all permissions concerning the set valid for that specific file. Furthermore a
file-set can be added to another file-set, thus making the first a subset of the
second (i.e. all permissions concerning the second file-set will also apply to
the first).
Resource identifiers (RESID) are used to identify hardware resources such
as storage space on a disk or computing power provided by CPUs.
Role object identifiers (ROID) are identical to roles, the different name
is just used to distinguish between roles used as subjects and roles used as
objects of Sygn permissions.
Figure 6.4 shows a file-set identifier, that has a role as source of authority.
00 <FILE_SET_ID>
01
<SET_SOA>
02
<ROLE_ID>
03
...
04
</ROLE_ID>
05
</SET_SOA>
06
<SET_NAME>
07
research_project_42B3701_files
08
</SET_NAME>
09 </FILE_SET_ID>
Figure 6.4: An example of a file-set identifier (FSID) in Sygn.
6.2. SYNTAX AND SEMANTICS OF THE SYGN LANGUAGE
6.2.3
91
Actions
Sygn supports a set of basic actions which can easily be extended if need
arises.
The different actions are specific to the object types and newly introduced
actions should be assigned to one or more object types too.
Currently file based objects (i.e. files and file-sets) support the actions read
and write that allow to read or write that file-based object. For a file-set, that
means the action is applicable to any file in the set. The action add to set
allows to add a file-based object to a file-set or to grant the permission to
do so. Consequently the action remove from set gives the right to revoke
certificates that add file-based objects to a file-set.
Roles currently only have the activate action that allows to activate the
role and use its permissions.
Hardware resources have the grant and the use actions. The grant action
is used to grant a certain amount of hardware use (measured by an external
metric) to a user. It therefore has an additional attribute specifying the
numerical value that is granted to the user.
The use action serves for the actual requests to use the granted resource.
The use action therefore only appears in requests and not in permissions. It
also has a numerical value attribute requesting the amount of the granted
hardware resource.
Figure 6.5 shows the encoding of a grant action with it’s additional attribute.
00 <ACTION SIZE="1000">
01
grant
02 </ACTION>
Figure 6.5: An example of a grant action of size 1000 in Sygn.
6.2.4
Capabilities
Sygn defines a capability as a legal combination of an object and an action.
Capabilities are used to grant or request specific actions on specific objects.
To allow future versions of Sygn to ask users if they can produce certain capabilities without disclosing their content, each capability has a unique identifier generated by calculating a cryptographic hash over the XML-encoded
data of the capability. Figure 6.6 shows the encoding of a capability giving
read access to a file.
92
CHAPTER 6. SYGN ACCESS CONTROL
00 <CAPABILITY>
01
<CAPABILITY_ID>
02
hEqrpFH6tN1w0FRNjSI0EWIPRi4=
03
</CAPABILITY_ID>
04
<OBJECT> <UNIQUE_FILE_ID>
05
<FILE_SOA> <USER_ID>MIG...BEQ==</USER_ID></FILE_SOA>
06
<LOGICAL_FILENAME>+/AbBuY...xe88=</LOGICAL_FILENAME>
07
</UNIQUE_FILE_ID></OBJECT>
08
<ACTION> read </ACTION>
09 </CAPABILITY>
Figure 6.6: An example of a capability allowing to read a file.
A special type of capabilities allows to express that an object is added
to a specific file-set. This capability has a second object which is the target
file-set. Figure 6.7 shows the encoding of such a capability.
00 <CAPABILITY>
01
<CAPABILITY_ID>
02
hEqrpFH6tN1w0FRNjSI0EWIPRi4=
03
</CAPABILITY_ID>
04
<OBJECT> <UNIQUE_FILE_ID>
05
...
06
</UNIQUE_FILE_ID></OBJECT>
07
<ACTION> add_to_set </ACTION>
08
<SECOND_OBJECT> <FILE_SET_ID>
09
...
10
</FILE_SET_ID></SECOND_OBJECT>
09 </CAPABILITY>
Figure 6.7: An example of a capability that adds a file to a file-set
6.2.5
Authorization Certificates
The basic building block of a Sygn permission is the Authorization Certificate
(AC). It permits to bind a capability to a subject and to specify various
conditions related to the use of the capability.
Basically a Sygn AC consists of:
6.2. SYNTAX AND SEMANTICS OF THE SYGN LANGUAGE
93
• An identifier generated by calculating a cryptographical hash of the
other AC data.
• A creator, who is identified by a UID, and whose private key is used to
sign the AC.
• An owner, also called the subject of the permission.
• A capability that is given to the owner by the creator, it contains the
object and the action of the permission.
• Validity limits (not before and not after ), represented by two timestamps.
• Restrictions on the use of the permission (not with).
• A delegation depth limit and
• a digital signature that makes unauthorized modifications of the permission detectable3 .
Figure 6.8 shows an example Sygn permission certificate encoded in XML.
Please note that the file’s SOA in this example (line 07) is not identical to
the AC creator (line 02). Therefore this permission may only be validated by
other ACs that give the creator of this AC the right to delegate read access
on the file.
The AC’s identifier (line 01) is used for revocation of authorization certificates. An AC may be revoked by its creator or by the SOA of its object.
The AC’s owner (line 03) is the subject of the permission. This may either
be an individual user represented by an UID or a role. If the AC’s owner is a
role, any user who can activate that role may use the capability of this AC.
If the certificate’s capability adds an object to a file-set, the owner of the AC
must be the source of authority of this file-set (or the <ANY SID/> tag).
Sygn ACs support restrictions (line 16) in form of a sequence of roles
identified by their RID, that may not be used in requests together with this
AC. This allows to enforce dynamic separation of duties (DSD) as described
in section 4.2.3. These restricted roles are enclosed by <NOT WITH> tags
in the XML representation of an AC.
The delegation depth limit (line 17) is an integer that specifies, how many
steps the AC’s capability may be delegated. A limit of zero means the capability may not be delegated at all. Any limit greater than zero allow the
3
We currently use the RSA signature algorithm with SHA-1 hashing and PKCS
padding. However due to recent advances in cryptographic attacks on SHA-1 this may
change in future versions of Sygn
94
CHAPTER 6. SYGN ACCESS CONTROL
00 <AUTHORIZATION_CERTIFICATE>
01
<ID>bA1lxGTYDd3eHT/gr/6B1N4dCWU=</ID>
02
<CREATOR><USER_ID>MIG...UraQIBEQ==</USER_ID></CREATOR>
03
<OWNER><USER_ID>MIG...wdUQIBEQ==</USER_ID></OWNER>
04
<CAPABILITY>
05
<CAPABILITIY_ID>8k9AiT...cMGbr8U=</CAPABILITIY_ID>
06
<OBJECT><UNIQUE_FILE_ID>
07
<FILE_SOA>
08
<USER_ID>MIG...j4jQIBEQ==</USER_ID>
09
</FILE_SOA>
10
<LOGICAL_FILENAME>+/Ab...e88=</LOGICAL_FILENAME>
11
</UNIQUE_FILE_ID></OBJECT>
12
<ACTION>read</ACTION>
13
</CAPABILITY>
14
<NOT_BEFORE>2003-10-01T10:23:02Z</NOT_BEFORE>
15
<NOT_AFTER>2004-10-01T12:22:03Z</NOT_AFTER>
16
<NOT_WITH>
17
<ROLE_ID> ... </ROLE_ID> ...
18
<NOT_WITH>
19
<DELEGATIONS>1</DELEGATIONS>
20
<SIGNATURE>qcIiRa...mbCZHCH6zBeOtc1OV5Byw=</SIGNATURE>
21 </AUTHORIZATION_CERTIFICATE>
Figure 6.8: An example of an AC where the creator (line 02) grants read
access (line 12) on a file (lines 06-11) to the owner of the AC (line 03), with
the right to delegate this capability one step (line 19).
AC owner to delegate the AC’s capability in a certificate path, by creating a
new AC with a delegation depth limit reduced by at least one. It is enclosed
by a <DELEGATIONS> tag in the XML representation of an AC.
6.2.6
Certificate Paths
In order to permit multi-step delegation and the use of permissions assigned
to roles (which require previous activation of the role) Sygn defines certificate
paths that target a certain capability for a certain user. If a path is valid for
a certain user, this means that he may use the path’s target capability. Each
path has a target capability and an ordered set of ACs. In order to prevent
denial of service attacks that make use of abnormally long certificate chains
(and thus fill the available memory) a maximum length for a path can be
6.2. SYNTAX AND SEMANTICS OF THE SYGN LANGUAGE
95
configured.
Figure 6.9 shows the XML encoding of such a certificate path with only
two certificates. The target of this path is a capability that allows to read a
file. If the path is to be valid, the two certificates should grant this capability
to a certain user.
00 <PATH>
01
<TARGET><CAPABILITY>
02
<CAPABILITIY_ID>8k9AiT...cMGbr8U=</CAPABILITIY_ID>
03
<OBJECT><UNIQUE_FILE_ID>...</UNIQUE_FILE_ID></OBJECT>
04
<ACTION>read</ACTION>
05
</CAPABILITY></TARGET>
06
<AUTHORIZATION_CERTIFICATE>
07
...
08
</AUTHORIZATION_CERTIFICATE>
09
<AUTHORIZATION_CERTIFICATE>
10
...
11
</AUTHORIZATION_CERTIFICATE>
12 </PATH>
Figure 6.9: An example of a certificate path containing two certificates, that
grants read access (line 05) to a file (line 04).
The certificate path verification algorithm is presented in section 6.4.
6.2.7
User requests
In order to gain access to a resource the user must submit a request to the
Sygn Policy Decision Point (PDP). This request follows the Sygn standard
user request format (SURF). A request contains the UID of the request issuer and one or more request paths. As each path targets one capability it is
possible to gain access to resources that need multiple capabilities simultaneously (e.g. two roles activated in parallel). Optionally, the Sygn-PDP can be
configured to require the request issuer to sign his request. To avoid replays
of old requests, the request is assigned a timestamp. Figure 6.10 shows such
a signed user request.
6.2.8
Sygn-PDP responses
The Sygn-PDP responses are also structured XML documents. They consist
of three blocks:
96
CHAPTER 6. SYGN ACCESS CONTROL
00 <SURF>
01
<REQ_ISSUER>
02
<USER_ID> ... </USER_ID>
03
</REQ_ISSUER>
04
<ISSUE_TIME> 2005-01-18T17:23:00Z </ISSUE_TIME>
05
<ISSUERS_SIGNATURE>
06
HXAI6CPU....UebjcLBGIdIFjNGs/ikB2pOOK2w=
07
</ISSUERS_SIGNATURE>
08
<REQ_PATH>
09
<PATH NUMBER="0"> ... </PATH>
10
<PATH NUMBER="1"> ... </PATH>
11
<PATH NUMBER="2"> ... </PATH>
12
</REQ_PATH>
13 </SURF>
Figure 6.10: An example of a signed request (SURF) containing three certificate paths.
• The request status, which is either granted, denied or failed. A status
of failed indicates that a component of the PDP has malfunctioned
causing the verification of the request to fail (e.g. the database storing
the meta-data is unavailable).
• A global description of errors (if any), that caused the request to fail
or to be denied.
• A detailed description of errors for every path (if any), that caused this
path to fail or to be denied.
Figure 6.11 shows such a response.
6.2.9
Extensibility
Sygn is designed to be easily extensible. The extension points are the subjects,
objects and actions that are supported by the Sygn language.
New Sygn subjects can be added by creating a new instance of the abstract class subject identifier. New subjects need to provide methods for exporting and importing the subject to and from an XML document. Those
subjects can subsequently be used as SOA of any Sygn objects with the
exception of ROIDs where the SOA must be a UID.
6.2. SYNTAX AND SEMANTICS OF THE SYGN LANGUAGE
97
00 <SYGN_RESPONSE>
01
<REQUEST_STATUS> denied </REQUEST_STATUS>
02
<GLOBAL_ERROR> Path error </GLOBAL_ERROR>
03
<PATH NR="0">
04
<STATUS> denied </STATUS>
05
<ERROR>
06
Certificate 3 is invalid: certificate has expired
07
</ERROR>
08
</PATH>
09
<PATH NR="1">
10
<STATUS> granted </STATUS>
11
<ERROR> none </ERROR>
12
</PATH>
13 </SYGN_RESPONSE>
Figure 6.11: An example of a Sygn-PDP response to a specific request.
New Sygn objects can also be added by creating new instances of the
abstract class object identifier. New objects can have parameters that differ
from existing ones, provided they have an object-name and a SOA. As with
Sygn subjects, an export and an import method to and from XML for new
object-types must be implemented. Furthermore the object must implement
a function that returns it’s SOA.
Sygn actions can also be extended by adding new action names to the list
of Sygn action names. The semantics of these actions must be interpreted by
the Sygn-PEP in order to enforce access decisions.
6.2.10
Formal representation
The Sygn language can be represented by the following set of productions
as a grammar. Terminal Symbols are: any sid, public key, string, url, integer value, timestamp, signature.
UID -> public_key
ROLE_SOA -> UID
ROLE_NAME -> string
REVIEW_REPOSITORY -> url
RID -> ROLE_SOA, ROLE_NAME, REVIEW_REPOSITORY
98
CHAPTER 6. SYGN ACCESS CONTROL
SID -> any_sid | UID | RID
FILE_SOA -> SID
LOGICAL_FILENAME -> string
FID -> FILE_SOA, LOGICAL_FILENAME
SET_SOA -> SID
SET_NAME -> string
FSID -> SET_SOA, SET_NAME
RESOURCE_SOA -> SID
RESOURCE_NAME -> string
RESID -> RESOURCE_SOA, RESOURCE_NAME
ROID -> RID
OID -> FID | FSID | RESID | ROID
ACTION -> string, integer_value | string
OBJECT -> OID
SECOND_OBJECT -> FSID
CAPABILITY -> OBJECT, ACTION | OBJECT, ACTION, SECOND_OBJECT
CREATOR -> UID
OWNER -> SID
NOT_BEFORE -> timestamp
NOT_AFTER -> timestamp
DELEGATION -> integer_value
NOT_WITH -> RID | NOT_WITH, RID
AC -> CREATOR, OWNER, CAPABILITY, NOT_BEFORE, NOT_AFTER,
NOT_WITH, DELEGATIONS, signature | CREATOR, OWNER,
CAPABILITY, NOT_BEFORE, NOT_AFTER, DELEGATIONS,
signature
COMMAND -> name | name, UID | name, FID | name, AC |
name, RESID | name, UID, RESID, integer_value
AC_CHAIN -> AC | AC_CHAIN, AC
PATH -> CAPABILITY | COMMAND | CAPABILITY, AC_CHAIN |
6.3. PDP META-DATA
99
CAPABILITY, COMMAND | CAPABILITY, AC_CHAIN, COMMAND
ISSUER -> UID
SURFSIG -> timestamp, signature
PATHES -> PATH | PATHES, PATH
SURF -> ISSUER, PATHES | ISSUER, SURFSIG, PATHES
For the sake of shortness and easy comprehension this grammar does not
address the size limitation of paths and the maximum number of paths in a
request.
The grammar is context-free and could be made regular by transforming some of the productions, however this would make it longer and less
understandable, since some of the semantic information would be destroyed.
6.3
PDP meta-data
Every Sygn-PDP needs a certain amount of meta-data to operate. This metadata is stored locally in a relational database. Since the security of the local
resources depends on this meta-data, it is important to protect it adequately.
Therefore no user but the local administrator should be able to access this
database directly. Updates of the meta-data by remote users are submitted
to the Sygn-PDP using an extension of the Sygn access control language that
allows to create certificate paths that support an administrative command.
After validation of such a path, the Sygn-PDP executes the command. Figure
6.12 shows the XML encoding of a certificate path that contains an administrative command. The following paragraphs explain the different meta-data
and the conditions that allow to remotely update them.
The most important meta-data are the sources of authority for all hardware resources and files controlled by the local Sygn-PDP. These form the
root of trust for access decisions and are therefore essential to the functionality of the access control system. Sources of authority for hardware resources
need to be entered manually into the database, whereas for files they are
stored automatically once a new file is copied onto a local storage resource.
Note that it is possible for any user having access to a file, to make a copy
of this file on the Grid with a new file identifier and thus to declare himself
source of authority for this copy. There is no reasonable way of preventing this in a distributed resource sharing environment. Therefore traceability
must act as deterrent against such fraudulent behavior. The Sygn-PDP verifies that the logical filename of a new file does not duplicate one of a FID that
is already registered in the local meta-data base. Only the source of authority for a file or hardware resource may remotely delete his registration. This
100
CHAPTER 6. SYGN ACCESS CONTROL
00 <PATH>
01
<SYGN_COMMAND>
02
<COMMAND_NAME>
03
add_file_soa
04
</COMMAND_NAME>
05
<PARAMETER NR="1">
06
<UNIQUE_FILE_ID> ... </UNIQUE_FILE_ID>
07
</PARAMETER>
08
</SYGN_COMMAND>
09 </PATH>
Figure 6.12: An example of a path containing an administrative command
adding a source of authority for a file to the local meta-data base. Such a
path could be included in a request.
is usually done after the file has been deleted or the hardware resource has
been taken offline, as subsequently no Grid access to the resource is possible
anymore.
Accounting for the use of granted hardware resources is also registered
in the Sygn meta-data base. This is done by the Sygn-PEP itself, so remote
administration of this meta-data is not possible. Sygn relies on some external
measurement to provide the numerical value for the amount of use. This
measurement could possibly be integrated in the Sygn-PEP. This means that
resource administrators have to define a metric that measures the use of their
hardware resource. For storage this can simply be the amount of storage space
used, or a formula based on the time a certain amount of storage space is
used. For CPU power the calculation of a usage metric becomes more difficult.
One could imagine a formula based on processor cycles, priority and memory
usage. These mechanisms need to be flexible since the exact execution time
of a program can often not be known prior to its execution.
An important piece of meta-data maintained by Sygn is a blacklist. Requests and permissions issued by any UID on the blacklist are rejected by the
local Sygn-PDP. This mechanism allows the local administrators to deny access to specific users and locally invalidate the permissions they have issued.
This mechanism also allows to invalidate rapidly all permissions issued by a
user whose private key has been stolen. The blacklist is maintained by the
local Sygn administrators therefore it can not be administrated remotely.
In order to allow more fine-grained revocation of permissions, the Sygn
meta-data also includes a certificate revocation table. For remote revocation
of a certificate, a user must submit the valid certificate, of which he must
6.4. PDP ALGORITHM
101
be the issuer. Alternatively the source of authority for the capability of a
valid certificate is also allowed to revoke it remotely. After verifying that the
certificate is valid, the Sygn-PDP enters the certificate’s identifier into the
revocation table thereby invalidating it locally. This table should be updated
on a regular basis using Grid wide Certificate Revocation Lists (CRL) in
order to maintain a coherent status of revoked permissions. Following the
proposal within SPKI, this CRL should be issued at fixed points in time,
previously known to the local resource providers. Therefore an attempt to
delay a revocation by making a denial-of-service attack against the CRL
distribution site can be noticed and appropriate measures can be taken. The
revocation table also stores the expiration date of revoked certificates, and
Sygn provides mechanisms for the administrator to erase revoked certificates
that have expired.
If the tracing mechanisms are turned on, the Sygn meta-data base also
stores a log-table, that records all requests that pass through the Sygn-PDP.
For each request, a locally unique request number and the exact time when
the request occurred are stored. Furthermore the contents of the request and
the PDP’s response are also recorded. Remote updates of this log-table are
not supported. Only the local administrators can read the log-table.
6.4
PDP algorithm
The policy decision algorithm decides if a certificate path grants the permission to use a target capability to a specific user. It is the core of the Sygn
architecture and it is based loosely on the principle of complete induction,
but includes a global memory. We start by giving an example path which we
use to illustrate how the algorithm works, then we describe a simplified presentation of the algorithm as automaton and finally we give the full, formal
definition of the algorithm.
The example path which is illustrated in figure 6.13 is composed of four
certificates. It gives Edgar the right to read the file document.txt.
In the first certificate AC 1 Alice, who is the SOA for document.txt, grants
Bob the capability to add document.txt to any file-set.
AC 2 which is issued by Bob now adds document.txt to the file-set Set A.
The SOA for Set A is Carol.
In AC 3 Carol grants read rights on Set A to a role Role B. The SOA of
Role B is Dave.
Finally in AC 4, the last certificate of this path, Dave grants activation
of Role B to Edgar.
102
CHAPTER 6. SYGN ACCESS CONTROL
Path target = read on ’document.txt’
Path target−object = ’document.txt’
Source of authority for ’document.txt’: Alice
AC 1 :
Alice
AC 2 :
Bob
can add target−object
to a Set
AC 3 :
Carol
Role B
(SOA: Dave)
read Set A
Bob
Carol
add target−object to Set A
(SOA: Carol)
AC 4 :
Dave
Edgar
activate Role B
Figure 6.13: A complete, correctly linked path of certificates.
Edgar can present the chain of these four certificates AC 1 − 4 to a SygnPDP in order to gain read access to the file document.txt.
The input parameters of the algorithm are the target capability consisting of a target action and a target object and a request issuer, who is the
specific user to whom the path should grant the permission to use the target
capability. The used variables are the current target of the path, which has
the initial value target but which may vary from target if the target object
is assigned to a set. current target consists of a current action which is always equal to target action and a current object which is updated when the
current object is added to a set.
The automaton representation of the algorithm is illustrated in figure
6.14. The automaton has three basic states and four intermediate states
which deal with the delegation of roles and the declaration of role hierarchies.
The transitions between states are triggered by the current certificate of the
path. The nature of the certificate that triggers a change to a specific state is
indicated within the state. Transitions can be subject to further conditions,
not directly related to the current certificate. These are indicated next to the
transition arrows. The basic states are:
• Granting the permission to use the current target capability and the
associated state that can be entered if this permission is granted to a
role.
• Addition of the current object to a set and the associated state that
can be entered if the set’s SOA is a role. The set to which current object
is added becomes the new current object. In this state, set hierarchies
are declared implicitly if current object itself is a set.
6.4. PDP ALGORITHM
103
• Granting the permission to add current object to a set and the associated state that can be entered if this permission is granted to a role.
In our example, the value of target and the initial value of current target
would be (document.txt, read). The request issuer would be Edgar.
The first certificate AC 1 leads the automaton to State I, since it grants
the permissions to add current object to a set.
AC 2 triggers a transition from State I to State II, since it adds current object to a set. Current object changes from document.txt to Set A.
With AC 3 the automaton passes to State III since the permission to use
the capability current target is granted. However since AC 3 is granted to a
role, the automation immediately passes on to State IIIa.
Finally AC 4 grants the activation of Role B to Edgar, since Edgar is
request issuer we pass to the terminal state of the automaton.
The formal implementation of the algorithm has three condition sets, the
start-conditions, the induction conditions and the end-conditions. In order to
be valid the first certificate of the path must fulfill the start-conditions, every
tuple of consecutive certificates must fulfill the induction-conditions and the
last certificate must fulfill the end-conditions.
The algorithm’s global memory stores the current target capability of
the path that varies through the verification procedure if the object of the
target capability is assigned to a set. If a permission in the certificate path
is granted to a role, the global memory stores this role as current role for
checking subsequent delegations of this role.
The memory also includes a has target flag that indicates if the owner
of the current certificate has been granted the permission to use the target
capability, a can add to set flag that indicates if the owner of the current
certificate is empowered to add the current object to a set, and a role valid
flag that indicates if current role has been set and may be delegated in the
next certificate of the path.
INPUT:
The user identifier of the request issuer.
A target capability with target := (target object, target action).
An ordered set path cert chain := {ac1 , ..., acn } of n
authorization certificates: aci := (creatori , owneri , capi ).
OUTPUT:
true if the path grants the right to use the target capability
to the request issuer, f alse otherwise.
CHAPTER 6. SYGN ACCESS CONTROL
104
State 0
grant role
activation
delegation or
role hierarchy
declaration
grant to
role
SOA of
current_object
is a role
State IIIa
grant role
activation
delegation or
State I
grant add_to_set
on current_object
grant to
role
set SOA
is a role
delegation
State II
add
current_object to a set.
current_object := set
set hierarchy
declaration
State III
State Ia
grant role
activation
delegation or
role hierarchy
declaration
State IIa
grant role
activation
delegation or
role hierarchy
declaration
This font: Comments
delegation
grant to
request_issuer
grant current_target
grant to
role hierarchy
grant to
request_issuer
request_issuer declaration
This font: Variables
This font: Conditions
and state actions
Figure 6.14: The informal representation of the Sygn algorithm as an automaton. A state change is initiated by a
certificate. The text of the states indicate which type of certificate will initiate a transition to this state. Further
conditions on state changes, not related to the current certificate are indicated next to the transition.
6.4. PDP ALGORITHM
105
VARIABLES:
current target := (current object, current action) : The current target
capability.
current role : The current role that may be delegated (if any).
creatori : The creator of aci .
owneri : The owner of aci .
capi : The capability of aci .
seti : If aci adds an object to a set then seti is this set, undefined otherwise.
can add to set : Boolean variable, indicates that the owner of this
certificate can add current object to a set.
has target : Boolean variable, indicates that the owner of this certificate
was granted the right to use target capability.
role valid : Boolean variable, indicates if the current role can be
delegated.
FUNCTIONS:
soa : {object identif iers ∪ subject identif iers} → subject identif ier.
If the input is an object identifier, returns the SOA for that object,
if the input is a role returns that role’s SOA, if the input is a user
identifier returns that identifier unchanged.
grant add to set : capabilities → {true, f alse}. Returns true if the
capability grants the permission to add the current object to a set,
and f alse otherwise.
adds to set : capabilities → {true, f alse}. Returns true if the capability
adds the current object to a set and f alse otherwise.
grants role : capabilities × roles → {true, f alse}. Returns true if the
capability grants the permission to activate the role and f alse
otherwise.
is role : subject identif iers → {true, f alse}. Returns true if the subject
is a role and f alse otherwise.
// Start conditions verification
current target := target // The initial current target is the path’s target
if creator1 6= soa(soa(current object)) then
// The first certificate must be created by the SOA of current object.
// The double use of the soa() function is due to the fact that
// this SOA may be a role. In that case we want creator1 to be
// the role’s SOA.
return f alse
end if
if cap1 = current target and grants add to set(cap1 ) = f alse then
106
CHAPTER 6. SYGN ACCESS CONTROL
// This certificate grants current target to someone, and not
// the permission to add current object to a set.
has target := true
can add to set := f alse
if is role(owner1 ) then
role valid := true
current role := owner1
else
role valid := f alse
end if
else if is role(soa(current target)) and
grants role(cap1 , soa(current target)) then
// Here the SOA of current object is a role and the permission
// to activate this role is granted in this certificate.
has target := true
can add to set := true
current role := soa(current target)
role valid := true
else if adds to set(cap1 ) = true then
if owner1 6= soa(soa(set1 )) then
// The algorithm requires the owner of a certificate
// that adds an object to a set to be the set’s SOA.
return f alse
end if
// If cap1 adds current object to a set, this set becomes the
// new current object for the path. The creator of the next
// certificate is the SOA of this set.
current object := set1
has target := true
can add to set := true
if is role(owner1 ) then
role valid := true
current role := owner1
else
role valid := f alse
end if
else if grants add to set(cap1 ) = true then
// If cap1 grants the permission to add current object to a set,
// this means that the owner of the ac1 has not been granted
// the permissions to use current target capability. Therefore
// the path should not end here.
6.4. PDP ALGORITHM
has target := f alse
can add to set := true
if is role(owner1 ) then
role valid := true
current role := owner1
else
role valid := f alse
end if
else
// Neither has target been granted to owner1 , nor has a role
// that is SOA of target been granted, nor has the
// current object been added to a set, nor has owner1 been
// granted the right to add current object to a set. Therefore
// this path starts incorrectly.
return f alse
end if
//Induction conditions verification, go through the certificate path
for i := 1 to n − 1 do
if creatori+1 6= soa(owneri ) then
// The creatori+1 must be the owneri or if owneri is a role,
// then that role’s SOA must be the creatori+1 .
return f alse
end if
if has target = true and capi+1 = current target then
// Here the permission to use current target is granted and not
// the permission to add the current objet to a set.
can add to set := f alse
if is role(owner1 ) then
role valid := true
current role := owner1
else
role valid := f alse
end if
else if role valid = true and
grants role(capi+1 , current role) = true then
// Here the current role is delegated. The internal memory
// does not change.
NOP
else if is role(owneri ) = true and
grants role(capi+1 , owneri ) = true then
107
108
CHAPTER 6. SYGN ACCESS CONTROL
// The owner of certificate aci was a role. Therefore this
// role can be delegated in certificate i + 1.
current role := owneri
role valid := true
else if can add to set = true then
if adds to set(capi+1 ) = true then
if owneri+1 6= soa(soa(seti+1 )) then
// The algorithm requires the owner of a certificate
// that adds an object to a set to be the set’s SOA.
return f alse
end if
// Here the current object is added to a set. Therefore
// the set becomes the new current object and as
// the owneri+1 is the set’s SOA, he also has the
// permission to use target capability.
current object := seti+1
has target := true
if is role(owner1 ) then
role valid := true
current role := owner1
else
role valid := f alse
end if
else if grants add to set(capi+1 ) = true then
// Here the permission to add current object to a set is
// granted. This means that owneri+1 has not been granted
// the permission to use target capability. Therefore the path
// should not end here.
has target := f alse
if is role(owner1 ) then
role valid := true
current role := owner1
else
role valid := f alse
end if
else
// The capability should either have added current object to
// a set or should have granted the right to add it to a set.
return f alse
end if
else
6.4. PDP ALGORITHM
109
// Neither has target been granted, nor has the current object
// been added to a set, nor has owneri+1 been granted the
// right to add current object to a set. This path is false.
return f alse
end if
end for
// End conditions verification
if has target = f alse then
// The path may not end if target capability is not
// granted to ownern .
return f alse
else if soa(ownern ) 6= request issuer then
// The permission to use target capability must be granted
// to the request issuer or to a role for which
// the request issuer is SOA.
return f alse
else
return true
end if
If we apply this algorithm to the example, we have the initial value of
current target which is equal to target = (document.txt, read).
The path complies with the start conditions since the creator of AC 1 is
Alice, who is the SOA of the current object. Since the permission to add
current object to a set is granted, the condition grants add to set(cap1 )
evaluated to true. Therefore the has target flag is set to f alse and the
can add to set flag is set to true. Since the owner of AC 1 is not a role,
the role valid flag is set to false.
We now enter the verification of the induction conditions, where AC 2
which is issued by Bob adds document.txt to the set Set A. AC 2 complies with the induction conditions, since Bob is the creator of AC 2. The
flag can add to set is equal to true and the condition adds to set(capi+1 )
evaluated to true. Furthermore Carol is the owner of AC 2. Therefore the
current object is set to Set A. The can add to set and the has target flags
are set to true, since Carol as SOA of Set A has the permission to use the
target capability and can grant the permission to add Set A to other sets.
In AC 3 Carol grants read rights on Set A to a role Role B. The
SOA of Role B is Dave. AC 3 also complies with the induction conditions, since Carol is its creator. The conditions cap1 = current target
and grants add to set(cap1 ) = f alse both evaluate to true. Therefore The
110
CHAPTER 6. SYGN ACCESS CONTROL
has target flag stays true and the can add to set flag is set to f alse. Since
the owner of AC 3 is a role, the role valid flag is set to true and current role
is set to the value Role B.
Finally in AC 4, the last certificate of this path, Dave grants
activation of Role B to Edgar. This certificate complies with the
paths induction conditions, since the conditions role valid = true and
grants role(capi+1 , current role) evaluate to true. The internal memory does
not change and the end conditions are checked. Since has target is true and
Edgar is the owner of AC 4 the end conditions are also fulfilled and the
algorithm returns true.
6.5
Sygn performance
In this section we give the results of some performance tests that we made
with the Sygn-PDP. We measured the execution time of the PDP’s decision
algorithm for the processing of a request without counting the time for network connection setup, authentication and request transmission. The tests
were run on a 1.9 GHz Pentium 4 under SuSE Linux 9.0. For cryptographical
primitives we used the Crypto++ library (version 5.1) by Wei Dai 4 . All the
program code was written in C++. The algorithm includes the verification
of the validity of all certificates by checking the expiration date, the signature and the local certificate revocation list (the latter requires a MySQL
database query).
Each parameter combination was run through the PDP 5000 times and
the total execution time was averaged. Except for the path complexity checks
all paths were direct delegations, that did not involve roles or file sets. Except
for the multi-path checks, all requests included only one certificate path of
variable length. The signatures of the certificates were created and verified
with the RSA digital signature algorithm, using PKCS padding and the SHA1 hash-function.
We have examined the following factors:
• Impact of the length of the path on the execution time (i.e. number of
certificates in the path). We measured paths ranging from one to seven
certificates.
• Impact of the length of the signature keys. We used 1024, 2048 and
4096-bit RSA keys.
• Impact of the database queries on execution time.
4
Available from http://cryptopp.com
6.5. SYGN PERFORMANCE
111
• Complexity of the path. We used paths that delegated permissions
through file sets and roles.
• Number of paths in request (while keeping the number of certificates
constant) in order to verify the overhead of setting up a path verification.
The complexity of the path had no significant impact on the duration of
the PDP’s decision procedure. The number of paths in the request has a very
low impact on the execution time (about 1 ms).
The two main factors that influence the execution time are the length of
the signature keys and the number of certificates in a path. Therefore the
signature verification is a key factor in the execution time of the algorithm.
The measurements suggest that the decision time increases linearly with the
number of certificates in the request. A third factor for the execution time
are the database queries, however their impact is comparatively small (about
2 ms for a path containing seven certificates). Figure 6.15 shows the results
of measurements with different key sizes and path lengths.
40
certificate signature
key size (in bits):
"1024"
"2048"
"4096"
35
Decision time in ms
30
25
20
15
10
5
0
0
1
2
3
4
5
6
Number of certificates in the request
7
8
Figure 6.15: The performance of the Sygn-PDP for different certificate signature key lengths and different numbers of certificates in the path.
112
CHAPTER 6. SYGN ACCESS CONTROL
As these results indicate, the time needed for the decision finding process
in the Sygn-PDP is extremely short, compared to time that is taken by the
general overhead of setting up a connection, transferring data or executing
jobs.
6.6
Discussion
Sygn proposes to use a permission push model as described in section 4.3.
Since Sygn ACs are not solely intended as short lived permission certificates,
but also for long-term permission storage, it is necessary to provide a revocation mechanism to be able to invalidate a permission before the AC that
granted it expires. This drawback, which is inherent to the push model has to
be weighted against the advantages of the push model. The user can choose
and submit exactly the ACs needed for the requested actions, thereby allowing to follow the least privilege principle. Furthermore the user can choose
exactly which permissions are disclosed to the different Grid services.
The approach to bind permissions to public keys has a drawback compared
to binding permissions to distinguished names: When the corresponding private key is stolen, all permissions bound to the public key must be revoked.
It is not sufficient to rely on the revocation of the authentication certificate
containing this public key, since the Sygn PEP only verifies that the request
issuer is correctly authenticated. Therefore a correctly authenticated request
issuer may use delegated permissions created with a stolen private key. On
the other hand the direct binding of permissions to public keys makes the
verification of permission integrity easier and requires a much smaller authorization data transfer volume. If a permission is bound to a distinguished
name in a certificate, the entire certificate chain for the public key of the
creator of this certificate is necessary, in order to verify his digital signature.
Such a certificate chain would have to be submitted for every authorization
certificate in a path.
A central feature of Sygn is the support for decentralized permission granting. Different SOAs can administrate access control to fine-grained resources
without intervention of a third party. However this feature makes it impossible to know for sure the entire set of permissions given to a specific user
or role. Therefore the results of any permission review functions (which are
required in standard RBAC, see section 4.2.3) are not necessarily complete.
Enforcing such a completeness would require a centralized validation of all
permissions and would negate the advantage of decentralized, ad-hoc permission granting. Therefore review functions have to rely on the goodwill of
the respective SOAs. It is the SOA’s responsibility to store duplicates of all
6.6. DISCUSSION
113
permissions they issue in corresponding review repositories. Another effect
of this situation with regard to RBAC is that it becomes impossible to enforce a static separation of duties (i.e. tuples of permissions that can not be
granted to the same entity). However since a dynamic separation of duties
can be enforced through the use of restrictions, this second drawback is only
of lesser concern.
Another central feature of Sygn are its delegation mechanisms. Following
the approach of SPKI [39] (see also section 4.5.3) we have examined three
choices for delegation control:
1. No control. All users can delegate any of their permissions.
2. Boolean control. A flag specifies if delegation is allowed or not.
3. Delegation depth control. A non-negative integer specifies how many
levels of delegation are allowed.
What speaks for the first option is that there is no way to prevent users
from sharing their private key used for authenticating with others. Alternatively users could also set up a service that signs any challenges to allow their
impersonation without having actual access to the private key. Therefore attempts to restrict delegation would be ineffective and would possibly weaken
the protection of private key material.
We have not chosen this option since we believe that security education of
users should prevent such situations from happening. If users have such bad
security practices as giving away their private keys, no system will be able to
really protect any resources on a Grid from unauthorized access. A way to
enforce a secure handling of private key material by untrained users, could
be to deploy hardware tokens that allow to perform private key operations
but keep the key material locked on the device.
The argument in favor of both other options is that it can be necessary to
specify if a permission can be delegated or not. If entities where entitled to
delegate any permissions they hold, this would increase the risk of a misuse.
Since the SOA of a resource can be held partially responsible for a misuse,
even if a delegated permission was used, it is important that a differentiation
can be made between entities that are given a permission and entities that
are allowed to delegate the same permission to others.
The creators of the SPKI Certificate Theory argue that depth control does
not give real control over the proliferation of a delegated permission, since
only the depth and not the width of the delegation tree can be controlled.
Even though this argument is valid, we still believe that depth control is
to be preferred over boolean control, since it allows to enforce a flat delegation tree and therefore make it easier to track down the responsible secondary
114
CHAPTER 6. SYGN ACCESS CONTROL
SOA if a misuse of permissions is detected. Another point in favor of depth
control is that Grid architectures commonly use proxying mechanisms to
create temporary credentials out of long-term credentials (see [100, 93] for
details). These proxy credentials are then delegated the permissions required
to execute a given task. In a boolean delegation control architecture that
would mean that every user needs delegation power over all of his long-term
permissions, thus making the delegation control almost ineffective. With the
depth control, this situation can be resolved without negating the benefit of
delegation control, by allowing one level of delegation for permissions that
need to be given to a proxy. We have therefore chosen to implement a delegation depth control in Sygn.
Sygn offers support for RBAC, and it can also be used in parallel to create
and handle discretionary access control (DAC ) permissions. This allows to
adapt the type of permissions to the situation in which they are used. If
a complex (possibly hierarchical) permission structure with authorizations
based on tasks is present, RBAC can be used. For ad-hoc permission granting
or in similar situations where RBAC is too cumbersome to use, Sygn can
handle DAC permissions that are more easy to create and use.
In [9] Bertino et al. suggest, that the access control for hardware resources
should already be considered in the resource allocation process, to avoid allocating resources to which the request issuer does not have access. These
considerations have to be taken into account when using Sygn for hardware
resource access control. Clearly the resource broker needs to be able to verify if an allocated resource is really accessible for a certain user request. A
possible approach would be to have the resource broker submit authorization
requests to the local Sygn-PDPs of the resources, on behalf of the request
issuer.
Finally the structure of Sygn requests allow to support scenarios where
multiple permissions are needed simultaneously. A simple example of this
would be the replication of a Grid-file. Such an operation requires read permissions on the file in question and access to a certain amount of disk space
at the replication site. By using a Sygn request with multiple paths, the
authorizations for such operations can be grouped together in a convenient
way.
Chapter 7
CryptStore encrypted storage
This chapter describes the design and implementation of our encrypted storage architecture CryptStore. Section 7.1 presents CryptStore and motivates
the key-servers as central idea for the CryptStore system. We then present
the architecture of CryptStore and how CryptStore can be used on a Grid in
section 7.2. Section 7.3 presents the syntax and the semantics of the CryptStore meta-data. In section 7.4 we present the algorithms of the CryptStore
architecture. We analyze the risk of attacks on CryptStore and how to mitigate these in section 7.5 and conclude with a discussion of the different
CryptStore design choices in section 7.6.
7.1
Basic concepts of CryptStore
The CryptStore architecture gives an entity, that is SOA of a file, the option
to encrypt it before storing it on the Grid. CryptStore provides a file administration client that can be integrated in a Grid interface or used separately.
This tool performs the encryption of the file and generates the necessary
meta-data.
In order to give authorized users access to decryption keys, key-servers are
deployed on various Grid sites as part of the CryptStore architecture. They
function as repositories for decryption keys and can be queried by users who
whish to decrypt data they are authorized to access. To avoid making keyservers valuable targets for attacks we have chosen to distribute shares of the
various encryption keys on multiple key-servers using Shamir’s secret sharing
algorithm [90]. The characteristics of this algorithm are described in section
7.4.
The file administration client handles the tasks related to the encryption
of a file, the generation of key-shares and the connection to the key-servers
115
116
CHAPTER 7. CRYPTSTORE ENCRYPTED STORAGE
in order to store the key-shares and related meta-data.
To access an encrypted file, a file user client is provided by CryptStore. It
handles the recovery of key-shares from the key-servers, the reconstruction
of the key from the shares and the decryption of the file. The key-shares are
subject to access control based on the requesting user’s file access permissions
(i.e. if the access control grants a user access to a file, this user also has access
to the decryption key of that file). The key-servers therefore have a generic
access control interface that can be instantiated to make them interact with
any access control system present on the Grid. If the access control system
works in a decentralized way, an instance of it can be co-located with the keyserver. If the access control mechanism uses a permission push architecture,
the user has to provide the necessary credentials to the file user client to
enable it to recover the key on his behalf.
If the file clients are not part of a Grid interface they can access key-servers
using a simple client-server model, where the key-servers are queried by the
clients. The connection is authenticated and secured by SSL and requires the
key-server to open a port on the host machine.
CryptStore deals with four distinct scenarios concerning access control
permissions of encrypted files:
• The users that are subject of the permission and the files that are object of the permission are individually known when the permission is
created and do not change. Such an authorization structure is illustrated by figure 7.1. This situation would allow to directly transmit
the decryption key to all authorized users via a secure channel (e.g.
direct connection secured with TLS/SLL or IpSec, encrypted mail).
• The users that are subject of the permission are individually known as
above. The files that are object of the permission are identified by a
file-set which may change dynamically as files are added or removed
from the set. Figure 7.2 shows such an authorization structure. In such
a scenario the lockbox concept presented in chapter 5 could be used to
provide authorized users with decryption keys.
• The files that are object of the permission are known and do not change.
The users that are subject of the permission are identified by a group
or role, and membership may change dynamically. This authorization
structure is represented in figure 7.3. In such a scenario a group key
could be used, that gives all members of the group access to lockboxes
containing the decryption keys. However every time a member is removed from the group, the group key and all the file keys have to be
updated, all the files have to be re-encrypted and the lockboxes have
7.2. ARCHITECTURE AND USE OF CRYPTSTORE
117
File or static set of files
Allows access to
Uses to gain access
Permission
Gives permission to
User
or static user group
Data SOA
Figure 7.1: A simple scenario of authorizations.
Dynamic set of files
Adds or removes
objects to/from
the set
Allows access to
Uses to gain access
Permission
Gives permission to
Data SOA
User
or static user group
Figure 7.2: An authorization scenario, where the permission object is a set
of files.
to be re-created. Since the files are fixed and known this may still be
feasible.
• The users that are subject of the permission and the files that are
object of the permissions are both identified by a group (or role, or set)
and membership of both is changed dynamically. This case is shown in
figure 7.4. For this setting we need key-servers that store key-material
and decide dynamically who may access a key. Authorized users can
contact the key-servers and recover the necessary key-shares in order
to decrypt a file.
7.2
Architecture and use of CryptStore
The CryptStore system consists of three components that are deployed on
the Grid: The administrator-client, the user-client and the key-server system.
118
CHAPTER 7. CRYPTSTORE ENCRYPTED STORAGE
File or static set of files
Group SOA
Adds or removes
users
to / from group
Uses to gain access
Allows access to
Permission
Gives permission to
Data SOA
Dynamically changing
user group
Figure 7.3: An authorization scenario where the permission subject is a group
(or a role) consisting of multiple users.
Dynamic set of files
Add or remove
files to/from
the set
Data SOA’s
Group SOA
Adds or removes
Allows access to
Uses to gain access users to / from group
Permission
Gives permission to
Set SOA
Dynamically changing
user group
Figure 7.4: An authorization scenario where the permission subject is a group
(or a role) consisting of multiple users and the permission object is a set of
files.
7.2. ARCHITECTURE AND USE OF CRYPTSTORE
119
The administrator-client allows file owners to perform the following actions:
• Encryption of a file.
• Creation of a message authentication code (MAC) with the same key
used to encrypt the file in order to ensure file integrity.
• Creation of key-shares.
• Storage of key-shares on a key-server.
• Storage of encryption parameters and key-server location information
in the file meta-data.
• Update of key-server data.
The administrator-client can be deployed as part of the user Grid interface
or as a stand-alone Grid tool.
The user-client allows Grid users that want to access an encrypted file to
perform the following actions:
• Extraction of key-server locations from an encrypted file meta-data
• Access to a key-server in order to retrieve a key-share (user must provide authentication and possibly authorization tokens).
• Reconstruction of a key from retrieved key-shares.
• Decryption of an encrypted file including extraction of the encryption
parameters from an encrypted file meta-data
As the administrator-client, the user-client can be deployed as part of the
user Grid interface or as stand-alone Grid tool.
The key-server system is set up at different Grid resource sites and provides the following services:
• Storage of key-shares and associated file-id.
• User-client interface that allows to gain access to key-shares.
• Generic interface to access control services to determine who may access
a key-share based on file access permissions.
• Deletion of key-shares by the owner of an encrypted file.
120
CHAPTER 7. CRYPTSTORE ENCRYPTED STORAGE
Storage Server
Data owner using
3.) Stores data and addresses of administrator−client
the key servers on the storage server
1.) Encrypts data
and creates
key shares
5.) Retrieves data
and addresses of
the key servers
4.)Gives
permissions to
user or user group
Data consumer
using user−client
2.) Stores key
shares on
different
key servers
6.) gets key shares
from the key servers
7.) Reconstructs key
from key shares and
decrypts data
secure transfer optional:
Key Servers
secure tranfer mandatory:
Figure 7.5: The use of CryptStore for encrypted file storage and access.
The update of a key-share is transparent to a key-server, since the administrator client handles this as the deletion of the old share and the storage of
the new share separately.
The key-server uses some standard C++ libraries and a C++ interface
library for the MySQL database system. These libraries have to be deployed
on the Grid resource site in order to run the key-server.
Figure 7.5 illustrates how these components work together.
In a first step, the owner of a file encrypts the data, and creates the
key-shares using his administrator client. In a second step he stores the keyshares on different randomly selected key-servers. He can then generate the
meta-data header of the encrypted file that contains encryption parameters
and the locations of the key-servers. In a third step he stores the encrypted
data (including the meta-data header) on a Grid storage server. The fourth
step that can be temporally disconnected from the first three is to use the
Grid access control mechanism to give some user or user group access to the
encrypted file. This step is performed outside the CryptStore architecture,
using available Grid access control tools, as for example Sygn.
The new actor now is the user that wants to access the encrypted file.
With his access permissions he retrieves the encrypted file from the storage
server using normal Grid file access mechanisms in a fifth step. The user can
now read the associated meta-data from the file header using the CryptStore
user-client. With this information the user-client can query the corresponding
7.3. CRYPTSTORE META-DATA
121
key-servers in a sixth step in order to retrieve the key-shares. In the final step
the user-client reconstructs the key from the key-shares and uses it to decrypt
the data.
7.3
CryptStore meta-data
CryptStore requires a certain amount of meta-data to function. The current
design of CryptStore adds these meta-data in an unencrypted form in the
headers of the encrypted files as it is required to locate key-servers and to
configure the decryption algorithm. The rationale behind this is that the size
of the meta-data will generally be relatively small compared to the size of
the file and therefore a small increase in file size will not be relevant. We are
however aware of situations where this assumption would not hold true. For
example if we want to encrypt database entries, where the table columns are
of fixed size, an increase of the size of the column data may not be possible.
In such a case the design of CryptStore would have to be marginally changed
in order to allow the storage of the meta-data externally to the encrypted
file. We discuss the reasons why we have chosen the first option in section
7.6.
The meta-data required by CryptStore are the following:
• The encryption algorithm and the encryption mode if several modes
are possible (e.g. ECB, CBC, CFB for block ciphers).
• The initialization vector (sometimes also be called nonce or tweak depending on the encryption mode) used, if any.
• The encryption key size in bytes.
• Optionally: The algorithm used for the generation of the message authentication code.
• Optionally: The message authentication code.
• The threshold value of the secret sharing algorithm (i.e. the number of
key-shares required to recover the key).
• Information about the key-servers that store shares of the file decryption key. If a simple client/server model is used these are the URLs and
port numbers under which the key-servers are accessible. Chapter 8 discusses other possible forms of deploying CryptStore, which may make
changes in this format necessary.
122
CHAPTER 7. CRYPTSTORE ENCRYPTED STORAGE
CryptStore itself requires no further meta-data. However if a decentralized
access control is deployed on the Grid, an access control server may be colocated with the key-server. This server checks which users are authorized
to obtain which keys based on their normal file authorizations. This access
control server needs a meta-database containing the SOA’s of the files for
which key-shares are stored, in order to have a root of trust for its access
decisions. These meta-data are collected at the moment a file SOA stores a
new keys-share in a key-servers database.
Furthermore the SOA of a file can also choose to store a message authentication code in the file header. This allows to check the integrity of the
encrypted file.
The meta-data stored in the file header are encoded in XML, as this
is a widespread structured format that is still human readable. Figure 7.6
shows an example of such a file header. It specifies that the following file was
encrypted with the AES algorithm, using the cipher-feedback mode, with a
key size of 16 bytes, using the given initialization vector. It features a message
authentication code generated with HMAC using the SHA-1 hash and gives
the value for message authentication. Furthermore it specifies that at least
two key-shares are needed to recover the key. Then it gives the addresses
including the port numbers of three key-servers that each hold one key. This
last information implies that three key-shares were initially created.
Appendix B gives an XML Schema definition of CryptStore meta-data
headers.
7.4
CryptStore algorithms
CryptStore uses different cryptographic algorithms and protocols to perform
its functions. In this section we present those algorithms and motivate their
choices.
7.4.1
Cryptographic algorithms
CryptStore uses three different cryptographic algorithms: A file encryption algorithm, a message authentication code (MAC) for file integrity and Shamir’s
secret sharing scheme.
For file encryption, we have chosen the AES algorithm, since it is standardized, widespread and very efficient. The AES is a block cipher algorithm
with variable key length and a block size of 128 bits. We have chosen a block
cipher rather than a stream cipher because stream ciphers do not allow random access to parts of encrypted files and re-encryption of a modified file
7.4. CRYPTSTORE ALGORITHMS
00
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
123
<ENCRYPTION_PARAMETERS>
<ALGORITHM> AES_CFB </ALGORITHM>
<KEYSIZE> 16 </KEYSIZE>
<IV> 7msW8+augercHfk0oE4zjA== </IV>
</ENCRYPTION_PARAMETERS>
<MAC>
<ALGORITHM> HMAC_SHA1 </ALGORITHM>
<DIGEST_VALUE>
qVqCQDDlwODNtcGqFKZ47olQ524=
</DIGEST_VALUE>";
</MAC>
<KEYSHARING_INFORMATION>
<THRESHOLD> 2 </THRESHOLD>
<KEYSHARE_SERVER> if.insa-lyon.fr:4711 </KEYSHARE_SERVER>
<KEYSHARE_SERVER> nn.cern.ch:1234 </KEYSHARE_SERVER>
<KEYSHARE_SERVER> liris.cnrs.fr:1764 </KEYSHARE_SERVER>
</KEYSHARING_INFORMATION>
Figure 7.6: An example of a CryptStore meta-data header specifying how
the file was encrypted, a message authentication code for verifying the file’s
integrity with a message authentication code and information where keyshares can be retrieved.
with the same key is not secure when using a stream cipher. Since medical
data files may become very large, and rapid access to some part of the file
may be necessary at some time, it is important to support random access to
parts of an encrypted file.
As mode of operation for the AES we have chosen the cipher block chaining mode (CBC) with ciphertext stealing (CTS) as presented in section 5.1.
This mode of operation ensures that a manipulation of the encrypted
data in order to change the plaintext becomes difficult and that patterns in
the plaintext blocks are hidden in the encrypted blocks. Furthermore CTS
enables us to keep the size of the ciphertext equal to the size of the plaintext.
In order to ensure the integrity of the data, we have chosen to use message authentication codes rather than a public key digital signature scheme.
The reason for this is that multiple users may be updating the same file.
Therefore if a signature scheme was used, these users would either have to
share a key-pair for signing this file or would have to provide their public
key for signature verification to every potential reader of the file. This would
make cumbersome public key distribution mechanisms necessary and could
124
CHAPTER 7. CRYPTSTORE ENCRYPTED STORAGE
m,n
Share secret
k
st , ..., st
0
m−1
Reconstruct
secret
s0 , ... , sn−1
k
Figure 7.7: The concept of Shamir’s secret sharing algorithm. The number
of shares is n, the threshold to reconstruct the secret k is m. The algorithm
produces a set of n shares si ; {ti } denotes a set of different indices 0 ≤ ti < n.
easily lead to confusion on which public key was used for the actual signature
of the file. A message authentication code uses a secret key to create a code
that allows the verification of the file integrity by other users having access
to this secret key. In CryptStore we use the same key for the file encryption
and the generation of the code. This means that a user who has only read
rights can update a code for an encrypted file. We assume such manipulations are prevented by Grid access control mechanisms that should not allow
such a user to write back a modified version of the file. We have chosen the
HMAC algorithm [6] with the SHA-1 hash function to generate the message
authentication codes. The reason for this is that HMAC is standardized and
widely used (e.g. in the IPSec protocol).
The algorithm used to split the file encryption key into shares is Shamir’s
secret sharing scheme [90]. The basic functionality of this algorithm is the
following: For a given secret that a user wants to share between n participants
the user chooses a threshold m with n ≥ m that indicates how many shares
are needed to reconstruct the secret. The algorithm takes the secret, n and m
as input and produces a set S of n shares so that any subset of S containing
at least m different shares can be used to reconstruct the secret. Furthermore
the algorithm has the property that no subset of S containing fewer than m
different shares allows to deduct the secret or to reduce the complexity of
an exhaustive search for the secret. Figure 7.7 illustrates this concept for a
secret k and parameters n and m. It shows the generation of a set of shares
si with i = 0...n.
7.4. CRYPTSTORE ALGORITHMS
7.4.2
125
Request handling
CryptStore handles two types of requests: One between the owner of a file
and a key-server and a second between a user wishing to retrieve a key-share
and the key-server. Communication between users and file owners is not part
of the CryptStore design.
CryptStore assumes that any user who has access to the Grid can store
key shares at a key-server, provided that he does not overwrite existing keyshares belonging to other users. Therefore only an authentication is necessary
to submit a pair of key-share and file identifier to a key-server.
The possible requests are:
• Store a key-share belonging to a file, identified by its logical filename
(lfn). Such requests come from file owners or users who have write
permissions on a file.
• Delete a key-share belonging to a file, identified by its lfn. Such requests
come from file owners or users who have write permissions on a file.
• Retrieve a key-share belonging to a file, identified by its lfn. Such requests come from file owners or users who have read or write permissions on a file.
As already mentioned updates of key-shares are treated as separate delete
and store requests for the key-servers.
Figure 7.8 shows examples for each possible type of request from a file
owner to a key-server. The first one to store a key-share, and the second to
delete a key-share related to a specific file.
The CryptStore key-share retrieval requires the requesting user to be
authenticated. If the authentication is successful the user’s request is considered. The form of the request depends on the Grid access control system
to which CryptStore has been linked. If it uses a pull message sequence system, then the request consists only of the file identifier for which the user
wishes to retrieve a key-share. If the access control uses a push message sequence, then the request must contain authorization assertions in addition
to the file identifier. The key-server contacts a Grid access control server and
requests an authorization decision whether the user is allowed to access the
encrypted file. If the response is positive the key-server returns the corresponding key-share. Figure 7.9 shows an example request from a user to a
key-server, requesting the release of a key-share.
Appendix B gives an XML Schema definition for CryptStore requests.
126
CHAPTER 7. CRYPTSTORE ENCRYPTED STORAGE
01 <cryptstore_request>
02
<request_type> store_keyshare </request_type>
03
<lfn> +/AbBuY...xe88= </lfn>
04
<keyshare>AAAAABWlEpxrg...j7x3yk= </keyshare>
05 </cryptstore_request>
01 <cryptstore_request>
02
<request_type> delete_keyshare </request_type>
03
<lfn> +/AbBuY...xe88= </lfn>
04 </cryptstore_request>
Figure 7.8: Examples of a file owner requests to a CryptStore key-server. The
first request is to store a key-share belonging to a file identified by a lfn, and
the second is to delete the same key-share.
01 <cryptstore_request>
02
<request_type> retrieve_keyshare </request_type>
03
<lfn> +/AbBuY...xe88= </lfn>
04 </cryptstore_request>
Figure 7.9: Examples of a file user request to a CryptStore key-server. The
request asks for a key-share belonging to a file identified by a logical file name
(lfn).
7.5
Security Analysis
In this section we analyze possible attacks on CryptStore and discuss how to
mitigate these threats. We do not consider attacks based on social engineering
and on malicious hardware, since those are out of the scope of this thesis and
can be applied against any cryptographic storage system. We do also not
consider attacks against the Grid’s file storage mechanisms since CryptStore
does not interact directly with those.
The following attacks could be tried against CryptStore:
1. Attacks on the encryption algorithm, with the goal to disclose the content of a file stored encrypted with CryptStore.
2. Attacks on the message authentication scheme with the goal to hide
unauthorized modifications of a file stored with CryptStore and protected by a message authentication code.
7.5. SECURITY ANALYSIS
127
3. Attacks on the key-share transfer with the goal of disclosing or falsifying
the transferred key-share.
4. Impersonation of authorized users in order to gain access to key-shares.
5. Byzantine attacks by malicious key-servers.
6. Attacks on the key-servers, especially on the MySQL database used for
the storage of key-shares, with the goal of either disrupting availability
of the services or with the goal of disclosing key-shares.
7. Malicious modifications of the CryptStore software.
8. Attacks on the CryptStore user client, especially on the client database.
As CryptStore uses the AES encryption algorithm and currently no feasible attacks on the AES are known, the first attack poses no real threat at the
time this thesis is written. Cryptographical breakthroughs, leading to new
attacks on the AES may make it necessary to change the algorithm used by
CryptStore and to re-encrypt all files. However users should be aware that in
such a case data may still be compromised, since attackers may have made
personal copies of the old encrypted data, which are not under the control
of the original data owner.
Concerning the second type of attacks, CryptStore uses the HMAC message authentication algorithm with the SHA-1 hashing algorithms. Recent
cryptographical attacks on SHA-1 [98] may make it necessary to replace it
by a more secure hashing algorithm. This also means that all existing message
authentication codes will have to be recalculated.
Since all communications between user clients and key-servers are protected with SSL (using the OpenSSL library), the third type of attack requires to attack either the SSL protocol, or an algorithm used within this
protocol or the implementation of the OpenSSL library.
The fourth type of attacks are not really attacks on CryptStore. They
are directed at the access control system (or the authentication system used
by the access control system) that governs access to key-shares. If Sygn is
used for CryptStore key-share access control, the information that has to be
protected in order to prevent impersonation are the users private keys.
In the fifth type of attacks, attackers set up key-servers themselves and
try trick users in storing key-shares on these servers. If an attacker is able to
set up enough Byzantine key-servers, he may get access to a sufficient number of key-shares in order to reconstruct the decryption key for an encrypted
file. Furthermore the Byzantine key-server could provide false key-shares to
128
CHAPTER 7. CRYPTSTORE ENCRYPTED STORAGE
requesting users, in order to deny access to decryption keys. In order to prevent such attacks, measures must be taken to ensure that entities providing
key-servers are trustworthy. Such measures can include the registration of
the key-server and of the entity that runs it in a list of trustworthy services.
The sixth type of attacks on CryptStore is the most dangerous, since
the key-servers store a great amount of security critical information. In the
current version of CryptStore, the key-server communicates with a MySQL
database using an unprotected channel. Therefore it is important to keep
the database on the same machine as the key-server. We plan to modify
the database interface in future version in order to allow the use of the SSL
protocol to protect communications between the key-server and the database.
As the key-server needs to know the database password in order to access
its records, the account on which the key-server is installed needs to be protected against unauthorized access. If an attack is successful, all key-shares
stored on this server need to be updated, furthermore all other key-servers
need to update the key-shares belonging to the same key as the compromised
key-shares.
The secret sharing mechanism used by CryptStore provides some protection if a CryptStore key-server is hacked, since the attackers need to break
into several key-servers in order to gain access to enough key-shares to allow
them to reconstruct decryption keys.
Finally the key-server handles key-shares in cleartext during its operations. This means that a key may be swapped onto disk due to the internal
memory management or that it may be written in a core dump, if CryptStore
crashes. These problems can be prevented by turning off swap or using an
encrypted swap area and setting the maximal coredump size to zero.
The seventh type of attacks consists of providing users or key-servers with
maliciously modified versions of the CryptStore software, that will leave a
backdoor or leak information to attackers. Such attacks can be prevented,
if the CryptStore code is signed and no unverified versions of the code are
installed.
The last type of attacks are those on the CryptStore user client. They are
similar to the attacks on the CryptStore key-server, since the user client also
uses a MySQL database to store copies of all encryption keys for the user’s
encrypted files and handles keys in memory. Therefore the same protection
measures have to be taken to reduce the risks of attacks on the user client.
7.6. DISCUSSION
7.6
129
Discussion
In this section we discuss the algorithmic and architectural choices within
CryptStore.
Since we are interested in medical data, which can include radiological,
sonographical or computer tomograph pictures in addition to simple text, we
had to take into account the possibility that the files in question can become
quite large. We therefore had to weight the encryption speed of stream ciphers
versus the random access capabilities and the possibility to re-encrypt under
the same key offered by block ciphers. Another important point was the
possibility of an encryption mode that does not change the size of the data.
This final property is inherent to stream ciphers since they process streams
of bits (or bytes in the case of the RC4 algorithm) one by one. For block
ciphers the property of keeping the plaintext size can be obtained by using the
ciphertext stealing scheme (and omitting base 64 encoding of the plaintext
which is quite common otherwise). Furthermore we chose against using the
CFB block cipher mode that effectively turns the block cipher in a stream
cipher for the obvious reason that if we wanted to use a stream cipher we
might as well take an efficient one (which is not the case of block ciphers
used in CFB mode for byte-by-byte encryption).
The final point that made us choose a block cipher in CBC mode other
than stream ciphers was the capability to re-encrypt an updated file using
the same key. This way a user having write access to the file can update its
contents, re-encrypt it and write it back to the Grid storage without having
to change the key shares on the key servers.
The decision to store the meta-data in the file header was made in order
to make it possible to handle the access to encrypted files in the same way
as normal files from the point of view of a Grid storage resource. We are
aware that this means that we do change the size of the data, which may
be a problem if the original data was stored in a database table with a fixed
size of table cells and the encrypted data needs to be written back to the
same database table. Extending the current CryptStore design to allow an
external management of the encryption meta-data would not be a major
problem, since most Grid architecture can keep meta-data about the Grid
files anyway, which could be used to store our file encryption meta-data
additionally.
The decision to use Shamir’s secret sharing scheme for the key storage
on the key servers was made in the general spirit of this work to avoid single
points of attack and trusted third parties. As we have pointed out in the
presentation of the basic concepts of CryptStore, a partially trusted third
party is inevitable, if we want to efficiently manage key accesses for dynamic
130
CHAPTER 7. CRYPTSTORE ENCRYPTED STORAGE
groups of users and sets of data resources. In order to reduce the impact of
a successful attack on one key server we have therefore chosen not to entrust
them with the entire key information. Due to the nature of the secret sharing
scheme we gain additional benefits: CryptStore becomes robust against break
downs of single key servers, if the user created a redundant number of shares.
Furthermore the storage of the key shares on the key servers provides an backup that can be used for emergency access if the user looses his decryption
key.
A topic that has to be considered was raised in chapter 5: The handling of
re-encryption, when permissions of users that had access to decryption keys
are revoked. Since we can not control the environment on the user machines,
we can never prevent a user who had access to a file from making copies of
it and spreading them to unauthorized users. Therefore we advocate the use
of the lazy re-encryption scheme, in which a file is only re-encrypted with a
different key after a permission change, when the file content has changed.
Another choice that had to be made in the design of CryptStore was
whom to put in charge of the decryption of an encrypted file. Setting up a
decryption service that performs decryption for the user would have solved
the revocation problem, since users would not have access to the decryption
keys. The drawback of such a solution would have been to introduce a single
point of attack and a trusted third party into the system. Therefore we
decided to leave the responsibility of decryption to the machine of the end
user of the data, where it can be handled by the CryptStore user-client.
The most important concept of CryptStore is the interface to an access
control mechanism. The rationale behind this is to keep access permissions
to files consistent with access permissions to keys that decrypt these files.
We have therefore chosen to avoid a duplicate access control layer and to
make it possible to use the full power of the access control service that is
available on the Grid architecture. This approach only works if the file owners
are the actual sources of authority for access control decision concerning
their files and not the local storage sites (as it is the case with the VOMS
access control architecture for example), but since empowering the owner
with access control over his files is one of the requirements that we defend in
this thesis, we believe that this fits together smoothly. As the granularity of
protection for CryptStore is files (and not dynamically generated views on
data), the concerns raised in [18] about the absence of a bijection between
encryption and access rights do not apply to our approach.
Chapter 8
Sygn and CryptStore in a Grid
n this chapter we discuss the aspects concerning the integration of our proposals in a Grid architecture. We consider two Grid architectures for our specific examples: µgrid a minimal Grid architecture1 [88] and the OGSA/WSRF
standardized Grid architecture provided by the Globus Toolkit version 4.
8.1
µgrid
The µgrid architecture was designed and implemented as a minimal Grid architecture in order to test and implement scientific Grid based applications
without having to install, configure and administrate a production Grid architecture. It is therefore small, easy to install and to run and it depends on
very few software libraries.
The µgrid consists of three software components, the client software that
allows users to access to the Grid, the farm manager, the current Grid entry
point of µgrid, that groups resources together and manages the scheduling of
jobs, the resource assignment and the data management. Finally computers
providing resources run the third component the host manager, which manages the computing jobs and the storage of data. All the communication is
done through simple sockets, using a client/server architecture.
With this architecture, a transparent sharing of resources is possible. Due
to its simple design and interfaces µgrid is easy to install, configure and
administrate. Possible file operations are to copy a file from a local disk to
the Grid, to replicate a file on the Grid, to copy a file from the Grid to a
1
mugrid was developed by Johan Montagnat from the CREATIS laboratory at INSA
Lyon (now I3S at the University of Nice) and Diane Lingrant from the I3S laboratory at
the University of Nice. It was created in the context of the French ministry of research
project MEDIGRID.
131
132
CHAPTER 8. SYGN AND CRYPTSTORE IN A GRID
local disk and to delete a file on the Grid. A C++ API allows to use these
file manipulation commands within jobs processed on the Grid.
Authentication is implemented using OpenSSL and a PKI. Each user,
farm and host has its own certificate allowing mutual authentication. µgrid
assumes a single root certificate authority in its actual version.
The current design of µgrid has limited scalability since the farm manager quickly becomes a bottleneck when it is assigned too many resources.
Therefore the extension of µgrid is planned by adding a layer of servers above
the farm manager that will also be the new Grid entry points.
8.2
OGSA/WSRF standardized Grids
The Open Grid Services Architecture (OGSA) [49] is a standard developed by
the Global Grid Forum (GGF) 2 . OGSA aims at defining a common, standard
open architecture for grid-based applications. OGSA is service oriented, and
requires a distributed computing middleware, that supports stateful services
(i.e. in which services can store information of previous sessions, from one
invocation to another).
Web services3 have been chosen as an architecture for implementing these
Grid services following the requirements of OGSA. The use of Web services
requires three major components: A discovery service, that is used to locate
existing services, the Web Service Description Language (WSDL) [24] which
is an XML based language used to describe the interfaces of Web services
in a standardized way, and a protocol to exchange Web service requests and
responses. The most frequently used protocol for Web service communication
is SOAP [19], which is a protocol that enables to exchange XML encoded
messages using the HTTP communication protocol.
Figure 8.1 shows a typical Web service invocation, using the Web Service Description Language (WSDL) to define and publish the Web service
interfaces and the SOAP protocol to exchange messages.
Web services as defined by the W3C are stateless, and thus pure Web
services are not sufficient for the requirements of the OGSA specification.
Therefore the Web Services Resource Framework (WSRF) was developed
by the OASIS consortium. WSRF specifies how Web services can be made
stateful. Figure 8.24 illustrates the relationships between OGSA, WSRF, and
Web services.
2
www.ggf.org
http://www.w3.org/2002/ws
4
Figure inspired by http://gdp.globus.org/gt4-tutorial
3
8.2. OGSA/WSRF STANDARDIZED GRIDS
Discovery
Service
Client
133
Web Service
(CryptStore key server)
1. Where can I find a
CryptStore key server?
2. At this address: URL
3. How can I invoke the key server?
4. I have the following interfaces: WSDL
5. SOAP Service request: store key share
6. SOAP Service response: storage successful
Figure 8.1: A typical Web service invocation.
OGSA
WSRF
requires
specifies
Stateful
Web services
extends
Web services
Figure 8.2: The relationship between OGSA, WSRF and Web services.
134
CHAPTER 8. SYGN AND CRYPTSTORE IN A GRID
In [76] a strategy for addressing security within the Open Grid Services
Architecture (OGSA) is proposed. It describes a set of security components
that need to be realized in the OGSA security architecture and presents a
set of use cases that show the interactions of these components in a secure
Grid environment.
This strategy defines three challenges that have to be addressed in the
realization of a Grid security architecture:
• The integration of heterogeneous, local security solutions. As it will be
impossible to enforce the use of a single security solution, the Grid
security architecture needs to be generic and extensible, so that it can
be instantiated with any existing security mechanisms.
As Sygn is designed to be deployed locally with the resources it controls, it allows the use of different local security solutions at other Grid
resource sites. CryptStore is designed in the same spirit and allows to
use different access control mechanisms to control access to decryption
keys.
• The interactions between those local security solutions. As services may
span across multiple domains using different security technologies, the
Grid security architecture needs to provide solutions, that allow these
security technologies to interact. Therefore a common message exchange protocol is needed (SOAP over HTTP is proposed as an example). Furthermore a common method that allows to communicate
and negotiate security policies and finally a common way of mapping
a user identity from one domain to another has to be specified.
To allow interaction of Sygn with other security solutions, the Sygn
permission language would have to be adapted to conform to a standard
such as SAML. This would require an extension of SAML in order
to support Sygn’s delegation mechanisms. Furthermore the Sygn and
CryptStore interfaces would have to be adapted to enable them to
interpret SAML assertions.
• Trust relationship management. The main problems in trust relationship management is that end users will use the Grid to perform requestspecific tasks, possibly executing their own code on some distant Grid
machines. Classical security questions such as authentication and authorization need to be answered in the new context of processes executing such user-created code. The necessity of delegation of rights to
allow such processes to execute tasks on a user’s behalf is also specifically mentioned.
8.2. OGSA/WSRF STANDARDIZED GRIDS
135
Sygn’s delegation mechanisms allow to provide user-created code with
the necessary permissions it needs for its execution. To achieve a smooth
integration it would be useful to integrate Sygn access control in a Grid
resource access API, as it was done for file access in µgrid.
Access control is addressed very briefly in the OGSA security architecture
(using the term Authorization Enforcement). The authors conjecture that every domain will typically have its own authorization service, using different
access control models such as DAC and RBAC. Therefore the Grid authorization model needs to be based on upcoming standards such as XACML, SAML
and WS-Authorization to allow interaction and mapping between different
access control services (see sections 4.4.2 for more information on XACML
and 4.5.1 for more information on SAML).
WS-Authorization is currently not even available as draft, however the
Web Services security roadmap [31] specifies that this standard will define
how access policies for a Web service are specified and managed, especially
how permissions may be expressed in certificates and how they are to be
interpreted at the service end-points. Proposals dealing with access control
models for Web services such as [10] suggest that SAML and XACML will
be the basis for implementing WS-Authorization.
The OGSA security architecture is mainly focused on standardization and
interoperation of heterogeneous security services. As this is out of the scope
of the work presented within this thesis, the impact on our work is relatively
low. However the few points of the OGSA security architecture regarding
the requirements for authorization are worth keeping in mind: These are
the requirement to be able to map different access control policies to each
other and the requirement to support delegation of rights in order to give
permissions to a process acting on behalf of a user.
First of all, in Sygn, a mapping of different access control policies can
be realized, by using the concept of hierarchical roles. A role A from one
domain can be mapped onto a role B of another domain by making role B
hierarchically inferior to role A (reminder: this means that any entity which
can activate role A can also activate role B). An equivalence between both
roles can also be defined, by making them mutually hierarchically inferior
to one another (i.e. every entity that can activate role B can also activate
role A and vice-versa).
Second, Sygn supports delegation mechanisms that allow to give authorizations to a process acting on a user’s behalf. As Sygn permissions are
bound to public keys, the problem of authenticating such a process can be
solved, using classical public key authentication mechanisms.
136
8.3
CHAPTER 8. SYGN AND CRYPTSTORE IN A GRID
Integrating Sygn in a Grid
In order to keep the architecture of Sygn independent of the underlying Grid
architecture, thus making Sygn more portable, we have chosen the following
approach: The policy enforcement point (PEP) acts as an agent between the
user client and the resource. The Grid user client has to be modified in order
to attach the Sygn request to the Grid request the user issues. When a request
arrives at the PEP, it strips the Sygn request off and passes it to the PDP for
checking. If the PDP’s response is positive, the PEP has to make sure that
the Sygn request was issued by the same entity that issued the Grid request.
For this, the PEP has to interact with the Grid authentication mechanism
in order to get the authenticated user’s public key. If this check is positive
the PEP has to verify that the Grid request matches the Sygn request. This
requires the resources that are the objects of both requests are the same as
well as the actions requested on these objects. If this check succeeds, the
PEP passes the Grid request to the Grid infrastructure controlling the local
resource, which then returns its answer to the user.
This scheme makes it possible to keep the Grid middleware that handles
the resources unchanged. Only the request handling protocol needs to be
changed, to integrate the PEP acting as an agent between users and the
resources, when requests arrive.
A PEP that integrates Sygn in the µgrid architecture has been implemented by Didier Oriol, in the course of his end-of-studies project for a
Master degree at INSA-Lyon. It allows file access control using Sygn within
µgrid. In order to realize the matching between the Grid request issuer and
the Sygn request issuer, this PEP interacts with the OpenSSL authentication
mechanism used in µgrid and extracts the public key from the certificate that
was used for authentication. This has proven to be somewhat difficult, since
the function that returns the public key contained in the OpenSSL X.509
certificate is not documented in the official OpenSSL manuals5 .
The matching between Grid request object and Sygn request object is
realized as simple string equality verification. To this end, Sygn permissions
use µgrid’s logical filenames as part of the Sygn file identifiers. The actions
required to execute a specific Grid request are the following:
• To copy a Grid file to the user’s local disk we require him to have the
read action.
• In order to write a file to the Grid or to delete it, the user needs write
action. If a new file is copied to the Grid in this way, Sygn automatically
5
The reference to this undocumented function was found on a Spanish discussion forum
dealing with OpenSSL.
8.3. INTEGRATING SYGN IN A GRID
137
registers the user who submitted this request as the SOA for this file.
Therefore this user automatically has the write permission that allows
him to proceed with his request. Sygn prevents users from overwriting
Grid files with new files having the same logical filename, unless the
user has the write action for the overwritten Grid file.
• µgrid allows users to manually replicate files between different Grid
storage elements with the possibility of changing their logical file names.
For this operation Sygn requires the user to have the read action for
the source file and the write action on the target file, if that file already
exists (i.e. if it is overwritten by this operation).
The Sygn-PEP is integrated within the µgrid file manipulation API,
therefore file manipulation by user created code is handled the same way
as manual file access through the terminal interface of µgrid. Users are responsible for providing their own code with the necessary permissions for any
file access that it will need to make.
In order to implement Sygn on a OSGA standardized Grid, one has to
consider if Sygn needs to be deployed as a Grid service. As Sygn is designed
to run co-located with the Grid resources, it remains doubtful whether an
implementation as a Grid service is necessary or if the resource can communicate with the Sygn-PEP locally. However we foresee no major problems
in adding a Web service interface to Sygn. The Sygn-PDP is stateless and
could therefore be implemented as a simple Web service without the need for
WSRF’s extensions for stateful Web services. As all communications are already encoded in XML, we would only have to define the service description
using WSDL and generate the SOAP communication protocol code. Various
tools for generating the latter out of a description in WSDL exist, as for
example the gSOAP Web services development toolkit6 .
The use of Sygn requires some public key based authentication service.
Therefore a password based authentication such as Kerberos service tickets
would not be usable with Sygn. We do not think that this is a major limitation, as the Globus toolkit’s security infrastructure GSI [37] provides public
key authentication based on OpenSSL, and this seems to be the approach
other Grid infrastructures are taking too. Therefore similar key extraction
mechanisms as the one used in the integration of µgrid could be used. Problems could arise, if the user’s authentication certificate is not directly available to the Sygn-PEP. In such cases it would be necessary to provide proof
of authentication in another way (e.g. through SAML authentication assertions), that allows the Sygn-PEP to acquire the authenticated user’s public
6
Available from http://www.cs.fsu.edu/∼engelen/soap.html
138
CHAPTER 8. SYGN AND CRYPTSTORE IN A GRID
key.
Depending on the nature of the Grid identification mechanisms for files
and hardware resources, a mapping between Sygn’s object identifiers and
these Grid resource identifiers may be required in order to realize the matching between a Grid request object and the Sygn permission object.
Finally the Grid requests have to be mapped on the available Sygn actions. As we have mentioned in 6.2.9 the Sygn actions are easily extensible,
and therefore action identifiers to fit the categories of the Grid request actions
can be easily added to the Sygn language.
8.4
Setting up CryptStore as a Grid service
In order to use CryptStore in a OSGA/WSRF-standardized Grid, the keyserver needs to be implemented as a Grid service. As the key-server is stateless, it can be implemented like a normal Web service. Requests and responses
are already encoded in XML, therefore only the Web service description has
to be written in WSDL and the SOAP protocol code has to be generated as
discussed in the previous section. As Grid security standards are still evolving relatively fast (for example the upcoming WS-Authorization standard),
we have not yet added such a Web service interface to CryptStore.
8.5
Using Sygn for CryptStore access control
We have implemented an interface that allows to use Sygn as access control
architecture for CryptStore key shares. In this design, an instance of the SygnPDP is co-located with the CryptStore key server. The local Sygn meta-data
base stores the sources of authority for the files for which the key server
stores key shares. This means that if a file owner stores a key share on the
key server, he is also registered as SOA of that file.
Using this information, the Sygn-PDP can make access control decisions
concerning the locally stored key shares.
The CryptStore administrator and user interfaces encapsulate Sygn requests in CryptStore requests according to actions the user wishes to initiate.
At the key server, the CryptStore access control integration module is effectively a Sygn-PEP, which performs the functions described in section 8.3.
When an administrator submits a new key share for storage, his CryptStore interface automatically generates an administrative Sygn command
(see section 6.3) that tries to register him as source of authority for the file
to which this share belongs. In this case the PEP only checks if the user is
8.6. SUMMARY
139
correctly authenticated and the Sygn-PDP checks if the file is not already
registered for a different SOA.
In order to retrieve a key share, the user must submit Sygn permissions
that allow him to read or write the corresponding file (which implies that
write permissions also include read permissions).
To be able to update the key shares belonging to a file, the user must have
write permissions on that file. This allows users who modified the file’s contents to re-encrypt it, if a lazy re-encryption scheme as described in chapter
5 is used.
In this setup, CryptStore is completely independent of the storage sites
where the file and its replicas are actually stored. It can rely on a local
access control service in order to control the access to key shares. This makes
the combination of CryptStore and Sygn extremely scalable, tolerant against
breakdowns and avoids adding an unnecessary layer of access control that
manages access to key shares.
8.6
Summary
In this chapter we have presented our integration of Sygn access control in a
working Grid architecture. We have made successful tests for the Sygn access
control within µgrid, covering a representative set of allowed and denied requests. We have presented OSGA/WSRF standardized Grids and discussed
how to integrate Sygn and CryptStore in such a Grid. Finally we have presented how to use Sygn for CryptStore key share access control. This has
also been implemented successfully and tested with a representative set of
allowed and denied requests.
140
CHAPTER 8. SYGN AND CRYPTSTORE IN A GRID
Chapter 9
Conclusions and Future Works
In the present thesis we have studied the use of Grid computing architectures
for health-care with a focus on data security. We have shown that classical
security solutions are not all directly applicable, due to the specifics of Grid
computing. As the central problem for health-care applications in Grid computing is the data security we have chosen to examine the specifics of access
control.
Based on a set of use-cases, we have presented a list of requirements and
constraints that are related to principles of good security, the nature of a
Grid architecture and the specifics of the health-care application. The most
important point we have found is the need for a decentralized administration
of the access rights, for traceability and for encrypted storage mechanisms.
The need for encrypted storage in conjunction with access control stems
from the fact that encrypted storage can enforce the use of the access control mechanism, which could be circumvented otherwise by persons having
physical access to the storage medium.
Based on these conclusions we have examined the current state of the
art regarding distributed and Grid access control and found that none of
the systems currently proposed fulfills all of our requirements, even when
disregarding the requirement of encrypted storage.
We have then examined the state of the art in encrypted storage systems. Our special focus here on group sharing mechanisms for encrypted
files, considering a dynamic evolution of group membership and file contents.
Our results indicate that all current encrypted storage systems that support group sharing of encryption keys do not handle dynamic groups well.
Another important point is that all of these systems have their own access
control mechanism, thus creating a duplicate, possibly inconsistent layer of
access control.
Our first contribution, the access control system Sygn, is based on a
141
142
CHAPTER 9. CONCLUSIONS AND FUTURE WORKS
decentralized permission administration. To this end it implements a concept
of decentralized roles and file sets, that are based solely on certificates. Sygn
permission management supports decentralized authorization management
by allowing the delegation of permissions through certificate chains. Finally
the goal of decentralization is also supported by minimizing access control
information that needs to be stored at the decision points. Most of the access
control information is provided by the users requesting an access in the form
of authorization certificate chains. The decision points only need to know the
source of authority for each of the local resources which they control. Not
only does such decentralization enhance the scalability of the access control
system, it also minimizes the impact of a successful attack on an access
control decision point, since only the local resources will be exposed.
Sygn also has integrated functions to allow traceability and can be configured to require non-repudiable requests which can be used as auditing
evidences. By integrating traceability in the access control mechanism, Sygn
allows for an easy deployment of both functionalities. Access control is a convenient point to collect audit information, as all requests to a system must
pass through the access control system.
Our second contribution, CryptStore, complements the access control by
protecting the data resources against the circumvention of the access control
system. CryptStore allows users to store their data in encrypted form and
to share the decryption keys with authorized users. Due to the necessity to
acquire the decryption key in order to use the data, accessing the encrypted
files on the storage medium does not help an attacker to gain knowledge
about the data contained in those files.
In order to have consistent permissions both on the decryption keys and
on the files they allow to decrypt, CryptStore uses the Grid’s file access
control mechanisms to determine if a user has access to an encrypted file’s
decryption key. This is achieved through a generic access control interface,
that can be adapted to any access control system present on the Grid infrastructure.
As keys themselves are valuable data, no key-server is given a entire copy
of a key. Instead keys are split into shares, using classical secret sharing
algorithms and key-shares are spread among the key-servers. Due to the
possibility to create redundant key-shares, CryptStore is also robust against
temporary inaccessibility or a loss of keys through failures of a key-server.
As future work on Sygn we plan to integrate mechanisms for fine grained
database access control with Sygn. This would allow a controlled exposure
of databases on a Grid infrastructure, regardless of the underlying database
143
management system. Sygn actions would need to be extended to include typical database actions such as SELECT, INSERT, UPDATE and DELETE,
based on the corresponding SQL queries. The database object identifiers
that would be needed in Sygn, will include information about the database
to which the permission applies, the table, the columns and possibly specific
table cells, specified by regular expressions applied to their contents. The
matching algorithm, verifying if a request object applies to a permission object becomes more complex in such a case, since a permission object may
have many different subsets (e.g. restricted selections of table columns) that
need to produce a positive match against the permission object if submitted
as request object.
Furthermore the implementation of database access control would open
opportunities to extend Sygn’s delegation mechanisms, allowing constrained
delegation as presented in [3]. This would allow a user who has been
granted some permissions on a database to delegate only a subset of these
permissions to another user.
Following the same track we also plan an extension of Sygn to allow fine
grained access control on elements of XML documents, drawing on previous
proposals on that topic such as [7], [30], [51] or [75]. We could build on
our previous experiences acquired during a cooperation with the Swedish
research laboratories PDC and SICS, where we implemented a system for
update access control on XACML policies [89].
For generic XML access control the required Sygn action set would be
read, insert and update, where read means that the XML element may be
read, insert specifies that new child elements may be added and update means
that the contents of the element and of all its child elements may be written
(this includes the permission to delete all or parts of them).
In this case a new type of Sygn object would be the elements of an XML
document, which can be easily identified by an XPath [26] expression.
Updates of XML documents could be controlled at a very fine grained
level of detail, using XML-document change detection algorithms such as
[102] and matching their results against the Sygn permissions.
XML document access control would also provide the opportunity to
integrate constrained delegation, where a user delegates a subset of his
permission to another. A challenging question that has to be answered in
that context is how to determine if an XPath expression is a restriction of
another one.
144
CHAPTER 9. CONCLUSIONS AND FUTURE WORKS
Finally another interesting question we plan to investigate are the legal implications of Grid computing for the processing of personal data as
exposed in section 3.5. The problem that would have to be addressed in
this context is how to create the contractual bindings between an entity
processing personal data on the Grid and the Grid resource providers. An
interesting approach could be the implementation of automatic, ad-hoc contract negotiation mechanisms, based on pre-defined security requirements. In
such a system a trusted third party could be used to certify that a resource
provider complies with certain security requirements. Software agents would
match the requirements of the user versus the services provided at the available resources and choose the appropriate resource providers and conclude
the processing contract on the user’s behalf. As we have already mentioned
it remains to be seen if such automatically concluded contracts can be legally
binding. In this question we can cooperate with legal experts to find the legal
requirements and to validate that our proposed technical solutions comply
with these requirements.
Appendix A
XML Schema for the Sygn
language
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<!-- Definition of the subject identifiers -->
<xs:element name="SID" abstract="true" />
<!-- Definition of the ANYSID identifier -->
<xs:element name="ANY_SID" substitutionGroup="SID"/>
<!-- Definition of the user identifiers UID -->
<xs:complexType name="uid">
<xs:simpleContent>
<xs:extension base="xs:string"/>
</xs:simpleContent>
</xs:complexType>
<xs:element name="USER_ID" type="uid" substitutionGroup="SID"/>
<!--Definistion of the role SOA type -->
<xs:complexType name="rsoa">
<xs:sequence>
<xs:element ref="USER_ID"/>
</xs:sequence>
</xs:complexType>
145
146
APPENDIX A. XML SCHEMA FOR THE SYGN LANGUAGE
<!-- Definition of the role (object) identifiers RID -->
<xs:complexType name="rid">
<xs:sequence>
<xs:element name="ROLE_SOA" type="rsoa"/>
<xs:element name="ROLE_NAME" type="xs:string"/>
<xs:element name="REVIEW_REPOSITORY" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:element name="ROLE_ID" type="rid" substitutionGroup="SID"/>
<!-- Definition of preOID (help construct, OID without RID) -->
<xs:element name="preOID" abstract="true" />
<!-- Definintion of Capability objects (help construct -->
<!-- to simulate multiple inheritance) -->
<xs:complexType name="OID">
<xs:choice>
<xs:element ref="preOID"/>
<xs:element ref="ROLE_ID"/>
</xs:choice>
</xs:complexType>
<!--Definistion of the object SOA type -->
<xs:complexType name="osoa">
<xs:sequence>
<xs:element ref="SID"/>
</xs:sequence>
</xs:complexType>
<!-- Definition of the file identificers FID -->
<xs:complexType name="fid">
<xs:sequence>
<xs:element name="FILE_SOA" type="osoa"/>
<xs:element name="LOGICAL_FILENAME" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:element name="UNIQUE_FILE_ID" type="fid"
substitutionGroup="preOID"/>
147
<!-- Definition of the file set identificers FSID -->
<xs:complexType name="fsid">
<xs:sequence>
<xs:element name="SET_SOA" type="osoa"/>
<xs:element name="SET_NAME" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:element name="FILE_SET_ID" type="fsid"
substitutionGroup="preOID"/>
<!-- Definition of the Resource Identifiers RESID -->
<xs:complexType name="resid">
<xs:sequence>
<xs:element name="RESOURCE_SOA" type="osoa"/>
<xs:element name="RESOURCE_NAME" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:element name="RESOURCE_ID" type="resid"
substitutionGroup="preOID"/>
<!-- Definition of action names -->
<xs:simpleType name="actionType">
<xs:restriction base="xs:string">
<xs:whiteSpace value="collapse"/>
<xs:enumeration value="read"/>
<xs:enumeration value="write"/>
<xs:enumeration value="activate"/>
<xs:enumeration value="add_to_set"/>
<xs:enumeration value="remove_from_set"/>
<xs:enumeration value="grant"/>
<xs:enumeration value="use"/>
</xs:restriction>
</xs:simpleType>
<!-- Definition of the Actions -->
<xs:complexType name="action">
<xs:simpleContent>
<xs:extension base="actionType">
<xs:attribute name="SIZE" type="xs:positiveInteger" />
148
APPENDIX A. XML SCHEMA FOR THE SYGN LANGUAGE
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:element name="ACTION" type="action"/>
<!-- Definition of Capability set objects (help construct) -->
<xs:complexType name="capset">
<xs:sequence>
<xs:element ref="FILE_SET_ID"/>
</xs:sequence>
</xs:complexType>
<!-- Definition of the Capabilities -->
<xs:complexType name="cap">
<xs:sequence>
<xs:element name="CAPABILITY_ID" type="xs:string"/>
<xs:element name="OBJECT" type="OID"/>
<xs:element ref="ACTION"/>
<xs:element name="SECOND_OBJECT" type="capset"
maxOccurs="1" minOccurs="0" />
</xs:sequence>
</xs:complexType>
<xs:element name="CAPABILITY" type="cap"/>
<!-- Definition of AC Creator (help construct) -->
<xs:complexType name="accreator">
<xs:sequence>
<xs:element ref="USER_ID"/>
</xs:sequence>
</xs:complexType>
<!-- Definition of AC Owner (help construct) -->
<xs:complexType name="acowner">
<xs:sequence>
<xs:element ref="SID"/>
</xs:sequence>
</xs:complexType>
149
<!-- Definition of AC restrictions (help construct) -->
<xs:complexType name="restrictions">
<xs:sequence>
<xs:element ref="ROLE_ID" maxOccurs="5" minOccurs="1"/>
</xs:sequence>
</xs:complexType>
<!-- Definition of the Authorization Certificates AC -->
<xs:complexType name="ac">
<xs:sequence>
<xs:element name="ID" type="xs:string"/>
<xs:element name="CREATOR" type="accreator"/>
<xs:element name="OWNER" type="acowner"/>
<xs:element ref="CAPABILITY"/>
<xs:element name="NOT_BEFORE" type="xs:string"/>
<xs:element name="NOT_AFTER" type="xs:string"/>
<xs:element name="NOT_WITH" type="restrictions"
maxOccurs="1" minOccurs="0"/>
<xs:element name="DELEGATIONS"
type="xs:nonNegativeInteger"/>
<xs:element name="SIGNATURE" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:element name="AUTHORIZATION_CERTIFICATE" type="ac"/>
<!-- Definition of the Sygn command names (help construct) -->
<xs:simpleType name="commandType">
<xs:restriction base="xs:string">
<xs:enumeration value="add_file_soa"/>
<xs:enumeration value="delete_file_soa"/>
<xs:enumeration value="revoke_certificate"/>
<xs:enumeration value="clean_revoked_table"/>
<xs:enumeration value="blacklist"/>
<xs:enumeration value="unblacklist"/>
<xs:enumeration value="register_resource"/>
<xs:enumeration value="unregister_resource"/>
<xs:enumeration value="log_resource_use"/>
<xs:enumeration value="get_metadata"/>
</xs:restriction>
</xs:simpleType>
150
APPENDIX A. XML SCHEMA FOR THE SYGN LANGUAGE
<!-- Definition of Sygn command parameters (help construct) -->
<xs:complexType name="parameter">
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="NR" type="xs:positiveInteger" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<!-- Definition of the Sygn Commands -->
<xs:complexType name="command">
<xs:sequence>
<xs:element name="COMMAND_NAME" type="commandType"/>
<xs:element name="PARAMETER" type="parameter"
maxOccurs="3" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
<xs:element name="SYGN_COMMAND" type="command"/>
<!-- Definition of the path targets (help construct) -->
<xs:complexType name="pathtarget">
<xs:sequence>
<xs:element ref="CAPABILITY"/>
</xs:sequence>
</xs:complexType>
<!-- Definition of the Certificates Paths -->
<xs:complexType name="path">
<xs:sequence>
<xs:element name="TARGET" type="pathtarget"
maxOccurs="1" minOccurs="0"/>
<xs:element ref="AUTHORIZATION_CERTIFICATE"
maxOccurs="10" minOccurs="0"/>
<xs:element ref="SYGN_COMMAND"
maxOccurs="1" minOccurs="0"/>
</xs:sequence>
<xs:attribute name="NUMBER" type="xs:nonNegativeInteger"/>
</xs:complexType>
151
<xs:element name="PATH" type="path"/>
<!-- Definition of the request issuer (help construct) -->
<xs:complexType name="reqIssuer">
<xs:sequence>
<xs:element ref="USER_ID"/>
</xs:sequence>
</xs:complexType>
<!-- Definition of the request signature (help construct) -->
<xs:group name="surfsignature">
<xs:sequence>
<xs:element name="ISSUE_TIME" type="xs:string"/>
<xs:element name="ISSUERS_SIGNATURE" type="xs:string"/>
</xs:sequence>
</xs:group>
<!-- Definition of pathes for request (help construct) -->
<xs:complexType name="pathes">
<xs:sequence>
<xs:element ref="PATH" maxOccurs="5" minOccurs="1"/>
</xs:sequence>
</xs:complexType>
<!-- Definition of the standard user request format SURF -->
<xs:complexType name="surf">
<xs:sequence>
<xs:element name="REQ_ISSUER" type="reqIssuer"/>
<xs:group ref="surfsignature" maxOccurs="1" minOccurs="0"/>
<xs:element name="REQ_PATH" type="pathes"/>
</xs:sequence>
</xs:complexType>
<xs:element name="SURF" type="surf"/>
<!-- Definition of response values (help construct) -->
<xs:simpleType name="gdf">
<xs:restriction base="xs:string">
<xs:enumeration value="granted"/>
<xs:enumeration value="denied"/>
<xs:enumeration value="failed"/>
152
APPENDIX A. XML SCHEMA FOR THE SYGN LANGUAGE
</xs:restriction>
</xs:simpleType>
<!-- Definition of path responses (help construct) -->
<xs:complexType name="PathResponse">
<xs:sequence>
<xs:element name="STATUS" type="gdf" />
<xs:element name="ERROR" type="xs:string" />
</xs:sequence>
<xs:attribute name="NR" type="xs:integer" use="required"/>
</xs:complexType>
<!-- Definition of request responses -->
<xs:complexType name="AdfResponse">
<xs:sequence>
<xs:element name="REQUEST_STATUS" type="gdf"/>
<xs:element name="GLOBAL_ERROR" type="xs:string"/>
<xs:element name="PATH" type="PathResponse"
maxOccurs="5" minOccurs="1"/>
</xs:sequence>
</xs:complexType>
<xs:element name="SYGN_RESPONSE" type="AdfResponse"/>
</xs:schema>
Appendix B
XML Schema for CryptStore
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!-- Definition of the requests -->
<xs:simpleType name="req_type">
<xs:restriction base="xs:string">
<xs:whiteSpace value="collapse"/>
<xs:enumeration value="store_keyshare"/>
<xs:enumeration value="delete_keyshare"/>
<xs:enumeration value="retrieve_keyshare"/>
</xs:restriction>
</xs:simpleType>
<xs:complexType name="cs_req">
<xs:sequence>
<xs:element name="request_type" type="req_type"/>
<xs:element name="lfn" type="xs:string"/>
<xs:element name="keyshare" type="xs:string"
minOccurs="0" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
<xs:element name="cryptstore_request" type="cs_req"/>
<!-- Definition
<xs:complexType
<xs:sequence>
<xs:element
<xs:element
of the encryption parameters -->
name="crypt_par">
name="ALGORITHM" type="xs:string"/>
name="KEYSIZE" type="xs:positiveInteger"/>
153
154
APPENDIX B. XML SCHEMA FOR CRYPTSTORE
<xs:element name="IV" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:element name="ENCRYPTION_PARAMETERS" type="crypt_par"/>
<!-- Definition of file digest -->
<xs:complexType name="digest_info">
<xs:sequence>
<xs:element name="ALGORITHM" type="xs:string"/>
<xs:element name="DIGEST_VALUE" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:element name="MAC" type="digest_info"/>
<!-- Definition of the key recovery information -->
<xs:complexType name="keyshare_info">
<xs:sequence>
<xs:element name="THRESHOLD" type="xs:positiveInteger"/>
<xs:element name="KEYSHARE_SERVER" type="xs:string"
minOccurs="1" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:element name="KEYSHARING_INFORMATION"
type="keyshare_info"/>
<!-- Definition of the file header -->
<xs:complexType name="CS_file_header">
<xs:sequence>
<xs:element ref="ENCRYPTION_PARAMETERS"/>
<xs:element ref="MAC_DIGEST" minOccurs="0" maxOccurs="1"/>
<xs:element ref="KEYSHARING_INFORMATION"/>
</xs:sequence>
</xs:complexType>
<xs:element name="CRYPTSTORE_METADATA" type="CS_file_header"/>
</xs:schema>
Appendix C
Sygn permission creation GUI
In order to help users in the creation of Sygn permissions, a graphical user
interface for handling permission creation and storage was designed and implemented under my supervision by Dan Hididis in the course of his third
year project of his Master studies at INSA-Lyon.
The user interface supports the creation the following Sygn certificate
components:
• Creation of RSA key-pairs for use as user identifiers (UID).
• Creation of role identifiers (RID) and role object identifiers (ROID).
• Creation of file identifiers including the optional generation of the logical filename by applying a SHA-1 hash to the file content.
• Creation of file set identifiers (FSID).
• Creation of capabilities
• Creation of authorization certificates (AC).
The Sygn actions supported by this interface are configurable through a
parameter file, and thus easily extensible. All certificate components can be
saved in a MySQL database which can be queried using the interface. To help
users referencing user identifers and authorization certificates, aliases can be
assigned to both.
As the interface is designed to run on lightweight computing devices, all
cryptographical primitives are executed remotely on a trusted machine that
has the necessary libraries installed. The interface connects to the remote
machine through an SSL secured Web service interface, using the SOAP protocol. In order to be portable, the interface itself in implemented entirely in
155
156
APPENDIX C. SYGN PERMISSION CREATION GUI
Java, whereas the Web service in written in C++ and its Web service protocol
code was generated by the gSOAP Web services development toolkit1 .
Figure C.1 show the user interface during the creation of an authorization
certificate.
1
Available from http://www.cs.fsu.edu/∼engelen/soap.html
Figure C.1: A graphical user interface for the creation of Sygn Authorization Certificates.
157
158
APPENDIX C. SYGN PERMISSION CREATION GUI
Glossary
AA See Attribute authority.
AAA Authentication, Authorization, and Accounting.
Access Control The process of verifying and enforcing authorizations.
Accounting Extension of Auditing. Gathering measurements on resource
use, possibly for billing.
AC Authorization Certificate or Attribute Certificate
ACL Access Control List. A representation of permissions under the Discretionary Access Control model.
Activation Used in the context of role based access control as for example
in the expression ”activation of a role”. Contrary to groups, roles are
not constantly active, therefore in order to use the permissions of a
role, the user has to activate it. This allows one to separate duties and
to use the least privileges.
Ad-hoc On demand, spontaneously. Used in conjunction with granting of
permissions to refer to permissions granted at short notice, typically
having a short lifetime.
AES Advanced Encryption Standard. Block cipher algorithm chosen by
NIST as US-standard in 2000.
Agent message sequence Message sequence in AAA systems, where the
AAA service works as agent between the users and the resource.
Akenti Access control system developed at the Distributed Systems Department of the Lawrence Berkeley Laboratory, in the USA.
Anonymization Removal of all information from a piece of personal data
that allows the identification of the person concerned by this data.
159
160
GLOSSARY
ARC4 Alleged RC4 algorithm. An unofficially published version of the RC4
stream cipher.
ASN.1 Abstract Syntax Notation 1. Binary data encoding scheme created
by the International Telecommunication Union.
Assertion Equivalent to certification. Declaration of (security relevant)
facts about a subject issued by a specific entity.
Asymmetric cryptography Also known as public-key cryptography. In
asymmetric cryptography the en-/decryption algorithm uses a different key for encryption and decryption. The decryption key can not
(feasibly) be derived from the encryption key. The most well known
asymmetric cryptosystem is RSA.
Attribute A property assigned to a subject. For example a group membership or the permission to activate a role.
Attribute authority An entity that is trusted to issue certain attributes
to subjects.
Attribute certificate A certificate in which an attribute is assigned to a
certain subject.
Auditing The process of analyzing the events that occurred on a certain
system by reviewing log-data.
Authentication The procedure by which a subject can prove a claimed
identity.
Authorization All activities that deal with the question who may access
which resource in what way.
Authorization action The operation that is requested or granted on a
resource in an authorization procedure.
Authorization object The resource targeted by an authorization.
Authorization subject The subject (i.e. person or process) to which an
authorization applies.
Block ciphers Class of en-/decryption algorithms that transform fixed
blocks of data using a secret key.
GLOSSARY
161
Block cipher modes Execution mode of a block cipher to sequentially en/decrypt multiple blocks of data with the same key. Examples are the
Electronic Code Book mode (ECB), the Cipher Block Chaining mode
(CBC) or the Cipher Feedback mode (CFB).
Blowfish A block cipher created by Bruce Schneier.
Capability Combination of an authorization object and an authorization
action. First defined by discretionary access control models (DAC).
Cardea Access control system developed at the NASA Advanced Supercomputing (NAS) Division of the NASA Ames Research Center, USA.
CAP Abbreviation for capability in Sygn.
CAS Community Authorization Server. Authorization server developed by
the Globus Alliance for use with the Grid infrastructure Globus Toolkit.
CBC Cipher Block Chaining. Block cipher mode of operation, that aims to
hide patterns in different blocks of ciphertext.
Cepheus Encrypted storage system developed at the MIT.
Certificate Digital document that specifies certain properties about a subject (the owner). Examples for such properties are: a public key to be
used for authentication, attributes assigned to the owner or authorizations given to the owner. Certificates have a creator, who signs them
using a digital signature, and usually specify a validity period.
Certificate Chain Equivalent to certificate path. An ordered set of certificates, through which an authorization or authentication is validated.
Certificate Path See Certificate Chain.
CFB Cipher Feedback mode. Block cipher mode of operation, that allows
to use a block cipher as a (not very efficient) stream cipher.
CFS Cryptographic File System. A secure storage system developed at the
AT&T Bell Laboratories.
CNIL French National Commission for Liberties and Informatics. Appointed by law to deal with privacy issues in computerized data processing.
162
GLOSSARY
Community Term used as a synonym of Virtual Organization (VO) or
a part of a VO in the CAS system. Designs a (possibly) crossorganizational interest group sharing resources on a Grid.
Confidentiality Prevention of the disclosure of sensitive data. Often uses
encryption methods to achieve its goals.
Creator Signer of an authorization certificate in Sygn.
CRL Certificate revocation list. Lists certificates that have been invalidated
before their expiration date for security reasons. Such lists need to
be consulted to determine the validity of a certificate, if a revocation
scheme exists for those certificates.
CryptFS Encrypted storage system created at the Computer Science Department of the Columbia University, USA.
Crypto++ Object-oriented C++ library of cryptographical functions.
Available from http://www.cryptopp.com.
Cryptographic hash A cryptographical function that maps a variable size
digital document to a fixed size hash value. Often used for integrity
protection in digital signatures and message authentication codes.
Cryptographical tokens Computer hardware implementing some cryptographical function. For example smartcards that handle public/private
key cryptosystems.
CryptStore Cryptographical storage system proposed in this thesis.
C-SDA Chip-Secured Data Access. Cryptographical storage system developed at the PRISM laboratory of the University of Versailles in France.
CTS Ciphertext stealing, a variant of ECB and CBC block cipher modes.
Normally the last block of a plaintext that is encrypted using a block
cipher is extended to match the cipher’s block size. In some cases this
may be undesirable. CTS allows to keep the encrypted text the same
size than the plaintext.
DAC See Discretionary Access Control.
DataGrid European Grid research project. IST-2000-25182. Started 2001
and ended 2004.
GLOSSARY
163
Delegation The act of transferring authorizations or the power to issue
authorizations from one entity to another.
Denial-of-service A type of attack against a computer system that aims
to disrupt the availability of the system.
DES Data Encryption Standard, a block cipher algorithm. No longer considered secure due to the short length of the key (56 bits). Has been
progressively replaced by AES since 2000.
DESX Variant of the DES block cipher algorithm that increases the length
of the key.
Digital signature Method to make changes in a digital document detectable. Uses public key cryptography.
Directive 95/46 EC European Union directive on the protection of individuals with regard to the processing of personal data.
Discretionary Access Control Access control model that uses different
representations of an access control matrix to represent access permissions. The matrix is composed of a row per user and a column per
resource; a matrix cell contains the rights of the corresponding user
with the corresponding resource.
DRM Digital Rights Management. Branch of access control dealing with
the usage of digital media (audio, video) under the aspects of copyright
protection.
DSD Dynamic Separation of Duties. RBAC-related concept. Equivalent to
a Separation of Duties enforced at runtime.
ECB Electronic Code Book. Block cipher mode that encrypts all cleartext
blocks sequentially.
EGEE Enabling Grid for EsciencE in Europe. European Grid research
project. Follow-up project to DataGrid. IST-2003-508833. Started in
2004. Project homepage: http://public.eu-egee.org.
Entity A user or a process acting on a user’s behalf or an automated process
acting on the Grid.
Escrow In cryptography the act of depositing a copy of an encryption key
at a trusted third party.
164
GLOSSARY
FID File identifier in Sygn.
FSID File set identifier in Sygn.
GACL Grid Access Control List. ACL implementation for Grids described
in [72].
GGF Global Grid Forum. Grid standardization body. http://www.ggf.org.
Grid computing An approach to distributed computing that allows transparent sharing of heterogeneous resources. See also old wine in new
bottles.
GridShib A recent project to adapt the Shibboleth access control architecture to a Grid computing environment.
Group server In encrypted storage, a server that handles group sharing of
encrypted files.
Health-care networks A network created by the interconnection of healthcare institutions (clinics, hospitals, individual doctors) with the goal of
improving health-care by making medical data more available.
HMAC A mechanism or message authentication using cryptographic hash
functions HMAC is a FIPS standard (FIPS PUB 113).
IEC International Electrical Commission. The IEC is a standards organization for all areas of electrotechnology.
IEEE Institute of Electrical and Electronic Engineers. Non-profit, technical
professional association.
IETF Internet Engineering Task Force. Open international community of
network designers, operators, vendors and researchers concerned with
the evolution of the Internet architecture and the smooth operation of
the Internet. Publishes RFC s.
Integrity protection In cryptography, protection of a digital document
against unauthorized modifications.
Intrusion detection Detection of successful hacker attacks against a computer system.
IPSec Internet Protocol Security, RFC 2401. A set of protocols for the secure
exchange of packets at the IP layer.
GLOSSARY
165
ISO International Standards Organization. ISO is a network of the national
standards institutes.
KeyNote A trust management system dealing with authentication and authorization. Also defines a policy specification language.
Key server In cryptographic storage, a server that stores decryption key
material.
LDAP Lightweight Directory Access Protocol, RFC 2251. Protocol that
allows querying and modifying data stored in a hierarchical distributed
database on the network.
Least privilege RBAC -related concept of using always the least set of privileges for performing an action. Limits the damage that can be done
by faulty or malicious processes acting on a user’s behalf.
LFN See Logical file name.
Lockbox Concept related to cryptographic storage. Refers to the storing of
a symmetric key encrypted with the public key of some user, in order
to make the symmetric key accessible for the user.
Logical file name Unique file name assigned to a file used to address it.
MAC See Mandatory Access Control or Message Authentication Code.
Mandatory Access Control Access control model that assigns different
levels of security to each resource and each user. Users may read all
resources that have a level equal or lower than their own and write to
all resources that have a level equal or higher than their own.
MDC/SHS Encryption algorithm created by Peter Gutmann for SFS.
Turns a cryptographic hash function into a block cipher running in
CFB mode.
Message authentication code Key dependent one-way hash function.
Method of protecting the integrity of a file for users sharing a common
secret key. Can not be used for non-repudiation, since every having the
secret key can produce a valid code.
Meta-data Data associated to some other data, describing some quality of
it.
MySQL A database management system. http://www.mysql.org.
166
GLOSSARY
Nonce A term sometimes used to design the initialization vector of certain
block cipher modes.
Non-repudiation Method by which the sender of some data is unable to
deny the sending of that data. In cryptography, digital signatures can
be used to implement non-repudiation.
OASIS Organization for the Advancement of Structured Information Standards. Non-profit, international consortium that works on the development, convergence, and adoption of e-business standards.
OFB Output Feedback Mode. Block cipher mode similar to CFB. Turns the
block cipher into a stream cipher.
OGSA Open Grid Services Architecture. A standard that aims at defining
a common, open architecture for grid-based applications.
OID Access control object identifier in Sygn.
OpenSSL A library implementing the SSL protocol suite used for secure
communication over sockets.
Owner The holder of a certificate in Sygn.
Path An ordered set of certificates, with the goal of proving a delegation
originating from a source of authority going to a certain entity.
PDP Policy decision point. The part of an access control system, that takes
access control decisions. Term defined in RFC 2904 [96].
PEP Policy enforcement point. The part of an access control system, that
enforces access control decisions. Term defined in RFC 2904 [96].
PERMIS A distributed access control system with a focus on RBAC, developed by the Information Systems Security Research Group of the
University of Salford, UK.
PKCS padding A scheme for extending a short block of cleartext to the
blocksize of the RSA algorithm.
PKI Public Key Infrastructure. A system where every entity can authenticate itself by using a digital certificate (and a corresponding private
key), created by a certification authority.
Policy In access control, a set of rules governing the access to resources.
GLOSSARY
167
POSIX.1E A standards paper describing security extensions to the
Portable Operating System Interface (POSIX) standardization effort.
PRIMA An access control system with a focus on ad-hoc authorization,
created at the Department of Computer Science of the Virginia Polytechnic Institute and State University.
Proxy Certificate In Grid computing, a term referring to a short lived
proxy authentication credential created from a long term authentication
credential. Used by processes acting on behalf of the owner of the long
term credential.
Pull message sequence Message sequence in AAA systems, where the resource contacts the AAA service after receiving a request in order to
get an authorization decision.
Public key cryptography Synonym for Asymmetric cryptography.
Push message sequence Message sequence in AAA systems, where the
user contacts the AAA service in order to get an authorization decision
before submitting a request to a resource.
RBAC See Role Based Access Control.
RC5 A stream cipher algorithm created by RSA Security Inc. Used in SSL
up to version 3.
RESID Hardware resource identifier in Sygn.
Resources In Grids, any hardware facility (e.g. CPU’s providing computing
power, hard disks providing storage space) and shared data.
Restriction In Sygn: A limitation on how a permission may be used. Implements the RBAC concept of DSD.
Review-repository In Sygn: A storage space where permissions assigned
to a role are duplicated.
Revocation The process of invalidating a certificate before its expiration
date.
RFC Abbreviation for Request For Comments. Format of IETF standard
proposals.
RID Role identifier in Sygn.
168
GLOSSARY
ROID Role object identifier in Sygn. Used for roles as access control objects.
Role RBAC -related concept, a named collection of permissions and possibly
other roles, that are needed to perform a specific task.
Role Based Access Control Access Control model, groups all permissions required to perform a specific task into a role and assigns roles
to users based on the tasks they have to perform.
RSA Name of the most famous asymmetric cryptography algorithm. Named
after its inventors Ron Rivest, Adi Shamir and Leonard Adleman.
SAML Security Assertions Markup Language. XML based language for
communicating security relevant information. Created by the OASIS
consortium.
SEAL Stream cipher algorithm. Designed at IBM by Phil Rogaway and Don
Coppersmith.
Secret sharing In cryptography, algorithms that allow to distribute a secret
between different parties, so that no party has access to the entire secret
and several (or all) parties must collaborate to reconstruct the secret.
Separation of duties RBAC concept of denying the simultaneous use of
certain permissions in order to prevent a user from cumulating critical
functions in a specific process.
SFS Over-used acronym for cryptographic storage systems. There is a SFS
(Secure FileSystem) by P. Gutmann [56], a SFS (Self-certifying File
System) by D. Mazières [71] and a SFS (Secure File System) by J. P.
Hughes et al. [60, 59].
SHA-1 Secure Hash Algorithm. Cryptographic hash algorithm designed to
be used with the Digital Signature Standard (DSA).
Shibboleth Access control system with a focus on user privacy protection,
designed by the Middleware Architecture Committee for Education
(MACE) of the Internet2 consortium and supported by IBM.
SID Subject identifier in Sygn.
SISWG Security in Storage Working Group. Group sponsored by IEEE with
the goal to define standards for cryptographic algorithms and methods
for encrypting data before storage.
GLOSSARY
169
Smartcard A plastic card with an embedded chip that features a microprocessor and a non-volatile memory. When used for asymmetric cryptography, smartcards have the advantage of providing protection for the
private key.
SNAD Secure Network Attached Disks. Cryptographic storage system developed at the University of California, USA.
SOA See Source of authority.
SOAP Simple Object Access Protocol [19]. XML based protocol for information exchange using the HTTP protocol. Often misused for achieving
firewall transversal.
SOC Sygn Owner Client. Tool that is part of Sygn and allows a resource
owner to create authorization certificates.
Source of authority The initial person who has the authority to issue permissions on a specific resource.
SPKI Simple Public Key Infrastructure. RFC 2692, 2693 [38, 39]. An architecture proposal for managing authorization through certificates.
SSD Static Separation of Duties. RBAC -related concept. Equivalent to Separation of duties enforced at permission creation.
SSL Secure Sockets Layer (OSI level 4). Protocol suite for ensuring communications security. Provides functionality such as mutual authentication,
encryption and integrity protection.
Stream cipher Class of en-/decryption algorithms that transforms a
stream of bits (or bytes) using a secret key.
SUC Sygn User Client. Tool that is part of Sygn and allows a user to store
and retrieve his authorization certificates when needed.
SURF Standard User Request Format in Sygn. The format in which requests
are submitted to the Sygn-PDP.
Sygn Access control system presented in this thesis.
Symmetric encryption Class of cryptographic algorithms using a single
secret key for encryption and decryption.
Target Used in Sygn to designate the capability that a Path is intended to
authorize to a user.
170
GLOSSARY
TCFS Transparent Cryptographic File System. Cryptographic storage system developed at the University of Salerno in Italy.
TDES Triple DES. Encryption algorithm based on the DES algorithms.
Uses multiple applications of the DES algorithm with different keys in
order to increase the overall key length.
Threshold Term used in secret sharing. Determines the minimum number
of shares needed to reconstruct the secret.
Timestamp Time and data in a fixed format. Used in certificates and requests.
TLS Transport Layer Security (OSI level 4). RFC 2246. Protocol standard
for transport security created on the basis of SSL version 3.0.
Traceability The possibility of verifying user actions (such as access to a
resource) in a system through the use of some log data.
Trusted third party An actor in a security relevant protocol, that holds
security critical information.
Tweak Name of the initialization vector in some block cipher modes.
Tweakable block cipher Block cipher mode, optimized for storage security.
UID User identifier in Sygn. A public key for which the user holds the corresponding private key.
URL Uniform Resource Locator. Global address of resources on the World
Wide Web.
VO Virtual organization. Cooperation of several entities (possibly crossorganizationtial), providing and using in common a set of resources
a Grid.
VOMS Virtual Organization Membership Service. An authorization server
created in the framework of the European DataGrid project, for which
development continues in the EGEE project.
W3C World Wide Web Consortium. Develops Web standards and guidelines.
Web services Distributed computing technology based on the World Wide
Web. Developed by the W3C.
GLOSSARY
171
WinEFS Windows Encrypting File System. Cryptographical storage system incorporated in some edition of the Windows operating system.
WSDL Web Services Description Language [24]. XML based language designed for the specification of the interfaces exposed by a web service.
WSRF Web Services Resource Framework. Specification developed by the
OASIS consortium in order to add state information to web services.
The Globus Toolkit version 4 implements a Grid middleware that is
compliant with WSRF.
X.509 A public-key infrastructure defined by the IETF. A part of X.509 is
the definition of a certificate format. Used in SSL/TLS.
XACML eXtensible Access Control Markup Language. Standard proposal
by the OASIS consortium, that defines a general purpose, XML based
language for specifying access control policies.
XML eXtensible Markup Language (XML). A simple, flexible text format
for the exchange of data. Developed by the W3C.
XMLSchema A language for defining the structure, content and semantics
of XML documents. Developed by the W3C.
XrML eXtensible rights Markup Language. Policy language based on XML,
used to describe rights and conditions for using digital resources. Developed by the ContentGuard company.
µgrid A minimal Grid architecture developed by J. Montagnat and D. Lingrant at the CREATIS laboratory of INSA Lyon in France.
172
GLOSSARY
Bibliography
[1] N. AHITUV, Y. LAPID, and S. NEUMANN. Processing Encrypted
Data. Communications of the ACM, 30(90):777–780, September 1987.
[2] R ALFIERI, R. CECCHINI, V. CIASCHINI, and al. VOMS, an Authorization System for Virtual Organizations. In Proceedings of the 1st
European Across Grids Conference, Santiago de Compostela, Spain,
February 2003.
[3] O. BANDMANN, M. DAM, and B. SADIGHI FIROZABADI. Constrained Delegation. In Proceedings of 2002 IEEE Symposium on Security and Privacy, Oakland, CA, USA, May 2002.
[4] D. E. BELL. A refinement of the mathematical model. Technical
Report ESD-TR-278 vol. 3, The Mitre Corp., Bedford, MA, 1973.
[5] D. E. BELL and L. J. LAPALUDA. Secure computer systems: Mathematical foundations. Technical Report ESD-TR-278 vol. 1, The Mitre
Corp., Bedford, MA, 1973.
[6] M. BELLARE, R. CANETTI, and H. KRAWCZYK. Keying Hash
Functions for Message Authentication. In Advances in Cryptology
- Crypto 96 Proceedings of the 16th Annual International Cryptology Conference conference., volume LNCS 1109, pages 1–15. SpringerVerlag, August 1996.
[7] E. BERTINO and E. FERRARI. Secure and Selective Dissemination of
XML Documents. In ACM, Transactions on Information and System
Security (TISSEC), volume 5, pages 290–331. 2002.
[8] E. BERTINO, E. FERRARI, and A. SQUICCIARINI. Trust Negotiations: Concepts, Systems, and Languages. CERIAS Tech Report
2004-68, Center for Education and Research in Information Assurance and Security, Purdue University, West Lafayette, IN 47907-2086,
July/August 2004.
173
174
BIBLIOGRAPHY
[9] E. BERTINO, P. MAZZOLENI, B. CRISPO, and al. Towards Supporting Fine-Grained Access Control for Grid Resources. In Proceedings of
the 10th International Workshop on Future Trends in Distributed Computing Systems (FTDCS), pages 59–65, Suzhou, China, May 2004.
[10] E. BERTINO and A. C. SQUICCIARINI. A Flexible Access Control Model for Web Services. In Proceedings of the 6th International
Conference On Flexible Query Answering Systems, pages 13–16, Lyon,
France, June 2004.
[11] K. J. BIBA. Integrity considerations for secure computer systems.
Technical Report TR-3153, The Mitre Corp., Bedford, MA, April 1976.
[12] M. BLAZE. A Cryptographic File System for UNIX. In ACM Conference on Computer and Communications Security, pages 9–16, Fairfax,
VA, November 1993. Association for Computing Machinery (ACM).
[13] M. BLAZE. Key Management in an Encrypting File System. In Proceedings of USENIX Summer 1994 Technical Conference, Boston, MA,
USA, June 1994.
[14] M. BLAZE, J. FEIGENBAUM, J. IOANNIDIS, and al. The KeyNote
Trust-Management System Version 2. Request For Comments (RFC)
2704, Internet Engineering Task Force (IETF), September 1999.
http://www.ietf.org/rfc/rfc2704.txt (Webpage visited on 12/04/05).
[15] P. BONATTI and P. SAMARATI. A Unified Framework for Regulating
Access and Information Release on the Web. Journal of Computer
Security, 10(3):241–272, September 2002.
[16] P. BONATTI, S. DE CAPITANI DI VIMERCATI, and P. SAMARATI. An Algebra for Composing Access Control Policies. ACM
Transactions on Information and System Security (TISSEC), 5(1):1–
35, February 2002.
[17] L. BOUGANIM, F. D. NGOC, P. PUCHERAL, and al. Chip-secured
data access: Reconciling access rights with data encryption. In Proceedings of the 29th conference on Very Large Data Bases (VLDB), pages
1133–1136, Berlin, Germany, September 2003.
[18] L. BOUGANIM and P. PUCHERAL. Chip-Secured Data Access: Confidential Data on Untrusted Servers. In Proceedings of the 28th conference on Very Large Data Bases (VLDB), pages 131–142, Hong Kong,
China, August 2002.
BIBLIOGRAPHY
175
[19] D. BOX, D. EHNEBUSKE, G. KAKIVAYA, and al. Simple Object
Access Protocol (SOAP) 1.1. W3C note, World Wide Web Consortium, May 2000. http://www.w3.org/TR/soap (Webpage visited on
16/05/05).
[20] T. BRAY, J. PAOLI, C. M. SPERBERG-MCQUEEN, and al. eXtensible Markup Language (XML) 1.0. W3C recommendation, World Wide
Web Consortium, 1998. http://www.w3.org/TR/REC-xml (Webpage
visited on 12/04/05).
[21] G. CATTANEO, G. PERSIANO, A. DEL SORBA, and al. Design
and Implementation of a Transparent Cryptographic File System for
UNIX. Technical report, University of Salerno, Italy, July 1997.
[22] G. CATTANEO, G. PERSIANO, A. DEL SORBO, and al. The Design
and Implementation of a Transparent Cryptographic File System for
UNIX. In Proceedings of the UNIX Annual Technical Conference 2001,
Freenix Track, Boston MA, USA, June 2001.
[23] D. CHADWICK and A. OTENKO. The PERMIS X.509 Role Based
Privilege Management Infrastructure. In Proceedings of the 7th ACM
Symposium on Access Control Models and Technologies, pages 135–140,
Monterey, CA, USA, June 2002.
[24] E. CHRISTENSEN, F. CURBERA, G. MEREDITH, and al. Web Services Description Language (WSDL) 1.1. W3C note, World Wide Web
Consortium, March 2001. http://www.w3.org/TR/wsdl (Webpage visited on 16/05/05).
[25] B. CLAERHOUT and G. J. E. DE MOOR. Privacy protection for
healthgrid applications. In Proceedings of the second European HealthGrid conference, Clermont-Ferrand, France, January 2004.
[26] J. CLARK and S. DEROSE. XML Path Language (XPath). W3C
recommendation, World Wide Web Consortium, November 1999.
http://www.w3.org/TR/xpath (Webpage visited on 12/04/05).
[27] PORTABLE APPLICATIONS STANDARDS COMITTEE. Portable
Operating System Interface (POSIX) - Part 1: System Application Program Interface (API) - Amendment #: Protection, Audit and Control
Interfaces [C Language]. Withdrawn draft, IEEE Computer Society,
October 1997. http://wt.xpilot.org/publications/posix.1e/ (Webpage
visited on 12/04/05).
176
BIBLIOGRAPHY
[28] CONTENTGUARD. eXtensible rights Markup Language XrML 2.0
Specification. Whitepaper, ContentGuard Inc., November 2001.
http://www.xrml.org/ (Webpage visited on 12/04/05).
[29] COUNCIL OF EUROPE. Convention for the protection of human
rights and fundamental freedoms. http://www.echr.coe.int, 4 November 1950. (Webpage visited on 12/04/05).
[30] E. DAMANI, S. DE CAPITANI DI VIMERCATI, S. PARABOSCHI,
and al. A Fine-Grained Access Control System. In Transactions on
Information and System Security (TISSEC), volume 5, pages 169–202.
ACM, 2002.
[31] G. DELLA-LIBERA, B. DIXON, J. FARRELL, and al.
Security in a Web Services World: A Proposed Architecture and Roadmap.
Whitepaper, IBM Corporation and Microsoft Corporation, April 2002.
available from http://www128.ibm.com/developerworks/webservices/library/ws-secmap (Webpage visited on 16/05/05).
[32] J. DOMINGO-FERRER. A new privacy homomorphism and applications. Information Processing Letters, 60(5):277–282, December 1996.
ISSN 0020-0190.
[33] D. EASTLAKE, J. REAGLE, and D. SOLO. XML-Signature Syntax
and Processing. W3c recommendation, World Wide Web Consortium,
February 2002. http://www.w3.org/TR/xmldsig-core (Webpage visited on 12/04/05).
[34] C. KENT Ed. Draft Proposal for Tweakable Narrow-block Encryption.
Draft, IEEE Computer Society, August 2004.
http://www.siswg.org/docs/index.html
(Webpage visited on 12/04/05).
[35] D. NAOR Ed. Draft Proposal for Key Backup Format for Wide-block
Encryption. Draft, IEEE Computer Society, September 2004.
http://www.siswg.org/docs/index.html
(Webpage visited on 12/04/05).
[36] S. HALEVI Ed. Draft Proposal for Tweakable Wide-block Encryption.
Draft, IEEE Computer Society, March 2003.
http://www.siswg.org/docs/index.html
(Webpage visited on 12/04/05).
BIBLIOGRAPHY
177
[37] V. WELCH Ed. Globus Toolkit Version 4 Grid Security Infrastructure: A Standards Perspective. Technical report, Globus Security Team, Globus Alliance, December 2004.
available from
http://www.globus.org/toolkit/docs/4.0/security (Webpage visited
16/05/05).
[38] C. ELLISON. SPKI Requirements. Request For Comments (RFC)
2692, Internet Engineering Task Force (IETF), September 1999.
http://www.ietf.org/rfc/rfc2692.txt (Webpage visited on 12/04/05).
[39] C. ELLISON, B. FRANTZ, B. LAMPSON, and al. SPKI Certificate
Theory. Request For Comments (RFC) 2693, Internet Engineering Task
Force (IETF), September 1999. http://www.ietf.org/rfc/rfc2693.txt
(Webpage visited on 12/04/05).
[40] M. ERDOS and S. CANTOR.
Shibboleth-Architecture Draft v05.
Technical report, Internet2,
2002. http://middleware.internet2.edu/shibboleth (Webpage visited on
12/04/05).
[41] EUROPEAN UNION. EUROPA - internal market - data protection
- legislative documents.
http://europa.eu.int/comm/internal market/privacy/law en.htm.
(Webpage visited on 12/04/05).
[42] EUROPEAN UNION. Directive 95/46/EC of the European Parliament
and of the Council. Official Journal of the European Communities, L
281:31–50, 24 October 1995.
[43] EUROPEAN UNION. Charter of fundamental rights of the european
union. Official Journal of the European Communities, C 364:1–22, 7
December 2000.
[44] EUROPEAN UNION. Consolidated Version of the Treaty on European
Union. Official Journal of the European Communities, C 325:5–181, 24
December 2002.
[45] S. FARRELL and R. HOUSLEY.
An Internet Attribute
Certificate Profile for Authorization.
Request For Comments
(RFC) 3281, Internet Egnineering Task Force (IETF), April 2002.
http://www.ietf.org/rfc/rfc3281.txt (Webpage visited on 12/04/05).
178
BIBLIOGRAPHY
[46] D. FERRAIOLO and D. R. KUHN. Role Based Access Control. In
Proceedings of the 15th NIST-NCSC National Computer Security Conference, pages 554–563, October 1992.
[47] D. FERRAIOLO, R. SANDHU, S. GAVRILLA, and al. A Proposed
Standard for Role Based Access Control. ACM Transactions on Information and System Security, 4(3), 2001.
[48] I. FOSTER and C. KESSELMAN, editors. The Grid Blueprint for a
New Computing Infrastructure. Morgan Kaufmann Publishers, Inc.,
San Francisco, 1999.
[49] I. FOSTER, H. KISHIMOTO, and A. SAVVA Eds.
The Open Grid Services Architecture.
Draft, Open Grid Services Architecture Working Group, January 2005.
available
from http://forge.gridforum.org/projects/ogsa-wg (Webpage visited on
16/05/05).
[50] K. FU. Group Sharing and Random Access in Cryptographic Storage
File Systems. Master’s thesis, Massachusetts Institute of Technology,
June 1999.
[51] A. GABILLON and E. BRUNO. Regulating Access to XML documents. In Proceedings of the fifteenth annual working conference
on Database and application security, Niagara on the Lake, Ontaria,
Canada, July 2001.
[52] S. GODIK and T. MOSES Eds. eXtensible Access Control Markup
Language (XACML).
Standard, Organization for the Advancement of Structured Information Standards (OASIS), February 2003.
http://www.oasis-open.org/ (Webpage visited on 12/04/05).
[53] FRENCH GOVERNMENT. Loi n◦ 2004-801 du 6 août 2004 relative
à la protection des personnes physiques à l’égard des traitements de
données à caractère personnel et modifiant la loi n◦ 78-17 du 6 janvier
1978 relative à l’informatique, aux fichiers et aux libertés. Journal
Officiel de la République Française, JUSX0100026L, 6 August 2004.
[54] G. S. GRAHAM and P. J. DENNING. Protection principles and
practice. In Proceedings of the American Federation of Information
Processing Societies (AFIPS) Conference, volume 40, pages 417–429,
Montvale, N.J., USA, May 1972. AFIPS Press.
BIBLIOGRAPHY
179
[55] P. GUTMAN. PKI: It’s Not Dead, Just Resting. IEEE Computer,
35(8):41–49, August 2002.
[56] P. GUTMANN. Secure filesystem.
http://www.cs.auckland.ac.nz/∼pgut001/sfs
(Webpage visited on 12/04/05), September 1996.
[57] M. H. HARRISON, W. L. RUZZO, and J. D. ULLMAN. Protection in
operating systems. Communications of the ACM, 19(8):461–471, 1976.
[58] J. A. M. HERVEG, F. CRAZZOLARA, S. E. MIDDLETON, and al.
GEMSS: Privacy and security for a Medical Grid. In Proceedings of
the second HealthGRID conference, Clermont-Ferrand, France, January
2004.
[59] J. HUGHES and C. FEIST. Architecture of the Secure File System.
In Proceedings of the 18th IEEE Symposium on Mass Storage Systems,
pages 277–290, San Diego, CA, USA, April 2001.
[60] J. HUGHES, C. FEIST, S. HAWKINSON, and al. A Universal Access, Smart-Card-Based, Secure File System. In Proceedings of the 3rd
annual Atlanta Linux Showcase, Atlanta, Georgia, USA, October 1999.
[61] ISO/IEC. Information technology – Open Systems Interconnection –
Security frameworks for open systems: Access control framework. ISO
Standard ISO/IEC 10181-3, International Organization for Standardization (ISO), 1995.
[62] P. KOCHER. Timing Attacks on Implementations of Diffie-Hellman,
RSA, DSS, and Other Systems. In Advances in Cryptology: Proceedings
of the CRYPTO’96 conference, pages 104–113, Santa Barbara, California, USA, August 1996. Springer Verlag.
[63] P. KOCHER, J. JAFFE, and B. JUN. Differential Power Analysis : Leaking Secrets. In Advances in Cryptology: Proceedings of the
CRYPTO’99 conference, vol 1666, pages 388–397, Santa Barbara, California, USA, August 1999. Springer Verlag.
[64] B. LAMPSON. Protection. In Proceedings of the 5th Princeton Conference on Information Sciences and Systems, Princeton, 1971. Reprinted
in ACM Operating Systems Rev., volume 8, 1, pages 18–24, 1974.
[65] R. LEPRO. Cardea: Dynamic Access Control in Distributed Systems. Technical Report NAS-03-020, NASA Advanced Supercomputing
(NAS) Division, November 2003.
180
BIBLIOGRAPHY
[66] M. LORCH, D. ADAMS, D. KAFURA, and al. The PRIMA System for
Privilege Management, Authorization and Enforcement. In Proceedings
of the 4th International Workshop on Grid Computing, Phoenix, AR,
USA, November 2003.
[67] M. LORCH and D. KAFURA. Supporting Secure Ad-hoc User Collaboration in Grid Environments. In Proceedings of the 3rd International
Workshop on Grid Computing, Baltimore, MD, USA, November 2002.
[68] E. MALER, P. MISHRA, and R. PHILPOTT Eds. The OASIS Security Assertion Markup Language (SAML) v1.1. Standard, Organization for the Advancement of Structured Information Standards (OASIS), September 2003. http://www.oasis-open.org (Webpage visited on
12/04/05).
[69] S. MANGARD. A Simple Power-Analysis (SPA) Attack on Implementations of the AES Key Expansion. In Lecture Notes in Computer Science Volume 2587: Proceedings of the 5th International Conference on
Information Security and Cryptology (ICISC), pages 343–358, Seoul,
Korea, November 2002.
[70] F. MARTIN-SANCHEZ, A. BABIC, R. BAUD, and al. Synergy
between medical informatics and bioinformatics: facilitating genomic
medicine for future health care. Journal of Biomedical Informatics,
37(1):30–42, 2004.
[71] D. MAZIÉRES. Security and Decentralized Control in the SFS Global
File System. Master’s thesis, Massachusetts Institute of Technology,
August 1998.
[72] A. MCNAB and S. KAUSHAL. Gridsite: Grid access control language.
http://www.gridsite.org/1.0.x/gacl.html, December 2003. (Webpage
visited on 12/04/05).
[73] MICROSOFT. Encrypting file system for windows 2000. Whitepaper
6715, Microsoft Corporation, 1998.
[74] E. MILLER, D. LONG, W. FREEMAN, and al. Strong Security for
Network-Attached Storage. In Proceedings of the 1st Annual Conference on File and Storage Technologies (FAST), Monterey, CA, USA,
January 2002.
[75] M. MURATA, A. TOZAWA, and M. KUDO. XML Access Control
Using Static Analysis. In Proceedings of the 10th ACM conference on
BIBLIOGRAPHY
181
computer and communication security, Washington, DC, USA, October
2003.
[76] N. Nagaratnam, P. Janson, J. Dayka, A. Nadalin, F. Siebenlist,
V. Welch, S. Tuecke, and I. Foster. Security Architecture for Open
Grid Services. Technical report, GGF OSGA Security Workgroup, July
2002. Revised 6/5/2003, available from http://www.ggf.org/ogsa-secwg (Webpage visited on 26/06/05).
[77] G. NAVARRO, B. SADIGHI FIROZABADI, E. RISSANEN, and al.
Constrained delegation in XML-based Access Control and Digital
Rights Management Standards. In Proceedings of the IASTED International Conference on Communication, Network, and Information
Security, New York, USA, December 2003.
[78] L. PEARLMAN, C. KESSELMAN, V. WELCH, and al. The Community Authorization Service: Status and Future. In Proceedings of the
2003 Conference for Computing in High Energy and Nuclear Physics
(CHEP), La Jolla, California, March 2003.
[79] L. PEARLMAN, V. WELCH, I. FOSTER, and al. A Community
Authorization Service for Group Collaboration. In Proceedings of the
2002 IEEE Workshop on Policies for Distributed Systems and Networks, Monterey, California, USA, June 2002.
[80] PKIX WORKING GROUP. Public Key Infrastructure (X.509). Technical report, Internet Engineering Task Force (IETF), 2002.
http://www.ietf.org/html.charters/pkix-charter.html (Webpage visited on 12/04/05).
[81] J. RAO, P. ROHATGI, H. SCHERZER, and al. Partitioning Attacks:
Or How to Rapidly Clone Some GSM Cards. In Proceedings of the
2002 IEEE Symposium on Security and Privacy, pages 31–44, Oakland,
California, USA, Mai 2002.
[82] T. RINDFLEISCH. Privacy, information technology, and health care.
Communications of the ACM, 40(8):92–100, 1997.
[83] R. SAADI, J. M. PIERSON, and L. BRUNIE. APC: Access Pass
Certificate. Distrust Certification Model for Large Access in Pervasive
Environment. To appear in the proceedings of the IEEE International
Conference on Pervasive Services, Santorini, Greece, July 2005.
182
BIBLIOGRAPHY
[84] P. SAMARATI and S. DE CAPITANI DI VIMERCATI. Access Control: Policies, Models, and Mechanisms. In Proceedings of the first
International School On Foundations Of Security Analysis And Design
(FOSAD), volume LNCS 2171, pages 137–196. Springer, 2001.
[85] R. SANDHU, E. J. COYNE, H. L. FEINSTEIN, and al. Role-Based
Access Control Models. IEEE Computer, 29(2):38–47, 1996.
[86] R. SANDHU and P. SAMARATI. Access Control: Principles and Practice. IEEE Communications Magazine, 32(9):40–48, 1994.
[87] B. SCHNEIER. Applied Cryptography: Protocols, Algorithms, and
Source Code in C, Second Edition. John Wiley & Sons, New York,
second edition, 1995.
[88] L. SEITZ, J. MONTAGNAT, J. M. PIERSON, and al. Authentication
and Authorization Prototype on the µgrid for Medical Data Management. In From Grid to Healthgrid, Proceedings of Healthgrid 2005,
pages 222–233, Oxford, UK, April 2005. IOS Press.
[89] L. SEITZ, E. RISSANEN, T. SANDHOLM, and al. Policy Administration Control and Delegation using XACML and Delegent. Technical
Report RR-2005-010, LIRIS, INSA-Lyon, France, 2005.
[90] A. SHAMIR. How to Share a Secret. In Communications of the ACM,
volume 22, pages 612–613, 1979.
[91] M. THOMPSON, W. JOHNSTON, S. MUDUMBAI, and al.
Certificate-based Access Control for Widely Distributed Resources. In
Proceedings of the 8th USENIX Security Symposium, Washinton, D.C.,
USA, August 1999.
[92] M. THOMPSON, S. MUDUMBAI, A. ESSIARI, and al. Authorization
Policy in a PKI Environment. In Proceedings of the 1st Annual NIST
workshop on PKI, Gaithersburg, Maryland, USA, April 2002.
[93] S. TUECKE, V. WELCH, D. ENGERT, and al. Internet X.509 Public
Key Infrastructure (PKI) Proxy Certificate Profile. Request For Comments (RFC) 3820, Internet Engineering Task Force (IETF), June 2004.
http://www.ietf.org/rfc/rfc3820.txt (Webpage visited on 12/04/05).
[94] INTERNATIONAL TELECOMMUNICATION UNION. Astract syntax notation one (asn.1). ITU-T Recommendation — ISO/IEC Standard X.680 — 8824-1:2002, International Telecommunication Union,
July 2002.
BIBLIOGRAPHY
183
[95] S. DE CAPITANI DI VIMERCATI and P. SAMARATI. New Directions in Access Control. In: Cyberspace Security and Defense:
Research Issues, Kluwer Academic Publisher (to appear). Available
from http://seclab.dti.unimi.it/Papers/nato.pdf (Webpage visited on
12/04/05).
[96] J. VOLLBRECHT, P. CALHOUN, S. FARRELL, and al. AAA Authorization Framework. Request For Comments (RFC) 2904, Internet
Engineering Task Force (IETF), August 2000.
http://www.ietf.org/rfc/rfc2904.txt (Webpage visited on 12/04/05).
[97] J. WANG, D. DEL VECCHIO, and M. HUMPHREY. Extending the
Security Assertion Markup Language to Support Delegation for Web
Services and Grid Services. submitted for publication, available from
http://www.cs.virginia.edu/∼humphrey/GCG.html (Webpage visited
on 12/04/05), 2005.
[98] X. WANG, Y. L. YIN, and H. YU. Collision Search Attacks on SHA1.
Available from: http://theory.csail.mit.edu/ yiqun/shanote.pdf (Webpage visited on 12/04/05), February 2005.
[99] V. WELCH, T. BARTON, K. KEAHEY, and al.
Attributes,
Anonymity, and Access: Shibboleth and Globus Integration to Facilitate Grid Collaboration. In Proceedings of the 4th Annual PKI R&D
Workshop, Gaithersburg, MD, USA, April 2005.
[100] V. WELCH, I. FOSTER, C. KESSELMAN, and al. X.509 Proxy Certificates for Dynamic Delegation. In Proceedings of the 3rd Annual PKI
R&D Workshop., Gaithersburg, MD, USA, April 2004.
[101] D. WIJESEKERA and S. JAJODIA. A Propositional Policy Algebra
for Access Control. ACM Transactions on Information and System
Security (TISSEC), 6(2):286–325, May 2003.
[102] W. YUAN and J. CAI D. DEWITT. X-diff: An effective change detection algorithm for xml documents. In Proceedings of the 19th International Conference on Data Engineering, pages 519–530, Bangalore,
India, March 2003.
[103] E. ZADOK, I. BADULESCU, and A. SHENDER. Cryptfs: A Stackable
Vnode Level Encryption File System. Technical Report CUCS-021-98,
Computer Science Department, Columbia University, July 1998.