d1_common.wrap package¶
DataONE API Type wrappers.
Although this directory is not a package, this __init__.py file is required for pytest to be able to reach test directories below this directory.
Submodules¶
d1_common.wrap.access_policy module¶
Context manager for working with the DataONE AccessPolicy type.
Examples
Perform multiple operations on an AccessPolicy:
# Wrap a SystemMetadata PyXB object to modify its AccessPolicy section
with d1_common.wrap.access_policy.wrap(sysmeta_pyxb) as ap:
# Print a list of subjects that have the changePermission access level
print(ap.get_subjects('changePermission'))
# Clear any existing rules in the access policy
ap.clear()
# Add a new rule
ap.add_perm('subj1', 'read')
# Exit the context manager scope to write the changes that were made back to the
# wrapped SystemMetadata.
If only a single operation is to be performed, use one of the module level functions:
# Add public public read permission to an AccessPolicy. This adds an allow rule with
# a "read" permission for the symbolic subject, "public". It is a no no-op if any of
# the existing rules already provide "read" or better to "public".
add_public_read(access_pyxb)
Notes
Overview:
Each science object in DataONE has an associated SystemMetadata document in which
there is an AccessPolicy element. The AccessPolicy contains rules assigning
permissions to subjects. The supported permissions are read
, write
and
changePermission
.
write
implicitly includes read
, and changePermission
implicitly includes
read
and write
. So, only a single permission needs to be assigned to a
subject in order to determine all permissions for the subject.
There can be multiple rules in a policy and each rule can contain multiple subjects and permissions. So the same subject can be specified multiple times in the same rules or in different rules, each time with a different set of permissions, while permissions also implicitly include lower permissions.
Due to this, the same permissions can be expressed in many different ways. This wrapper hides the variations, exposing a single canonical set of rules that can be read, modified and written. That is, the wrapper allows working with any set of permissions in terms of the simplest possible representation that covers the resulting effective permissions.
E.g., the following two access policies are equivalent. The latter represents the canonical representation of the former.
<accessPolicy>
<allow>
<subject>subj2</subject>
<subject>subj1</subject>
<perm>read</perm>
</allow>
<allow>
<subject>subj4</subject>
<perm>read</perm>
<perm>changePermission</perm>
</allow>
<allow>
<subject>subj2</subject>
<subject>subj3</subject>
<perm>read</perm>
<perm>write</perm>
</allow>
<allow>
<subject>subj5</subject>
<perm>read</perm>
<perm>write</perm>
</allow>
</accessPolicy>
and
<accessPolicy>
<allow>
<subject>subj1</subject>
<perm>read</perm>
</allow>
<allow>
<subject>subj2</subject>
<subject>subj3</subject>
<subject>subj5</subject>
<perm>write</perm>
</allow>
<allow>
<subject>subj4</subject>
<perm>changePermission</perm>
</allow>
</accessPolicy>
Representations of rules, permissions and subjects:
subj_dict
maps each subj to the perms the the subj has specifically been given.
It holds perms just having been read for PyXB. Duplicates caused by the same
subj being given the same perm in multiple ways are filtered out.
{
'subj1': { 'read' },
'subj2': { 'read', 'write' },
'subj3': { 'read', 'write' },
'subj4': { 'changePermission', 'read' },
'subj5': { 'read', 'write' }
}
perm_dict
maps each perm that a subj has specifically been given, to the subj.
If the AccessPolicy contains multiple allow
elements, and they each give
different perms to a subj, those show up as additional mappings. Duplicates
caused by the same subj being given the same perm in multiple ways are filtered
out. Calls such as add_perm()
also cause extra mappings to be added here, as
long as they’re not exact duplicates. Whenever this dict is used for generating
PyXB or making comparisons, it is first normalized to a norm_perm_list
.
{
'read': { 'subj1', 'subj2' },
'write': { 'subj3' },
'changePermission': { 'subj2' },
}
subj_highest_dict
maps each subj to the highest perm the subj has. The dict has
the same number of keys as there are subj.
{
'subj1': 'write',
'subj2': 'changePermission',
'subj3': 'write',
}
highest_perm_dict
maps the highest perm a subj has, to the subj. The dict can
have at most 3 keys:
{
'changePermission': { 'subj2', 'subj3', 'subj5', 'subj6' },
'read': { 'public' },
'write': { 'subj1', 'subj4' }
}
norm_perm_list
is a minimal, ordered and hashable list of lists. The top level
has up to 3 lists, one for each perm that is in use. Each of the lists then has
a list of subj for which that perm is the highest perm. norm_perm_list is the
shortest way that the required permissions can be expressed, and is used for
comparing access policies and creating uniform PyXB objects:
[
['read', ['public']],
['write', ['subj1', 'subj4']],
['changePermission', ['subj2', 'subj3', 'subj5', 'subj6']]
]
-
d1_common.wrap.access_policy.
wrap
(access_pyxb, pyxb_binding=None, read_only=False)¶ Work with the AccessPolicy in a SystemMetadata PyXB object.
- Parameters
access_pyxb – AccessPolicy PyXB object The AccessPolicy to modify.
read_only – bool Do not update the wrapped AccessPolicy.
When only a single AccessPolicy operation is needed, there’s no need to use this context manager. Instead, use the generated context manager wrappers.
-
d1_common.wrap.access_policy.
wrap_sysmeta_pyxb
(sysmeta_pyxb, pyxb_binding=None, read_only=False)¶ Work with the AccessPolicy in a SystemMetadata PyXB object.
- Parameters
sysmeta_pyxb – SystemMetadata PyXB object SystemMetadata containing the AccessPolicy to modify.
read_only – bool Do not update the wrapped AccessPolicy.
When only a single AccessPolicy operation is needed, there’s no need to use this context manager. Instead, use the generated context manager wrappers.
There is no clean way in Python to make a context manager that allows client code to replace the object that is passed out of the manager. The AccessPolicy schema does not allow the AccessPolicy element to be empty. However, the SystemMetadata schema specifies the AccessPolicy as optional. By wrapping the SystemMetadata instead of the AccessPolicy when working with AccessPolicy that is within SystemMetadata, the wrapper can handle the situation of empty AccessPolicy by instead dropping the AccessPolicy from the SystemMetadata.
-
class
d1_common.wrap.access_policy.
AccessPolicyWrapper
(access_pyxb, pyxb_binding=None)¶ Bases:
object
Wrap an AccessPolicy and provide convenient methods to read, write and update it.
- Parameters
access_pyxb – AccessPolicy PyXB object The AccessPolicy to modify.
-
update
()¶ Update the wrapped AccessPolicy PyXB object with normalized and minimal rules representing current state.
-
get_normalized_pyxb
()¶ Returns:
AccessPolicy PyXB object : Current state of the wrapper as the minimal rules required for correctly representing the perms.
-
get_normalized_perm_list
()¶ Returns:
A minimal, ordered, hashable list of subjects and permissions that represents the current state of the wrapper.
-
get_highest_perm_str
(subj_str)¶ - Parameters
subj_str – str Subject for which to retrieve the highest permission.
- Returns
The highest permission for subject or None if subject does not have any permissions.
-
get_effective_perm_list
(subj_str)¶ - Parameters
subj_str – str Subject for which to retrieve the effective permissions.
- Returns
List of permissions up to and including the highest permission for subject, ordered lower to higher, or empty list if subject does not have any permissions.
E.g.: If ‘write’ is highest permission for subject, return [‘read’, ‘write’].
- Return type
list of str
-
get_subjects_with_equal_or_higher_perm
(perm_str)¶ - Parameters
perm_str – str Permission,
read
,write
orchangePermission
.- Returns
Subj that have perm equal or higher than
perm_str
.Since the lowest permission a subject can have is
read
, passingread
will return all subjects.- Return type
set of str
-
dump
()¶ Dump the current state to debug level log.
-
is_public
()¶ Returns:
bool:
True
if AccessPolicy allows publicread
.
-
is_private
()¶ Returns:
bool: True if AccessPolicy does not grant access to any subjects.
-
is_empty
()¶ Returns:
bool:
True
if AccessPolicy does not grant access to any subjects.
-
are_equivalent_pyxb
(access_pyxb)¶ - Parameters
access_pyxb – AccessPolicy PyXB object with which to compare.
- Returns
True
ifaccess_pyxb
grants the exact same permissions as the wrapped AccessPolicy.Differences in how the permissions are represented in the XML docs are handled by transforming to normalized lists before comparison.
- Return type
bool
-
are_equivalent_xml
(access_xml)¶ - Parameters
access_xml – AccessPolicy XML doc with which to compare.
- Returns
True
ifaccess_xml
grants the exact same permissions as the wrapped AccessPolicy.Differences in how the permissions are represented in the XML docs are handled by transforming to normalized lists before comparison.
- Return type
bool
-
subj_has_perm
(subj_str, perm_str)¶ Returns:
bool:
True
ifsubj_str
has perm equal to or higher thanperm_str
.
-
clear
()¶ Remove AccessPolicy.
Only the rightsHolder set in the SystemMetadata will be able to access the object unless new perms are added after calling this method.
-
add_public_read
()¶ Add public public
read
perm.Add an allow rule with a
read
permission for the symbolic subject,public
. It is a no no-op if any of the existing rules already provideread
or higher topublic
.
-
add_authenticated_read
()¶ Add
read
perm for all authenticated subj.Public
read
is removed if present.
-
add_verified_read
()¶ Add
read
perm for all verified subj.Public
read
is removed if present.
-
add_perm
(subj_str, perm_str)¶ Add a permission for a subject.
- Parameters
subj_str – str Subject for which to add permission(s)
perm_str – str Permission to add. Implicitly adds all lower permissions. E.g.,
write
will also addread
.
-
remove_perm
(subj_str, perm_str)¶ Remove permission from a subject.
- Parameters
subj_str – str Subject for which to remove permission(s)
perm_str – str Permission to remove. Implicitly removes all higher permissions. E.g.,
write
will also removechangePermission
if previously granted.
-
remove_subj
(subj_str)¶ Remove all permissions for subject.
- Parameters
subj_str – str Subject for which to remove all permissions. Since subjects can only be present in the AccessPolicy when they have one or more permissions, this removes the subject itself as well.
The subject may still have access to the obj. E.g.:
The obj has public access.
The subj has indirect access by being in a group which has access.
The subj has an equivalent subj that has access.
The subj is set as the rightsHolder for the object.
-
d1_common.wrap.access_policy.
update
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
get_normalized_pyxb
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
get_normalized_perm_list
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
get_highest_perm_str
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
get_effective_perm_list
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
get_subjects_with_equal_or_higher_perm
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
dump
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
is_public
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
is_private
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
is_empty
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
are_equivalent_pyxb
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
are_equivalent_xml
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
subj_has_perm
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
clear
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
add_public_read
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
add_authenticated_read
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
add_verified_read
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
add_perm
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
remove_perm
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
remove_subj
(access_pyxb, *args, **kwargs)¶
-
d1_common.wrap.access_policy.
mk_func
(func_name)¶
-
d1_common.wrap.access_policy.
method_obj
(self)¶ Update the wrapped AccessPolicy PyXB object with normalized and minimal rules representing current state.
d1_common.wrap.simple_xml module¶
Context manager for simple XML processing.
Example
with d1_common.wrap.simple_xml.wrap(my_xml_str) as xml_wrapper:
# Read, modify and write the text in an XML element
text_str = xml.get_element_text('my_el')
xml.set_element_text('{} more text'.format(text_str)
# Discard the wrapped XML and replace it with the modified XML. Calling get_xml()
# is required because context managers cannot replace the object that was passed
# to the manager, and strings are immutable. If the wrapped XML is needed later,
# just store another reference to it.
my_xml_str = xml_wrapper.get_xml()
Notes
Typically, the DataONE Python stack, and any apps based on the stack, process XML using the PyXB bindings for the DataONE XML types. However, in some rare cases, it is necessary to process XML without using PyXB, and this wrapper provides some basic methods for such processing.
Uses include:
Process XML that is not DataONE types, and so does not have PyXB binding.
Process XML that is invalid in such a way that PyXB cannot parse or generate it.
Process XML without causing xs:dateTime fields to be normalized to the UTC time zone (PyXB is based on the XML DOM, which requires such normalization.)
Generate intentionally invalid XML for DataONE types in order to test how MNs, CNs and other components of the DataONE architecture handle and recover from invalid input.
Speed up simple processing, when the performance overhead of converting the documents to and from PyXB objects, with the schema validation and other processing that it entails, would be considered too high.
Usage:
Methods that take
el_name
andel_idx
operate on the element with indexel_idx
of elements with nameel_name
. Ifel_idx
is higher than the number of elements with nameel_name
, SimpleXMLWrapperException is raised.Though this wrapper does not require XML to validate against the DataONE schemas, it does require that the wrapped XML is well formed and it will only generate well formed XML.
If it’s necessary to process XML that is not well formed, a library such as BeautifulSoup may be required.
In some cases, it may be possible read or write XML that is not well formed by manipulating the XML directly as a string before wrapping or after generating.
This wrapper is based on the ElementTree module.
-
d1_common.wrap.simple_xml.
wrap
(xml_str)¶ Simple processing of XML.
-
class
d1_common.wrap.simple_xml.
SimpleXMLWrapper
(xml_str)¶ Bases:
object
Wrap an XML document and provide convenient methods for performing simple processing on it.
- Parameters
xml_str – str XML document to read, write or modify.
-
parse_xml
(xml_str)¶
-
get_xml
(encoding='unicode')¶ Returns:
str : Current state of the wrapper as XML
-
get_pretty_xml
(encoding='unicode')¶ Returns:
str : Current state of the wrapper as a pretty printed XML string.
-
get_xml_below_element
(el_name, el_idx=0, encoding='unicode')¶ - Parameters
el_name – str Name of element that is the base of the branch to retrieve.
el_idx – int Index of element to use as base in the event that there are multiple sibling elements with the same name.
- Returns
XML fragment rooted at
el
.- Return type
str
-
get_element_list_by_name
(el_name, namespaces=None)¶ - Parameters
el_name – str Name of element for which to search.
- Returns
List of elements with name
el_name
.If there are no matching elements, an empty list is returned.
- Return type
list
-
get_element_list_by_attr_key
(attr_key, namespaces=None)¶ - Parameters
attr_key – str Name of attribute for which to search
- Returns
List of elements containing an attribute key named
attr_key
.If there are no matching elements, an empty list is returned.
- Return type
list
-
get_element_by_xpath
(xpath_str, namespaces=None)¶ - Parameters
xpath_str – str XPath matching the elements for which to search.
- Returns
List of elements matching
xpath_str
.If there are no matching elements, an empty list is returned.
- Return type
list
-
get_element_by_name
(el_name, el_idx=0)¶ - Parameters
el_name – str Name of element to get.
el_idx – int Index of element to use as base in the event that there are multiple sibling elements with the same name.
- Returns
The selected element.
- Return type
element
-
get_element_by_attr_key
(attr_key, el_idx=0)¶ - Parameters
attr_key – str Name of attribute for which to search
el_idx – int Index of element to use as base in the event that there are multiple sibling elements with the same name.
- Returns
Element containing an attribute key named
attr_key
.
-
get_element_text
(el_name, el_idx=0)¶ - Parameters
el_name – str Name of element to use.
el_idx – int Index of element to use in the event that there are multiple sibling elements with the same name.
- Returns
Text of the selected element.
- Return type
str
-
set_element_text
(el_name, el_text, el_idx=0)¶ - Parameters
el_name – str Name of element to update.
el_text – str Text to set for element.
el_idx – int Index of element to use in the event that there are multiple sibling elements with the same name.
-
get_element_text_by_attr_key
(attr_key, el_idx=0)¶ - Parameters
attr_key – str Name of attribute for which to search
el_idx – int Index of element to use in the event that there are multiple sibling elements with the same name.
- Returns
Text of the selected element.
- Return type
str
-
set_element_text_by_attr_key
(attr_key, el_text, el_idx=0)¶ - Parameters
attr_key – str Name of attribute for which to search
el_text – str Text to set for element.
el_idx – int Index of element to use in the event that there are multiple sibling elements with the same name.
-
get_attr_value
(attr_key, el_idx=0)¶ Return the value of the selected attribute in the selected element.
- Parameters
attr_key – str Name of attribute for which to search
el_idx – int Index of element to use in the event that there are multiple sibling elements with the same name.
- Returns
Value of the selected attribute in the selected element.
- Return type
str
-
set_attr_text
(attr_key, attr_val, el_idx=0)¶ Set the value of the selected attribute of the selected element.
- Parameters
attr_key – str Name of attribute for which to search
attr_val – str Text to set for the attribute.
el_idx – int Index of element to use in the event that there are multiple sibling elements with the same name.
-
get_element_dt
(el_name, tz=None, el_idx=0)¶ Return the text of the selected element as a
datetime.datetime
object.The element text must be a ISO8601 formatted datetime
- Parameters
el_name – str Name of element to use.
tz – datetime.tzinfo Timezone in which to return the datetime.
Without a timezone, other contextual information is required in order to determine the exact represented time.
If dt has timezone: The
tz
parameter is ignored.If dt is naive (without timezone): The timezone is set to
tz
.tz=None
: Prevent naive dt from being set to a timezone. Without a timezone, other contextual information is required in order to determine the exact represented time.tz=d1_common.date_time.UTC()
: Set naive dt to UTC.
el_idx – int Index of element to use in the event that there are multiple sibling elements with the same name.
- Returns
datetime.datetime
-
set_element_dt
(el_name, dt, tz=None, el_idx=0)¶ Set the text of the selected element to an ISO8601 formatted datetime.
- Parameters
el_name – str Name of element to update.
dt – datetime.datetime Date and time to set
tz – datetime.tzinfo Timezone to set
Without a timezone, other contextual information is required in order to determine the exact represented time.
If dt has timezone: The
tz
parameter is ignored.If dt is naive (without timezone): The timezone is set to
tz
.tz=None
: Prevent naive dt from being set to a timezone. Without a timezone, other contextual information is required in order to determine the exact represented time.tz=d1_common.date_time.UTC()
: Set naive dt to UTC.
el_idx – int Index of element to use in the event that there are multiple sibling elements with the same name.
-
remove_children
(el_name, el_idx=0)¶ Remove any child elements from element.
- Parameters
el_name – str Name of element to update.
el_idx – int Index of element to use in the event that there are multiple sibling elements with the same name.
-
replace_by_etree
(root_el, el_idx=0)¶ Replace element.
Select element that has the same name as
root_el
, then replace the selected element withroot_el
root_el
can be a single element or the root of an element tree.- Parameters
root_el – element New element that will replace the existing element.
-
replace_by_xml
(xml_str, el_idx=0)¶ Replace element.
Select element that has the same name as
xml_str
, then replace the selected element withxml_str
xml_str
must have a single element in the root.The root element in
xml_str
can have an arbitrary number of children.
- Parameters
xml_str – str New element that will replace the existing element.
-
exception
d1_common.wrap.simple_xml.
SimpleXMLWrapperException
¶ Bases:
Exception