d1_client package

DataONE Client Library.

The /client/index works together with the /common/index to provide functionality commonly needed by client software that connects to DataONE nodes.

The main functionality provided by this library is a complete set of wrappers for all DataONE API methods. There are many details related to interacting with the DataONE API, such as creating MIME multipart messages, encoding parameters into URLs and handling Unicode. The wrappers hide these details, allowing the developer to communicate with nodes by calling native Python methods which take and return native objects.

The wrappers also convert any errors received from the nodes into native exceptions, enabling clients to use Python’s concise exception handling system to handle errors.

Although this directory is not a package, this __init__.py file is required for pytest to be able to reach test directories below this directory.

Submodules

d1_client.baseclient module

class d1_client.baseclient.DataONEBaseClient(base_url='https://cn.dataone.org/cn', *args, **kwargs)

Bases: d1_client.session.Session

Extend Session by adding REST API wrappers for APIs that are available on both Coordinating Nodes and Member Nodes, and that have the same signature on both:

CNCore/MNCore.getLogRecords() CNRead/MNRead.get() CNRead/MNRead.getSystemMetadata() CNRead/MNRead.describe() CNRead/MNRead.listObjects() CNAuthorization/MNAuthorization.isAuthorized() CNCore/MNCore.ping()

For details on how to use these methods, see:

https://releases.dataone.org/online/api-documentation-v2.0/apis/MN_APIs.html https://releases.dataone.org/online/api-documentation-v2.0/apis/CN_APIs.html

On error response, raises a DataONEException.

Methods with names that end in “Response” return the HTTPResponse object directly for manual processing by the client. The *Response methods are only needed in rare cases where the default handling is inadequate, e.g., for dealing with nodes that don’t fully comply with the spec.

The client classes wrap all the DataONE API methods, hiding the many details related to interacting with the DataONE API, such as creating MIME multipart messages, encoding parameters into URLs and handling Unicode.

The clients allow the developer to communicate with nodes by calling native Python methods which take and return native objects.

The clients also convert any errors received from the nodes into native exceptions, enabling clients to use Python’s concise exception handling system to handle errors.

The clients are arranged into the following class hierarchy:

digraph G {
  dpi = 72;
  edge [dir = back];

  Session -> DataONEBaseClient;

  DataONEBaseClient -> DataONEBaseClient_1_1 [weight=1000];
  DataONEBaseClient -> MemberNodeClient;
  DataONEBaseClient -> CoordinatingNodeClient;

  DataONEBaseClient_1_1 -> MemberNodeClient_1_1;
  DataONEBaseClient_1_1 -> CoordinatingNodeClient_1_1;

  MemberNodeClient -> MemberNodeClient_1_1;
  CoordinatingNodeClient -> CoordinatingNodeClient_1_1;

  MemberNodeClient -> DataONEBaseClient_1_1 [style=invis];
  CoordinatingNodeClient -> DataONEBaseClient_1_1 [style=invis];

  DataONEBaseClient_1_1 -> DataONEBaseClient_2_0 [weight=1000];
  MemberNodeClient_1_1 -> DataONEBaseClient_2_0 [style=invis];
  CoordinatingNodeClient_1_1 -> DataONEBaseClient_2_0 [style=invis];

  DataONEBaseClient_2_0 -> MemberNodeClient_2_0;
  DataONEBaseClient_2_0 -> CoordinatingNodeClient_2_0;

  MemberNodeClient_1_1 -> MemberNodeClient_2_0;
  CoordinatingNodeClient_1_1 -> CoordinatingNodeClient_2_0;

  MemberNodeClient_2_0 -> DataONEClient;
  CoordinatingNodeClient_2_0 -> DataONEClient;
}

The classes without version designators implement functionality defined in v1.0 of the DataONE service specifications. The classes with version designators implement support for the corresponding DataONE service specifications.

DataONEBaseClient

The DataONEBaseClient classes contain methods that allow access to APIs that are common to Coordinating Nodes and Member Nodes.

  • d1_client.d1baseclient

  • d1_client.d1baseclient_1_1

  • d1_client.d1baseclient_2_0

MemberNodeClient

The MemberNodeClient classes contain methods that allow access to APIs that are specific to Member Nodes.

  • d1_client.mnclient

  • d1_client.mnclient_1_1

  • d1_client.mnclient_2_0

CoordinatingNodeClient

The CoordinatingNodeClient classes contain methods that allow access to APIs that are specific to Coordinating Nodes.

  • d1_client.cnclient

  • d1_client.cnclient_1_1

  • d1_client.cnclient_2_0

DataONEClient

The DataONEClient uses CN- and MN clients to perform high level operations against the DataONE infrastructure.

  • d1_client.d1client

DataONEObject

Wraps a single DataONE Science Object and adds functionality such as resolve and get.

  • d1_client.d1client

SolrConnection

Provides functionality for working with DataONE’s Solr index, which powers the ONEMercury science data search engine.

  • d1_client.solr_client

__init__(base_url='https://cn.dataone.org/cn', *args, **kwargs)

Create a DataONEBaseClient. See Session for parameters.

Parameters
  • api_major (integer) – Major version of the DataONE API

  • api_minor (integer) – Minor version of the DataONE API

Returns

None

property api_version_tup
property pyxb_binding
getLogRecordsResponse(fromDate=None, toDate=None, event=None, pidFilter=None, idFilter=None, start=0, count=100, vendorSpecific=None)
getLogRecords(fromDate=None, toDate=None, event=None, pidFilter=None, idFilter=None, start=0, count=100, vendorSpecific=None)
pingResponse(vendorSpecific=None)
ping(vendorSpecific=None)
getResponse(pid, stream=False, vendorSpecific=None)
get(pid, stream=False, vendorSpecific=None)

Initiate a MNRead.get(). Return a Requests Response object from which the object bytes can be retrieved.

When stream is False, Requests buffers the entire object in memory before returning the Response. This can exhaust available memory on the local machine when retrieving large science objects. The solution is to set stream to True, which causes the returned Response object to contain a a stream. However, see note below.

When stream = True, the Response object will contain a stream which can be processed without buffering the entire science object in memory. However, failure to read all data from the stream can cause connections to be blocked. Due to this, the stream parameter is False by default.

Also see:

get_and_save(pid, sciobj_stream, vendorSpecific=None)

Like MNRead.get(), but also retrieve the object bytes and write them to the provided stream. This method does not have the potential issue with excessive memory usage that get() with ``stream``=False has.

Also see MNRead.get().

getSystemMetadataResponse(pid, vendorSpecific=None)
getSystemMetadata(pid, vendorSpecific=None)
describeResponse(pid, vendorSpecific=None)
describe(pid, vendorSpecific=None)

Note: If the server returns a status code other than 200 OK, a ServiceFailure will be raised, as this method is based on a HEAD request, which cannot carry exception information.

listObjectsResponse(fromDate=None, toDate=None, formatId=None, identifier=None, replicaStatus=None, nodeId=None, start=0, count=100, vendorSpecific=None)
listObjects(fromDate=None, toDate=None, formatId=None, identifier=None, replicaStatus=None, nodeId=None, start=0, count=100, vendorSpecific=None)
generateIdentifierResponse(scheme, fragment=None, vendorSpecific=None)
generateIdentifier(scheme, fragment=None, vendorSpecific=None)
archiveResponse(pid, vendorSpecific=None)
archive(pid, vendorSpecific=None)
isAuthorizedResponse(pid, action, vendorSpecific=None)
isAuthorized(pid, action, vendorSpecific=None)

Return True if user is allowed to perform action on pid, else False.

d1_client.baseclient_1_1 module

class d1_client.baseclient_1_1.DataONEBaseClient_1_1(*args, **kwargs)

Bases: d1_client.baseclient.DataONEBaseClient

Extend DataONEBaseClient with functionality common between Member and Coordinating nodes that was added in v1.1 of the DataONE infrastructure.

For details on how to use these methods, see:

https://releases.dataone.org/online/api-documentation-v2.0/apis/MN_APIs.html https://releases.dataone.org/online/api-documentation-v2.0/apis/CN_APIs.html

__init__(*args, **kwargs)

See d1_client.baseclient.DataONEBaseClient for args.

queryResponse(queryEngine, query_str, vendorSpecific=None, do_post=False, **kwargs)

CNRead.query(session, queryEngine, query) → OctetStream https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNRead.query MNQuery.query(session, queryEngine, query) → OctetStream http://jenkins.

-1.dataone.org/jenkins/job/API%20Documentation%20-%20trunk/ws/api- documentation/build/html/apis/MN_APIs.html#MNQuery.query.

Parameters
  • queryEngine

  • query_str

  • vendorSpecific

  • do_post

  • **kwargs

Returns:

query(queryEngine, query_str, vendorSpecific=None, do_post=False, **kwargs)

See Also: queryResponse()

Parameters
  • queryEngine

  • query_str

  • vendorSpecific

  • do_post

  • **kwargs

Returns:

getQueryEngineDescriptionResponse(queryEngine, vendorSpecific=None, **kwargs)

CNRead.getQueryEngineDescription(session, queryEngine) → QueryEngineDescription https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNRead.getQueryEngineDescription MNQuery.getQueryEngineDescription(session, queryEngine) → QueryEngineDescription http://jenkins-1.dataone.org/jenkins/job/API%20D ocumentation%20-%20trunk/ws.

/api-documentation/build/html/apis/MN_APIs.h tml#MNQuery.getQueryEngineDescription.

Parameters
  • queryEngine

  • **kwargs

Returns:

getQueryEngineDescription(queryEngine, **kwargs)

See Also: getQueryEngineDescriptionResponse()

Parameters
  • queryEngine

  • **kwargs

Returns:

d1_client.baseclient_1_2 module

class d1_client.baseclient_1_2.DataONEBaseClient_1_2(*args, **kwargs)

Bases: d1_client.baseclient_1_1.DataONEBaseClient_1_1

Extend DataONEBaseClient with functionality common between Member and Coordinating nodes that was added in v1.1 of the DataONE infrastructure.

For details on how to use these methods, see:

https://releases.dataone.org/online/api-documentation-v2.0/apis/MN_APIs.html https://releases.dataone.org/online/api-documentation-v2.0/apis/CN_APIs.html

__init__(*args, **kwargs)

See d1_client.baseclient.DataONEBaseClient for args.

d1_client.baseclient_2_0 module

class d1_client.baseclient_2_0.DataONEBaseClient_2_0(*args, **kwargs)

Bases: d1_client.baseclient_1_2.DataONEBaseClient_1_2

Extend DataONEBaseClient_1_2 with functionality common between Member and Coordinating nodes that was added in v2.0 of the DataONE infrastructure.

For details on how to use these methods, see:

https://releases.dataone.org/online/api-documentation-v2.0/apis/MN_APIs.html https://releases.dataone.org/online/api-documentation-v2.0/apis/CN_APIs.html

__init__(*args, **kwargs)

See baseclient.DataONEBaseClient for args.

updateSystemMetadataResponse(pid, sysmeta_pyxb, vendorSpecific=None)

MNStorage.updateSystemMetadata(session, pid, sysmeta) → boolean http://jenkins-1.dataone.org/documentation/unstable/API-Documentation- development/apis/MN_APIs.html#MNStorage.updateSystemMetadata.

Parameters
  • pid

  • sysmeta_pyxb

  • vendorSpecific

Returns:

updateSystemMetadata(pid, sysmeta_pyxb, vendorSpecific=None)

d1_client.cnclient module

class d1_client.cnclient.CoordinatingNodeClient(*args, **kwargs)

Bases: d1_client.baseclient.DataONEBaseClient

Extend DataONEBaseClient by adding REST API wrappers for APIs that are available on Coordinating Nodes.

For details on how to use these methods, see:

https://releases.dataone.org/online/api-documentation-v2.0/apis/CN_APIs.html

__init__(*args, **kwargs)

See d1_client.baseclient.DataONEBaseClient for args.

listFormatsResponse(vendorSpecific=None)

CNCore.ping() → null https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNCore.ping Implemented in d1_client.baseclient.py.

CNCore.create(session, pid, object, sysmeta) → Identifier https://releases.dataone.org/online/api-documentation-v2.0.1/apis/CN_APIs.html#CNCore.create CN INTERNAL

CNCore.listFormats() → ObjectFormatList https://releases.dataone.org/online/api-documentation-v2.0.1/apis/CN_APIs.html#CNCore.listFormats

Parameters

vendorSpecific

Returns:

listFormats(vendorSpecific=None)

See Also: listFormatsResponse()

Parameters

vendorSpecific

Returns:

getFormatResponse(formatId, vendorSpecific=None)

CNCore.getFormat(formatId) → ObjectFormat https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNCore.getFormat.

Parameters
  • formatId

  • vendorSpecific

Returns:

getFormat(formatId, vendorSpecific=None)

See Also: getFormatResponse()

Parameters
  • formatId

  • vendorSpecific

Returns:

reserveIdentifierResponse(pid, vendorSpecific=None)

CNCore.getLogRecords(session[, fromDate][, toDate][, event][, start][, count]) → Log https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNCore.getLogRecords Implemented in d1_client.baseclient.py.

CNCore.reserveIdentifier(session, pid) → Identifier https://releases.dataone.org/online/api-documentation-v2.0.1/apis/CN_APIs.html#CNCore.reserveIdentifier

Parameters
  • pid

  • vendorSpecific

Returns:

reserveIdentifier(pid, vendorSpecific=None)

See Also: reserveIdentifierResponse()

Parameters
  • pid

  • vendorSpecific

Returns:

listChecksumAlgorithmsResponse(vendorSpecific=None)

CNCore.listChecksumAlgorithms() → ChecksumAlgorithmList https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNCore.listChecksumAlgorithms.

Parameters

vendorSpecific

Returns:

listChecksumAlgorithms(vendorSpecific=None)

See Also: listChecksumAlgorithmsResponse()

Parameters

vendorSpecific

Returns:

setObsoletedByResponse(pid, obsoletedByPid, serialVersion, vendorSpecific=None)

CNCore.setObsoletedBy(session, pid, obsoletedByPid, serialVersion) → boolean https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNCore.setObsoletedBy.

Parameters
  • pid

  • obsoletedByPid

  • serialVersion

  • vendorSpecific

Returns:

setObsoletedBy(pid, obsoletedByPid, serialVersion, vendorSpecific=None)

See Also: setObsoletedByResponse()

Parameters
  • pid

  • obsoletedByPid

  • serialVersion

  • vendorSpecific

Returns:

listNodesResponse(vendorSpecific=None)

CNCore.listNodes() → NodeList https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNCore.listNodes.

Parameters

vendorSpecific

Returns:

listNodes(vendorSpecific=None)

See Also: listNodesResponse()

Parameters

vendorSpecific

Returns:

hasReservationResponse(pid, subject, vendorSpecific=None)

CNCore.registerSystemMetadata(session, pid, sysmeta) → Identifier CN INTERNAL.

CNCore.hasReservation(session, pid) → boolean https://releases.dataone.org/online/api-documentation-v2.0.1/apis/CN_APIs.html#CNCore.hasReservation

Parameters
  • pid

  • subject

  • vendorSpecific

Returns:

hasReservation(pid, subject, vendorSpecific=None)

See Also: hasReservationResponse()

Parameters
  • pid

  • subject

  • vendorSpecific

Returns:

resolveResponse(pid, vendorSpecific=None)

CNRead.get(session, pid) → OctetStream Implemented in d1_client.baseclient.py.

CNRead.getSystemMetadata(session, pid) → SystemMetadata Implemented in d1_client.baseclient.py

CNRead.resolve(session, pid) → ObjectLocationList https://releases.dataone.org/online/api-documentation-v2.0.1/apis/CN_APIs.html#CNRead.resolve

Parameters
  • pid

  • vendorSpecific

Returns:

resolve(pid, vendorSpecific=None)

See Also: resolveResponse()

Parameters
  • pid

  • vendorSpecific

Returns:

getChecksumResponse(pid, vendorSpecific=None)

CNRead.getChecksum(session, pid) → Checksum https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNRead.getChecksum.

Parameters
  • pid

  • vendorSpecific

Returns:

getChecksum(pid, vendorSpecific=None)

See Also: getChecksumResponse()

Parameters
  • pid

  • vendorSpecific

Returns:

searchResponse(queryType, query, vendorSpecific=None, **kwargs)

CNRead.search(session, queryType, query) → ObjectList https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNRead.search.

Parameters
  • queryType

  • query

  • vendorSpecific

  • **kwargs

Returns:

search(queryType, query=None, vendorSpecific=None, **kwargs)

See Also: searchResponse()

Parameters
  • queryType

  • query

  • vendorSpecific

  • **kwargs

Returns:

queryResponse(queryEngine, query=None, vendorSpecific=None, **kwargs)

CNRead.query(session, queryEngine, query) → OctetStream https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNRead.query.

Parameters
  • queryEngine

  • query

  • vendorSpecific

  • **kwargs

Returns:

query(queryEngine, query=None, vendorSpecific=None, **kwargs)

See Also: queryResponse()

Parameters
  • queryEngine

  • query

  • vendorSpecific

  • **kwargs

Returns:

getQueryEngineDescriptionResponse(queryEngine, vendorSpecific=None, **kwargs)

CNRead.getQueryEngineDescription(session, queryEngine) → QueryEngineDescription https://releases.dataone.org/online/api-document ation-v2.0.1/apis/CN_APIs.html#CNRead.getQueryEngineDescription.

Parameters
  • queryEngine

  • vendorSpecific

  • **kwargs

Returns:

getQueryEngineDescription(queryEngine, vendorSpecific=None, **kwargs)

See Also: getQueryEngineDescriptionResponse()

Parameters
  • queryEngine

  • vendorSpecific

  • **kwargs

Returns:

setRightsHolderResponse(pid, userId, serialVersion, vendorSpecific=None)

CNAuthorization.setRightsHolder(session, pid, userId, serialVersion)

→ Identifier https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNAuthorization.setRightsHolder.

Parameters
  • pid

  • userId

  • serialVersion

  • vendorSpecific

Returns:

setRightsHolder(pid, userId, serialVersion, vendorSpecific=None)

See Also: setRightsHolderResponse()

Parameters
  • pid

  • userId

  • serialVersion

  • vendorSpecific

Returns:

setAccessPolicyResponse(pid, accessPolicy, serialVersion, vendorSpecific=None)

CNAuthorization.setAccessPolicy(session, pid, accessPolicy, serialVersion) → boolean https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNAuthorization.setAccessPolicy.

Parameters
  • pid

  • accessPolicy

  • serialVersion

  • vendorSpecific

Returns:

setAccessPolicy(pid, accessPolicy, serialVersion, vendorSpecific=None)

See Also: setAccessPolicyResponse()

Parameters
  • pid

  • accessPolicy

  • serialVersion

  • vendorSpecific

Returns:

registerAccountResponse(person, vendorSpecific=None)

CNIdentity.registerAccount(session, person) → Subject https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNIdentity.registerAccount.

Parameters
  • person

  • vendorSpecific

Returns:

registerAccount(person, vendorSpecific=None)

See Also: registerAccountResponse()

Parameters
  • person

  • vendorSpecific

Returns:

updateAccountResponse(subject, person, vendorSpecific=None)

CNIdentity.updateAccount(session, person) → Subject https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNIdentity.updateAccount.

Parameters
  • subject

  • person

  • vendorSpecific

Returns:

updateAccount(subject, person, vendorSpecific=None)

See Also: updateAccountResponse()

Parameters
  • subject

  • person

  • vendorSpecific

Returns:

verifyAccountResponse(subject, vendorSpecific=None)

CNIdentity.verifyAccount(session, subject) → boolean https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNIdentity.verifyAccount.

Parameters
  • subject

  • vendorSpecific

Returns:

verifyAccount(subject, vendorSpecific=None)

See Also: verifyAccountResponse()

Parameters
  • subject

  • vendorSpecific

Returns:

getSubjectInfoResponse(subject, vendorSpecific=None)

CNIdentity.getSubjectInfo(session, subject) → SubjectList https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNIdentity.getSubjectInfo.

Parameters
  • subject

  • vendorSpecific

Returns:

getSubjectInfo(subject, vendorSpecific=None)

See Also: getSubjectInfoResponse()

Parameters
  • subject

  • vendorSpecific

Returns:

listSubjectsResponse(query, status=None, start=None, count=None, vendorSpecific=None)

CNIdentity.listSubjects(session, query, status, start, count) → SubjectList https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNIdentity.listSubjects.

Parameters
  • query

  • status

  • start

  • count

  • vendorSpecific

Returns:

listSubjects(query, status=None, start=None, count=None, vendorSpecific=None)

See Also: listSubjectsResponse()

Parameters
  • query

  • status

  • start

  • count

  • vendorSpecific

Returns:

mapIdentityResponse(primarySubject, secondarySubject, vendorSpecific=None)

CNIdentity.mapIdentity(session, subject) → boolean https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNIdentity.mapIdentity.

Parameters
  • primarySubject

  • secondarySubject

  • vendorSpecific

Returns:

mapIdentity(primarySubject, secondarySubject, vendorSpecific=None)

See Also: mapIdentityResponse()

Parameters
  • primarySubject

  • secondarySubject

  • vendorSpecific

Returns:

removeMapIdentityResponse(subject, vendorSpecific=None)

CNIdentity.removeMapIdentity(session, subject) → boolean https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNIdentity.removeMapIdentity.

Parameters
  • subject

  • vendorSpecific

Returns:

removeMapIdentity(subject, vendorSpecific=None)

See Also: removeMapIdentityResponse()

Parameters
  • subject

  • vendorSpecific

Returns:

denyMapIdentityResponse(subject, vendorSpecific=None)

CNIdentity.denyMapIdentity(session, subject) → boolean https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNIdentity.denyMapIdentity.

Parameters
  • subject

  • vendorSpecific

Returns:

denyMapIdentity(subject, vendorSpecific=None)

See Also: denyMapIdentityResponse()

Parameters
  • subject

  • vendorSpecific

Returns:

requestMapIdentityResponse(subject, vendorSpecific=None)

CNIdentity.requestMapIdentity(session, subject) → boolean https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNIdentity.requestMapIdentity.

Parameters
  • subject

  • vendorSpecific

Returns:

requestMapIdentity(subject, vendorSpecific=None)

See Also: requestMapIdentityResponse()

Parameters
  • subject

  • vendorSpecific

Returns:

confirmMapIdentityResponse(subject, vendorSpecific=None)

CNIdentity.confirmMapIdentity(session, subject) → boolean https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNIdentity.confirmMapIdentity.

Parameters
  • subject

  • vendorSpecific

Returns:

confirmMapIdentity(subject, vendorSpecific=None)

See Also: confirmMapIdentityResponse()

Parameters
  • subject

  • vendorSpecific

Returns:

createGroupResponse(group, vendorSpecific=None)

CNIdentity.createGroup(session, groupName) → Subject https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNIdentity.createGroup.

Parameters
  • group

  • vendorSpecific

Returns:

createGroup(group, vendorSpecific=None)

See Also: createGroupResponse()

Parameters
  • group

  • vendorSpecific

Returns:

updateGroupResponse(group, vendorSpecific=None)

CNIdentity.addGroupMembers(session, groupName, members) → boolean https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNIdentity.addGroupMembers.

Parameters
  • group

  • vendorSpecific

Returns:

updateGroup(group, vendorSpecific=None)

See Also: updateGroupResponse()

Parameters
  • group

  • vendorSpecific

Returns:

setReplicationStatusResponse(pid, nodeRef, status, dataoneError=None, vendorSpecific=None)

CNReplication.setReplicationStatus(session, pid, nodeRef, status, failure) → boolean https://releases.dataone.org/online/api-documentatio n-v2.0.1/apis/CN_APIs.html#CNReplication.setReplicationStatus.

Parameters
  • pid

  • nodeRef

  • status

  • dataoneError

  • vendorSpecific

Returns:

setReplicationStatus(pid, nodeRef, status, dataoneError=None, vendorSpecific=None)

See Also: setReplicationStatusResponse()

Parameters
  • pid

  • nodeRef

  • status

  • dataoneError

  • vendorSpecific

Returns:

updateReplicationMetadataResponse(pid, replicaMetadata, serialVersion, vendorSpecific=None)

CNReplication.updateReplicationMetadata(session, pid, replicaMetadata, serialVersion) → boolean https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_AP Is.html#CNReplication.updateReplicationMetadata Not implemented.

Parameters
  • pid

  • replicaMetadata

  • serialVersion

  • vendorSpecific

Returns:

updateReplicationMetadata(pid, replicaMetadata, serialVersion, vendorSpecific=None)

See Also: updateReplicationMetadataResponse()

Parameters
  • pid

  • replicaMetadata

  • serialVersion

  • vendorSpecific

Returns:

setReplicationPolicyResponse(pid, policy, serialVersion, vendorSpecific=None)

CNReplication.setReplicationPolicy(session, pid, policy, serialVersion) → boolean https://releases.dataone.org/online/api-docume ntation-v2.0.1/apis/CN_APIs.html#CNReplication.setReplicationPolicy.

Parameters
  • pid

  • policy

  • serialVersion

  • vendorSpecific

Returns:

setReplicationPolicy(pid, policy, serialVersion, vendorSpecific=None)

See Also: setReplicationPolicyResponse()

Parameters
  • pid

  • policy

  • serialVersion

  • vendorSpecific

Returns:

isNodeAuthorizedResponse(targetNodeSubject, pid, vendorSpecific=None)

CNReplication.isNodeAuthorized(session, targetNodeSubject, pid, replicatePermission) → boolean() https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNReplication.isNodeAuthorized.

Parameters
  • targetNodeSubject

  • pid

  • vendorSpecific

Returns:

isNodeAuthorized(targetNodeSubject, pid, vendorSpecific=None)

See Also: isNodeAuthorizedResponse()

Parameters
  • targetNodeSubject

  • pid

  • vendorSpecific

Returns:

deleteReplicationMetadataResponse(pid, nodeId, serialVersion, vendorSpecific=None)

CNReplication.deleteReplicationMetadata(session, pid, policy, serialVersion)

→ boolean https://releases.dataone.org/online/api-docume ntation-v2.0.1/apis/CN_APIs.html#CNReplication.deleteReplicationMetadat a.

Parameters
  • pid

  • nodeId

  • serialVersion

  • vendorSpecific

Returns:

deleteReplicationMetadata(pid, nodeId, serialVersion, vendorSpecific=None)

See Also: deleteReplicationMetadataResponse()

Parameters
  • pid

  • nodeId

  • serialVersion

  • vendorSpecific

Returns:

updateNodeCapabilitiesResponse(nodeId, node, vendorSpecific=None)

CNRegister.updateNodeCapabilities(session, nodeId, node) → boolean https://releases.dataone.org/online/api-documentation-v2.0.1/apis/CN_AP Is.html#CNRegister.updateNodeCapabilities.

Parameters
  • nodeId

  • node

  • vendorSpecific

Returns:

updateNodeCapabilities(nodeId, node, vendorSpecific=None)

See Also: updateNodeCapabilitiesResponse()

Parameters
  • nodeId

  • node

  • vendorSpecific

Returns:

registerResponse(node, vendorSpecific=None)

CNRegister.register(session, node) → NodeReference https://releases.dataone.org/online/api- documentation-v2.0.1/apis/CN_APIs.html#CNRegister.register.

Parameters
  • node

  • vendorSpecific

Returns:

register(node, vendorSpecific=None)

See Also: registerResponse()

Parameters
  • node

  • vendorSpecific

Returns:

d1_client.cnclient_1_1 module

class d1_client.cnclient_1_1.CoordinatingNodeClient_1_1(*args, **kwargs)

Bases: d1_client.baseclient_1_1.DataONEBaseClient_1_1, d1_client.cnclient.CoordinatingNodeClient

Extend DataONEBaseClient_1_1 and CoordinatingNodeClient with functionality for Coordinating nodes that was added in v1.1 of the DataONE infrastructure.

For details on how to use these methods, see:

https://releases.dataone.org/online/api-documentation-v2.0/apis/CN_APIs.html

__init__(*args, **kwargs)

See baseclient.DataONEBaseClient for args.

d1_client.cnclient_1_2 module

class d1_client.cnclient_1_2.CoordinatingNodeClient_1_2(*args, **kwargs)

Bases: d1_client.baseclient_1_2.DataONEBaseClient_1_2, d1_client.cnclient.CoordinatingNodeClient

Extend DataONEBaseClient_1_2 and CoordinatingNodeClient with functionality for Coordinating nodes that was added in v1.1 of the DataONE infrastructure.

For details on how to use these methods, see:

https://releases.dataone.org/online/api-documentation-v2.0/apis/CN_APIs.html

__init__(*args, **kwargs)

See baseclient.DataONEBaseClient for args.

d1_client.cnclient_2_0 module

class d1_client.cnclient_2_0.CoordinatingNodeClient_2_0(*args, **kwargs)

Bases: d1_client.baseclient_2_0.DataONEBaseClient_2_0, d1_client.cnclient_1_2.CoordinatingNodeClient_1_2

Extend DataONEBaseClient_2_0 and CoordinatingNodeClient_1_2 with functionality for Coordinating nodes that was added in v2.0 of the DataONE infrastructure.

Updated in v2:

  • CNCore.listFormats() → ObjectFormatList

  • CNRead.listObjects(session[, fromDate][, toDate][, formatId]

  • MNRead.listObjects(session[, fromDate][, toDate][, formatId]

The base implementations of listFormats() and listObjects() handle v2 when called through this class.

https://releases.dataone.org/online/api-documentation-v2.0/apis/CN_APIs.html

__init__(*args, **kwargs)

See baseclient.DataONEBaseClient for args.

deleteResponse(pid)

CNCore.delete(session, id) → Identifier DELETE /object/{id}

Parameters

pid

Returns:

delete(pid)

See Also: deleteResponse()

Parameters

pid

Returns:

synchronizeResponse(pid, vendorSpecific=None)

CNRead.synchronize(session, pid) → boolean POST /synchronize.

Args: pid: vendorSpecific:

synchronize(pid, vendorSpecific=None)

See Also: synchronizeResponse() Args: pid: vendorSpecific:

Returns:

viewResponse(theme, did, vendorSpecific=None)
view(theme, did)
listViewsResponse(vendorSpecific=None)
listViews()
echoCredentialsResponse(vendorSpecific=None)
echoCredentials(vendorSpecific=None)
echoSystemMetadataResponse(sysmeta_pyxb, vendorSpecific=None)
echoSystemMetadata(sysmeta_pyxb, vendorSpecific=None)
echoIndexedObjectResponse(queryEngine, sysmeta_pyxb, obj, vendorSpecific=None)
echoIndexedObject(queryEngine, sysmeta_pyxb, obj, vendorSpecific=None)

d1_client.command_line module

Utilities for command line tools that instantiate DataONEClient(), CoordinatingNodeClient(), or MemberNodeClient() objects.

The intention is to both reduce the amount of boilerplate code in command line tools that interact with the DataONE infrastructure and to standardize the behavior of the scripts.

d1_client.command_line.get_standard_arg_parser(description_str=None, formatter_class=<class 'argparse.ArgumentDefaultsHelpFormatter'>, add_base_url=False)

Return an argparse.ArgumentParser populated with a standard set of command line arguments.

Command line tools that interact with the DataONE infrastructure typically instantiate a DataONE Client with all arguments either set to their defaults or specified as command line arguments by the user.

This module makes it convenient for scripts to add a standardized set of command line arguments that allow the user to override the default settings in the DataONE Client as needed.

The script that calls this function will typically add its own specific arguments by making additional parser.add_argument() calls before extracting the command line arguments with args = parser.parse_args().

When creating the DataONE Client, simply pass the command line arguments to the client via the command_line_adapter().

Parameters
  • description_str – Description of the command The description is included in the automatically generated help message.

  • formatter_class – Modify the help message format. See the argparse module for available Formatter classes.

  • add_base_url – Require a BaseURL to be provided as a positional command line argument.

    If the script will be creating a CN Client, leave this set to False to enable automatically connecting to the CN in the environment specified by --env, which is the Production environment by default.

    If the script will be creating a MN Client, set this to True to require a MN BaseURL to be specified on the command line.

Returns

Prepulated with command line arguments that allow overriding

DataONEClient defaults.

Return type

argparse.ArgumentParser()

Example

def main():
parser = d1_client.command_line.get_standard_d1_client_arg_parser(

__doc__, add_base_url=True

) parser.add_argument(

“–my-additional-arg”, …

args = parser.parse_args() … client = d1_client.cnclient_2_0.CoordinatingNodeClient_2_0(

d1_client.command_line.args_adapter(args)

d1_client.command_line.args_adapter(args)

Convert a command line arguments object to a dict suitable for passing to a D1Client create call via argument unpacking.

Parameters

args – Object returned from parser.parse_args()

Returns

Arguments valid for passing to a D1Client create call.

Return type

dict

Example

args = parser.parse_args() … client = d1_client.cnclient_2_0.CoordinatingNodeClient_2_0(

**d1_client.command_line.args_adapter(args)

)

d1_client.command_line.log_setup(is_debug, disable_existing_loggers=False)

Set up a log format that is suitable for writing to the console by command line tools.

d1_client.d1client module

class d1_client.d1client.DataONEClient(*args, **kwargs)

Bases: d1_client.mnclient_2_0.MemberNodeClient_2_0, d1_client.cnclient_2_0.CoordinatingNodeClient_2_0

Perform high level operations against the DataONE infrastructure.

The other Client classes are specific to CN or MN and to architecture version. This class provides a more abstract interface that can be used for interacting with any DataONE node regardless of type and version.

__init__(*args, **kwargs)

See baseclient.DataONEBaseClient for args.

create_sciobj(pid, format_id, sciobj, vendor_specific_dict=None, **sysmeta_dict)

Create a Science Object on a Memeber Node.

Wrapper for MNStorage.create() that includes semi-automatic generation of System Metadata.

Parameters
  • pid – str Persistent Identifier.

  • format_id – str formatId of the Science Object.

  • sciobj – str, bytes or file-like stream str: Path to file bytes: Bytes file-like stream: lxml.etree of XML doc to validate

  • vendor_specific_dict – dict Pass additional, vendor specific parameters.

  • **sysmeta_dict – dict

    Parameters to customize the System Metadata.

    See also

    d1_common.system_metadata.generate_system_metadata_pyxb()

create_sysmeta(pid, format_id, sciobj_stream, **sysmeta_dict)
get_node_id()
d1_client.d1client.get_api_major_by_base_url(base_url='https://cn.dataone.org/cn', *client_arg_list, **client_arg_dict)

Read the Node document from a node and return an int containing the latest D1 API version supported by the node.

The Node document can always be reached through the v1 API and will list services for v1 and any later APIs versions supported by the node.

d1_client.d1client.get_client_type(d1_client_obj)
d1_client.d1client.get_version_tag_by_d1_client(d1_client_obj)
d1_client.d1client.get_client_class_by_version_tag(api_major)

d1_client.mnclient module

class d1_client.mnclient.MemberNodeClient(*args, **kwargs)

Bases: d1_client.baseclient.DataONEBaseClient

Extend DataONEBaseClient by adding REST API wrappers for APIs that are available on Member Nodes.

For details on how to use these methods, see:

https://releases.dataone.org/online/api-documentation-v2.0/apis/MN_APIs.html

__init__(*args, **kwargs)

See baseclient.DataONEBaseClient for args.

getCapabilitiesResponse(vendorSpecific=None)
getCapabilities(vendorSpecific=None)
getChecksumResponse(pid, checksumAlgorithm=None, vendorSpecific=None)
getChecksum(pid, checksumAlgorithm=None, vendorSpecific=None)
synchronizationFailedResponse(message, vendorSpecific=None)
synchronizationFailed(message, vendorSpecific=None)
createResponse(pid, obj, sysmeta_pyxb, vendorSpecific=None)
create(pid, obj, sysmeta_pyxb, vendorSpecific=None)
updateResponse(pid, obj, newPid, sysmeta_pyxb, vendorSpecific=None)
update(pid, obj, newPid, sysmeta_pyxb, vendorSpecific=None)
deleteResponse(pid, vendorSpecific=None)
delete(pid, vendorSpecific=None)
systemMetadataChangedResponse(pid, serialVersion, dateSysMetaLastModified, vendorSpecific=None)
systemMetadataChanged(pid, serialVersion, dateSysMetaLastModified, vendorSpecific=None)
replicateResponse(sysmeta_pyxb, sourceNode, vendorSpecific=None)
replicate(sysmeta_pyxb, sourceNode, vendorSpecific=None)
getReplicaResponse(pid, vendorSpecific=None)
getReplica(pid, vendorSpecific=None)

d1_client.mnclient_1_1 module

class d1_client.mnclient_1_1.MemberNodeClient_1_1(*args, **kwargs)

Bases: d1_client.baseclient_1_1.DataONEBaseClient_1_1, d1_client.mnclient.MemberNodeClient

Extend DataONEBaseClient_1_1 and MemberNodeClient with functionality for Member nodes that was added in v1.1 of the DataONE infrastructure.

For details on how to use these methods, see:

https://releases.dataone.org/online/api-documentation-v2.0/apis/MN_APIs.html

__init__(*args, **kwargs)

See baseclient.DataONEBaseClient for args.

d1_client.mnclient_1_2 module

class d1_client.mnclient_1_2.MemberNodeClient_1_2(*args, **kwargs)

Bases: d1_client.baseclient_1_2.DataONEBaseClient_1_2, d1_client.mnclient.MemberNodeClient

Extend DataONEBaseClient_1_2 and MemberNodeClient with functionality for Member nodes that was added in v1.2 of the DataONE infrastructure.

For details on how to use these methods, see:

https://releases.dataone.org/online/api-documentation-v2.0/apis/MN_APIs.html

__init__(*args, **kwargs)

See baseclient.DataONEBaseClient for args.

viewResponse(theme, did, vendorSpecific=None, **kwargs)
view(theme, did, **kwargs)
listViewsResponse(vendorSpecific=None, **kwargs)
listViews(**kwargs)
getPackageResponse(did, packageType='application/bagit-097', vendorSpecific=None, **kwargs)
getPackage(did, packageType='application/bagit-097', **kwargs)

d1_client.mnclient_2_0 module

class d1_client.mnclient_2_0.MemberNodeClient_2_0(*args, **kwargs)

Bases: d1_client.baseclient_2_0.DataONEBaseClient_2_0, d1_client.mnclient_1_2.MemberNodeClient_1_2

Extend DataONEBaseClient_2_0 and MemberNodeClient_1_2 with functionality for Member nodes that was added in v2.0 of the DataONE infrastructure.

For details on how to use these methods, see:

https://releases.dataone.org/online/api-documentation-v2.0/apis/MN_APIs.html

__init__(*args, **kwargs)

See baseclient.DataONEBaseClient for args.

d1_client.session module

class d1_client.session.Session(base_url='https://cn.dataone.org/cn', cert_pem_path=None, cert_key_path=None, **kwargs_dict)

Bases: object

__init__(base_url='https://cn.dataone.org/cn', cert_pem_path=None, cert_key_path=None, **kwargs_dict)

The Session improves performance by keeping connection related state and allowing it to be reused in multiple API calls to a DataONE Coordinating Node or Member Node. This includes:

  • A connection pool

  • HTTP persistent connections (HTTP/1.1 and keep-alive)

Based on Python Requests: - http://docs.python-requests.org/en/master/ - http://docs.python-requests.org/en/master/user/advanced/#session-objects

Parameters
  • base_url – DataONE Node REST service BaseURL.

  • cert_pem_path – Path to a PEM formatted certificate file. If provided and

accepted by the remote node, the subject for which the certificate was issued is added to the authenticated context in which API calls are made by the client. Equivalent subjects and group subjects may be implicitly included as well. If the certificate is used together with an JWT token, the two sets of subjects are combined. :type cert_pem_path: string

Parameters
  • cert_key_path (string) – Path to a PEM formatted file that contains the private key for the certificate file. Only required if the certificate file does not itself contain the private key.

  • jwt_token – Base64 encoded JSON Web Token. If provided and accepted by the

remote node, the subject for which the token was issued is added to the authenticated context in which API calls are made by the client. Equivalent subjects and group subjects may be implicitly included as well. If the token is used together with an X.509 certificate, the two sets of subjects are combined. :type token: string

Parameters
  • timeout_sec (float, int, None) – Time in seconds that requests will wait for a response. None, 0, 0.0 disables timeouts. Default is DEFAULT_HTTP_TIMEOUT, currently 60 seconds.

  • try_count (int) – Set number of times to try a request before failing. If not set, retries are still performed, using the default number of retries. To disable retries, set to 1.

  • headers (dictionary) – headers that will be included with all connections.

  • query (dictionary) – URL query parameters that will be included with all connections.

  • use_stream (bool) – Use streaming response. When enabled, responses must be completely read to free up the connection for reuse. (default:False)

  • verify_tls (bool or path) – Verify the server side TLS/SSL certificate. (default: True). Can also hold a path that points to a trusted CA bundle

  • suppress_verify_warnings (bool) – Suppress the warnings issued when verify_tls is set to False.

  • user_agent (str) – Override the default User-Agent string used by d1client.

  • charset (str) – Override the default Charset used by d1client. (default: utf-8)

  • mmp_boundary (str) – By default, boundary strings used in Mime Multipart (MMP) documents are automatically generated as required. If provided, this string will be used instead. This is typically required for creating reproducible test results and may be required by non-compliant MMP parsers.

Returns

None

property base_url
property auth_subj_tup

This property contains the DataONE subjects for which connections created by the client may be authenticated on the remote node.

Returns

primary subject string, equivalent identities set

  • If a certificate was passed when the client was created:

    • primary subject string: Extracted from the certificate DN

    • equivalent identities set: group memberships and inferred symbolic subjects extracted from the SubjectInfo (if present.)

    • All returned subjects are DataONE compliant serializations.

    • A copy of the primary subject is always included in the set of equivalent identities.

  • If a certificate was not passed when the client was created:

    Both primary subject string and equivalent identities set contain the

    The DataONE public symbolic subject

Return type

2-tuple

GET(rest_path_list, **kwargs)

Send a GET request. See requests.sessions.request for optional parameters.

Returns

Response object

HEAD(rest_path_list, **kwargs)

Send a HEAD request. See requests.sessions.request for optional parameters.

Returns

Response object

POST(rest_path_list, **kwargs)

Send a POST request with optional streaming multipart encoding. See requests.sessions.request for optional parameters. To post regular data, pass a string, iterator or generator as the data argument. To post a multipart stream, pass a dictionary of multipart elements as the fields argument. E.g.:

fields = {

‘field0’: ‘value’, ‘field1’: ‘value’, ‘field2’: (‘filename.xml’, open(‘file.xml’, ‘rb’), ‘application/xml’)

}

Returns

Response object

PUT(rest_path_list, **kwargs)

Send a PUT request with optional streaming multipart encoding. See requests.sessions.request for optional parameters. See post() for parameters.

Returns

Response object

DELETE(rest_path_list, **kwargs)

Send a DELETE request. See requests.sessions.request for optional parameters.

Returns

Response object

OPTIONS(rest_path_list, **kwargs)

Send a OPTIONS request. See requests.sessions.request for optional parameters.

Returns

Response object

get_curl_command_line(method, url, **kwargs)

Get request as cURL command line for debugging.

dump_request_and_response(response)

Return a string containing a nicely formatted representation of the request and response objects for logging and debugging.

  • Note: Does not work if the request or response body is a MultipartEncoder object.

d1_client.solr_client module

Basic Solr client.

Based on: http://svn.apache.org/viewvc/lucene/solr/tags/release-1.2.0/ client/python/solr.py

DataONE provides an index of all objects stored in the Member Nodes that form the DataONE federation. The index is stored in an Apache Solr database and can be queried with the SolrClient.

The DataONE Solr index provides information only about objects for which the caller has access. When querying the index without authenticating, only records related to public objects can be retrieved. To authenticate, provide a certificate signed by CILogon when creating the client.

Example:

# Connect to the DataONE Coordinating Nodes in the default (production) environment.
c = d1_client.solr_client.SolrConnection()

search_result = c.search({
  'q': 'id:[* TO *]', # Filter for search
  'rows': 10, # Number of results to return
  'fl': 'formatId', # List of fields to return for each result
})

pprint.pprint(search_result)
class d1_client.solr_client.Param(field, param)

Bases: object

Solr Query Parameter

class d1_client.solr_client.SolrClient(base_url='https://cn.dataone.org/cn', *args, **kwargs)

Bases: d1_client.baseclient_1_2.DataONEBaseClient_1_2

Extend DataONEBaseClient_1_2 with functions for querying Solr indexes hosted on CNs and MNs.

Example:

To connect to DataONE’s production environment:

solr_client = SolrClient()

For the supported keyword args, see:

d1_client.session.Session()

  • Most methods take a **query_dict as a parameter. It allows passing any number of query parameters that will be sent to Solr.

Pass the query parameters as regular keyword arguments. E.g.:

solr_client.search(q=Param(‘id’, ‘abc*’), fq=Param(‘id’, ‘def*’))

To pass multiple query parameters of the same type, pass a list. E.g., to pass multiple filter query (fq) parameters:

solr_client.search(

q=Param(‘id’, ‘abc*’), fq=[Param(‘id’, ‘def*’), Param(‘id’, ‘ghi’)]

)

  • For more information about DataONE’s Solr index, see:

https://releases.dataone.org/online/api-documentation-v2.0/design/SearchMetadata.html

search(**query_dict)

Search the Solr index.

Example:

result_dict = search(q=[‘id:abc*’], fq=[‘id:def*’, ‘id:ghi’])

get(doc_id)

Retrieve the specified document.

get_ids(start=0, rows=1000, **query_dict)

Retrieve a list of identifiers for documents matching the query.

count(**query_dict)

Return the number of entries that match query.

get_field_values(name, maxvalues=-1, sort=True, **query_dict)

Retrieve the unique values for a field, along with their usage counts.

Parameters
  • name (string) – Name of field for which to retrieve values

  • sort – Sort the result

  • maxvalues (int) – Maximum number of values to retrieve. Default is -1, which causes retrieval of all values.

Returns

dict of {fieldname: [[value, count], … ], }

get_field_min_max(name, **query_dict)

Returns the minimum and maximum values of the specified field. This requires two search calls to the service, each requesting a single value of a single field.

@param name(string) Name of the field @param q(string) Query identifying range of records for min and max values @param fq(string) Filter restricting range of query

@return list of [min, max]

field_alpha_histogram(name, n_bins=10, include_queries=True, **query_dict)

Generates a histogram of values from a string field.

Output is: [[low, high, count, query], … ]. Bin edges is determined by equal division of the fields.

delete(doc_id)
delete_by_query(query)
add(**fields)
add_docs(docs)

docs is a list of fields that are a dictionary of name:value for a record.

commit(waitFlush=True, waitSearcher=True, optimize=False)
class d1_client.solr_client.SolrRecordTransformerBase

Bases: object

Base for Solr record transformers.

Used to transform a Solr search response document into some other form, such as a dictionary or list of values.

transform(record)
class d1_client.solr_client.SolrArrayTransformer(cols=None)

Bases: d1_client.solr_client.SolrRecordTransformerBase

A transformer that returns a list of values for the specified columns.

transform(record)
class d1_client.solr_client.SolrSearchResponseIterator(client, page_size=100, max_records=1000, transformer=<d1_client.solr_client.SolrRecordTransformerBase object>, **query_dict)

Bases: object

Performs a search against a Solr index and acts as an iterator to retrieve all the values.

process_row(row)

Override this method in derived classes to reformat the row response.

class d1_client.solr_client.SolrArrayResponseIterator(client, page_size=100, cols=None, **query_dict)

Bases: d1_client.solr_client.SolrSearchResponseIterator

Returns an iterator that operates on a Solr result set.

The output for each document is a list of values for the columns specified in the cols parameter of the constructor.

class d1_client.solr_client.SolrSubsampleResponseIterator(client, q, fq=None, fields='*', page_size=100, n_samples=10000, transformer=<d1_client.solr_client.SolrRecordTransformerBase object>)

Bases: d1_client.solr_client.SolrSearchResponseIterator

Returns a pseudo-random subsample of the result set.

Works by calculating the number of pages required for the entire data set and taking a random sample of pages until n_samples can be retrieved. So pages are random, but records within a page are not.

class d1_client.solr_client.SolrValuesResponseIterator(client, field, page_size=1000, **query_dict)

Bases: object

Iterates over a Solr get values response.

This returns a list of distinct values for a particular field.

__init__(client, field, page_size=1000, **query_dict)

Initialize.

@param client(SolrConnection) An instance of a solr connection to use. @param field(string) name of the field from which to retrieve values @param q(string) The Solr query to restrict results @param fq(string) A facet query, restricts the set of rows that q is applied to @param fields(string) A comma delimited list of field names to return @param page_size(int) Number of rows to retrieve in each call.

d1_client.util module

d1_client.util.normalize_request_response_dump(dump_str)