Data Connection

Connection to Google Cloud Bigtable Data API.

gcloud_bigtable.data_connection.DATA_API_HOST = 'bigtable.googleapis.com'

Data API request host.

class gcloud_bigtable.data_connection.DataConnection(credentials=None)[source]

Bases: object

Connection to Google Cloud Bigtable Data API.

Enables interaction with data in an existing table.

Parameters:credentials (oauth2client.client.OAuth2Credentials or NoneType) – The OAuth2 Credentials to use for this connection.
READ_ONLY_SCOPE = 'https://www.googleapis.com/auth/cloud-bigtable.data.readonly'

Read-only scope for data API requests.

SCOPE = 'https://www.googleapis.com/auth/cloud-bigtable.data'

Scope for data API requests.

check_and_mutate_row(table_name, row_key)[source]

Checks and mutates a row.

mutate_row(table_name, row_key)[source]

Mutates a row.

read_modify_write_row(table_name, row_key)[source]

Reads, modifies and writes a row.

read_rows(table_name, row_key=None, row_range=None, filter_=None, allow_row_interleaving=None, num_rows_limit=None, timeout_seconds=10)[source]

Read rows from table.

Streams back the contents of all requested rows, optionally applying the same Reader filter to each. Depending on their size, rows may be broken up across multiple responses, but atomicity of each row will still be preserved.

Note

If neither row_key nor row_range is set, reads from all rows. Otherwise, at most one of row_key and row_range can be set.

Parameters:
  • table_name (string) – The name of the table we are reading from. Must be of the form “projects/../zones/../clusters/../tables/..” Since this is a low-level class, we don’t check this, rather we expect callers to pass correctly formatted data.
  • row_key (bytes) – (Optional) The key of a single row from which to read.
  • row_range (_generated.bigtable_data_pb2.RowRange) – (Optional) A range of rows from which to read.
  • filter (_generated.bigtable_data_pb2.RowFilter) – (Optional) The filter to apply to the contents of the specified row(s). If unset, reads the entire table.
  • allow_row_interleaving (boolean) – (Optional) By default, rows are read sequentially, producing results which are guaranteed to arrive in increasing row order. Setting “allow_row_interleaving” to true allows multiple rows to be interleaved in the response stream, which increases throughput but breaks this guarantee, and may force the client to use more memory to buffer partially-received rows.
  • num_rows_limit (integer) – (Optional) The read will terminate after committing to N rows’ worth of results. The default (zero) is to return all results. Note that if “allow_row_interleaving” is set to true, partial results may be returned for more than N rows. However, only N “commit_row” chunks will be sent.
  • timeout_seconds (integer) – Number of seconds for request time-out. If not passed, defaults to TIMEOUT_SECONDS.
Return type:

bigtable_service_messages_pb2.ReadRowsResponse

Returns:

The response returned by the backend.

sample_row_keys(table_name, timeout_seconds=10)[source]

Returns a sample of row keys in the table.

The returned row keys will delimit contiguous sections of the table of approximately equal size, which can be used to break up the data for distributed tasks like mapreduces.

Parameters:
  • table_name (string) – The name of the table we are taking the sample from. Must be of the form “projects/../zones/../clusters/../tables/..” Since this is a low-level class, we don’t check this, rather we expect callers to pass correctly formatted data.
  • timeout_seconds (integer) – Number of seconds for request time-out. If not passed, defaults to TIMEOUT_SECONDS.
Return type:

messages_pb2.SampleRowKeysResponse

Returns:

The sample row keys response returned.