Table¶
User friendly container for Google Cloud Bigtable Table.
-
class
gcloud_bigtable.table.
Table
(table_id, cluster)[source]¶ Bases:
object
Representation of a Google Cloud Bigtable Table.
Note
We don’t define any properties on a table other than the name. As the proto says, in a request:
Thename
field of the Table and all of its ColumnFamilies must be left blank, and will be populated in the response.This leaves only the
current_operation
andgranularity
fields. Thecurrent_operation
is only used for responses whilegranularity
is an enum with only one value.We can use a
Table
to:create()
the tablerename()
the tabledelete()
the tablelist_column_families()
in the table
Parameters: - table_id (str) – The ID of the table.
- cluster (
cluster.Cluster
) – The cluster that owns the table.
-
client
¶ Getter for table’s client.
Return type: client.Client
Returns: The client that owns this table.
-
cluster
¶ Getter for table’s cluster.
Return type: cluster.Cluster
Returns: The cluster stored on the table.
-
column_family
(column_family_id, gc_rule=None)[source]¶ Factory to create a column family associated with this table.
Parameters: - column_family_id (str) – The ID of the column family. Must be of the
form
[_a-zA-Z0-9][-_.a-zA-Z0-9]*
. - gc_rule (
column_family.GarbageCollectionRule
,column_family.GarbageCollectionRuleUnion
orcolumn_family.GarbageCollectionRuleIntersection
) – (Optional) The garbage collection settings for this column family.
Return type: Returns: A column family owned by this table.
- column_family_id (str) – The ID of the column family. Must be of the
form
-
create
(initial_split_keys=None, timeout_seconds=None)[source]¶ Creates this table.
Note
Though a
_generated.bigtable_table_data_pb2.Table
is also allowed (as thetable
property) in a create table request, we do not support it in this method. As mentioned in theTable
docstring, the name is the only useful property in the table proto.Note
A create request returns a
_generated.bigtable_table_data_pb2.Table
but we don’t use this response. The proto definition allows for the inclusion of acurrent_operation
in the response, but in example usage so far, it seems the Bigtable API does not return any operation.Parameters: - initial_split_keys (list) – (Optional) List of row keys that will be
used to initially split the table into
several tablets (Tablets are similar to
HBase regions). Given two split keys,
"s1"
and"s2"
, three tablets will be created, spanning the key ranges:[, s1)
,[s1, s2)
,[s2, )
. - timeout_seconds (int) – Number of seconds for request time-out. If not passed, defaults to value set on table.
- initial_split_keys (list) – (Optional) List of row keys that will be
used to initially split the table into
several tablets (Tablets are similar to
HBase regions). Given two split keys,
-
delete
(timeout_seconds=None)[source]¶ Delete this table.
Parameters: timeout_seconds (int) – Number of seconds for request time-out. If not passed, defaults to value set on table.
-
list_column_families
(timeout_seconds=None)[source]¶ Check if this table exists.
Parameters: timeout_seconds (int) – Number of seconds for request time-out. If not passed, defaults to value set on table. Return type: dictionary with string as keys and column_family.ColumnFamily
as valuesReturns: List of column families attached to this table. Raises: ValueError
if the column family name from the response does not agree with the computed name from the column family ID.
-
name
¶ Table name used in requests.
Note
This property will not change if
table_id
does not, but the return value is not cached.The table name is of the form
"projects/../zones/../clusters/../tables/{table_id}"
Return type: str Returns: The table name.
-
read_row
(row_key, filter=None, timeout_seconds=None)[source]¶ Read a single row from this table.
Parameters: - row_key (bytes) – The key of the row to read from.
- filter (
row.RowFilter
,row.RowFilterChain
,row.RowFilterUnion
orrow.ConditionalRowFilter
) – (Optional) The filter to apply to the contents of the row. If unset, returns the entire row. - timeout_seconds (int) – Number of seconds for request time-out. If not passed, defaults to value set on table.
Return type: Returns: The contents of the row.
Raises: ValueError
if a commit row chunk is never encountered.
-
read_rows
(start_key=None, end_key=None, allow_row_interleaving=None, limit=None, filter=None, timeout_seconds=None)[source]¶ Read rows from this table.
Parameters: - start_key (bytes) – (Optional) The beginning of a range of row keys to
read from. The range will include
start_key
. If left empty, will be interpreted as the empty string. - end_key (bytes) – (Optional) The end of a range of row keys to read from.
The range will not include
end_key
. If left empty, will be interpreted as an infinite string. - filter (
row.RowFilter
,row.RowFilterChain
,row.RowFilterUnion
orrow.ConditionalRowFilter
) – (Optional) The filter to apply to the contents of the specified row(s). If unset, reads every column in each row. - allow_row_interleaving (bool) – (Optional) By default, rows are read
sequentially, producing results which
are guaranteed to arrive in increasing
row order. Setting
allow_row_interleaving
toTrue
allows multiple rows to be interleaved in the response stream, which increases throughput but breaks this guarantee, and may force the client to use more memory to buffer partially-received rows. - limit (int) – (Optional) The read will terminate after committing to N
rows’ worth of results. The default (zero) is to return
all results. Note that if
allow_row_interleaving
is set toTrue
, partial results may be returned for more than N rows. However, only Ncommit_row
chunks will be sent. - timeout_seconds (int) – Number of seconds for request time-out. If not passed, defaults to value set on table.
Return type: Returns: A
PartialRowsData
convenience wrapper for consuming the streamed results.- start_key (bytes) – (Optional) The beginning of a range of row keys to
read from. The range will include
-
rename
(new_table_id, timeout_seconds=None)[source]¶ Rename this table.
Note
This cannot be used to move tables between clusters, zones, or projects.
Note
The Bigtable Table Admin API currently returns
BigtableTableService.RenameTable is not yet implemented
when this method is used. It’s unclear when this method will actually be supported by the API.
Parameters:
-
row
(row_key)[source]¶ Factory to create a row associated with this table.
Parameters: row_key (bytes) – The key for the row being created. Return type: row.Row
Returns: A row owned by this table.
-
sample_row_keys
(timeout_seconds=None)[source]¶ Read a sample of row keys in the table.
The returned row keys will delimit contiguous sections of the table of approximately equal size, which can be used to break up the data for distributed tasks like mapreduces.
The elements in the iterator are a SampleRowKeys response and they have the properties
offset_bytes
androw_key
. They occur in sorted order. The table might have contents before the first row key in the list and after the last one, but a key containing the empty string indicates “end of table” and will be the last response given, if present.Note
Row keys in this list may not have ever been written to or read from, and users should therefore not make any assumptions about the row key structure that are specific to their use case.
The
offset_bytes
field on a response indicates the approximate total storage space used by all rows in the table which precederow_key
. Buffering the contents of all rows between two subsequent samples would require space roughly equal to the difference in theiroffset_bytes
fields.Parameters: timeout_seconds (int) – Number of seconds for request time-out. If not passed, defaults to value set on table. Return type: grpc.framework.alpha._reexport._CancellableIterator
Returns: A cancel-able iterator. Can be consumed by calling next()
or by casting to alist
and can be cancelled by callingcancel()
.