Reference¶
Contents
newt.db module-level functions¶
-
newt.db.
connection
(dsn, **kw)¶ Create a newt
newt.db.Connection
.Keyword options can be used to provide either ZODB.DB options or RelStorage options.
-
newt.db.
DB
(dsn, **kw)¶ Create a Newt DB database object.
Keyword options can be used to provide either ZODB.DB options or RelStorage options.
A Newt DB object is a thin wrapper around
ZODB.DB
objects. When it’sopen
method is called, it returnsnewt.db.Connection
objects.
-
newt.db.
storage
(dsn, keep_history=False, transform=None, auxiliary_tables=(), **kw)¶ Create a RelStorage storage using the newt PostgresQL adapter.
Keyword options can be used to provide either ZODB.DB options or RelStorage options.
-
newt.db.
pg_connection
(dsn, driver_name='auto')¶ Create a PostgreSQL (not newt) database connection
This function should be used rather than, for example, calling
psycopg2.connect
, because it can use other Postgres drivers depending on the Python environment and available modules.
-
class
newt.db.
Connection
(connection)¶ Wrapper for ZODB.Connection.Connection objects
newt.db.Connection
objects provide extra helper methods for searching and for transaction management.-
abort
()¶ Abort the current transaction
-
commit
()¶ Commit the current transaction
-
create_text_index
(fname, D=None, C=None, B=None, A=None, config=None)¶ Set up a newt full-text index.
The
create_text_index_sql
method is used to compute SQL, which is then executed to set up the index. (This can take a long time on an existing database with many records.)The SQL is executed against the database associated with the given connection, but a separate connection is used, so it’s execution is independent of the current transaction.
-
static
create_text_index_sql
(fname, D=None, C=None, B=None, A=None)¶ Compute and return SQL to set up a newt text index.
The resulting SQL contains a statement to create a PL/pgSQL function and an index-creation function that uses it.
The first argument is the name of the function to be generated. The second argument is a single expression or property name or a sequence of expressions or property names. If expressions are given, they will be evaluated against the newt JSON
state
column. Values consisting of alphanumeric characters (including underscores) are threaded as names, and other values are treated as expressions.Additional arguments,
C
,B
, andA
can be used to supply expressions and/or names for text to be extracted with different weights for ranking. See: https://www.postgresql.org/docs/current/static/textsearch-controls.html#TEXTSEARCH-RANKINGThe
config
argument may be used to specify which text search configuration to use. If not specified, the server-configured default configuration is used.
-
query_data
(query, *args, **kw)¶ Query the newt Postgres database for raw data.
Query parameters may be provided as either positional arguments or keyword arguments. They are inserted into the query where there are placeholders of the form:
%s
for positional arguments, or%(NAME)s
for keyword arguments.A sequence of data tuples is returned.
-
search
(query, *args, **kw)¶ Search for newt objects using an SQL query.
Query parameters may be provided as either positional arguments or keyword arguments. They are inserted into the query where there are placeholders of the form
%s
for positional arguments or%(NAME)s
for keyword arguments.The query results must contain the columns
zoid
andghost_pickle
. It’s simplest and costs nothing to simply select all columns (using*
) from thenewt
table.A sequence of newt objects is returned.
-
search_batch
(query, args, batch_start, batch_size=None)¶ Query for a batch of newt objects.
Query parameters are provided using the
args
argument, which may be a tuple or a dictionary. They are inserted into the query where there are placeholders of the form%s
for an arguments tuple or%(NAME)s
for an arguments dict.The
batch_size
andbatch_size
arguments are used to specify the result batch. AnORDER BY
clause should be used to order results.The total result count and sequence of batch result objects are returned.
The query parameters,
args
, may be omitted. (In this case,batch_size
will be None and the other arguments will be re-arranged appropriately.batch_size
is required.) You might use this feature if you pre-inserted data using a database cursor mogrify method.
-
where
(query_tail, *args, **kw)¶ Query for objects satisfying criteria.
This is a convenience wrapper for the
search
method. The first arument is SQL text for query criteria to be included in an SQL where clause.This mehod simply appends it’s first argument to:
select * from newt where
and so may also contain code that can be included after a where clause, such as an
ORDER BY
clause.Query parameters may be provided as either positional arguments or keyword arguments. They are inserted into the query where there are placeholders of the form:
%s
for positional arguments, or%(NAME)s
for keyword arguments.A sequence of newt objects is returned.
-
where_batch
(query_tail, args, batch_start, batch_size=None)¶ Query for batch of objects satisfying criteria
Like the
where
method, this is a convenience wrapper for thesearch_batch
method.Query parameters are provided using the second,
args
argument, which may be a tuple or a dictionary. They are inserted into the query where there are placeholders of the form%s
for an arguments tuple or%(NAME)s
for an arguments dict.The
batch_size
andbatch_size
arguments are used to specify the result batch. AnORDER BY
clause should be used to order results.The total result count and sequence of batch result objects are returned.
The query parameters,
args
, may be omitted. (In this case,batch_size
will be None and the other arguments will be re-arranged appropriately.batch_size
is required.) You might use this feature if you pre-inserted data using a database cursor mogrify method.
-
newt.db.search module-level functions¶
Search API.
It’s assumed that the API is used with an object stored in a RelStorage with a Postgres back end.
-
newt.db.search.
where
(conn, query_tail, *args, **kw)¶ Query for objects satisfying criteria.
This is a convenience wrapper for the
search
method. The first arument is SQL text for query criteria to be included in an SQL where clause.This mehod simply appends it’s first argument to:
select * from newt where
and so may also contain code that can be included after a where clause, such as an
ORDER BY
clause.Query parameters may be provided as either positional arguments or keyword arguments. They are inserted into the query where there are placeholders of the form:
%s
for positional arguments, or%(NAME)s
for keyword arguments.A sequence of newt objects is returned.
-
newt.db.search.
search
(conn, query, *args, **kw)¶ Search for newt objects using an SQL query.
Query parameters may be provided as either positional arguments or keyword arguments. They are inserted into the query where there are placeholders of the form
%s
for positional arguments or%(NAME)s
for keyword arguments.The query results must contain the columns
zoid
andghost_pickle
. It’s simplest and costs nothing to simply select all columns (using*
) from thenewt
table.A sequence of newt objects is returned.
-
newt.db.search.
where_batch
(conn, query_tail, args, batch_start, batch_size=None)¶ Query for batch of objects satisfying criteria
Like the
where
method, this is a convenience wrapper for thesearch_batch
method.Query parameters are provided using the second,
args
argument, which may be a tuple or a dictionary. They are inserted into the query where there are placeholders of the form%s
for an arguments tuple or%(NAME)s
for an arguments dict.The
batch_size
andbatch_size
arguments are used to specify the result batch. AnORDER BY
clause should be used to order results.The total result count and sequence of batch result objects are returned.
The query parameters,
args
, may be omitted. (In this case,batch_size
will be None and the other arguments will be re-arranged appropriately.batch_size
is required.) You might use this feature if you pre-inserted data using a database cursor mogrify method.
-
newt.db.search.
search_batch
(conn, query, args, batch_start, batch_size=None)¶ Query for a batch of newt objects.
Query parameters are provided using the
args
argument, which may be a tuple or a dictionary. They are inserted into the query where there are placeholders of the form%s
for an arguments tuple or%(NAME)s
for an arguments dict.The
batch_size
andbatch_size
arguments are used to specify the result batch. AnORDER BY
clause should be used to order results.The total result count and sequence of batch result objects are returned.
The query parameters,
args
, may be omitted. (In this case,batch_size
will be None and the other arguments will be re-arranged appropriately.batch_size
is required.) You might use this feature if you pre-inserted data using a database cursor mogrify method.
-
newt.db.search.
query_data
(conn, query, *args, **kw)¶ Query the newt Postgres database for raw data.
Query parameters may be provided as either positional arguments or keyword arguments. They are inserted into the query where there are placeholders of the form:
%s
for positional arguments, or%(NAME)s
for keyword arguments.A sequence of data tuples is returned.
-
newt.db.search.
create_text_index_sql
(fname, D=None, C=None, B=None, A=None, config=None)¶ Compute and return SQL to set up a newt text index.
The resulting SQL contains a statement to create a PL/pgSQL function and an index-creation function that uses it.
The first argument is the name of the function to be generated. The second argument is a single expression or property name or a sequence of expressions or property names. If expressions are given, they will be evaluated against the newt JSON
state
column. Values consisting of alphanumeric characters (including underscores) are threaded as names, and other values are treated as expressions.Additional arguments,
C
,B
, andA
can be used to supply expressions and/or names for text to be extracted with different weights for ranking. See: https://www.postgresql.org/docs/current/static/textsearch-controls.html#TEXTSEARCH-RANKINGThe
config
argument may be used to specify which text search configuration to use. If not specified, the server-configured default configuration is used.
-
newt.db.search.
create_text_index
(conn, fname, D, C=None, B=None, A=None, config=None)¶ Set up a newt full-text index.
The
create_text_index_sql
method is used to compute SQL, which is then executed to set up the index. (This can take a long time on an existing database with many records.)The SQL is executed against the database associated with the given connection, but a separate connection is used, so it’s execution is independent of the current transaction.
-
newt.db.search.
read_only_cursor
(conn)¶ Get a database cursor for reading.
The returned cursor can be used to make PostgreSQL queries and to perform safe SQL generation using the cursor’s mogrify method.
The caller must close the returned cursor after use.
newt.db.follow module-level functions¶
-
newt.db.follow.
updates
(conn, start_tid=-1, end_tid=None, batch_limit=100000, internal_batch_size=100, poll_timeout=300)¶ Create a data-update iterator
The iterator returns an iterator of batchs, where each batch is an iterator of records. Each record is a triple consisting of an integer transaction id, integer object id and data. A sample use:
>>> import newt.db >>> import newt.db.follow >>> connection = newt.db.pg_connection('') >>> for batch in newt.db.follow.updates(connection): ... for tid, zoid, data in batch: ... print(tid, zoid, len(data))
If no
end_tid
is provided, the iterator will iterate until interrupted.Parameters:
- conn
- A Postgres database connection.
- start_tid
- Start tid, expressed as an integer. The iterator starts at the first transaction after this tid.
- end_tid
- End tid, expressed as an integer. The iterator stops at this, or at the end of data, whichever is less. If the end tid is None, the iterator will run indefinately, returning new data as they are committed.
- batch_limit
- A soft batch size limit. When a batch reaches this limit, it will end at the next transaction boundary. The purpose of this limit is to limit read-transaction size.
- internal_batch_size
- The size of the internal Postgres iterator. Data aren’t loaded from
Postgres all at once. Server-side cursors are used and data are
loaded from the server in
internal_batch_size
batches. - poll_timeout
- When no
end_tid
is specified, this specifies how often to poll for changes. Note that a trigger is created and used to notify the iterator of changes, so changes ne detected quickly. The poll timeout is just a backstop.
-
newt.db.follow.
get_progress_tid
(connection, id)¶ Get the current progress for a follow client.
Return the last saved integer transaction id for the client, or -1, if one hasn’t been saved before.
A follow client often updates some other data based on the data returned from
updates
. It may stop and restart later. To do this, it will callset_progress_tid
to save its progress and later callget_progress_tid
to find where it left off. It can then pass the returned tid asstart_tid
toupdates
.The
connection
argument must be a PostgreSQL connection string or connection.The
id
parameters is used to identify which progress is wanted. This should uniquely identify the client and generally a dotted name (__name__
) of the client module is used. This allows multiple clients to have their progress tracked.
-
newt.db.follow.
set_progress_tid
(connection, id, tid)¶ Set the current progress for a follow client.
See
get_progress_tid
.The
connection
argument must be a PostgreSQL connection string or connection.The
id
argument is a string identifying a client. It should generally be a dotted name (usually__name__
) of the client module. It must uniquely identify the client.The
tid
argument is the most recently processed transaction id as an int.
-
newt.db.follow.
listen
(dsn, timeout_on_start=False, poll_timeout=300)¶ Listen for newt database updates.
Returns an iterator that returns integer transaction ids or None values.
The purpose of this method is to determine if there are updates. If transactions are committed very quickly, then not all of them will be returned by the iterator.
None values indicate that
poll_interval
seconds have passed since the last update.Parameters:
- dsn
- A Postgres connection string
- timeout_on_start
Force None to be returned immediately after listening for notifications.
This is useful in some special cases to avoid having to time out waiting for changes that happened before the iterator began listening.
- poll_timeout
- A timeout after which None is returned if there are no changes. (This is a backstop to PostgreSQL’s notification system.)
newt.db.jsonpickle module-level functions¶
Convert pickles to JSON
The goal of the conversion is to produce JSON that is useful for indexing, querying and reporting in external systems like Postgres and Elasticsearch.
-
class
newt.db.jsonpickle.
Jsonifier
(skip_class=None, transform=None)¶ -
__call__
(id, data)¶ Convert data from a ZODB data record to data used by newt.
The data returned is a class name, ghost pickle, and state triple. The state is a JSON-formatted string. The ghost pickle is a binary string that can be used to create a ZODB ghost object.
If there is an error converting data, if the data is empty, or if the skip_class function returns a true value, then
(None, None, None)
is returned.Parameters:
- id
- A data identifier (e.g. an object id) used when logging errors.
- data
- Pickle data to be converted.
-
__init__
(skip_class=None, transform=None)¶ Create a callable for converting database data to Newt JSON
Parameters:
- skip_class
- A callable that will be called with the class name extracted
from the data. If the callable returns a true value, then
data won’t be converted to JSON and
(None, None, None)
are returned. The default, which is available as theskip_class
attribute of theJsonifier
class, skips objects from theBTrees
package and blobs. - transform
A function that transforms a record’s state JSON.
If provided, it should accept a class name and a state string in JSON format.
If the transform function should return a new state string or None. If None is returned, the original state is used.
If the function returns an empty string, then the Jsonifier will return
(None, None, None)
. In other words, providing a transform that returns an empty string is equivalent to providing askip_class
function that returns True.Returning anything other than None or a string is an error and behavior is undefined.
-
-
class
newt.db.jsonpickle.
JsonUnpickler
(pickle)¶ Unpickler that returns JSON
Usage:
>>> apickle = pickle.dumps([1,2]) >>> unpickler = JsonUnpickler(apickle) >>> json_string = unpickler.load() >>> unpickler.pos == len(apickle) True