csvw.db
SQLite as alternative storage backend for a TableGroup’s data.
For the most part, translation of a TableGroup’s tableSchema to SQL works as expected:
each table is converted to a CREATE TABLE statement
each column specifies a column in the corresponding CREATE TABLE statement
foreignKey constraints are added according to the corresponding tableSchema property.
List-valued foreignKeys are supported as follows: For each pair of tables related through a list-valued foreign key, an association table is created. To make it possible to distinguish multiple list-valued foreign keys between the same two tables, the association table has a column context, which stores the name of the foreign key column from which a row in the assocation table was created.
Other list-valued columns work in two different ways: If the atomic datatype is string, the specified separator is used to create a concatenated string representation in the database field. Otherwise, the list of values is serialized as JSON.
SQL table and column names can be customized by passing a translator callable when instantiating
a Database
.
SQLite support has the following limitations:
regex constraints on strings (as specified via a
csvw.Datatype
’s format attribute) are not enforced by the database.
- class csvw.db.ColSpec(name, csvw_type='string', separator=None, db_type=None, convert=None, read=None, required=False, csvw=None)[source]
A ColSpec captures sufficient information about a
csvw.Column
for the DB schema.- check(translate)[source]
We try to convert as many data constraints as possible into SQLite CHECK constraints.
- Parameters:
translate (
csvw.db.ColumnTranslator
) – Callable to translate column names between CSVW metadata and DB schema.- Return type:
typing.Optional
[str
]- Returns:
A string suitable as argument of an SQL CHECK constraint.
- class csvw.db.Database(tg, fname=None, translate=None, drop_self_referential_fks=True)[source]
Represents a SQLite database associated with a
csvw.TableGroup
instance.- Parameters:
tg (
csvw.metadata.TableGroup
) – TableGroup instance defining the schema of the database.fname (
typing.Union
[str
,pathlib.Path
,None
]) – Path to which to write the database file.translate (
typing.Optional
[csvw.db.SchemaTranslator
]) – Schema object name translator.drop_self_referential_fks (
typing.Optional
[bool
]) – Flag signaling whether to drop or enforce self-referential foreign-key constraints.
Warning
We write rows of a table to the database sequentially. Since CSVW does not require ordering rows in tables such that self-referential foreign-key constraints are satisfied at each row, we don’t enforce self-referential foreign-keys by default in order to not trigger “false” integrity errors. If data in a CSVW Table is known to be ordered appropriately, False should be passed as drop_self_referential_fks keyword parameter to enforce self-referential foreign-keys.
- association_table_context(table, column, fkey)[source]
Context for association tables is created calling this method.
Note: If a custom value for the context column is created by overwriting this method, select_many_to_many must be adapted accordingly, to make sure the custom context is retrieved when reading the data from the db.
- Parameters:
table –
column –
fkey –
- Returns:
a pair (foreign key, context)
- static name_translator(table, column=None)[source]
A callable with this signature can be passed into DB creation to control the names of the schema objects.
- Parameters:
table (
str
) – CSVW name of the table before translationcolumn (
typing.Optional
[str
]) – CSVW name of a column of table before translation
- Return type:
str
- Returns:
Translated table name if column is None else translated column name
- read()[source]
- Return type:
typing.Dict
[str
,typing.List
[typing.OrderedDict
]]- Returns:
A dict where keys are SQL table names corresponding to CSVW tables and values are lists of rows, represented as dicts where keys are the SQL column names.
- class csvw.db.TableSpec(name, columns=_Nothing.NOTHING, foreign_keys=_Nothing.NOTHING, many_to_many=_Nothing.NOTHING, primary_key=None)[source]
A TableSpec captures sufficient information about a
csvw.Table
for the DB schema.Note
We support “light-weight” many-to-many relationships by allowing list-valued foreign key columns in CSVW. In the database these columns are turned into an associative table, adding the name of the column as value a context column. Thus, multiple columns in a table my be specified as targets of many-to-many relations with the same table.
- classmethod association_table(atable, apk, btable, bpk)[source]
List-valued foreignKeys are supported as follows: For each pair of tables related through a list-valued foreign key, an association table is created. To make it possible to distinguish multiple list-valued foreign keys between the same two tables, the association table has a column context, which stores the name of the foreign key column from which a row in the assocation table was created.
- Return type:
- classmethod from_table_metadata(table, drop_self_referential_fks=True)[source]
Create a TableSpec from the schema description of a csvw.metadata.Table.
- Parameters:
table (
csvw.metadata.Table
) – csvw.metadata.Table instance.drop_self_referential_fks (
typing.Optional
[bool
]) – Flag signaling whether to drop self-referential foreign keys. This may be necessary, if the order of rows in a CSVW table does not guarantee referential integrity when inserted in order (e.g. an eralier row refering to a later one).
- Return type:
- Returns:
TableSpec instance.
- sql(translate)[source]
- Parameters:
translate (
csvw.db.SchemaTranslator
) –- Return type:
str
- Returns:
The SQL statement to create the table.
- csvw.db.insert(db, translate, table, keys, *rows, single=False)[source]
Insert a sequence of rows into a table.
- Parameters:
db (
sqlite3.Connection
) – Database connection.translate (
csvw.db.SchemaTranslator
) – Callable translating table and column names to proper schema object names.table (
str
) – Untranslated table name.keys (
typing.Sequence
[str
]) – Untranslated column names.rows (
list
) – Sequence of rows to insert.single (
typing.Optional
[bool
]) – Flag signaling whether to insert all rows at once using executemany or one at a time, allowing for more focused debugging output in case of errors.
- csvw.db.schema(tg, drop_self_referential_fks=True)[source]
Convert the table and column descriptions of a TableGroup into specifications for the DB schema.
- Parameters:
tg (
csvw.metadata.TableGroup
) – CSVW TableGroup.drop_self_referential_fks (
typing.Optional
[bool
]) – Flag signaling whether to drop self-referential foreign keys. This may be necessary, if the order of rows in a CSVW table does not guarantee referential integrity when inserted in order (e.g. an eralier row refering to a later one).
- Return type:
typing.List
[csvw.db.TableSpec
]- Returns:
A pair (tables, reference_tables).