Editor Integration

Several editor syntax checkers already natively support the output format for a similar tool, sqlint, which we can piggy-back off of by using the "sqlint" reporter.

Specific editors are mentioned below, but generally, if your editor has support for sqlint and can customize the executable, try running squabble --reporter=sqlint instead!

emacs (via flycheck)

The best way to configure squabble through flycheck would be to create a new checker definition, which should Just Work when you open a SQL file with flycheck turned on.

(flycheck-define-checker sql-squabble
  "A SQL syntax checker using the squabble tool.
See URL `https://github.com/erik/squabble'."
  :command ("squabble" "--reporter=sqlint")
  :standard-input t
  :error-patterns
  ((warning line-start "stdin:" line ":" column ":WARNING "
            (message (one-or-more not-newline)
                     (zero-or-more "\n"))
            line-end)
   (error line-start "stdin:" line ":" column ":ERROR "
          (message (one-or-more not-newline)
                   (zero-or-more "\n"))
          line-end))
:modes (sql-mode))

(add-to-list 'flycheck-checkers 'sql-squabble)

Flycheck ships with support for sqlint, so if for whatever reason you don’t want to define a new checker, you should just be able to point flycheck at the squabble executable.

(setq 'flycheck-sql-sqlint-executable "squabble")

Unfortunately flycheck does not allow user customization of the command line arguments passed to the program, so you’ll need to make sure to have configuration file with {"reporter": "sqlint"}.

vim (via syntastic)

let g:syntastic_sql_sqlint_exec = "squabble"
let g:syntastic_sql_sqlint_args = "--reporter=sqlint"

Built-in Rules

Squabble ships with several rules that are focused mostly on preventing unsafe schema migrations. To enable these rules, reference them in your .squabblerc configuration file.

For example:

{
  "rules": {
    "AddColumnDisallowConstraints": {
      "disallowed": ["DEFAULT"]
    },
    "RequirePrimaryKey": {}
  }
}

AddColumnDisallowConstraints

class squabble.rules.add_column_disallow_constraints.AddColumnDisallowConstraints[source]

Bases: squabble.rules.BaseRule

Prevent adding a column with certain constraints to an existing table.

Configuration:

{
    "AddColumnDisallowConstraints": {
        "disallowed": ["DEFAULT", "FOREIGN"]
    }
}
Valid constraint types:
  • DEFAULT
  • NULL
  • NOT NULL
  • FOREIGN
  • UNIQUE
class ConstraintNotAllowed(**kwargs)[source]

Bases: squabble.message.Message

When adding a column to an existing table, certain constraints can have unintentional side effects, like locking the table or introducing performance issues.

For example, adding a DEFAULT constraint may hold a lock on the table while all existing rows are modified to fill in the default value.

A UNIQUE constraint will require scanning the table to confirm there are no duplicates.

On a particularly hot table, a FOREIGN constraint will introduce possibly dangerous overhead to confirm the referential integrity of each row.

DisallowChangeColumnType

class squabble.rules.disallow_change_column_type.DisallowChangeColumnType[source]

Bases: squabble.rules.BaseRule

Prevent changing the type of an existing column.

Configuration:

{ "DisallowChangeColumnType": {} }
class ChangeTypeNotAllowed(**kwargs)[source]

Bases: squabble.message.Message

Trying to change the type of an existing column may hold a full table lock while all of the rows are modified.

Additionally, changing the type of a column may not be backwards compatible with code that has already been deployed.

Instead, try adding a new column with the updated type, and then migrate over.

For example, to migrate a column from type_a to type_b.

ALTER TABLE foo ADD COLUMN bar_new type_b;
UPDATE foo SET bar_new = cast(bar_old, type_b);
-- Deploy server code to point to new column
ALTER TABLE foo DROP COLUMN bar_old;

DisallowForeignKey

class squabble.rules.disallow_foreign_key.DisallowForeignKey[source]

Bases: squabble.rules.BaseRule

Prevent creation of new FOREIGN KEY constraints.

Optionally, can be configured with a list of table names that ARE allowed to create foreign key references.

This rule will check CREATE TABLE and ALTER TABLE statements for foreign keys.

Configuration

{
    "DisallowForeignKey": {
        "excluded": ["table1", "table2"]
    }
}
class DisallowedForeignKeyConstraint(**kwargs)[source]

Bases: squabble.message.Message

Sometimes, foreign keys are not possible, or may cause more overhead than acceptable.

If you’re working with multiple services, each of which with their own database, it’s not possible to create a foreign key reference to a table that exists on another database. In this case, you’ll likely need to rely on your business logic being correct to guarantee referential integrity.

A foreign key constraint requires the database to query the referenced table to ensure that the value exists. On high-traffic, write heavy production instances, this may cause unacceptable overhead on writes.

DisallowFloatTypes

class squabble.rules.disallow_float_types.DisallowFloatTypes[source]

Bases: squabble.rules.BaseRule

Prevent using approximate float point number data types.

In SQL, the types FLOAT, REAL, and DOUBLE PRECISION are implemented as IEEE 754 floating point numbers, which will not be able to perfectly represent all numbers within their ranges.

Often, they’ll be “good enough”, but when doing aggregates over a large table, or trying to store very large (or very small) numbers, errors can be exaggerated.

Most of the time, you’ll probably want to used a fixed-point number, such as NUMERIC(3, 4).

Configuration

{ "DisallowFloatTypes": {} }
class LossyFloatType(**kwargs)[source]

Bases: squabble.message.Message

The types FLOAT, REAL, and DOUBLE PRECISION are implemented as IEEE 754 floating point numbers, which by definition will not be able to perfectly represent all numbers within their ranges.

This is an issue when performing aggregates over large numbers of rows, as errors can accumulate.

Instead, using the fixed-precision numeric data types (NUMERIC, DECIMAL) are likely the right choice for most cases.

DisallowNotIn

class squabble.rules.disallow_not_in.DisallowNotIn[source]

Bases: squabble.rules.BaseRule

Prevent NOT IN as part of queries, due to the unexpected behavior around NULL values.

Configuration:

{ "DisallowNotIn": {} }
class NotInNotAllowed(**kwargs)[source]

Bases: squabble.message.Message

NOT IN (along with any expression containing NOT ... IN) should generally not be used as it behaves in unexpected ways if there is a null present.

-- Always returns 0 rows
SELECT * FROM foo WHERE col NOT IN (1, null);

-- Returns 0 rows if any value of bar.x is null
SELECT * FROM foo WHERE col NOT IN (SELECT x FROM bar);

col IN (1, null) returns TRUE if col=1, and NULL otherwise (i.e. it can never return FALSE).

Since NOT (TRUE) = FALSE, but NOT (NULL) = NULL, it is not possible for this expression to return TRUE.

If you can guarantee that there will never be a null in the list of values, NOT IN is safe to use, but will not be optimized nicely.

DisallowPaddedCharType

class squabble.rules.disallow_padded_char_type.DisallowPaddedCharType[source]

Bases: squabble.rules.BaseRule

Prevent using CHAR(n) data type.

Postgres recommends never using CHAR(n), as any value stored in this type will be padded with spaces to the declared width. This padding wastes space, but doesn’t make operations on any faster; in fact the reverse, thanks to the need to strip spaces in many contexts.

In most cases, the variable length types TEXT or VARCHAR will be more appropriate.

Configuration

{ "DisallowPaddedCharType": {} }
class WastefulCharType(**kwargs)[source]

Bases: squabble.message.Message

Any value stored in this type will be padded with spaces to the declared width. This padding wastes space, but doesn’t make operations on any faster; in fact the reverse, thanks to the need to strip spaces in many contexts.

From a storage point of view, CHAR(n) is not a fixed-width type. The actual number of bytes varies since characters (e.g. unicode) may take more than one byte, and the stored values are therefore treated as variable-length anyway (even though the space padding is included in the storage).

If a maximum length must be enforced in the database, use VARCHAR(n), otherwise, consider using TEXT as a replacement.

DisallowRenameEnumValue

class squabble.rules.disallow_rename_enum_value.DisallowRenameEnumValue[source]

Bases: squabble.rules.BaseRule

Prevent renaming existing enum value.

Configuration:

{ "DisallowChangeEnumValue": {} }
class RenameNotAllowed(**kwargs)[source]

Bases: squabble.message.Message

Renaming an existing enum value may not be backwards compatible with code that is live in production, and may cause errors (either from the database or application) if the old enum value is read or written.

DisallowTimestampPrecision

class squabble.rules.disallow_timestamp_precision.DisallowTimestampPrecision[source]

Bases: squabble.rules.BaseRule

Prevent using TIMESTAMP(p) due to rounding behavior.

For both TIMESTAMP(p) and TIMESTAMP WITH TIME ZONE(p), (as well as the corresponding TIME types) the optional precision parameter p rounds the value instead of truncating.

This means that it is possible to store values that are half a second in the future for p == 0.

To only enforce this rule for certain values of p, set the configuration option allow_precision_greater_than.

Configuration

{ "DisallowTimestampPrecision": {
    "allow_precision_greater_than": 0
  }
}
class NoTimestampPrecision(**kwargs)[source]

Bases: squabble.message.Message

Specifying a fixed precision for TIMESTAMP and TIME types will cause the database to round inserted values (instead of truncating, as one would expect).

This rounding behavior means that some values that get inserted may be in the future, up to half a second with a precision of 0.

Instead, explicitly using date_trunc('granularity', time) may be a better option.

DisallowTimetzType

class squabble.rules.disallow_timetz_type.DisallowTimetzType[source]

Bases: squabble.rules.BaseRule

Prevent using time with time zone, along with CURRENT_TIME.

Postgres recommends never using this type, citing that it’s only implemented for ANSI SQL compliance, and that timestamptz / timestamp with time zone is almost always a better solution.

Configuration

{ "DisallowTimetzType": {} }
class NoTimetzType(**kwargs)[source]

Bases: squabble.message.Message

The type time with time zone is defined by the SQL standard, but the definition exhibits properties which lead to questionable usefulness.

In most cases, a combination of date, time, timestamp without time zone, and timestamp with time zone should provide a complete range of date/time functionality required by any application.

class NoCurrentTime(**kwargs)[source]

Bases: squabble.message.Message

CURRENT_TIME returns a time with time zone type, which is likely not what you want.

In most cases, CURRENT_TIMESTAMP is the correct replacement.

Some other options:

  • CURRENT_TIMESTAMP, now() - timestamp with time zone
  • LOCALTIMESTAMP - timestamp without time zone
  • CURRENT_DATE - date
  • LOCALTIME - time

RequireColumns

class squabble.rules.require_columns.RequireColumns[source]

Bases: squabble.rules.BaseRule

Require that newly created tables have specified columns.

Configuration

{
    "RequireColumns": {
        "required": ["column_foo,column_type", "column_bar"]
    }
}

If a column type is specified (like column_foo in the example configuration), the linter will make sure that the types match.

Otherwise, only the presence of the column will be checked.

class MissingRequiredColumn(**kwargs)[source]

Bases: squabble.message.Message

class ColumnWrongType(**kwargs)[source]

Bases: squabble.message.Message

RequireConcurrentIndex

class squabble.rules.require_concurrent_index.RequireConcurrentIndex[source]

Bases: squabble.rules.BaseRule

Require all new indexes to be created with CONCURRENTLY so they won’t block.

By default, tables created in the same file as the index are exempted, since they are known to be empty. This can be changed with the option "include_new_tables": true.

Configuration:

{
    "RequireConcurrentIndex": {
        "include_new_tables": false
    }
}
class IndexNotConcurrent(**kwargs)[source]

Bases: squabble.message.Message

Adding a new index to an existing table may hold a full table lock while the index is being built. On large tables, this may take a long time, so the preferred approach is to create the index concurrently instead.

-- Don't do this
CREATE INDEX users_by_name ON users(name);

-- Try this instead
CREATE INDEX CONCURRENTLY users_by_name ON users(name);

RequireForeignKey

class squabble.rules.require_foreign_key.RequireForeignKey[source]

Bases: squabble.rules.BaseRule

New columns that look like references must have a foreign key constraint.

By default, “looks like” means that the name of the column matches the regex .*_id$, but this is configurable.

CREATE TABLE comments (
  post_id  INT,  -- warning here, this looks like a foreign key,
                 -- but no constraint was given

  -- No warning here
  user_id INT REFERENCES users(id)
)

ALTER TABLE books
  ADD COLUMN author_id INT;  -- warning here

Configuration

{
    "RequireForeignKey": {
        "column_regex": ".*_id$"
    }
}
class MissingForeignKeyConstraint(**kwargs)[source]

Bases: squabble.message.Message

Foreign keys are a good way to guarantee that your database retains referential integrity.

When adding a new column that points to another table, make sure to add a constraint so that the database can check that it points to a valid record.

Foreign keys can either be added when creating a table, or after the fact in the case of adding a new column.

CREATE TABLE admins (user_id INTEGER REFERENCES users(id));

CREATE TABLE admins (
  user_id INTEGER,
  FOREIGN KEY (user_id) REFERENCES users(id)
);

ALTER TABLE admins ADD COLUMN user_id INTEGER REFERENCES users(id);

ALTER TABLE admins ADD FOREIGN KEY user_id REFERENCES users(id);

RequirePrimaryKey

class squabble.rules.require_primary_key.RequirePrimaryKey[source]

Bases: squabble.rules.BaseRule

Require that all new tables specify a PRIMARY KEY constraint.

Configuration:

{ "RequirePrimaryKey": {} }
class MissingPrimaryKey(**kwargs)[source]

Bases: squabble.message.Message

When creating a new table, it’s usually a good idea to define a primary key, as it can guarantee a unique, fast lookup into the table.

If no single column will uniquely identify a row, creating a composite primary key is also possible.

CREATE TABLE users (email VARCHAR(128) PRIMARY KEY);

-- Also valid
CREATE TABLE users (
  email VARCHAR(128),
  -- ...
  PRIMARY KEY(email)
);

Writing Plugins

Squabble supports loading rule definitions from directories specified in the .squabblerc configuration file.

Every Python file in the list of directories will be loaded and any classes that inherit from squabble.rules.BaseRule will be registered and available for use.

Configuration

{
  "plugins": [
    "/path/to/plugins/",
    ...
  ]
}

Concepts

Rules

Rules are classes which inherit from squabble.rules.BaseRule and are responsible for checking the abstract syntax tree of a SQL file.

At a minimum, each rule will define def enable(self, root_context, config), which is responsible for doing any initialization when the rule is enabled.

Rules register callback functions to trigger when certain nodes of the abstract syntax tree are hit. Rules will report messages to indicate any issues discovered.

For example

class MyRule(squabble.rules.BaseRule):
    def enable(self, context, config):
        ...

Could be configured with this .squabblerc

{"rules": {"MyRule": {"foo": "bar"}}}

enable() would be passed config={"foo": "bar"}.

Messages

Messages inherit from squabble.message.Message, and are used to define specific kinds of lint exceptions a rule can uncover.

At a bare minimum, each message class needs a TEMPLATE class variable, which is used when formatting the message to be printed on the command line.

For example

class BadlyNamedColumn(squabble.message.Message):
    """
    Here are some more details about ``BadlyNamedColumn``.

    This is where you would explain why this message is relevant,
    how to resolve it, etc.
    """

    TEMPLATE = 'tried to {foo} when you should have done {bar}!'

>>> msg = MyMessage(foo='abc', bar='xyz')
>>> msg.format()
'tried to abc when you should have done xyz'
>>> msg.explain()
'Here are some more details ...

Messages may also define a CODE class variable, which is an integer which uniquely identifies the class. If not explicitly specified, one will be assigned, starting at 9000. These can be used by the --explain command line flag

$ squabble --explain 9001
BadlyNamedColumn
    Here are some more details about ``BadlyNamedColumn``.

    ...

Context

Each instance of squabble.lint.Context holds the callback functions that have been registered at or below a particular node in the abstract syntax tree, as well as being responsible for reporting any messages that get raised.

When the enable() function for a class inheriting from squabble.rules.BaseRule is called, it will be passed a context pointing to the root node of the syntax tree. Every callback function will be passed a context scoped to the node that triggered the callback.

def enable(root_context, _config):
    root_context.register('CreateStmt', create_table_callback)

def create_table_callback(child_context, node):
    # register a callback that is only scoped to this ``node``
    child_context.register('ColumnDef', column_def_callback):

def column_def_callback(child_context, node):
    ...

Details

  • Parsing is done using libpg_query, a Postgres query parser.
    • theoretically it will work with other SQL dialects
  • Rules are implemented by registering callbacks while traversing the Abstract Syntax Tree of the query.
    • e.g. entering a CREATE TABLE node registers a callback for a column definition node, which checks that the column type is correct.

As a somewhat unfortunate consequence of our reliance on libpg_query, the abstract syntax tree is very, very specific to Postgres. While developing new rules, it will be necessary to reference the Postgres AST Node source listing, or, more readably, the Python bindings.

Example Rule

import squabble.rule
from squabble.lint import Severity
from squabble.message import Message
from squabble.rules import BaseRule

class AllTablesMustBeLoud(BaseRule):
    """
    A custom rule which makes sure that all table names are
    in CAPSLOCK NOTATION.
    """

    class TableNotLoudEnough(Message):
        """Add more details about the message here"""
        CODE = 9876
        TEMPLATE = 'table "{name}" not LOUD ENOUGH'

    def enable(self, root_ctx, config):
        """
        Called before the root AST node is traversed. Here's where
        most callbacks should be registered for different AST
        nodes.

        Each linter is initialized once per file that it is being
        run against. `config` will contain the merged base
        configuration with the file-specific configuration options
        for this linter.
        """

        # Register that any time we see a `CreateStmt`
        # (`CREATE TABLE`), call self._check()
        root_ctx.register('CreateStmt', self._check_create())

        # When we exit the root `ctx`, call `self._on_finish()`
        root_ctx.register_exit(lambda ctx: self._on_finish(ctx))

    # node_visitor will pass in `ctx, node` for you so there's no
    # need to use a lambda
    @squabble.rule.node_visitor
    def _check_create(self, ctx, node):
        """
        Called when we enter a 'CreateStmt' node. Here we can
        register more callbacks if we need to, or do some checking
        based on the `node` which will be the AST representation of
        a `CREATE TABLE`.
        """

        table_name = node.relation.relname.value
        if table_name != table_name.upper():
            # Report an error if this table was not SCREAMING_CASE
            ctx.report(
                self.TableNotLoudEnough(name=table_name),
                node=node.relation,
                severity=Severity.HIGH)

    def _on_finish(self, ctx):
        pass

API

squabble package

exception squabble.SquabbleException[source]

Bases: Exception

Base exception type for all things in this package

exception squabble.RuleConfigurationException(rule, msg)[source]

Bases: squabble.SquabbleException

exception squabble.UnknownRuleException(name)[source]

Bases: squabble.SquabbleException

squabble.cli module

Usage:
squabble [options] [PATHS…] squabble (-h | –help)
Arguments:
PATHS Paths to check. If given a directory, will recursively traverse the
path and lint all files ending in .sql [default: -].
Options:

-h –help Show this screen. -V –verbose Turn on debug level logging. -v –version Show version information.

-x –expanded Show explantions for every raised message.

-c –config=PATH Path to configuration file. -p –preset=PRESETS Comma-separated list of presets to use as a base. -r –reporter=REPORTER Use REPORTER for output rather than one in config.

-e –explain=CODE Show detailed explanation of a message code. –list-presets List available preset configurations. –list-rules List available rules. –show-rule=RULE Show detailed information about RULE.

squabble.cli.main()[source]
squabble.cli.dispatch_args(args)[source]

Handle the command line arguments as parsed by docopt. Calls the subroutine implied by the combination of command line flags and returns the exit status (or None, if successful) of the program.

Note that some exceptional conditions will terminate the program directly.

squabble.cli.run_linter(base_config, paths, expanded)[source]

Run linter against all SQL files contained in paths.

paths may contain both files and directories.

If paths is empty or only contains "-", squabble will read from stdin instead.

If expanded is True, print the detailed explanation of each message after the lint has finished.

squabble.cli.collect_files(paths)[source]

Given a list of files or directories, find all named files as well as any files ending in .sql in the directories.

The return format is a list of tuples containing the file name and file contents.

The value '-' is treated specially as stdin.

squabble.cli.show_rule(name)[source]

Print information about rule named name.

squabble.cli.list_rules()[source]

Print out all registered rules and brief description of what they do.

squabble.cli.explain_message(code)[source]

Print out the more detailed explanation of the given message code.

squabble.cli.list_presets()[source]

Print out all the preset configurations.

squabble.config module

class squabble.config.Config(reporter, plugins, rules)

Bases: tuple

plugins

Alias for field number 1

reporter

Alias for field number 0

rules

Alias for field number 2

exception squabble.config.UnknownPresetException(preset)[source]

Bases: squabble.SquabbleException

Raised when user tries to apply a preset that isn’t defined.

squabble.config.discover_config_location()[source]

Try to locate a config file in some likely locations.

Used when no config path is specified explicitly. In order, this will check for a file named .squabblerc:

  • in the current directory.
  • in the root of the repository (if working in a git repo).
  • in the user’s home directory.
squabble.config.get_base_config(preset_names=None)[source]

Return a basic config value that can be overridden by user configuration files.

Parameters:preset_names – The named presets to use (applied in order), or None
squabble.config.load_config(config_file, preset_names=None, reporter_name=None)[source]

Load configuration from a file, optionally applying a predefined set of rules.

Parameters:
  • config_file (str) – Path to JSON file containing user configuration.
  • preset_name (str) – Preset to use as a base before applying user configuration.
  • reporter_name (str) – Override the reporter named in configuration.
squabble.config.apply_file_config(base, contents)[source]

Given a base configuration object and the contents of a file, return a new config that applies any file-specific rule additions/deletions.

Returns None if the file should be skipped.

squabble.lint module

linting engine

class squabble.lint.LintIssue[source]

Bases: squabble.lint._LintIssue

class squabble.lint.Severity[source]

Bases: enum.Enum

Enumeration describing the relative severity of a LintIssue.

By themselves, these values don’t mean much, but are meant to convey the likely hood that a detected issue is truly problematic. For example, a syntax error in a migration would be CRITICAL, but perhaps a naming inconsistency would be LOW.

LOW = 'LOW'
MEDIUM = 'MEDIUM'
HIGH = 'HIGH'
CRITICAL = 'CRITICAL'
squabble.lint.check_file(config, name, contents)[source]

Return a list of lint issues from using config to lint name.

class squabble.lint.Session(rules, sql_text, file_name)[source]

Bases: object

A run of the linter using a given set of rules over a single file. This class exists mainly to hold the list of issues returned by the enabled rules.

report_issue(issue)[source]
lint()[source]

Run the linter on SQL given in constructor, returning a list of LintIssue discovered.

class squabble.lint.Context(session)[source]

Bases: object

Contains the node tag callback hooks enabled at or below the parent_node passed to the call to traverse.

>>> import pglast
>>> ast = pglast.Node(pglast.parse_sql('''
...   CREATE TABLE foo (id INTEGER PRIMARY KEY);
... '''))
>>> ctx = Context(session=...)
>>>
>>> def create_stmt(child_ctx, node):
...     print('create stmt')
...     child_ctx.register('ColumnDef', lambda _c, _n: print('from child'))
...
>>> ctx.register('CreateStmt', create_stmt)
>>> ctx.register('ColumnDef', lambda _c, _n: print('from root'))
>>> ctx.traverse(ast)
create stmt
from child
from root
traverse(parent_node)[source]

Recursively walk down the AST starting at parent_node.

For every node, call any callback functions registered for that particular node tag.

register_exit(fn)[source]

Register fn to be called when the current node is finished being traversed.

register(node_tag, fn)[source]

Register fn to be called whenever node_tag node is visited.

>>> session = ...
>>> ctx = Context(session)
>>> ctx.register('CreateStmt', lambda ctx, node: ...)
report_issue(issue)[source]
report(message, node=None, severity=None)[source]

Convenience wrapper to create and report a lint issue.

squabble.message module

exception squabble.message.DuplicateMessageCodeException(dupe)[source]

Bases: squabble.SquabbleException

class squabble.message.Registry[source]

Bases: object

Singleton which maps message code values to classes.

>>> class MyMessage(Message):
...     '''My example message.'''
...     TEMPLATE = '...'
...     CODE = 5678
>>> cls = Registry.by_code(5678)
>>> cls.explain()
'My example message.'
>>> cls is MyMessage
True

Duplicate codes are not allowed, and will throw an exception.

>>> class MyDuplicateMessage(Message):
...     CODE = 5678
Traceback (most recent call last):
  ...
squabble.message.DuplicateMessageCodeException: ...
classmethod register(msg)[source]

Add msg to the registry, and assign a CODE value if not explicitly specified.

classmethod by_code(code)[source]

Return the squabble.message.Message class identified by code, raising a KeyError if it doesn’t exist.

class squabble.message.Message(**kwargs)[source]

Bases: object

Messages represent specific issues identified by a lint rule.

Each class that inherits from Message should have a docstring which explains the reasoning and context of the message, as well as a class member variable named TEMPLATE, which is used to display a brief explanation on the command line.

Messages may also have a CODE class member, which is used to identify the message. The actual value doesn’t matter much, as long as it is unique among all the loaded Message s. If no CODE is defined, one will be assigned.

>>> class TooManyColumns(Message):
...    '''
...    This may indicate poor design, consider multiple tables instead.
...    '''
...    TEMPLATE = 'table "{table}" has > {limit} columns'
...    CODE = 1234
>>> message = TooManyColumns(table='foo', limit=30)
>>> message.format()
'table "foo" has > 30 columns'
>>> message.explain()
'This may indicate poor design, consider multiple tables instead.'
TEMPLATE = None
CODE = None
format()[source]
classmethod explain()[source]

Provide more context around this message.

The purpose of this function is to explain to users _why_ the message was raised, and what they can do to resolve the issue.

The base implementation will simply return the docstring for the class, but this can be overridden if more specialized behavior is necessary.

>>> class NoDocString(Message): pass
>>> NoDocString().explain() is None
True
asdict()[source]

Return dictionary representation of message, for formatting.

>>> class SummaryMessage(Message):
...     CODE = 90291
...     TEMPLATE = 'everything is {status}'
...
>>> msg = SummaryMessage(status='wrong')
>>> msg.asdict() == {
...   'message_id': 'SummaryMessage',
...   'message_text': 'everything is wrong',
...   'message_template': SummaryMessage.TEMPLATE,
...   'message_params': {'status': 'wrong'},
...   'message_code': 90291
... }
True

squabble.reporter module

exception squabble.reporter.UnknownReporterException(name)[source]

Bases: squabble.SquabbleException

Raised when a configuration references a reporter that doesn’t exist.

squabble.reporter.reporter(name)[source]

Decorator to register function as a callback when the config sets the "reporter" config value to name.

The wrapped function will be called with each squabble.lint.LintIssue and the contents of the file being linted. Each reporter should return a list of lines of output which will be printed to stderr.

>>> from squabble.lint import LintIssue
>>> @reporter('no_info')
... def no_info(issue, file_contents):
...     return ['something happened']
...
>>> no_info(LintIssue(), file_contents='')
['something happened']
squabble.reporter.report(reporter_name, issues, files)[source]

Call the named reporter function for every issue in the list of issues. All lines of output returned will be printed to stderr.

Parameters:
  • reporter_name (str) – Issue reporter format to use.
  • issues (list) – List of generated squabble.lint.LintIssue.
  • files (dict) – Map of file name to contents of file.
>>> import sys; sys.stderr = sys.stdout  # for doctest.
>>> from squabble.lint import LintIssue
>>> @reporter('message_and_severity')
... def message_and_severity_reporter(issue, contents):
...     return ['%s:%s' % (issue.severity.name, issue.message_text)]
...
>>> issue = LintIssue(severity=Severity.CRITICAL,
...                   message_text='bad things!')
>>> report('message_and_severity', [issue], files={})
CRITICAL:bad things!
squabble.reporter.plain_text_reporter(issue, file_contents)[source]

Simple single-line output format that is easily parsed by editors.

squabble.reporter.color_reporter(issue, file_contents)[source]

Extension of squabble.reporter.plain_text_reporter(), uses ANSI color and shows error location.

squabble.reporter.json_reporter(issue, _file_contents)[source]

Dump each issue as a JSON dictionary

squabble.reporter.sqlint_reporter(issue, file_contents)[source]

Format compatible with sqlint, which is already integrated into Flycheck and other editor linting frameworks.

Main difference is really just that there are only two severity levels: ERROR and WARNING.

squabble.rule module

squabble.rule.load_rules(plugin_paths=None)[source]

Load built in rules as well as any custom rules contained in the directories in plugin_paths.

squabble.rule.node_visitor(fn)[source]

Helper decorator to make it easier to register callbacks for AST nodes. Effectively creates the partial function automatically so there’s no need for a lambda.

Wraps fn to pass in self, context, and node when the callback is called.

>>> from squabble.rules import BaseRule
>>> class SomeRule(BaseRule):
...     def enable(self, ctx, config):
...         # These are equivalent
...         ctx.register('foo', self.check_foo(x=1))
...         ctx.register('bar', lambda c, n: self.check_bar(c, n, x=1))
...
...     @node_visitor
...     def check_foo(self, context, node, x):
...         pass
...
...     def check_bar(self, context, node, x):
...         pass
class squabble.rule.Registry[source]

Bases: object

Singleton instance used to keep track of all rules.

Any class that inherits from squabble.rules.BaseRule will automatically be registered to the registry.

static register(rule)[source]
static get_meta(name)[source]

Return metadata about a given rule in the registry.

If no rule exists in the registry named name, UnknownRuleException will be thrown.

The returned dictionary will look something like this:

{
    'name': 'RuleClass',
    'help': 'Some rule...',
    # ...
}
static get_class(name)[source]

Return class for given rule name in the registry.

If no rule exists in the registry named name, UnknownRuleException will be thrown.

static all()[source]

Return an iterator over all known rule metadata. Equivalent to calling get_meta() for all registered rules.

squabble.rules package

class squabble.rules.BaseRule[source]

Bases: object

Base implementation of a linter rule.

Rules work by adding hooks into the abstract syntax tree for a SQL file, and then performing their lint actions inside the callback functions.

Rules represent lint issues using pre-defined message classes, which are defined by creating a child class inheriting from squabble.message.Message.

For example:

>>> import squabble.message
>>> class MyRule(BaseRule):
...     class BadColumnName(squabble.message.Message):
...         TEMPLATE = 'column {name} is not allowed'
...
>>> MyRule.BadColumnName(name='foo').format()
'column foo is not allowed'
classmethod meta()[source]

Return metadata about the Rule. Base implementation should be sane enough for most purposes.

>>> class MyRule(BaseRule):
...     '''
...     Brief description of rule. This can
...     wrap to the next line.
...
...     More details about this rule.
...     '''
>>> m = MyRule.meta()
>>> m['name']
'MyRule'
>>> m['description']
'Brief description of rule. This can wrap to the next line.'
>>> m['help']
'More details about this rule.'
enable(ctx, config)[source]

Called before the root AST node is traversed. Here’s where most callbacks should be registered for different AST nodes.

Each linter is initialized once per file that it is being run against. config will contain the merged base configuration with the file-specific configuration options for this linter.

squabble.util module

Odds and ends pieces that don’t fit elsewhere, but aren’t important enough to have their own modules.

squabble.util.strip_rst_directives(string)[source]

Strip reStructuredText directives out of a block of text.

Lines containing a directive will be stripped out entirely

>>> strip_rst_directives('hello\n.. code-block:: foo\nworld')
'hello\nworld'
squabble.util.format_type_name(type_name)[source]

Return a simple stringified version of a pglast.node.TypeName node.

Note that this won’t be suitable for printing, and ignores type modifiers (e.g. NUMERIC(3,4) => NUMERIC).

>>> import pglast
>>> sql = 'CREATE TABLE _ (y time with time zone);'
>>> node = pglast.Node(pglast.parse_sql(sql))
>>> col_def = node[0]['stmt']['tableElts'][0]
>>> format_type_name(col_def.typeName)
'pg_catalog.timetz'

squabble

build status Documentation Status PyPI version

Catch unsafe SQL migrations.

$ squabble sql/migration.sql
sql/migration.sql:4:46 ERROR: column "uh_oh" has a disallowed constraint [1004]
ALTER TABLE big_table ADD COLUMN uh_oh integer DEFAULT 0;
                                               ^
# Use --explain to get more information on a lint violation
$ squabble --explain 1004
ConstraintNotAllowed
     When adding a column to an existing table, certain constraints can have
     unintentional side effects, like locking the table or introducing
     performance issues.
     ...

Squabble can also be integrated with your editor to catch errors in SQL files.

$ echo 'SELECT * FROM WHERE x = y;' | squabble --reporter=plain
stdin:1:15 CRITICAL: syntax error at or near "WHERE"

Currently, most of the rules have been focused on Postgres and its quirks. However, squabble can parse any ANSI SQL and new rules that are specific to other databases are appreciated!

Installation

$ pip3 install squabble
$ squabble --help

Note

Squabble is only supported on Python 3.5+

If you’d like to install from source:

$ git clone https://github.com/erik/squabble.git && cd squabble
$ python3 -m venv ve && source ve/bin/activate
$ python setup.py install
$ squabble --help

Configuration

To see a list of rules, try

$ squabble --list-rules

Then, to show more verbose information about a rule (such as rationale and configuration options)

$ squabble --show-rule AddColumnsDisallowConstraints

Once a configuration file is in place, it can be passed explicitly on the command line, or automatically looked up.

$ squabble -c path/to/config ...

If not explicitly given on the command line, squabble will look for a file named .squabblerc in the following places (in order):

  • ./.squabblerc
  • (git_repo_root)/.squabblerc
  • ~/.squabblerc

Per-File Configuration

Configuration can also be applied at the file level by using SQL line comments in the form -- squabble-enable:RuleName or -- squabble-disable:RuleName.

For example, to disable RuleA and enable RuleB just for one file, this could be done:

-- squabble-disable:RuleA
-- squabble-enable:RuleB config=value array=1,2,3
SELECT email FROM users WHERE ...;

To prevent squabble from running on a file, use -- squabble-disable. Note that this will also disable syntax checking. Note that this flag will take precedence over any other configuration set either on the command line or in the rest of the file.

Example Configuration

{
  "reporter": "color",

  "plugins": [
    "/some/directory/with/custom/rules"
  ],

  "rules": {
    "AddColumnsDisallowConstraints": {
      "disallowed": ["DEFAULT", "FOREIGN", "NOT NULL"]
    }
  }
}

Prior Art

squabble is of course not the first tool in this space. If it doesn’t fit your needs, consider one of these tools:

  • sqlcheck - regular expression based (rather than parsing), focuses more on SELECT statements than migrations.
  • sqlint - checks that the syntax of a file is valid. Uses the same parsing library as squabble.
  • sqlfluff - focused more on style and formatting, seems to still be a work in progress.

Acknowledgments

This project would not be possible without:

  • libpg_query - Postgres query parser
  • pglast - Python bindings to libpg_query
  • Postgres - …obviously

The logo image used in the documentation is created by Gianni - Dolce Merda from the Noun Project.

Indices and tables