Skip to content

The command create_external_models returns incomplete results for Databricks #5781

@blecourt-private

Description

@blecourt-private

If the external table in Databricks contains a long STRUCT column the schema of this column is truncated and the entry of the external table in external_models.yaml ends up being incomplete.

Observed for sqlmesh v0.234.0.

Details

For the databricks dialect sqlmesh create_external_models uses the EngineAdapter.columns() method to get the columns and types from the engine by means of DESCRIBE <table>.

In Databricks the schema of a long STRUCT is truncated to fit into the data_type column of the result. SQLMesh then does a best effort, including the visible STRUCT elements in the external_models.yaml.

Suggested fixes

Using DESCRIBE

For Databricks the columns() method should use DESCRIBE EXTENDED <table> AS JSON instead of DESCRIBE <table>. The returned json will have the full schema for the long STRUCT.

Using information schema

Similar to what is done for the mssql dialect.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions