If the external table in Databricks contains a long STRUCT column the schema of this column is truncated and the entry of the external table in external_models.yaml ends up being incomplete.
Observed for sqlmesh v0.234.0.
Details
For the databricks dialect sqlmesh create_external_models uses the EngineAdapter.columns() method to get the columns and types from the engine by means of DESCRIBE <table>.
In Databricks the schema of a long STRUCT is truncated to fit into the data_type column of the result. SQLMesh then does a best effort, including the visible STRUCT elements in the external_models.yaml.
Suggested fixes
Using DESCRIBE
For Databricks the columns() method should use DESCRIBE EXTENDED <table> AS JSON instead of DESCRIBE <table>. The returned json will have the full schema for the long STRUCT.
Using information schema
Similar to what is done for the mssql dialect.
If the external table in Databricks contains a long STRUCT column the schema of this column is truncated and the entry of the external table in
external_models.yamlends up being incomplete.Observed for sqlmesh v0.234.0.
Details
For the databricks dialect
sqlmesh create_external_modelsuses the EngineAdapter.columns() method to get the columns and types from the engine by means ofDESCRIBE <table>.In Databricks the schema of a long STRUCT is truncated to fit into the
data_typecolumn of the result. SQLMesh then does a best effort, including the visible STRUCT elements in theexternal_models.yaml.Suggested fixes
Using DESCRIBE
For Databricks the
columns()method should useDESCRIBE EXTENDED <table> AS JSONinstead ofDESCRIBE <table>. The returned json will have the full schema for the long STRUCT.Using information schema
Similar to what is done for the mssql dialect.