MsSQL#
Note
SQLServer does not need to specify protocol.
MsSQL Connection#
Hint
if the user password has special characters, they need to be sanitized. example: from urllib import parse; password = parse.quote_plus(password)
import connectorx as cx
conn = 'mssql://username:password@server:port/database?encrypt=true&trusted_connection=true' # connection token
query = 'SELECT * FROM table' # query string
cx.read_sql(conn, query) # read data from MsSQL
Connection Parameters#
By adding
trusted_connection=true
to connection uri parameter, windows authentication will be enabled.Example:
mssql://host:port/db?trusted_connection=true
By adding
encrypt=true
to connection uri parameter, SQLServer will use SSL encryption.Example:
mssql://host:port/db?encrypt=true&trusted_connection=true
By adding
trust_server_certificate=true
to connection uri parameter, the SQLServer certificate will not be validated and it is accepted as-is.Example:
mssql://host:port/db?trust_server_certificate=true&encrypt=true
By adding
trust_server_certificate_ca=/path/to/ca-cert.crt
to connection uri parameter, the SQLServer certificate will be validated against the given CA certificate in addition to the system-truststore.Example:
mssql://host:port/db?encrypt=true&trust_server_certificate_ca=/path/to/ca-cert.crt
SQLServer-Pandas Type Mapping#
SQLServer Type |
Pandas Type |
Comment |
---|---|---|
TINYINT |
int64, Int64(nullable) |
|
SMALLINT |
int64, Int64(nullable) |
|
INT |
int64, Int64(nullable) |
|
BIGINT |
int64, Int64(nullable) |
|
FLOAT |
float64 |
|
NUMERIC |
float64 |
|
DECIMAL |
float64 |
cannot support precision larger than 28 |
BIT |
bool, boolean(nullable) |
|
VARCHAR |
object |
|
CHAR |
object |
|
TEXT |
object |
|
NVARCHAR |
object |
|
NCHAR |
object |
|
NTEXT |
object |
|
VARBINARY |
object |
|
BINARY |
object |
|
IMAGE |
object |
|
DATETIME |
datetime64[ns] |
|
DATETIME2 |
datetime64[ns] |
|
SMALLDATETIME |
datetime64[ns] |
|
DATE |
datetime64[ns] |
|
DATETIMEOFFSET |
datetime64[ns] |
|
TIME |
object |
|
UNIQUEIDENTIFIER |
object |
Performance (r5.4xlarge docker in another EC2 instance)#
Modin does not support read_sql on Mssql
Time chart, lower is better.
Memory consumption chart, lower is better.
In conclusion, ConnectorX uses 3x less memory and 14x less time compared with Pandas.