Matillion ETL Data Model for Apache Hive
Version - 22.0.8326.0

Note: Data models are true for the latest version of Matillion ETL. If you are on an older version or using a component from an old job, your experience may differ.



Connection String OptionsBack To Top

  1. AuthScheme
  2. Server
  3. Port
  4. User
  5. Password
  6. ProtocolVersion
  7. TransportMode
  8. Database
  9. ImpersonationProxyUser
  10. SaslQop
  11. KerberosKDC
  12. KerberosRealm
  13. KerberosSPN
  14. KerberosKeytabFile
  15. KerberosServiceRealm
  16. KerberosServiceKDC
  17. KerberosTicketCache
  18. SSLClientCert
  19. SSLClientCertType
  20. SSLClientCertPassword
  21. SSLClientCertSubject
  22. SSLServerCert
  23. SSHAuthMode
  24. SSHClientCert
  25. SSHClientCertPassword
  26. SSHClientCertSubject
  27. SSHClientCertType
  28. SSHServer
  29. SSHPort
  30. SSHUser
  31. SSHPassword
  32. SSHServerFingerprint
  33. UseSSH
  34. FirewallType
  35. FirewallServer
  36. FirewallPort
  37. FirewallUser
  38. FirewallPassword
  39. ProxyAutoDetect
  40. ProxyServer
  41. ProxyPort
  42. ProxyAuthScheme
  43. ProxyUser
  44. ProxyPassword
  45. ProxySSLType
  46. ProxyExceptions
  47. Logfile
  48. Verbosity
  49. LogModules
  50. MaxLogFileSize
  51. MaxLogFileCount
  52. Location
  53. BrowsableSchemas
  54. Tables
  55. Views
  56. AutoCache
  57. CacheDriver
  58. CacheConnection
  59. CacheLocation
  60. CacheTolerance
  61. Offline
  62. CacheMetadata
  63. AsyncQueryTimeout
  64. BatchSize
  65. ConnectionLifeTime
  66. ConnectOnOpen
  67. EnableXSRF
  68. FailoverHosts
  69. HTTPPath
  70. MaxRows
  71. Other
  72. Pagesize
  73. PoolIdleTimeout
  74. PoolMaxSize
  75. PoolMinSize
  76. PoolWaitTime
  77. PseudoColumns
  78. QueryPassthrough
  79. Readonly
  80. RTK
  81. ServerConfigurations
  82. Timeout
  83. UseConnectionPooling
  84. UseDescTableQuery
  85. UserDefinedViews
  86. UseShowDatabasesQuery
  87. UseShowTablesQuery
  88. UseSSL
  89. UseZookeeperDiscovery
  90. ZookeeperNamespace

AuthScheme

Data Type

string

Default Value

"Plain"

Remarks

The AuthScheme used to authenticate with Hive.

NoSasl Used when the hive.server2.authentication property is set to NoSasl.
Plain (default) Used when the hive.server2.authentication property is set to None (uses Plain SASL), PAM, or CUSTOM. If User or Password are not set, 'anonymous' will be sent for these fields.
LDAP Used when the hive.server2.authentication property is set to LDAP.
Kerberos Used when the hive.server2.authentication property is set to Kerberos.



Server

Data Type

string

Default Value

""

Remarks

If multiple Hive servers are available, FailoverHosts can be set to the additional Hive instances.

If UseZookeeperDiscovery is set to True, Server and Port must be set to the ZooKeeper server.



Port

Data Type

string

Default Value

"10000"

Remarks

When using BINARY TransportMode, this property should be set to the value in the 'hive.server2.thrift.port' property of the Hive configuration file (hive-site.xml).

When using HTTP TransportMode, this property should be set to the value in the 'hive.server2.thrift.http.port' property of the Hive configuration file (hive-site.xml).



User

Data Type

string

Default Value

""

Remarks

The username used to authenticate with Hive.



Password

Data Type

string

Default Value

""

Remarks

The password used to authenticate with Hive.



ProtocolVersion

Data Type

string

Default Value

"1"

Remarks

The most efficient protocol version will be determined automatically by the CData JDBC Driver for Apache Hive upon connecting to Hive. This property allows you to explicitly specify the version to use and overrides the version determined by the CData JDBC Driver for Apache Hive.



TransportMode

Data Type

string

Default Value

"BINARY"

Remarks

The transport mode used to communicate with the Hive server.

This property should be set to the 'hive.server2.transport.mode' value specified in your Hive configuration file (hive-site.xml).



Database

Data Type

string

Default Value

""

Remarks

When specified, the CData JDBC Driver for Apache Hive will issue a 'USE [Database]' command upon connecting to Hive. This will be the database schema used when executing queries that do not have a schema explicitly specified.

To execute queries to other schemas, the schema can be explicitly specified in the statement.

When Database is not set, the 'default' database schema will be used (no 'USE' statement is issued to Hive in this case).



ImpersonationProxyUser

Data Type

string

Default Value

""

Remarks

The proxy user of the Hive user impersonation.



SaslQop

Data Type

string

Default Value

"auth"

Remarks

This property should be set to the 'hive.server2.thrift.sasl.qop' value specified in your Hive configuration file (hive-site.xml).

auth Authentication only
auth-int Authentication plus integrity protection
auth-conf Authentication plus integrity and confidentiality protection



KerberosKDC

Data Type

string

Default Value

""

Remarks

The Kerberos properties are used when using SPNEGO or Windows Authentication. The driver will request session tickets and temporary session keys from the Kerberos KDC service. The Kerberos KDC service is conventionally colocated with the domain controller.

If Kerberos KDC is not specified, the driver will attempt to detect these properties automatically from the following locations:

Note: Windows authentication is supported in JRE 1.6 and above only.



KerberosRealm

Data Type

string

Default Value

""

Remarks

The Kerberos properties are used when using SPNEGO or Windows Authentication. The Kerberos Realm is used to authenticate the user with the Kerberos Key Distribution Service (KDC). The Kerberos Realm can be configured by an administrator to be any string, but conventionally it is based on the domain name.

If Kerberos Realm is not specified, the driver will attempt to detect these properties automatically from the following locations:

Note: Kerberos-based authentication is supported in JRE 1.6 and above only.



KerberosSPN

Data Type

string

Default Value

""

Remarks

If the SPN on the Kerberos Domain Controller is not the same as the URL that you are authenticating to, use this property to set the SPN.



KerberosKeytabFile

Data Type

string

Default Value

""

Remarks

The Keytab file containing your pairs of Kerberos principals and encrypted keys.



KerberosServiceRealm

Data Type

string

Default Value

""

Remarks

The KerberosServiceRealm is the specify the service Kerberos realm when using cross-realm Kerberos authentication.

In most cases, a single realm and KDC machine are used to perform the Kerberos authentication and this property is not required.

This property is available for complex setups where a different realm and KDC machine are used to obtain an authentication ticket (AS request) and a service ticket (TGS request).



KerberosServiceKDC

Data Type

string

Default Value

""

Remarks

The KerberosServiceKDC is used to specify the service Kerberos KDC when using cross-realm Kerberos authentication.

In most cases, a single realm and KDC machine are used to perform the Kerberos authentication and this property is not required.

This property is available for complex setups where a different realm and KDC machine are used to obtain an authentication ticket (AS request) and a service ticket (TGS request).



KerberosTicketCache

Data Type

string

Default Value

""

Remarks

This property can be set if you wish to use a credential cache file that was created using the MIT Kerberos Ticket Manager or kinit command.



SSLClientCert

Data Type

string

Default Value

""

Remarks

The name of the certificate store for the client certificate.

The SSLClientCertType field specifies the type of the certificate store specified by SSLClientCert. If the store is password protected, specify the password in SSLClientCertPassword.

SSLClientCert is used in conjunction with the SSLClientCertSubject field in order to specify client certificates. If SSLClientCert has a value, and SSLClientCertSubject is set, a search for a certificate is initiated. See SSLClientCertSubject for more information.

Designations of certificate stores are platform-dependent.

The following are designations of the most common User and Machine certificate stores in Windows:

MYA certificate store holding personal certificates with their associated private keys.
CACertifying authority certificates.
ROOTRoot certificates.
SPCSoftware publisher certificates.

In Java, the certificate store normally is a file containing certificates and optional private keys.

When the certificate store type is PFXFile, this property must be set to the name of the file. When the type is PFXBlob, the property must be set to the binary contents of a PFX file (for example, PKCS12 certificate store).



SSLClientCertType

Data Type

string

Default Value

"USER"

Remarks

This property can take one of the following values:

USER - defaultFor Windows, this specifies that the certificate store is a certificate store owned by the current user. Note that this store type is not available in Java.
MACHINEFor Windows, this specifies that the certificate store is a machine store. Note that this store type is not available in Java.
PFXFILEThe certificate store is the name of a PFX (PKCS12) file containing certificates.
PFXBLOBThe certificate store is a string (base-64-encoded) representing a certificate store in PFX (PKCS12) format.
JKSFILEThe certificate store is the name of a Java key store (JKS) file containing certificates. Note that this store type is only available in Java.
JKSBLOBThe certificate store is a string (base-64-encoded) representing a certificate store in JKS format. Note that this store type is only available in Java.
PEMKEY_FILEThe certificate store is the name of a PEM-encoded file that contains a private key and an optional certificate.
PEMKEY_BLOBThe certificate store is a string (base64-encoded) that contains a private key and an optional certificate.
PUBLIC_KEY_FILEThe certificate store is the name of a file that contains a PEM- or DER-encoded public key certificate.
PUBLIC_KEY_BLOBThe certificate store is a string (base-64-encoded) that contains a PEM- or DER-encoded public key certificate.
SSHPUBLIC_KEY_FILEThe certificate store is the name of a file that contains an SSH-style public key.
SSHPUBLIC_KEY_BLOBThe certificate store is a string (base-64-encoded) that contains an SSH-style public key.
P7BFILEThe certificate store is the name of a PKCS7 file containing certificates.
PPKFILEThe certificate store is the name of a file that contains a PuTTY Private Key (PPK).
XMLFILEThe certificate store is the name of a file that contains a certificate in XML format.
XMLBLOBThe certificate store is a string that contains a certificate in XML format.



SSLClientCertPassword

Data Type

string

Default Value

""

Remarks

If the certificate store is of a type that requires a password, this property is used to specify that password to open the certificate store.



SSLClientCertSubject

Data Type

string

Default Value

"*"

Remarks

When loading a certificate the subject is used to locate the certificate in the store.

If an exact match is not found, the store is searched for subjects containing the value of the property. If a match is still not found, the property is set to an empty string, and no certificate is selected.

The special value "*" picks the first certificate in the certificate store.

The certificate subject is a comma separated list of distinguished name fields and values. For example, "CN=www.server.com, OU=test, C=US, E=support@company.com". The common fields and their meanings are shown below.

FieldMeaning
CNCommon Name. This is commonly a host name like www.server.com.
OOrganization
OUOrganizational Unit
LLocality
SState
CCountry
EEmail Address

If a field value contains a comma, it must be quoted.



SSLServerCert

Data Type

string

Default Value

""

Remarks

If using a TLS/SSL connection, this property can be used to specify the TLS/SSL certificate to be accepted from the server. Any other certificate that is not trusted by the machine is rejected.

This property can take the following forms:

Description Example
A full PEM Certificate (example shortened for brevity) -----BEGIN CERTIFICATE----- MIIChTCCAe4CAQAwDQYJKoZIhv......Qw== -----END CERTIFICATE-----
A path to a local file containing the certificate C:\cert.cer
The public key (example shortened for brevity) -----BEGIN RSA PUBLIC KEY----- MIGfMA0GCSq......AQAB -----END RSA PUBLIC KEY-----
The MD5 Thumbprint (hex values can also be either space or colon separated) ecadbdda5a1529c58a1e9e09828d70e4
The SHA1 Thumbprint (hex values can also be either space or colon separated) 34a929226ae0819f2ec14b4a3d904f801cbb150d

If not specified, any certificate trusted by the machine is accepted.

Certificates are validated as trusted by the machine based on the System's trust store. The trust store used is the 'javax.net.ssl.trustStore' value specified for the system. If no value is specified for this property, Java's default trust store is used (for example, JAVA_HOME\lib\security\cacerts).

Use '*' to signify to accept all certificates. Note that this is not recommended due to security concerns.



SSHAuthMode

Data Type

string

Default Value

"Password"

Remarks



SSHClientCert

Data Type

string

Default Value

""

Remarks

SSHClientCert must contain a valid private key in order to use public key authentication. A public key is optional, if one is not included then the driver generates it from the private key. The driver sends the public key to the server and the connection is allowed if the user has authorized the public key.

The SSHClientCertType field specifies the type of the key store specified by SSHClientCert. If the store is password protected, specify the password in SSHClientCertPassword.

Some types of key stores are containers which may include multiple keys. By default the driver will select the first key in the store, but you can specify a specific key using SSHClientCertSubject.



SSHClientCertPassword

Data Type

string

Default Value

""

Remarks

This property is only used when authenticating to SFTP servers with SSHAuthMode set to PublicKey and SSHClientCert set to a private key.



SSHClientCertSubject

Data Type

string

Default Value

"*"

Remarks

When loading a certificate the subject is used to locate the certificate in the store.

If an exact match is not found, the store is searched for subjects containing the value of the property.

If a match is still not found, the property is set to an empty string, and no certificate is selected.

The special value "*" picks the first certificate in the certificate store.

The certificate subject is a comma separated list of distinguished name fields and values. For instance "CN=www.server.com, OU=test, C=US, E=support@cdata.com". Common fields and their meanings are displayed below.

FieldMeaning
CNCommon Name. This is commonly a host name like www.server.com.
OOrganization
OUOrganizational Unit
LLocality
SState
CCountry
EEmail Address

If a field value contains a comma it must be quoted.



SSHClientCertType

Data Type

string

Default Value

"PEMKEY_FILE"

Remarks

This property can take one of the following values:

TypesDescriptionAllowed Blob Values
MACHINE/USER Not available on this platform.Blob values are not supported.
JKSFILE/JKSBLOBA Java keystore file. Must contain both a certificate and a private key. Only available in Java. base64-only
PFXFILE/PFXBLOBA PKCS12-format (.pfx) file. Must contain both a certificate and a private key.base64-only
PEMKEY_FILE/PEMKEY_BLOBA PEM-format file. Must contain an RSA, DSA, or OPENSSH private key. Can optionally contain a certificate matching the private key.base64 or plain text. Newlines may be replaced with spaces when providing the blob as text.
PPKFILE/PPKBLOBA PuTTY-format private key created using the puttygen tool.base64-only
XMLFILE/XMLBLOBAn XML key in the format generated by the .NET RSA class: RSA.ToXmlString(true).base64 or plain text.



SSHServer

Data Type

string

Default Value

""

Remarks

The SSH server.



SSHPort

Data Type

string

Default Value

"22"

Remarks

The SSH port.



SSHUser

Data Type

string

Default Value

""

Remarks

The SSH user.



SSHPassword

Data Type

string

Default Value

""

Remarks

The SSH password.



SSHServerFingerprint

Data Type

string

Default Value

""

Remarks

The SSH server fingerprint.



UseSSH

Data Type

bool

Default Value

false

Remarks

By default the driver will attempt to connect directly to Apache Hive. When this option is enabled, the driver will instead establish an SSH connection with the SSHServer and tunnel the connection to Apache Hive through it.



FirewallType

Data Type

string

Default Value

"NONE"

Remarks

This property specifies the protocol that the driver will use to tunnel traffic through the FirewallServer proxy. Note that by default, the driver connects to the system proxy; to disable this behavior and connect to one of the following proxy types, set ProxyAutoDetect to false.

Type Default Port Description
TUNNEL 80 When this is set, the driver opens a connection to Apache Hive and traffic flows back and forth through the proxy.
SOCKS4 1080 When this is set, the driver sends data through the SOCKS 4 proxy specified by FirewallServer and FirewallPort and passes the FirewallUser value to the proxy, which determines if the connection request should be granted.
SOCKS5 1080 When this is set, the driver sends data through the SOCKS 5 proxy specified by FirewallServer and FirewallPort. If your proxy requires authentication, set FirewallUser and FirewallPassword to credentials the proxy recognizes.

To connect to HTTP proxies, use ProxyServer and ProxyPort. To authenticate to HTTP proxies, use ProxyAuthScheme, ProxyUser, and ProxyPassword.



FirewallServer

Data Type

string

Default Value

""

Remarks

This property specifies the IP address, DNS name, or host name of a proxy allowing traversal of a firewall. The protocol is specified by FirewallType: Use FirewallServer with this property to connect through SOCKS or do tunneling. Use ProxyServer to connect to an HTTP proxy.

Note that the driver uses the system proxy by default. To use a different proxy, set ProxyAutoDetect to false.



FirewallPort

Data Type

int

Default Value

0

Remarks

This specifies the TCP port for a proxy allowing traversal of a firewall. Use FirewallServer to specify the name or IP address. Specify the protocol with FirewallType.



FirewallUser

Data Type

string

Default Value

""

Remarks

The FirewallUser and FirewallPassword properties are used to authenticate against the proxy specified in FirewallServer and FirewallPort, following the authentication method specified in FirewallType.



FirewallPassword

Data Type

string

Default Value

""

Remarks

This property is passed to the proxy specified by FirewallServer and FirewallPort, following the authentication method specified by FirewallType.



ProxyAutoDetect

Data Type

bool

Default Value

false

Remarks

This takes precedence over other proxy settings, so you'll need to set ProxyAutoDetect to FALSE in order use custom proxy settings.

NOTE: When this property is set to True, the proxy used is determined as follows:

To connect to an HTTP proxy, see ProxyServer. For other proxies, such as SOCKS or tunneling, see FirewallType.



ProxyServer

Data Type

string

Default Value

""

Remarks

The hostname or IP address of a proxy to route HTTP traffic through. The driver can use the HTTP, Windows (NTLM), or Kerberos authentication types to authenticate to an HTTP proxy.

If you need to connect through a SOCKS proxy or tunnel the connection, see FirewallType.

By default, the driver uses the system proxy. If you need to use another proxy, set ProxyAutoDetect to false.



ProxyPort

Data Type

int

Default Value

80

Remarks

The port the HTTP proxy is running on that you want to redirect HTTP traffic through. Specify the HTTP proxy in ProxyServer. For other proxy types, see FirewallType.



ProxyAuthScheme

Data Type

string

Default Value

"BASIC"

Remarks

This value specifies the authentication type to use to authenticate to the HTTP proxy specified by ProxyServer and ProxyPort.

Note that the driver will use the system proxy settings by default, without further configuration needed; if you want to connect to another proxy, you will need to set ProxyAutoDetect to false, in addition to ProxyServer and ProxyPort. To authenticate, set ProxyAuthScheme and set ProxyUser and ProxyPassword, if needed.

The authentication type can be one of the following:

If you need to use another authentication type, such as SOCKS 5 authentication, see FirewallType.



ProxyUser

Data Type

string

Default Value

""

Remarks

The ProxyUser and ProxyPassword options are used to connect and authenticate against the HTTP proxy specified in ProxyServer.

You can select one of the available authentication types in ProxyAuthScheme. If you are using HTTP authentication, set this to the user name of a user recognized by the HTTP proxy. If you are using Windows or Kerberos authentication, set this property to a user name in one of the following formats:

user@domain

domain\user



ProxyPassword

Data Type

string

Default Value

""

Remarks

This property is used to authenticate to an HTTP proxy server that supports NTLM (Windows), Kerberos, or HTTP authentication. To specify the HTTP proxy, you can set ProxyServer and ProxyPort. To specify the authentication type, set ProxyAuthScheme.

If you are using HTTP authentication, additionally set ProxyUser and ProxyPassword to HTTP proxy.

If you are using NTLM authentication, set ProxyUser and ProxyPassword to your Windows password. You may also need these to complete Kerberos authentication.

For SOCKS 5 authentication or tunneling, see FirewallType.

By default, the driver uses the system proxy. If you want to connect to another proxy, set ProxyAutoDetect to false.



ProxySSLType

Data Type

string

Default Value

"AUTO"

Remarks

This property determines when to use SSL for the connection to an HTTP proxy specified by ProxyServer. This value can be AUTO, ALWAYS, NEVER, or TUNNEL. The applicable values are the following:

AUTODefault setting. If the URL is an HTTPS URL, the driver will use the TUNNEL option. If the URL is an HTTP URL, the component will use the NEVER option.
ALWAYSThe connection is always SSL enabled.
NEVERThe connection is not SSL enabled.
TUNNELThe connection is through a tunneling proxy. The proxy server opens a connection to the remote host and traffic flows back and forth through the proxy.



ProxyExceptions

Data Type

string

Default Value

""

Remarks

The ProxyServer is used for all addresses, except for addresses defined in this property. Use semicolons to separate entries.

Note that the driver uses the system proxy settings by default, without further configuration needed; if you want to explicitly configure proxy exceptions for this connection, you need to set ProxyAutoDetect = false, and configure ProxyServer and ProxyPort. To authenticate, set ProxyAuthScheme and set ProxyUser and ProxyPassword, if needed.



Logfile

Data Type

string

Default Value

""

Remarks

Once this property is set, the driver will populate the log file as it carries out various tasks, such as when authentication is performed or queries are executed. If the specified file doesn't already exist, it will be created.

Connection strings and version information are also logged, though connection properties containing sensitive information are masked automatically.

If a relative filepath is supplied, the location of the log file will be resolved based on the path found in the Location connection property.

For more control over what is written to the log file, you can adjust the Verbosity property.

Log contents are categorized into several modules. You can show/hide individual modules using the LogModules property.

To edit the maximum size of a single logfile before a new one is created, see MaxLogFileSize.

If you would like to place a cap on the number of logfiles generated, use MaxLogFileCount.

Java Logging

Java logging is also supported. To enable Java logging, set Logfile to:

Logfile=JAVALOG://myloggername

As in the above sample, JAVALOG:// is a required prefix to use Java logging, and you will substitute your own Logger.

The supplied Logger's getLogger method is then called, using the supplied value to create the Logger instance. If a logging instance already exists, it will reference the existing instance.

When Java logging is enabled, the Verbosity will now correspond to specific logging levels.



Verbosity

Data Type

string

Default Value

"1"

Remarks

The verbosity level determines the amount of detail that the driver reports to the Logfile. Verbosity levels from 1 to 5 are supported. These are detailed in the Logging page.



LogModules

Data Type

string

Default Value

""

Remarks

Only the modules specified (separated by ';') will be included in the log file. By default all modules are included.

See the Logging page for an overview.



MaxLogFileSize

Data Type

string

Default Value

"100MB"

Remarks

When the limit is hit, a new log is created in the same folder with the date and time appended to the end. The default limit is 100 MB. Values lower than 100 kB will use 100 kB as the value instead.

Adjust the maximum number of logfiles generated with MaxLogFileCount.



MaxLogFileCount

Data Type

int

Default Value

-1

Remarks

When the limit is hit, a new log is created in the same folder with the date and time appended to the end and the oldest log file will be deleted.

The minimum supported value is 2. A value of 0 or a negative value indicates no limit on the count.

Adjust the maximum size of the logfiles generated with MaxLogFileSize.



Location

Data Type

string

Default Value

"%APPDATA%\\CData\\ApacheHive Data Provider\\Schema"

Remarks

The path to a directory which contains the schema files for the driver (.rsd files for tables and views, .rsb files for stored procedures). The folder location can be a relative path from the location of the executable. The Location property is only needed if you want to customize definitions (for example, change a column name, ignore a column, and so on) or extend the data model with new tables, views, or stored procedures.

If left unspecified, the default location is "%APPDATA%\\CData\\ApacheHive Data Provider\\Schema" with %APPDATA% being set to the user's configuration directory:

Platform %APPDATA%
Windows The value of the APPDATA environment variable
Mac ~/Library/Application Support
Linux ~/.config



BrowsableSchemas

Data Type

string

Default Value

""

Remarks

Listing the schemas from databases can be expensive. Providing a list of schemas in the connection string improves the performance.



Tables

Data Type

string

Default Value

""

Remarks

Listing the tables from some databases can be expensive. Providing a list of tables in the connection string improves the performance of the driver.

This property can also be used as an alternative to automatically listing views if you already know which ones you want to work with and there would otherwise be too many to work with.

Specify the tables you want in a comma-separated list. Each table should be a valid SQL identifier with any special characters escaped using square brackets, double-quotes or backticks. For example, Tables=TableA,[TableB/WithSlash],WithCatalog.WithSchema.`TableC With Space`.

Note that when connecting to a data source with multiple schemas or catalogs, you will need to provide the fully qualified name of the table in this property, as in the last example here, to avoid ambiguity between tables that exist in multiple catalogs or schemas.



Views

Data Type

string

Default Value

""

Remarks

Listing the views from some databases can be expensive. Providing a list of views in the connection string improves the performance of the driver.

This property can also be used as an alternative to automatically listing views if you already know which ones you want to work with and there would otherwise be too many to work with.

Specify the views you want in a comma-separated list. Each view should be a valid SQL identifier with any special characters escaped using square brackets, double-quotes or backticks. For example, Views=ViewA,[ViewB/WithSlash],WithCatalog.WithSchema.`ViewC With Space`.

Note that when connecting to a data source with multiple schemas or catalogs, you will need to provide the fully qualified name of the table in this property, as in the last example here, to avoid ambiguity between tables that exist in multiple catalogs or schemas.



AutoCache

Data Type

bool

Default Value

false

Remarks

When AutoCache = true, the driver automatically maintains a cache of your table's data in the database of your choice.

Setting the Caching Database

When AutoCache = true, the driver caches to a simple, file-based cache. You can configure its location or cache to a different database with the following properties:

See Also



CacheDriver

Data Type

string

Default Value

""

Remarks

You can cache to any database for which you have a JDBC driver, including CData JDBC drivers.

The cache database is determined by the CacheDriver and CacheConnection properties. The CacheDriver is the name of the JDBC driver class that you want to use to cache data.

Note: you must add the CacheDriver JAR file to the classpath.

Examples
The following examples show how to cache to several major databases. Refer to CacheConnection for more information on the JDBC URL syntax and typical connection properties.
Derby and Java DB

The driver simplifies Derby configuration. Java DB is the Oracle distribution of Derby. The JAR file is shipped in the JDK. You can find the JAR file, derby.jar, in the db subfolder of the JDK installation. In most caching scenarios, you need to specify only the following, after adding derby.jar to the classpath:

jdbc:apachehive:CacheLocation='c:/Temp/cachedir';Server=127.0.0.1;Port=10000;TransportMode=BINARY
To customize the Derby JDBC URL, use CacheDriver and CacheConnection. For example, to cache to an in-memory database, use a JDBC URL like the following:
jdbc:apachehive:CacheDriver=org.apache.derby.jdbc.EmbeddedDriver;CacheConnection='jdbc:derby:memory';Server=127.0.0.1;Port=10000;TransportMode=BINARY
SQLite

The following is a JDBC URL for the SQLite JDBC driver:

jdbc:apachehive:CacheDriver=org.sqlite.JDBC;CacheConnection='jdbc:sqlite:C:/Temp/sqlite.db';Server=127.0.0.1;Port=10000;TransportMode=BINARY
MySQL

The following is a JDBC URL for the included CData JDBC Driver for MySQL:

  jdbc:apachehive:Cache Driver=cdata.jdbc.mysql.MySQLDriver;Cache Connection='jdbc:mysql:Server=localhost;Port=3306;Database=cache;User=root;Password=123456';Server=127.0.0.1;Port=10000;TransportMode=BINARY

  
SQL Server

The following JDBC URL uses the Microsoft JDBC Driver for SQL Server:

jdbc:apachehive:Cache Driver=com.microsoft.sqlserver.jdbc.SQLServerDriver;Cache Connection='jdbc:sqlserver://localhost\sqlexpress:7437;user=sa;password=123456;databaseName=Cache';Server=127.0.0.1;Port=10000;TransportMode=BINARY
Oracle

The following is a JDBC URL for the Oracle Thin Client:

jdbc:apachehive:Cache Driver=oracle.jdbc.OracleDriver;CacheConnection='jdbc:oracle:thin:scott/tiger@localhost:1521:orcldb';Server=127.0.0.1;Port=10000;TransportMode=BINARY
NOTE: If using a version of Oracle older than 9i, the cache driver will instead be oracle.jdbc.driver.OracleDriver .
PostgreSQL

The following JDBC URL uses the official PostgreSQL JDBC driver:

jdbc:apachehive:CacheDriver=cdata.jdbc.postgresql.PostgreSQLDriver;CacheConnection='jdbc:postgresql:User=postgres;Password=admin;Database=postgres;Server=localhost;Port=5432;';Server=127.0.0.1;Port=10000;TransportMode=BINARY



CacheConnection

Data Type

string

Default Value

""

Remarks

The cache database is determined based on the CacheDriver and CacheConnection properties. Both properties are required to use the cache database. Examples of common cache database settings can be found below. For more information on setting the caching database's driver, refer to CacheDriver.

The connection string specified in the CacheConnection property is passed directly to the underlying CacheDriver. Consult the documentation for the specific JDBC driver for more information on the available properties. Make sure to include the JDBC driver in your application's classpath.

Derby and Java DB

The driver simplifies caching to Derby, only requiring you to set the CacheLocation property to make a basic connection.

Alternatively, you can configure the connection to Derby manually using CacheDriver and CacheConnection. The following is the Derby JDBC URL syntax:

jdbc:derby:[subsubprotocol:][databaseName][;attribute=value[;attribute=value] ... ]
For example, to cache to an in-memory database, use the following:
jdbc:derby:memory

SQLite

To cache to SQLite, you can use the SQLite JDBC driver. The following is the syntax of the JDBC URL:

jdbc:sqlite:dataSource

MySQL

The installation includes the CData JDBC Driver for MySQL. The following is an example JDBC URL:

jdbc:mysql:User=root;Password=root;Server=localhost;Port=3306;Database=cache
The following are typical connection properties:

SQL Server

The JDBC URL for the Microsoft JDBC Driver for SQL Server has the following syntax:

jdbc:sqlserver://[serverName[\instance][:port]][;database=databaseName][;property=value[;property=value] ... ]
For example:
jdbc:sqlserver://localhost\sqlexpress:1433;integratedSecurity=true
The following are typical SQL Server connection properties:
Oracle

The following is the conventional JDBC URL syntax for the Oracle JDBC Thin driver:

jdbc:oracle:thin:[userId/password]@[//]host[[:port][:sid]]
For example:
jdbc:oracle:thin:scott/tiger@myhost:1521:orcl
The following are typical connection properties:
PostgreSQL

The following is the JDBC URL syntax for the official PostgreSQL JDBC driver:

jdbc:postgresql:[//[host[:port]]/]database[[?option=value][[&option=value][&option=value] ... ]]
For example, the following connection string connects to a database on the default host (localhost) and port (5432):
jdbc:postgresql:postgres
The following are typical connection properties:



CacheLocation

Data Type

string

Default Value

"%APPDATA%\\CData\\ApacheHive Data Provider"

Remarks

The CacheLocation is a simple, file-based cache. The driver uses Java DB, Oracle's distribution of the Derby database. To cache to Java DB, you will need to add the Java DB JAR file to the classpath. The JAR file, derby.jar, is shipped in the JDK and located in the db subfolder of the JDK installation.

If left unspecified, the default location is "%APPDATA%\\CData\\ApacheHive Data Provider" with %APPDATA% being set to the user's configuration directory:

Platform %APPDATA%
Windows The value of the APPDATA environment variable
Mac ~/Library/Application Support
Linux ~/.config

See Also



CacheTolerance

Data Type

int

Default Value

600

Remarks

The tolerance for stale data in the cache specified in seconds. This only applies when AutoCache is used. The driver checks with the data source for newer records after the tolerance interval has expired. Otherwise, it returns the data directly from the cache.



Offline

Data Type

bool

Default Value

false

Remarks

When Offline = true, all queries execute against the cache as opposed to the live data source. In this mode, certain queries like INSERT, UPDATE, DELETE, and CACHE are not allowed.



CacheMetadata

Data Type

bool

Default Value

false

Remarks

As you execute queries with this property set, table metadata in the Apache Hive catalog are cached to the file store specified by CacheLocation if set or the user's home directory otherwise. A table's metadata will be retrieved only once, when the table is queried for the first time.

When to Use CacheMetadata

The driver automatically persists metadata in memory for up to two hours when you first discover the metadata for a table or view and therefore, CacheMetadata is generally not required. CacheMetadata becomes useful when metadata operations are expensive such as when you are working with large amounts of metadata or when you have many short-lived connections.

When Not to Use CacheMetadata



AsyncQueryTimeout

Data Type

int

Default Value

300

Remarks

If the AsyncQueryTimeout property is set to 0, asynchronous operations will not time out; instead, they will run until they complete successfully or encounter an error condition. This property is distinct from Timeout which applies to individual operations while AsyncQueryTimeout applies to execution time of the operation as a whole.

If AsyncQueryTimeout expires and the asynchronous request has not finished being processed, the driver raises an error condition.



BatchSize

Data Type

int

Default Value

0

Remarks

When BatchSize is set to a value greater than 0, the batch operation will split the entire batch into separate batches of size BatchSize. The split batches will then be submitted to the server individually. This is useful when the server has limitations on the size of the request that can be submitted.

Setting BatchSize to 0 will submit the entire batch as specified.



ConnectionLifeTime

Data Type

int

Default Value

0

Remarks

The maximum lifetime of a connection in seconds. Once the time has elapsed, the connection object is disposed. The default is 0 which indicates there is no limit to the connection lifetime.



ConnectOnOpen

Data Type

bool

Default Value

false

Remarks

When set to true, a connection will be made to Apache Hive when the connection is opened. This property enables the Test Connection feature available in various database tools.

This feature acts as a NOOP command as it is used to verify a connection can be made to Apache Hive and nothing from this initial connection is maintained.

Setting this property to false may provide performance improvements (depending upon the number of times a connection is opened).



EnableXSRF

Data Type

bool

Default Value

false

Remarks

This option specifies whether to add the X-XSRF-Header filter to HiveServer2 HTTP mode.



FailoverHosts

Data Type

string

Default Value

""

Remarks

Specify both a server and port with multiple values separated with commas. For example 'server1:port1,server2,port2'.

The Server and Port values take precedence and will be used to make the initial connection. Upon a failure to connect to the main host, the list of FailoverHosts will be used to attempt the connection.



HTTPPath

Data Type

string

Default Value

"cliservice"

Remarks

This property is used to specify the path component of the URL endpoint when using HTTP TransportMode.

This property should be set to the value specified in the 'hive.server2.thrift.http.path' property of you Hive configuration file (hive-site.xml).



MaxRows

Data Type

int

Default Value

-1

Remarks

Limits the number of rows returned rows when no aggregation or group by is used in the query. This helps avoid performance issues at design time.



Other

Data Type

string

Default Value

""

Remarks

The properties listed below are available for specific use cases. Normal driver use cases and functionality should not require these properties.

Specify multiple properties in a semicolon-separated list.

Caching Configuration

CachePartial=TrueCaches only a subset of columns, which you can specify in your query.
QueryPassthrough=TruePasses the specified query to the cache database instead of using the SQL parser of the driver.

Integration and Formatting

DefaultColumnSizeSets the default length of string fields when the data source does not provide column length in the metadata. The default value is 2000.
ConvertDateTimeToGMTDetermines whether to convert date-time values to GMT, instead of the local time of the machine.
RecordToFile=filenameRecords the underlying socket data transfer to the specified file.



Pagesize

Data Type

int

Default Value

10000

Remarks

The Pagesize property affects the maximum number of results to return per page from Apache Hive. Setting a higher value may result in better performance at the cost of additional memory allocated per page consumed.



PoolIdleTimeout

Data Type

int

Default Value

60

Remarks

The allowed idle time a connection can remain in the pool until the connection is closed. The default is 60 seconds.



PoolMaxSize

Data Type

int

Default Value

100

Remarks

The maximum connections in the pool. The default is 100. To disable this property, set the property value to 0 or less.



PoolMinSize

Data Type

int

Default Value

1

Remarks

The minimum number of connections in the pool. The default is 1.



PoolWaitTime

Data Type

int

Default Value

60

Remarks

The max seconds to wait for a connection to become available. If a new connection request is waiting for an available connection and exceeds this time, an error is thrown. By default, new requests wait forever for an available connection.



PseudoColumns

Data Type

string

Default Value

""

Remarks

This setting is particularly helpful in Entity Framework, which does not allow you to set a value for a pseudo column unless it is a table column. The value of this connection setting is of the format "Table1=Column1, Table1=Column2, Table2=Column3". You can use the "*" character to include all tables and all columns; for example, "*=*".



QueryPassthrough

Data Type

bool

Default Value

false

Remarks

When this is set, queries are passed through directly to Apache Hive.



Readonly

Data Type

bool

Default Value

false

Remarks

If this property is set to true, the driver will allow only SELECT queries. INSERT, UPDATE, DELETE, and stored procedure queries will cause an error to be thrown.



RTK

Data Type

string

Default Value

""

Remarks

The RTK property may be used to license a build. See the included licensing file to see how to set this property. The runtime key is only available if you purchased an OEM license.



ServerConfigurations

Data Type

string

Default Value

""

Remarks

This property takes a comma-separated list of Hive configuration variables. Each value in the comma-separated list will be sent as specified to the Hive server via the 'set' command to override the server default values.

Example: hive.enforce.bucketing=true,hive.enforce.sorting=true



Timeout

Data Type

int

Default Value

60

Remarks

If Timeout = 0, operations do not time out. The operations run until they complete successfully or until they encounter an error condition.

If Timeout expires and the operation is not yet complete, the driver throws an exception.



UseConnectionPooling

Data Type

bool

Default Value

false

Remarks

This property enables connection pooling. The default is false. See Connection Pooling for information on using connection pools.



UseDescTableQuery

Data Type

bool

Default Value

false

Remarks

When set to true, a DESC TABLE query will be issued to retrieve the columns for the table.



UserDefinedViews

Data Type

string

Default Value

""

Remarks

User Defined Views are defined in a JSON-formatted configuration file called UserDefinedViews.json. The driver automatically detects the views specified in this file.

You can also have multiple view definitions and control them using the UserDefinedViews connection property. When you use this property, only the specified views are seen by the driver.

This User Defined View configuration file is formatted as follows:

For example:

{

	"MyView": {

		"query": "SELECT * FROM [CData].[Default].Customers WHERE MyColumn = 'value'"

	},

	"MyView2": {

		"query": "SELECT * FROM MyTable WHERE Id IN (1,2,3)"

	}

}
Use the UserDefinedViews connection property to specify the location of your JSON configuration file. For example:
"UserDefinedViews", "C:\\Users\\yourusername\\Desktop\\tmp\\UserDefinedViews.json"



UseShowDatabasesQuery

Data Type

bool

Default Value

false

Remarks

When set to true, a SHOW DATABASES query will be issued to retrieve the schemas.



UseShowTablesQuery

Data Type

bool

Default Value

false

Remarks

When set to true, a SHOW TABLES query will be issued to retrieve the tables for the database.



UseSSL

Data Type

bool

Default Value

false

Remarks

Set this property to the value specified in the 'hive.server2.use.SSL' property of your Hive configuration file (hive-site.xml).



UseZookeeperDiscovery

Data Type

bool

Default Value

false

Remarks

When set to 'True', Hive servers will be discovered via the ZooKeeper service.

The Server and Port must be set to the configured ZooKeeper server and port. The ZookeeperNamespace property must be set to the namespace configured on ZooKeeper for the Hive Server 2 znodes.

If multiple ZooKeeper servers are available, FailoverHosts can be set to the additional ZooKeeper instances.



ZookeeperNamespace

Data Type

string

Default Value

""

Remarks

This property is only applicable when UseZookeeperDiscovery is set to 'True'.