Matillion Data Model for Elasticsearch
Version - 21.0.8137.0

Note: If you're using Matillion ETL, we advise you update to the latest version to ensure that your instance is reflective of the information displayed in the data model. This note does not apply to the Data Productivity Cloud.



Connection String OptionsBack To Top

  1. AuthScheme
  2. User
  3. Password
  4. Server
  5. Port
  6. APIKey
  7. APIKeyId
  8. DataModel
  9. AWSAccessKey
  10. AWSSecretKey
  11. AWSRoleARN
  12. AWSRegion
  13. AWSSessionToken
  14. TemporaryTokenDuration
  15. KerberosKDC
  16. KerberosRealm
  17. KerberosSPN
  18. KerberosKeytabFile
  19. KerberosServiceRealm
  20. KerberosServiceKDC
  21. KerberosTicketCache
  22. SSLClientCert
  23. SSLClientCertType
  24. SSLClientCertPassword
  25. SSLClientCertSubject
  26. SSLServerCert
  27. FirewallType
  28. FirewallServer
  29. FirewallPort
  30. FirewallUser
  31. FirewallPassword
  32. ProxyAutoDetect
  33. ProxyServer
  34. ProxyPort
  35. ProxyAuthScheme
  36. ProxyUser
  37. ProxyPassword
  38. ProxySSLType
  39. ProxyExceptions
  40. Logfile
  41. Verbosity
  42. LogModules
  43. MaxLogFileSize
  44. MaxLogFileCount
  45. Location
  46. BrowsableSchemas
  47. Tables
  48. Views
  49. FlattenObjects
  50. FlattenArrays
  51. AutoCache
  52. CacheDriver
  53. CacheConnection
  54. CacheLocation
  55. CacheTolerance
  56. Offline
  57. CacheMetadata
  58. BatchSize
  59. ClientSideEvaluation
  60. ConnectionLifeTime
  61. ConnectOnOpen
  62. GenerateSchemaFiles
  63. MaxResults
  64. MaxRows
  65. Other
  66. PageSize
  67. PoolIdleTimeout
  68. PoolMaxSize
  69. PoolMinSize
  70. PoolWaitTime
  71. PseudoColumns
  72. QueryPassthrough
  73. Readonly
  74. RowScanDepth
  75. RTK
  76. ScrollDuration
  77. SupportEnhancedSQL
  78. Timeout
  79. UseConnectionPooling

AuthScheme

Data Type

string

Default Value

"None"

Remarks

This field is used to authenticate against the server. Use the following options to select your authentication scheme:



User

Data Type

string

Default Value

""

Remarks

The user who is authenticating to Elasticsearch.



Password

Data Type

string

Default Value

""

Remarks

The password used to authenticate to Elasticsearch.



Server

Data Type

string

Default Value

""

Remarks

The host name or IP address of the Elasticsearch REST server.

To use SSL, prefix the host name or IP address with 'https://' and set SSL connection properties such as SSLServerCert.



Port

Data Type

string

Default Value

"9200"

Remarks

The port the Elasticsearch REST server is bound to.



APIKey

Data Type

string

Default Value

""

Remarks

The APIKey used to authenticate to Elasticsearch.



APIKeyId

Data Type

string

Default Value

""

Remarks

The APIKey Id to authenticate to Elasticsearch.



DataModel

Data Type

string

Default Value

"Document"

Remarks

Select a DataModel configuration to configure how the driver models nested documents into tables. See Parsing Hierarchical Data for examples of querying the data in the different configurations.

Selecting a Data Modeling Strategy

The following DataModel configurations are available. See Parsing Hierarchical Data for examples of querying the data in the different configurations.

See Also



AWSAccessKey

Data Type

string

Default Value

""

Remarks

Your AWS account access key. This value is accessible from your AWS security credentials page:

  1. Sign into the AWS Management console with the credentials for your root account.
  2. Select your account name or number and select My Security Credentials in the menu that is displayed.
  3. Click Continue to Security Credentials and expand the Access Keys section to manage or create root account access keys.



AWSSecretKey

Data Type

string

Default Value

""

Remarks

Your AWS account secret key. This value is accessible from your AWS security credentials page:

  1. Sign into the AWS Management console with the credentials for your root account.
  2. Select your account name or number and select My Security Credentials in the menu that is displayed.
  3. Click Continue to Security Credentials and expand the Access Keys section to manage or create root account access keys.



AWSRoleARN

Data Type

string

Default Value

""

Remarks

When authenticating outside of AWS, it is common to use a Role for authentication instead of your direct AWS account credentials. Entering the AWSRoleARN will cause the CData JDBC Driver for Elasticsearch to perform a role based authentication instead of using the AWSAccessKey and AWSSecretKey directly. The AWSAccessKey and AWSSecretKey must still be specified to perform this authentication. You cannot use the credentials of an AWS root user when setting RoleARN. The AWSAccessKey and AWSSecretKey must be those of an IAM user.



AWSRegion

Data Type

string

Default Value

"NORTHERNVIRGINIA"

Remarks

The hosting region for your Amazon Web Services. Available values are OHIO, NORTHERNVIRGINIA, NORTHERNCALIFORNIA, OREGON, CAPETOWN, HONGKONG, MUMBAI, OSAKA, SEOUL, SINGAPORE, SYDNEY, TOKYO, CENTRAL, BEIJING, NINGXIA, FRANKFURT, IRELAND, LONDON, MILAN, PARIS, STOCKHOLM, BAHRAIN, SAOPAULO, GOVCLOUDEAST, and GOVCLOUDWEST.



AWSSessionToken

Data Type

string

Default Value

""

Remarks

Your AWS session token. This value can be retrieved in different ways. See this link for more info.



TemporaryTokenDuration

Data Type

string

Default Value

"3600"

Remarks

Temporary tokens are used with Role based authentication. Temporary tokens will eventually time out, at which time a new temporary token must be obtained. The CData JDBC Driver for Elasticsearch will internally request a new temporary token once the temporary token has expired.

For Role based authentication, the minimum duration is 900 seconds (15 minutes) while the maximum if 3600 (1 hour).



KerberosKDC

Data Type

string

Default Value

""

Remarks

The Kerberos properties are used when using SPNEGO or Windows Authentication. The driver will request session tickets and temporary session keys from the Kerberos KDC service. The Kerberos KDC service is conventionally colocated with the domain controller.

If Kerberos KDC is not specified, the driver will attempt to detect these properties automatically from the following locations:

Note: Windows authentication is supported in JRE 1.6 and above only.



KerberosRealm

Data Type

string

Default Value

""

Remarks

The Kerberos properties are used when using SPNEGO or Windows Authentication. The Kerberos Realm is used to authenticate the user with the Kerberos Key Distribution Service (KDC). The Kerberos Realm can be configured by an administrator to be any string, but conventionally it is based on the domain name.

If Kerberos Realm is not specified the driver will attempt to detect these properties automatically from the following locations:

Note: Kerberos-based authentication is supported in JRE 1.6 and above only.



KerberosSPN

Data Type

string

Default Value

""

Remarks

If the SPN on the Kerberos Domain Controller is not the same as the URL that you are authenticating to, use this property to set the SPN.



KerberosKeytabFile

Data Type

string

Default Value

""

Remarks

The Keytab file containing your pairs of Kerberos principals and encrypted keys.



KerberosServiceRealm

Data Type

string

Default Value

""

Remarks

The KerberosServiceRealm is the specify the service Kerberos realm when using cross-realm Kerberos authentication.

In most cases, a single realm and KDC machine are used to perform the Kerberos authentication and this property is not required.

This property is available for complex setups where a different realm and KDC machine are used to obtain an authentication ticket (AS request) and a service ticket (TGS request).



KerberosServiceKDC

Data Type

string

Default Value

""

Remarks

The KerberosServiceKDC is used to specify the service Kerberos KDC when using cross-realm Kerberos authentication.

In most cases, a single realm and KDC machine are used to perform the Kerberos authentication and this property is not required.

This property is available for complex setups where a different realm and KDC machine are used to obtain an authentication ticket (AS request) and a service ticket (TGS request).



KerberosTicketCache

Data Type

string

Default Value

""

Remarks

This property can be set if you wish to use a credential cache file that was created using the MIT Kerberos Ticket Manager or kinit command.



SSLClientCert

Data Type

string

Default Value

""

Remarks

The name of the certificate store for the client certificate.

The SSLClientCertType field specifies the type of the certificate store specified by SSLClientCert. If the store is password protected, specify the password in SSLClientCertPassword.

SSLClientCert is used in conjunction with the SSLClientCertSubject field in order to specify client certificates. If SSLClientCert has a value, and SSLClientCertSubject is set, a search for a certificate is initiated. See SSLClientCertSubject for more information.

Designations of certificate stores are platform-dependent.

The following are designations of the most common User and Machine certificate stores in Windows:

MYA certificate store holding personal certificates with their associated private keys.
CACertifying authority certificates.
ROOTRoot certificates.
SPCSoftware publisher certificates.

In Java, the certificate store normally is a file containing certificates and optional private keys.

When the certificate store type is PFXFile, this property must be set to the name of the file. When the type is PFXBlob, the property must be set to the binary contents of a PFX file (for example, PKCS12 certificate store).



SSLClientCertType

Data Type

string

Default Value

"USER"

Remarks

This property can take one of the following values:

USER - defaultFor Windows, this specifies that the certificate store is a certificate store owned by the current user. Note that this store type is not available in Java.
MACHINEFor Windows, this specifies that the certificate store is a machine store. Note that this store type is not available in Java.
PFXFILEThe certificate store is the name of a PFX (PKCS12) file containing certificates.
PFXBLOBThe certificate store is a string (base-64-encoded) representing a certificate store in PFX (PKCS12) format.
JKSFILEThe certificate store is the name of a Java key store (JKS) file containing certificates. Note that this store type is only available in Java.
JKSBLOBThe certificate store is a string (base-64-encoded) representing a certificate store in JKS format. Note that this store type is only available in Java.
PEMKEY_FILEThe certificate store is the name of a PEM-encoded file that contains a private key and an optional certificate.
PEMKEY_BLOBThe certificate store is a string (base64-encoded) that contains a private key and an optional certificate.
PUBLIC_KEY_FILEThe certificate store is the name of a file that contains a PEM- or DER-encoded public key certificate.
PUBLIC_KEY_BLOBThe certificate store is a string (base-64-encoded) that contains a PEM- or DER-encoded public key certificate.
SSHPUBLIC_KEY_FILEThe certificate store is the name of a file that contains an SSH-style public key.
SSHPUBLIC_KEY_BLOBThe certificate store is a string (base-64-encoded) that contains an SSH-style public key.
P7BFILEThe certificate store is the name of a PKCS7 file containing certificates.
PPKFILEThe certificate store is the name of a file that contains a PuTTY Private Key (PPK).
XMLFILEThe certificate store is the name of a file that contains a certificate in XML format.
XMLBLOBThe certificate store is a string that contains a certificate in XML format.



SSLClientCertPassword

Data Type

string

Default Value

""

Remarks

If the certificate store is of a type that requires a password, this property is used to specify that password to open the certificate store.



SSLClientCertSubject

Data Type

string

Default Value

"*"

Remarks

When loading a certificate the subject is used to locate the certificate in the store.

If an exact match is not found, the store is searched for subjects containing the value of the property. If a match is still not found, the property is set to an empty string, and no certificate is selected.

The special value "*" picks the first certificate in the certificate store.

The certificate subject is a comma separated list of distinguished name fields and values. For example, "CN=www.server.com, OU=test, C=US, E=support@company.com". The common fields and their meanings are shown below.

FieldMeaning
CNCommon Name. This is commonly a host name like www.server.com.
OOrganization
OUOrganizational Unit
LLocality
SState
CCountry
EEmail Address

If a field value contains a comma, it must be quoted.



SSLServerCert

Data Type

string

Default Value

""

Remarks

If using a TLS/SSL connection, this property can be used to specify the TLS/SSL certificate to be accepted from the server. Any other certificate that is not trusted by the machine is rejected.

This property can take the following forms:

Description Example
A full PEM Certificate (example shortened for brevity) -----BEGIN CERTIFICATE----- MIIChTCCAe4CAQAwDQYJKoZIhv......Qw== -----END CERTIFICATE-----
A path to a local file containing the certificate C:\cert.cer
The public key (example shortened for brevity) -----BEGIN RSA PUBLIC KEY----- MIGfMA0GCSq......AQAB -----END RSA PUBLIC KEY-----
The MD5 Thumbprint (hex values can also be either space or colon separated) ecadbdda5a1529c58a1e9e09828d70e4
The SHA1 Thumbprint (hex values can also be either space or colon separated) 34a929226ae0819f2ec14b4a3d904f801cbb150d

If not specified, any certificate trusted by the machine is accepted.

Certificates are validated as trusted by the machine based on the System's trust store. The trust store used is the 'javax.net.ssl.trustStore' value specified for the system. If no value is specified for this property, Java's default trust store is used (for example, JAVA_HOME\lib\security\cacerts).

Use '*' to signify to accept all certificates. Note that this is not recommended due to security concerns.



FirewallType

Data Type

string

Default Value

"NONE"

Remarks

This property specifies the protocol that the driver will use to tunnel traffic through the FirewallServer proxy. Note that by default, the driver connects to the system proxy; to disable this behavior and connect to one of the following proxy types, set ProxyAutoDetect to false.

Type Default Port Description
TUNNEL 80 When this is set, the driver opens a connection to Elasticsearch and traffic flows back and forth through the proxy.
SOCKS4 1080 When this is set, the driver sends data through the SOCKS 4 proxy specified by FirewallServer and FirewallPort and passes the FirewallUser value to the proxy, which determines if the connection request should be granted.
SOCKS5 1080 When this is set, the driver sends data through the SOCKS 5 proxy specified by FirewallServer and FirewallPort. If your proxy requires authentication, set FirewallUser and FirewallPassword to credentials the proxy recognizes.

To connect to HTTP proxies, use ProxyServer and ProxyPort. To authenticate to HTTP proxies, use ProxyAuthScheme, ProxyUser, and ProxyPassword.



FirewallServer

Data Type

string

Default Value

""

Remarks

This property specifies the IP address, DNS name, or host name of a proxy allowing traversal of a firewall. The protocol is specified by FirewallType: Use FirewallServer with this property to connect through SOCKS or do tunneling. Use ProxyServer to connect to an HTTP proxy.

Note that the driver uses the system proxy by default. To use a different proxy, set ProxyAutoDetect to false.



FirewallPort

Data Type

int

Default Value

0

Remarks

This specifies the TCP port for a proxy allowing traversal of a firewall. Use FirewallServer to specify the name or IP address. Specify the protocol with FirewallType.



FirewallUser

Data Type

string

Default Value

""

Remarks

The FirewallUser and FirewallPassword properties are used to authenticate against the proxy specified in FirewallServer and FirewallPort, following the authentication method specified in FirewallType.



FirewallPassword

Data Type

string

Default Value

""

Remarks

This property is passed to the proxy specified by FirewallServer and FirewallPort, following the authentication method specified by FirewallType.



ProxyAutoDetect

Data Type

bool

Default Value

false

Remarks

This takes precedence over other proxy settings, so you'll need to set ProxyAutoDetect to FALSE in order use custom proxy settings.

NOTE: When this property is set to True, the proxy used is determined as follows:

To connect to an HTTP proxy, see ProxyServer. For other proxies, such as SOCKS or tunneling, see FirewallType.



ProxyServer

Data Type

string

Default Value

""

Remarks

The hostname or IP address of a proxy to route HTTP traffic through. The driver can use the HTTP, Windows (NTLM), or Kerberos authentication types to authenticate to an HTTP proxy.

If you need to connect through a SOCKS proxy or tunnel the connection, see FirewallType.

By default, the driver uses the system proxy. If you need to use another proxy, set ProxyAutoDetect to false.



ProxyPort

Data Type

int

Default Value

80

Remarks

The port the HTTP proxy is running on that you want to redirect HTTP traffic through. Specify the HTTP proxy in ProxyServer. For other proxy types, see FirewallType.



ProxyAuthScheme

Data Type

string

Default Value

"BASIC"

Remarks

This value specifies the authentication type to use to authenticate to the HTTP proxy specified by ProxyServer and ProxyPort.

Note that the driver will use the system proxy settings by default, without further configuration needed; if you want to connect to another proxy, you will need to set ProxyAutoDetect to false, in addition to ProxyServer and ProxyPort. To authenticate, set ProxyAuthScheme and set ProxyUser and ProxyPassword, if needed.

The authentication type can be one of the following:

If you need to use another authentication type, such as SOCKS 5 authentication, see FirewallType.



ProxyUser

Data Type

string

Default Value

""

Remarks

The ProxyUser and ProxyPassword options are used to connect and authenticate against the HTTP proxy specified in ProxyServer.

You can select one of the available authentication types in ProxyAuthScheme. If you are using HTTP authentication, set this to the user name of a user recognized by the HTTP proxy. If you are using Windows or Kerberos authentication, set this property to a user name in one of the following formats:

user@domain

domain\user



ProxyPassword

Data Type

string

Default Value

""

Remarks

This property is used to authenticate to an HTTP proxy server that supports NTLM (Windows), Kerberos, or HTTP authentication. To specify the HTTP proxy, you can set ProxyServer and ProxyPort. To specify the authentication type, set ProxyAuthScheme.

If you are using HTTP authentication, additionally set ProxyUser and ProxyPassword to HTTP proxy.

If you are using NTLM authentication, set ProxyUser and ProxyPassword to your Windows password. You may also need these to complete Kerberos authentication.

For SOCKS 5 authentication or tunneling, see FirewallType.

By default, the driver uses the system proxy. If you want to connect to another proxy, set ProxyAutoDetect to false.



ProxySSLType

Data Type

string

Default Value

"AUTO"

Remarks

This property determines when to use SSL for the connection to an HTTP proxy specified by ProxyServer. This value can be AUTO, ALWAYS, NEVER, or TUNNEL. The applicable values are the following:

AUTODefault setting. If the URL is an HTTPS URL, the driver will use the TUNNEL option. If the URL is an HTTP URL, the component will use the NEVER option.
ALWAYSThe connection is always SSL enabled.
NEVERThe connection is not SSL enabled.
TUNNELThe connection is through a tunneling proxy. The proxy server opens a connection to the remote host and traffic flows back and forth through the proxy.



ProxyExceptions

Data Type

string

Default Value

""

Remarks

The ProxyServer is used for all addresses, except for addresses defined in this property. Use semicolons to separate entries.

Note that the driver uses the system proxy settings by default, without further configuration needed; if you want to explicitly configure proxy exceptions for this connection, you need to set ProxyAutoDetect = false, and configure ProxyServer and ProxyPort. To authenticate, set ProxyAuthScheme and set ProxyUser and ProxyPassword, if needed.



Logfile

Data Type

string

Default Value

""

Remarks

Once this property is set, the driver will populate the log file as it carries out various tasks, such as when authentication is performed or queries are executed. If the specified file doesn't already exist, it will be created.

Connection strings and version information are also logged, though connection properties containing sensitive information are masked automatically.

If a relative filepath is supplied, the location of the log file will be resolved based on the path found in the Location connection property.

For more control over what is written to the log file, you can adjust the Verbosity property.

Log contents are categorized into several modules. You can show/hide individual modules using the LogModules property.

To edit the maximum size of a single logfile before a new one is created, see MaxLogFileSize.

If you would like to place a cap on the number of logfiles generated, use MaxLogFileCount.

Java Logging

Java logging is also supported. To enable Java logging, set Logfile to:

Logfile=JAVALOG://myloggername

As in the above sample, JAVALOG:// is a required prefix to use Java logging, and you will substitute your own Logger.

The supplied Logger's getLogger method is then called, using the supplied value to create the Logger instance. If a logging instance already exists, it will reference the existing instance.

When Java logging is enabled, the Verbosity will now correspond to specific logging levels.



Verbosity

Data Type

string

Default Value

"1"

Remarks

The verbosity level determines the amount of detail that the driver reports to the Logfile. Verbosity levels from 1 to 5 are supported. These are detailed in the Logging page.



LogModules

Data Type

string

Default Value

""

Remarks

Only the modules specified (separated by ';') will be included in the log file. By default all modules are included.

See the Logging page for an overview.



MaxLogFileSize

Data Type

string

Default Value

"100MB"

Remarks

When the limit is hit, a new log is created in the same folder with the date and time appended to the end. The default limit is 100 MB. Values lower than 100 kB will use 100 kB as the value instead.

Adjust the maximum number of logfiles generated with MaxLogFileCount.



MaxLogFileCount

Data Type

int

Default Value

-1

Remarks

When the limit is hit, a new log is created in the same folder with the date and time appended to the end and the oldest log file will be deleted.

The minimum supported value is 2. A value of 0 or a negative value indicates no limit on the count.

Adjust the maximum size of the logfiles generated with MaxLogFileSize.



Location

Data Type

string

Default Value

"%APPDATA%\\CData\\Elasticsearch Data Provider\\Schema"

Remarks

The path to a directory which contains the schema files for the driver (.rsd files for tables and views, .rsb files for stored procedures). The folder location can be a relative path from the location of the executable. The Location property is only needed if you want to customize definitions (for example, change a column name, ignore a column, and so on) or extend the data model with new tables, views, or stored procedures.

If left unspecified, the default location is "%APPDATA%\\CData\\Elasticsearch Data Provider\\Schema" with %APPDATA% being set to the user's configuration directory:

Platform %APPDATA%
Windows The value of the APPDATA environment variable
Mac ~/Library/Application Support
Linux ~/.config



BrowsableSchemas

Data Type

string

Default Value

""

Remarks

Listing the schemas from databases can be expensive. Providing a list of schemas in the connection string improves the performance.



Tables

Data Type

string

Default Value

""

Remarks

Listing the tables from some databases can be expensive. Providing a list of tables in the connection string improves the performance of the driver.

This property can also be used as an alternative to automatically listing views if you already know which ones you want to work with and there would otherwise be too many to work with.

Specify the tables you want in a comma-separated list. Each table should be a valid SQL identifier with any special characters escaped using square brackets, double-quotes or backticks. For example, Tables=TableA,[TableB/WithSlash],WithCatalog.WithSchema.`TableC With Space`.

Note that when connecting to a data source with multiple schemas or catalogs, you will need to provide the fully qualified name of the table in this property, as in the last example here, to avoid ambiguity between tables that exist in multiple catalogs or schemas.



Views

Data Type

string

Default Value

""

Remarks

Listing the views from some databases can be expensive. Providing a list of views in the connection string improves the performance of the driver.

This property can also be used as an alternative to automatically listing views if you already know which ones you want to work with and there would otherwise be too many to work with.

Specify the views you want in a comma-separated list. Each view should be a valid SQL identifier with any special characters escaped using square brackets, double-quotes or backticks. For example, Views=ViewA,[ViewB/WithSlash],WithCatalog.WithSchema.`ViewC With Space`.

Note that when connecting to a data source with multiple schemas or catalogs, you will need to provide the fully qualified name of the table in this property, as in the last example here, to avoid ambiguity between tables that exist in multiple catalogs or schemas.



FlattenObjects

Data Type

bool

Default Value

true

Remarks

Set FlattenObjects to true to flatten object properties into columns of their own. Otherwise, objects nested in arrays are returned as strings of JSON. The property name is concatenated onto the object name with a period to generate the column name.

For example, you can flatten the nested objects below at connection time:

"manager": {

  "name": "Alice White",

  "age": 30

}
When FlattenObjects is set to true, the preceding object is flattened into the following table:

Column NameColumn Value
manager.nameAlice White
manager.age30



FlattenArrays

Data Type

string

Default Value

""

Remarks

By default, nested arrays are returned as strings of JSON. The FlattenArrays property can be used to flatten the elements of nested arrays into columns of their own. This is only recommended for arrays that are expected to be short.

Set FlattenArrays to the number of elements you want to return from nested arrays. The specified elements are returned as columns. The zero-based index is concatenated to the column name. Other elements are ignored.

For example, you can return an arbitrary number of elements from an array of strings:

"employees": [

  {

    "name": "John Smith",

    "age": 34

  },

  {

    "name": "Peter Brown",

    "age": 26

  },

  {

    "name": "Paul Jacobs",

    "age": 30

  }

]
When FlattenArrays is set to 2, the preceding array is flattened into the following table:

Column NameColumn Value
employees.0.nameJohn Smith
employees.0.age34
employees.1.namePeter Brown
employees.1.age26

See JSON Functions to use JSON paths to work with unbounded arrays.



AutoCache

Data Type

bool

Default Value

false

Remarks

When AutoCache = true, the driver automatically maintains a cache of your table's data in the database of your choice.

Setting the Caching Database

When AutoCache = true, the driver caches to a simple, file-based cache. You can configure its location or cache to a different database with the following properties:

See Also



CacheDriver

Data Type

string

Default Value

""

Remarks

You can cache to any database for which you have a JDBC driver, including CData JDBC drivers.

The cache database is determined based on the CacheDriver and CacheConnection properties. The CacheDriver is the name of the JDBC driver class that you want to use to cache data.

Note that you must also add the CacheDriver JAR file to the classpath.

The following examples show how to cache to several major databases. Refer to CacheConnection for more information on the JDBC URL syntax and typical connection properties.

Derby and Java DB

The driver simplifies Derby configuration. Java DB is the Oracle distribution of Derby. The JAR file is shipped in the JDK. You can find the JAR file, derby.jar, in the db subfolder of the JDK installation. In most caching scenarios, you need to specify only the following, after adding derby.jar to the classpath:

jdbc:elasticsearch:CacheLocation='c:/Temp/cachedir';Server=127.0.0.1;Port=9200;
To customize the Derby JDBC URL, use CacheDriver and CacheConnection. For example, to cache to an in-memory database, use a JDBC URL like the following:
jdbc:elasticsearch:CacheDriver=org.apache.derby.jdbc.EmbeddedDriver;CacheConnection='jdbc:derby:memory';Server=127.0.0.1;Port=9200;
SQLite

The following is a JDBC URL for the SQLite JDBC driver:

jdbc:elasticsearch:CacheDriver=org.sqlite.JDBC;CacheConnection='jdbc:sqlite:C:/Temp/sqlite.db';Server=127.0.0.1;Port=9200;
MySQL

The following is a JDBC URL for the included CData JDBC Driver for MySQL:

  jdbc:elasticsearch:Cache Driver=cdata.jdbc.mysql.MySQLDriver;Cache Connection='jdbc:mysql:Server=localhost;Port=3306;Database=cache;User=root;Password=123456';Server=127.0.0.1;Port=9200;

  
SQL Server

The following JDBC URL uses the Microsoft JDBC Driver for SQL Server:

jdbc:elasticsearch:Cache Driver=com.microsoft.sqlserver.jdbc.SQLServerDriver;Cache Connection='jdbc:sqlserver://localhost\sqlexpress:7437;user=sa;password=123456;databaseName=Cache';Server=127.0.0.1;Port=9200;
Oracle

The following is a JDBC URL for the Oracle Thin Client:

jdbc:elasticsearch:Cache Driver=oracle.jdbc.OracleDriver;CacheConnection='jdbc:oracle:thin:scott/tiger@localhost:1521:orcldb';Server=127.0.0.1;Port=9200;
NOTE: If using a version of Oracle older than 9i, the cache driver will instead be oracle.jdbc.driver.OracleDriver .
PostgreSQL

The following JDBC URL uses the official PostgreSQL JDBC driver:

jdbc:elasticsearch:CacheDriver=cdata.jdbc.postgresql.PostgreSQLDriver;CacheConnection='jdbc:postgresql:User=postgres;Password=admin;Database=postgres;Server=localhost;Port=5432;';Server=127.0.0.1;Port=9200;



CacheConnection

Data Type

string

Default Value

""

Remarks

The cache database is determined based on the CacheDriver and CacheConnection properties. Both properties are required to use the cache database. Examples of common cache database settings can be found below. For more information on setting the caching database's driver, refer to CacheDriver.

The connection string specified in the CacheConnection property is passed directly to the underlying CacheDriver. Consult the documentation for the specific JDBC driver for more information on the available properties. Make sure to include the JDBC driver in your application's classpath.

Derby and Java DB

The driver simplifies caching to Derby, only requiring you to set the CacheLocation property to make a basic connection.

Alternatively, you can configure the connection to Derby manually using CacheDriver and CacheConnection. The following is the Derby JDBC URL syntax:

jdbc:derby:[subsubprotocol:][databaseName][;attribute=value[;attribute=value] ... ]
For example, to cache to an in-memory database, use the following:
jdbc:derby:memory

SQLite

To cache to SQLite, you can use the SQLite JDBC driver. The following is the syntax of the JDBC URL:

jdbc:sqlite:dataSource

MySQL

The installation includes the CData JDBC Driver for MySQL. The following is an example JDBC URL:

jdbc:mysql:User=root;Password=root;Server=localhost;Port=3306;Database=cache
The following are typical connection properties:

SQL Server

The JDBC URL for the Microsoft JDBC Driver for SQL Server has the following syntax:

jdbc:sqlserver://[serverName[\instance][:port]][;database=databaseName][;property=value[;property=value] ... ]
For example:
jdbc:sqlserver://localhost\sqlexpress:1433;integratedSecurity=true
The following are typical SQL Server connection properties:
Oracle

The following is the conventional JDBC URL syntax for the Oracle JDBC Thin driver:

jdbc:oracle:thin:[userId/password]@[//]host[[:port][:sid]]
For example:
jdbc:oracle:thin:scott/tiger@myhost:1521:orcl
The following are typical connection properties:
PostgreSQL

The following is the JDBC URL syntax for the official PostgreSQL JDBC driver:

jdbc:postgresql:[//[host[:port]]/]database[[?option=value][[&option=value][&option=value] ... ]]
For example, the following connection string connects to a database on the default host (localhost) and port (5432):
jdbc:postgresql:postgres
The following are typical connection properties:



CacheLocation

Data Type

string

Default Value

"%APPDATA%\\CData\\Elasticsearch Data Provider"

Remarks

The CacheLocation is a simple, file-based cache. The driver uses Java DB, Oracle's distribution of the Derby database. To cache to Java DB, you will need to add the Java DB JAR file to the classpath. The JAR file, derby.jar, is shipped in the JDK and located in the db subfolder of the JDK installation.

If left unspecified, the default location is "%APPDATA%\\CData\\Elasticsearch Data Provider" with %APPDATA% being set to the user's configuration directory:

Platform %APPDATA%
Windows The value of the APPDATA environment variable
Mac ~/Library/Application Support
Linux ~/.config

See Also



CacheTolerance

Data Type

int

Default Value

600

Remarks

The tolerance for stale data in the cache specified in seconds. This only applies when AutoCache is used. The driver checks with the data source for newer records after the tolerance interval has expired. Otherwise, it returns the data directly from the cache.



Offline

Data Type

bool

Default Value

false

Remarks

When Offline = true, all queries execute against the cache as opposed to the live data source. In this mode, certain queries like INSERT, UPDATE, DELETE, and CACHE are not allowed.



CacheMetadata

Data Type

bool

Default Value

false

Remarks

As you execute queries with this property set, table metadata in the Elasticsearch catalog are cached to the file store specified by CacheLocation if set or the user's home directory otherwise. A table's metadata will be retrieved only once, when the table is queried for the first time.

When to Use CacheMetadata

The driver automatically persists metadata in memory for up to two hours when you first discover the metadata for a table or view and therefore, CacheMetadata is generally not required. CacheMetadata becomes useful when metadata operations are expensive such as when you are working with large amounts of metadata or when you have many short-lived connections.

When Not to Use CacheMetadata



BatchSize

Data Type

int

Default Value

0

Remarks

When BatchSize is set to a value greater than 0, the batch operation will split the entire batch into separate batches of size BatchSize. The split batches will then be submitted to the server individually. This is useful when the server has limitations on the size of the request that can be submitted.

Setting BatchSize to 0 will submit the entire batch as specified.



ClientSideEvaluation

Data Type

bool

Default Value

false

Remarks

Set ClientSideEvaluation to true to perform Evaluation (GROUP BY, filtering) client side on nested objects.

For example, with ClientSideEvaluation set to false(default value), GROUP BY on nested object 'property.0.name' would be grouped as 'property.*.name', while if set to true, results would be grouped as 'property.0.name'.

Similarly, with ClientSideEvaluation set to false(default value), filtering on nested object 'property.0.name' would be filtered as 'property.*.name', while if set to true, results would be filtered as 'property.0.name'.

This would affect performance as query is evaluated client side.



ConnectionLifeTime

Data Type

int

Default Value

0

Remarks

The maximum lifetime of a connection in seconds. Once the time has elapsed, the connection object is disposed. The default is 0 which indicates there is no limit to the connection lifetime.



ConnectOnOpen

Data Type

bool

Default Value

false

Remarks

When set to true, a connection will be made to Elasticsearch when the connection is opened. This property enables the Test Connection feature available in various database tools.

This feature acts as a NOOP command as it is used to verify a connection can be made to Elasticsearch and nothing from this initial connection is maintained.

Setting this property to false may provide performance improvements (depending upon the number of times a connection is opened).



GenerateSchemaFiles

Data Type

string

Default Value

"Never"

Remarks

This property outputs schemas to .rsd files in the path specified by Location.

Available settings are the following:

Note that if you want to regenerate a file, you will first need to delete it.
Generate Schemas with SQL

When you set GenerateSchemaFiles to OnUse, the driver generates schemas as you execute SELECT queries. Schemas are generated for each table referenced in the query.

When you set GenerateSchemaFiles to OnCreate, schemas are only generated when a CREATE TABLE query is executed.

Generate Schemas on Connection

Another way to use this property is to obtain schemas for every table in your database when you connect. To do so, set GenerateSchemaFiles to OnStart and connect.



MaxResults

Data Type

string

Default Value

"10000"

Remarks

This property corresponds to the Elasticsearch index.max_result_window index setting. Thus the default value is 10000, which is Elasticsearch's default limit.

This value is not applicable when using the Scroll API. Set ScrollDuration to use this API.

When a LIMIT is specified in a query, the LIMIT will be taken into account provided it is less than MaxResults. Otherwise the number of results returned will be limited to the MaxResults value.

If you receive an error stating that the result window is too large, this is caused by the MaxResults value being greater than the Elasticsearch index.max_result_window index setting. You can either change the MaxResults value to match the index.max_result_window index setting or use the Scroll API by setting ScrollDuration.



MaxRows

Data Type

int

Default Value

-1

Remarks

Limits the number of rows returned rows when no aggregation or group by is used in the query. This helps avoid performance issues at design time.



Other

Data Type

string

Default Value

""

Remarks

The properties listed below are available for specific use cases. Normal driver use cases and functionality should not require these properties.

Specify multiple properties in a semicolon-separated list.

Caching Configuration

CachePartial=TrueCaches only a subset of columns, which you can specify in your query.
QueryPassthrough=TruePasses the specified query to the cache database instead of using the SQL parser of the driver.

Integration and Formatting

DefaultColumnSizeSets the default length of string fields when the data source does not provide column length in the metadata. The default value is 2000.
ConvertDateTimeToGMTDetermines whether to convert date-time values to GMT, instead of the local time of the machine.
RecordToFile=filenameRecords the underlying socket data transfer to the specified file.



PageSize

Data Type

string

Default Value

"10000"

Remarks

The PageSize can control the number of results received per request from Elasticsearch on a given query.

The default value is 10000, which is Elasticsearch's default limit (based on the Elasticsearch index.max_result_window index setting).



PoolIdleTimeout

Data Type

int

Default Value

60

Remarks

The allowed idle time a connection can remain in the pool until the connection is closed. The default is 60 seconds.



PoolMaxSize

Data Type

int

Default Value

100

Remarks

The maximum connections in the pool. The default is 100. To disable this property, set the property value to 0 or less.



PoolMinSize

Data Type

int

Default Value

1

Remarks

The minimum number of connections in the pool. The default is 1.



PoolWaitTime

Data Type

int

Default Value

60

Remarks

The max seconds to wait for a connection to become available. If a new connection request is waiting for an available connection and exceeds this time, an error is thrown. By default, new requests wait forever for an available connection.



PseudoColumns

Data Type

string

Default Value

""

Remarks

This setting is particularly helpful in Entity Framework, which does not allow you to set a value for a pseudo column unless it is a table column. The value of this connection setting is of the format "Table1=Column1, Table1=Column2, Table2=Column3". You can use the "*" character to include all tables and all columns; for example, "*=*".



QueryPassthrough

Data Type

bool

Default Value

false

Remarks

Setting this property to True enables the driver to pass an Elasticsearch query as-is to Elasticsearch. There are two options for submitting as-is queries to Elasticsearch: SQL and Search DSL.

SQL API

Elasticsearch version 6.3 and above supports a SQL API endpoint. When set to true, this option allows you to pass SQL queries directly to the Elasticsearch SQL API. Columns will be identified based on the metadata returned in the response.

Supported SQL syntax and commands can be found in the Elasticsearch documentation.

Note: SQL functionality is limited to what is supported by Elasticsearch.

Search DSL

Alternatively, queries can be submitted using Elasticsearch's Search DSL language, which includes Query DSL. This functionality is available in all versions of Elasticsearch.

The supported query syntax is JSON using the query passthrough syntax described below.

The JSON Passthrough Query Syntax supports the following elements:

Element Name Function
index The Elasticsearch index (or schema) to query. This is a JSON element that takes a string value.
type The Elasticsearch type (or table) to query within index. This is a JSON element that takes a string value.
docid The Id of the document to query within index.type. This is a JSON element that takes a string value.
apiendpoint The Elasticsearch API Endpoint to query. Default value is '_search'. This is a JSON element that takes a string value.
requestdata The raw Elasticsearch Search DSL that will be sent to Elasticsearch as is. The value is a JSON object that maps directly to the format required by Elasticsearch.

The index, type, docid, and apiendpoint are used to generate the URL where the requestdata will be sent. The URL is generated using the following format: [Server]:[Port]/[index]/[type]/[docid]/[apiendpoint]. If any of the JSON passthrough elements are not specified, they will not be added to the URL.

Below is an example of a passthrough query. This example will retrieve the first 10 documents from megacorp.employee that contain a last_name of 'smith'. The results will be ordered by first_name in descending order.

{ 

  "index": "megacorp", 

  "type": "employee", 

  "requestdata": 

  {

    "from": 0,

    "size": 10,

    "query": {"bool":{"must":{"term":{"last_name":"smith"}}}},

    "sort": {"first_name":{"order":"desc"}}

  }

}

When using QueryPassthrough queries, the metadata is determined by the data returned in the response. RowScanDepth identifies the depth of the records that will be scanned to determine the metadata (columns and types). Since the metadata is based on the response data, passthrough queries may display different metadata than a similar query performed using the SQL syntax (where the metadata is retrieved directly from Elasticsearch).



Readonly

Data Type

bool

Default Value

false

Remarks

If this property is set to true, the driver will allow only SELECT queries. INSERT, UPDATE, DELETE, and stored procedure queries will cause an error to be thrown.



RowScanDepth

Data Type

string

Default Value

"100"

Remarks

This property is used when generating table metadata and specifically is used to identify arrays within the data. Elasticsearch allows any field to be an array and does not identify which fields are arrays in the mapping data. Thus RowScanDepth rows will be queried and scanned to identify if any of the fields contain arrays.

When QueryPassthrough is set to True, the columns in a table must be determined by scanning the data returned in the request. This value determines the maximum number of rows that will be scanned to determine the table metadata. The default value is 100.

Setting a high value may decrease performance. Setting a low value may prevent the data type from being determined properly, especially when there is null data.



RTK

Data Type

string

Default Value

""

Remarks

The RTK property may be used to license a build. See the included licensing file to see how to set this property. The runtime key is only available if you purchased an OEM license.



ScrollDuration

Data Type

string

Default Value

"1m"

Remarks

When a nonzero value is specified, the Scroll API will be used.

The time unit specified will be sent in each request made to Elasticsearch to specify how long the server should keep the search context alive. The value specified only needs to be long enough to process the previous batch of results (not to process all the data). This is because the ScrollDuration value will be sent in each request, which will extend the context time.

Once all the results have been retrieved, the search context will be cleared.

The format for this value is: [integer][time unit]. For example: 1m = 1 minute.

Setting this property to '0' will cause the default Search API to be used. In such a case, the maximum number of results that can be returned are equal to MaxResults.

Supported Time Units:

Value Description
y Year
M Month
w Week
d Day
h Hour
m Minute
s Second
ms Milli-second



SupportEnhancedSQL

Data Type

bool

Default Value

true

Remarks

When SupportEnhancedSQL = true, the driver offloads as much of the SELECT statement processing as possible to Elasticsearch and then processes the rest of the query in memory. In this way, the driver can execute unsupported predicates, joins, and aggregation.

When SupportEnhancedSQL = false, the driver limits SQL execution to what is supported by the Elasticsearch API.

Execution of Predicates

The driver determines which of the clauses are supported by the data source and then pushes them to the source to get the smallest superset of rows that would satisfy the query. It then filters the rest of the rows locally. The filter operation is streamed, which enables the driver to filter effectively for even very large datasets.

Execution of Joins

The driver uses various techniques to join in memory. The driver trades off memory utilization against the requirement of reading the same table more than once.

Execution of Aggregates

The driver retrieves all rows necessary to process the aggregation in memory.



Timeout

Data Type

int

Default Value

60

Remarks

If Timeout = 0, operations do not time out. The operations run until they complete successfully or until they encounter an error condition.

If Timeout expires and the operation is not yet complete, the driver throws an exception.



UseConnectionPooling

Data Type

bool

Default Value

false

Remarks

This property enables connection pooling. The default is false. See Connection Pooling for information on using connection pools.