Matillion Data Model for Splunk
Version - 21.0.8137.0

Note: If you're using Matillion ETL, we advise you update to the latest version to ensure that your instance is reflective of the information displayed in the data model. This note does not apply to the Data Productivity Cloud.



Connection String OptionsBack To Top

  1. AuthScheme
  2. URL
  3. User
  4. Password
  5. SSLServerCert
  6. FirewallType
  7. FirewallServer
  8. FirewallPort
  9. FirewallUser
  10. FirewallPassword
  11. ProxyAutoDetect
  12. ProxyServer
  13. ProxyPort
  14. ProxyAuthScheme
  15. ProxyUser
  16. ProxyPassword
  17. ProxySSLType
  18. ProxyExceptions
  19. Logfile
  20. Verbosity
  21. LogModules
  22. MaxLogFileSize
  23. MaxLogFileCount
  24. Location
  25. BrowsableSchemas
  26. Tables
  27. Views
  28. AutoCache
  29. CacheDriver
  30. CacheConnection
  31. CacheLocation
  32. CacheTolerance
  33. Offline
  34. CacheMetadata
  35. BatchSize
  36. ConnectionLifeTime
  37. ConnectOnOpen
  38. IncludeInternalFields
  39. MaxRows
  40. MaxThreads
  41. Other
  42. Pagesize
  43. PoolIdleTimeout
  44. PoolMaxSize
  45. PoolMinSize
  46. PoolWaitTime
  47. PseudoColumns
  48. Readonly
  49. RowScanDepth
  50. RTK
  51. SupportEnhancedSQL
  52. Timeout
  53. TypeDetectionScheme
  54. UseConnectionPooling
  55. UseJobs

AuthScheme

Data Type

string

Default Value

"Auto"

Remarks



URL

Data Type

string

Default Value

""

Remarks

The URL to your Splunk endpoint; for example, https://yoursitename.splunk.com:8089.

The port should be set to the Splunk management port (default 8089).



User

Data Type

string

Default Value

""

Remarks

Together with Password, this field is used to authenticate against the Splunk server.



Password

Data Type

string

Default Value

""

Remarks

The User and Password are together used to authenticate with the server.



SSLServerCert

Data Type

string

Default Value

""

Remarks

If using a TLS/SSL connection, this property can be used to specify the TLS/SSL certificate to be accepted from the server. Any other certificate that is not trusted by the machine is rejected.

This property can take the following forms:

Description Example
A full PEM Certificate (example shortened for brevity) -----BEGIN CERTIFICATE----- MIIChTCCAe4CAQAwDQYJKoZIhv......Qw== -----END CERTIFICATE-----
A path to a local file containing the certificate C:\cert.cer
The public key (example shortened for brevity) -----BEGIN RSA PUBLIC KEY----- MIGfMA0GCSq......AQAB -----END RSA PUBLIC KEY-----
The MD5 Thumbprint (hex values can also be either space or colon separated) ecadbdda5a1529c58a1e9e09828d70e4
The SHA1 Thumbprint (hex values can also be either space or colon separated) 34a929226ae0819f2ec14b4a3d904f801cbb150d

If not specified, any certificate trusted by the machine is accepted.

Certificates are validated as trusted by the machine based on the System's trust store. The trust store used is the 'javax.net.ssl.trustStore' value specified for the system. If no value is specified for this property, Java's default trust store is used (for example, JAVA_HOME\lib\security\cacerts).

Use '*' to signify to accept all certificates. Note that this is not recommended due to security concerns.



FirewallType

Data Type

string

Default Value

"NONE"

Remarks

This property specifies the protocol that the driver will use to tunnel traffic through the FirewallServer proxy. Note that by default, the driver connects to the system proxy; to disable this behavior and connect to one of the following proxy types, set ProxyAutoDetect to false.

Type Default Port Description
TUNNEL 80 When this is set, the driver opens a connection to Splunk and traffic flows back and forth through the proxy.
SOCKS4 1080 When this is set, the driver sends data through the SOCKS 4 proxy specified by FirewallServer and FirewallPort and passes the FirewallUser value to the proxy, which determines if the connection request should be granted.
SOCKS5 1080 When this is set, the driver sends data through the SOCKS 5 proxy specified by FirewallServer and FirewallPort. If your proxy requires authentication, set FirewallUser and FirewallPassword to credentials the proxy recognizes.

To connect to HTTP proxies, use ProxyServer and ProxyPort. To authenticate to HTTP proxies, use ProxyAuthScheme, ProxyUser, and ProxyPassword.



FirewallServer

Data Type

string

Default Value

""

Remarks

This property specifies the IP address, DNS name, or host name of a proxy allowing traversal of a firewall. The protocol is specified by FirewallType: Use FirewallServer with this property to connect through SOCKS or do tunneling. Use ProxyServer to connect to an HTTP proxy.

Note that the driver uses the system proxy by default. To use a different proxy, set ProxyAutoDetect to false.



FirewallPort

Data Type

int

Default Value

0

Remarks

This specifies the TCP port for a proxy allowing traversal of a firewall. Use FirewallServer to specify the name or IP address. Specify the protocol with FirewallType.



FirewallUser

Data Type

string

Default Value

""

Remarks

The FirewallUser and FirewallPassword properties are used to authenticate against the proxy specified in FirewallServer and FirewallPort, following the authentication method specified in FirewallType.



FirewallPassword

Data Type

string

Default Value

""

Remarks

This property is passed to the proxy specified by FirewallServer and FirewallPort, following the authentication method specified by FirewallType.



ProxyAutoDetect

Data Type

bool

Default Value

false

Remarks

This takes precedence over other proxy settings, so you'll need to set ProxyAutoDetect to FALSE in order use custom proxy settings.

NOTE: When this property is set to True, the proxy used is determined as follows:

To connect to an HTTP proxy, see ProxyServer. For other proxies, such as SOCKS or tunneling, see FirewallType.



ProxyServer

Data Type

string

Default Value

""

Remarks

The hostname or IP address of a proxy to route HTTP traffic through. The driver can use the HTTP, Windows (NTLM), or Kerberos authentication types to authenticate to an HTTP proxy.

If you need to connect through a SOCKS proxy or tunnel the connection, see FirewallType.

By default, the driver uses the system proxy. If you need to use another proxy, set ProxyAutoDetect to false.



ProxyPort

Data Type

int

Default Value

80

Remarks

The port the HTTP proxy is running on that you want to redirect HTTP traffic through. Specify the HTTP proxy in ProxyServer. For other proxy types, see FirewallType.



ProxyAuthScheme

Data Type

string

Default Value

"BASIC"

Remarks

This value specifies the authentication type to use to authenticate to the HTTP proxy specified by ProxyServer and ProxyPort.

Note that the driver will use the system proxy settings by default, without further configuration needed; if you want to connect to another proxy, you will need to set ProxyAutoDetect to false, in addition to ProxyServer and ProxyPort. To authenticate, set ProxyAuthScheme and set ProxyUser and ProxyPassword, if needed.

The authentication type can be one of the following:

If you need to use another authentication type, such as SOCKS 5 authentication, see FirewallType.



ProxyUser

Data Type

string

Default Value

""

Remarks

The ProxyUser and ProxyPassword options are used to connect and authenticate against the HTTP proxy specified in ProxyServer.

You can select one of the available authentication types in ProxyAuthScheme. If you are using HTTP authentication, set this to the user name of a user recognized by the HTTP proxy. If you are using Windows or Kerberos authentication, set this property to a user name in one of the following formats:

user@domain

domain\user



ProxyPassword

Data Type

string

Default Value

""

Remarks

This property is used to authenticate to an HTTP proxy server that supports NTLM (Windows), Kerberos, or HTTP authentication. To specify the HTTP proxy, you can set ProxyServer and ProxyPort. To specify the authentication type, set ProxyAuthScheme.

If you are using HTTP authentication, additionally set ProxyUser and ProxyPassword to HTTP proxy.

If you are using NTLM authentication, set ProxyUser and ProxyPassword to your Windows password. You may also need these to complete Kerberos authentication.

For SOCKS 5 authentication or tunneling, see FirewallType.

By default, the driver uses the system proxy. If you want to connect to another proxy, set ProxyAutoDetect to false.



ProxySSLType

Data Type

string

Default Value

"AUTO"

Remarks

This property determines when to use SSL for the connection to an HTTP proxy specified by ProxyServer. This value can be AUTO, ALWAYS, NEVER, or TUNNEL. The applicable values are the following:

AUTODefault setting. If the URL is an HTTPS URL, the driver will use the TUNNEL option. If the URL is an HTTP URL, the component will use the NEVER option.
ALWAYSThe connection is always SSL enabled.
NEVERThe connection is not SSL enabled.
TUNNELThe connection is through a tunneling proxy. The proxy server opens a connection to the remote host and traffic flows back and forth through the proxy.



ProxyExceptions

Data Type

string

Default Value

""

Remarks

The ProxyServer is used for all addresses, except for addresses defined in this property. Use semicolons to separate entries.

Note that the driver uses the system proxy settings by default, without further configuration needed; if you want to explicitly configure proxy exceptions for this connection, you need to set ProxyAutoDetect = false, and configure ProxyServer and ProxyPort. To authenticate, set ProxyAuthScheme and set ProxyUser and ProxyPassword, if needed.



Logfile

Data Type

string

Default Value

""

Remarks

Once this property is set, the driver will populate the log file as it carries out various tasks, such as when authentication is performed or queries are executed. If the specified file doesn't already exist, it will be created.

Connection strings and version information are also logged, though connection properties containing sensitive information are masked automatically.

If a relative filepath is supplied, the location of the log file will be resolved based on the path found in the Location connection property.

For more control over what is written to the log file, you can adjust the Verbosity property.

Log contents are categorized into several modules. You can show/hide individual modules using the LogModules property.

To edit the maximum size of a single logfile before a new one is created, see MaxLogFileSize.

If you would like to place a cap on the number of logfiles generated, use MaxLogFileCount.

Java Logging

Java logging is also supported. To enable Java logging, set Logfile to:

Logfile=JAVALOG://myloggername

As in the above sample, JAVALOG:// is a required prefix to use Java logging, and you will substitute your own Logger.

The supplied Logger's getLogger method is then called, using the supplied value to create the Logger instance. If a logging instance already exists, it will reference the existing instance.

When Java logging is enabled, the Verbosity will now correspond to specific logging levels.



Verbosity

Data Type

string

Default Value

"1"

Remarks

The verbosity level determines the amount of detail that the driver reports to the Logfile. Verbosity levels from 1 to 5 are supported. These are detailed in the Logging page.



LogModules

Data Type

string

Default Value

""

Remarks

Only the modules specified (separated by ';') will be included in the log file. By default all modules are included.

See the Logging page for an overview.



MaxLogFileSize

Data Type

string

Default Value

"100MB"

Remarks

When the limit is hit, a new log is created in the same folder with the date and time appended to the end. The default limit is 100 MB. Values lower than 100 kB will use 100 kB as the value instead.

Adjust the maximum number of logfiles generated with MaxLogFileCount.



MaxLogFileCount

Data Type

int

Default Value

-1

Remarks

When the limit is hit, a new log is created in the same folder with the date and time appended to the end and the oldest log file will be deleted.

The minimum supported value is 2. A value of 0 or a negative value indicates no limit on the count.

Adjust the maximum size of the logfiles generated with MaxLogFileSize.



Location

Data Type

string

Default Value

"%APPDATA%\\CData\\Splunk Data Provider\\Schema"

Remarks

The path to a directory which contains the schema files for the driver (.rsd files for tables and views, .rsb files for stored procedures). The folder location can be a relative path from the location of the executable. The Location property is only needed if you want to customize definitions (for example, change a column name, ignore a column, and so on) or extend the data model with new tables, views, or stored procedures.

If left unspecified, the default location is "%APPDATA%\\CData\\Splunk Data Provider\\Schema" with %APPDATA% being set to the user's configuration directory:

Platform %APPDATA%
Windows The value of the APPDATA environment variable
Mac ~/Library/Application Support
Linux ~/.config



BrowsableSchemas

Data Type

string

Default Value

""

Remarks

Listing the schemas from databases can be expensive. Providing a list of schemas in the connection string improves the performance.



Tables

Data Type

string

Default Value

""

Remarks

Listing the tables from some databases can be expensive. Providing a list of tables in the connection string improves the performance of the driver.

This property can also be used as an alternative to automatically listing views if you already know which ones you want to work with and there would otherwise be too many to work with.

Specify the tables you want in a comma-separated list. Each table should be a valid SQL identifier with any special characters escaped using square brackets, double-quotes or backticks. For example, Tables=TableA,[TableB/WithSlash],WithCatalog.WithSchema.`TableC With Space`.

Note that when connecting to a data source with multiple schemas or catalogs, you will need to provide the fully qualified name of the table in this property, as in the last example here, to avoid ambiguity between tables that exist in multiple catalogs or schemas.



Views

Data Type

string

Default Value

""

Remarks

Listing the views from some databases can be expensive. Providing a list of views in the connection string improves the performance of the driver.

This property can also be used as an alternative to automatically listing views if you already know which ones you want to work with and there would otherwise be too many to work with.

Specify the views you want in a comma-separated list. Each view should be a valid SQL identifier with any special characters escaped using square brackets, double-quotes or backticks. For example, Views=ViewA,[ViewB/WithSlash],WithCatalog.WithSchema.`ViewC With Space`.

Note that when connecting to a data source with multiple schemas or catalogs, you will need to provide the fully qualified name of the table in this property, as in the last example here, to avoid ambiguity between tables that exist in multiple catalogs or schemas.



AutoCache

Data Type

bool

Default Value

false

Remarks

When AutoCache = true, the driver automatically maintains a cache of your table's data in the database of your choice.

Setting the Caching Database

When AutoCache = true, the driver caches to a simple, file-based cache. You can configure its location or cache to a different database with the following properties:

See Also



CacheDriver

Data Type

string

Default Value

""

Remarks

You can cache to any database for which you have a JDBC driver, including CData JDBC drivers.

The cache database is determined based on the CacheDriver and CacheConnection properties. The CacheDriver is the name of the JDBC driver class that you want to use to cache data.

Note that you must also add the CacheDriver JAR file to the classpath.

The following examples show how to cache to several major databases. Refer to CacheConnection for more information on the JDBC URL syntax and typical connection properties.

Derby and Java DB

The driver simplifies Derby configuration. Java DB is the Oracle distribution of Derby. The JAR file is shipped in the JDK. You can find the JAR file, derby.jar, in the db subfolder of the JDK installation. In most caching scenarios, you need to specify only the following, after adding derby.jar to the classpath:

jdbc:splunk:CacheLocation='c:/Temp/cachedir';user=MyUserName;password=MyPassword;URL=MyURL;
To customize the Derby JDBC URL, use CacheDriver and CacheConnection. For example, to cache to an in-memory database, use a JDBC URL like the following:
jdbc:splunk:CacheDriver=org.apache.derby.jdbc.EmbeddedDriver;CacheConnection='jdbc:derby:memory';user=MyUserName;password=MyPassword;URL=MyURL;
SQLite

The following is a JDBC URL for the SQLite JDBC driver:

jdbc:splunk:CacheDriver=org.sqlite.JDBC;CacheConnection='jdbc:sqlite:C:/Temp/sqlite.db';user=MyUserName;password=MyPassword;URL=MyURL;
MySQL

The following is a JDBC URL for the included CData JDBC Driver for MySQL:

  jdbc:splunk:Cache Driver=cdata.jdbc.mysql.MySQLDriver;Cache Connection='jdbc:mysql:Server=localhost;Port=3306;Database=cache;User=root;Password=123456';user=MyUserName;password=MyPassword;URL=MyURL;

  
SQL Server

The following JDBC URL uses the Microsoft JDBC Driver for SQL Server:

jdbc:splunk:Cache Driver=com.microsoft.sqlserver.jdbc.SQLServerDriver;Cache Connection='jdbc:sqlserver://localhost\sqlexpress:7437;user=sa;password=123456;databaseName=Cache';user=MyUserName;password=MyPassword;URL=MyURL;
Oracle

The following is a JDBC URL for the Oracle Thin Client:

jdbc:splunk:Cache Driver=oracle.jdbc.OracleDriver;CacheConnection='jdbc:oracle:thin:scott/tiger@localhost:1521:orcldb';user=MyUserName;password=MyPassword;URL=MyURL;
NOTE: If using a version of Oracle older than 9i, the cache driver will instead be oracle.jdbc.driver.OracleDriver .
PostgreSQL

The following JDBC URL uses the official PostgreSQL JDBC driver:

jdbc:splunk:CacheDriver=cdata.jdbc.postgresql.PostgreSQLDriver;CacheConnection='jdbc:postgresql:User=postgres;Password=admin;Database=postgres;Server=localhost;Port=5432;';user=MyUserName;password=MyPassword;URL=MyURL;



CacheConnection

Data Type

string

Default Value

""

Remarks

The cache database is determined based on the CacheDriver and CacheConnection properties. Both properties are required to use the cache database. Examples of common cache database settings can be found below. For more information on setting the caching database's driver, refer to CacheDriver.

The connection string specified in the CacheConnection property is passed directly to the underlying CacheDriver. Consult the documentation for the specific JDBC driver for more information on the available properties. Make sure to include the JDBC driver in your application's classpath.

Derby and Java DB

The driver simplifies caching to Derby, only requiring you to set the CacheLocation property to make a basic connection.

Alternatively, you can configure the connection to Derby manually using CacheDriver and CacheConnection. The following is the Derby JDBC URL syntax:

jdbc:derby:[subsubprotocol:][databaseName][;attribute=value[;attribute=value] ... ]
For example, to cache to an in-memory database, use the following:
jdbc:derby:memory

SQLite

To cache to SQLite, you can use the SQLite JDBC driver. The following is the syntax of the JDBC URL:

jdbc:sqlite:dataSource

MySQL

The installation includes the CData JDBC Driver for MySQL. The following is an example JDBC URL:

jdbc:mysql:User=root;Password=root;Server=localhost;Port=3306;Database=cache
The following are typical connection properties:

SQL Server

The JDBC URL for the Microsoft JDBC Driver for SQL Server has the following syntax:

jdbc:sqlserver://[serverName[\instance][:port]][;database=databaseName][;property=value[;property=value] ... ]
For example:
jdbc:sqlserver://localhost\sqlexpress:1433;integratedSecurity=true
The following are typical SQL Server connection properties:
Oracle

The following is the conventional JDBC URL syntax for the Oracle JDBC Thin driver:

jdbc:oracle:thin:[userId/password]@[//]host[[:port][:sid]]
For example:
jdbc:oracle:thin:scott/tiger@myhost:1521:orcl
The following are typical connection properties:
PostgreSQL

The following is the JDBC URL syntax for the official PostgreSQL JDBC driver:

jdbc:postgresql:[//[host[:port]]/]database[[?option=value][[&option=value][&option=value] ... ]]
For example, the following connection string connects to a database on the default host (localhost) and port (5432):
jdbc:postgresql:postgres
The following are typical connection properties:



CacheLocation

Data Type

string

Default Value

"%APPDATA%\\CData\\Splunk Data Provider"

Remarks

The CacheLocation is a simple, file-based cache. The driver uses Java DB, Oracle's distribution of the Derby database. To cache to Java DB, you will need to add the Java DB JAR file to the classpath. The JAR file, derby.jar, is shipped in the JDK and located in the db subfolder of the JDK installation.

If left unspecified, the default location is "%APPDATA%\\CData\\Splunk Data Provider" with %APPDATA% being set to the user's configuration directory:

Platform %APPDATA%
Windows The value of the APPDATA environment variable
Mac ~/Library/Application Support
Linux ~/.config

See Also



CacheTolerance

Data Type

int

Default Value

600

Remarks

The tolerance for stale data in the cache specified in seconds. This only applies when AutoCache is used. The driver checks with the data source for newer records after the tolerance interval has expired. Otherwise, it returns the data directly from the cache.



Offline

Data Type

bool

Default Value

false

Remarks

When Offline = true, all queries execute against the cache as opposed to the live data source. In this mode, certain queries like INSERT, UPDATE, DELETE, and CACHE are not allowed.



CacheMetadata

Data Type

bool

Default Value

false

Remarks

As you execute queries with this property set, table metadata in the Splunk catalog are cached to the file store specified by CacheLocation if set or the user's home directory otherwise. A table's metadata will be retrieved only once, when the table is queried for the first time.

When to Use CacheMetadata

The driver automatically persists metadata in memory for up to two hours when you first discover the metadata for a table or view and therefore, CacheMetadata is generally not required. CacheMetadata becomes useful when metadata operations are expensive such as when you are working with large amounts of metadata or when you have many short-lived connections.

When Not to Use CacheMetadata



BatchSize

Data Type

int

Default Value

0

Remarks

When BatchSize is set to a value greater than 0, the batch operation will split the entire batch into separate batches of size BatchSize. The split batches will then be submitted to the server individually. This is useful when the server has limitations on the size of the request that can be submitted.

Setting BatchSize to 0 will submit the entire batch as specified.



ConnectionLifeTime

Data Type

int

Default Value

0

Remarks

The maximum lifetime of a connection in seconds. Once the time has elapsed, the connection object is disposed. The default is 0 which indicates there is no limit to the connection lifetime.



ConnectOnOpen

Data Type

bool

Default Value

false

Remarks

When set to true, a connection will be made to Splunk when the connection is opened. This property enables the Test Connection feature available in various database tools.

This feature acts as a NOOP command as it is used to verify a connection can be made to Splunk and nothing from this initial connection is maintained.

Setting this property to false may provide performance improvements (depending upon the number of times a connection is opened).



IncludeInternalFields

Data Type

bool

Default Value

false

Remarks

Whether or not the CData JDBC Driver for Splunk should push the internal fields. These fields include: user, eventtype, etc.



MaxRows

Data Type

int

Default Value

-1

Remarks

Limits the number of rows returned rows when no aggregation or group by is used in the query. This helps avoid performance issues at design time.



MaxThreads

Data Type

string

Default Value

"5"

Remarks

This property allows you to issue multiple requests simultaneously, thereby improving performance. Default value is 5 threads. Setting a higher value can result in OutOfMemory issues.



Other

Data Type

string

Default Value

""

Remarks

The properties listed below are available for specific use cases. Normal driver use cases and functionality should not require these properties.

Specify multiple properties in a semicolon-separated list.

Caching Configuration

CachePartial=TrueCaches only a subset of columns, which you can specify in your query.
QueryPassthrough=TruePasses the specified query to the cache database instead of using the SQL parser of the driver.

Integration and Formatting

DefaultColumnSizeSets the default length of string fields when the data source does not provide column length in the metadata. The default value is 2000.
ConvertDateTimeToGMTDetermines whether to convert date-time values to GMT, instead of the local time of the machine.
RecordToFile=filenameRecords the underlying socket data transfer to the specified file.



Pagesize

Data Type

int

Default Value

10000

Remarks

The Pagesize property affects the maximum number of results to return per page from Splunk. Setting a higher value may result in better performance at the cost of additional memory allocated per page consumed.



PoolIdleTimeout

Data Type

int

Default Value

60

Remarks

The allowed idle time a connection can remain in the pool until the connection is closed. The default is 60 seconds.



PoolMaxSize

Data Type

int

Default Value

100

Remarks

The maximum connections in the pool. The default is 100. To disable this property, set the property value to 0 or less.



PoolMinSize

Data Type

int

Default Value

1

Remarks

The minimum number of connections in the pool. The default is 1.



PoolWaitTime

Data Type

int

Default Value

60

Remarks

The max seconds to wait for a connection to become available. If a new connection request is waiting for an available connection and exceeds this time, an error is thrown. By default, new requests wait forever for an available connection.



PseudoColumns

Data Type

string

Default Value

""

Remarks

This setting is particularly helpful in Entity Framework, which does not allow you to set a value for a pseudo column unless it is a table column. The value of this connection setting is of the format "Table1=Column1, Table1=Column2, Table2=Column3". You can use the "*" character to include all tables and all columns; for example, "*=*".



Readonly

Data Type

bool

Default Value

false

Remarks

If this property is set to true, the driver will allow only SELECT queries. INSERT, UPDATE, DELETE, and stored procedure queries will cause an error to be thrown.



RowScanDepth

Data Type

string

Default Value

"50"

Remarks

Determines the number of rows used to determine the column data types.

Setting a high value may decrease performance. Setting a low value may prevent the data type from being determined properly, especially when there is null data.



RTK

Data Type

string

Default Value

""

Remarks

The RTK property may be used to license a build. See the included licensing file to see how to set this property. The runtime key is only available if you purchased an OEM license.



SupportEnhancedSQL

Data Type

bool

Default Value

true

Remarks

When SupportEnhancedSQL = true, the driver offloads as much of the SELECT statement processing as possible to Splunk and then processes the rest of the query in memory. In this way, the driver can execute unsupported predicates, joins, and aggregation.

When SupportEnhancedSQL = false, the driver limits SQL execution to what is supported by the Splunk API.

Execution of Predicates

The driver determines which of the clauses are supported by the data source and then pushes them to the source to get the smallest superset of rows that would satisfy the query. It then filters the rest of the rows locally. The filter operation is streamed, which enables the driver to filter effectively for even very large datasets.

Execution of Joins

The driver uses various techniques to join in memory. The driver trades off memory utilization against the requirement of reading the same table more than once.

Execution of Aggregates

The driver retrieves all rows necessary to process the aggregation in memory.



Timeout

Data Type

int

Default Value

60

Remarks

If Timeout = 0, operations do not time out. The operations run until they complete successfully or until they encounter an error condition.

If Timeout expires and the operation is not yet complete, the driver throws an exception.



TypeDetectionScheme

Data Type

string

Default Value

"RowScan"

Remarks

NoneSetting TypeDetectionScheme to None will return all columns as the string type.
RowScanSetting TypeDetectionScheme to RowScan will scan rows to heuristically determine the data type. The RowScanDepth determines the number of rows to be scanned.



UseConnectionPooling

Data Type

bool

Default Value

false

Remarks

This property enables connection pooling. The default is false. See Connection Pooling for information on using connection pools.



UseJobs

Data Type

bool

Default Value

false

Remarks

Whether to use the jobs endpoint instead of the export endpoint. While Jobs generally provide higher performance, the initial response time may be longer. If a Timeout error occurs, set the Timeout connection property to a higher value.





TablesBack To Top

  1. DataModels
  2. Datasets
  3. SearchJobs

DataModels

Create, query, update, and delete data models in Splunk.

Select

The driver will use the Splunk API to process search criteria that refer to the Id column. This column supports server-side processing for the = operator. The driver processes other filters client-side within the driver.

For example, the following query is processed server side by the Splunk APIs:

SELECT * FROM DataModels WHERE Id = 'SampleModel' 
You can turn off the client-side execution of the query by setting SupportEnhancedSQL to false in which case any search criteria that refers to other columns will cause an error or inconsistent data.
Insert

The Id column is the minimum requirement for an insert. In an insert, the DataModels table allows only the Id and Acceleration columns.

INSERT INTO DataModels (Id, Acceleration) VALUES ('initialname', '{"enabled":false,"earliest_time":"","hunk.file_format":"","hunk.dfs_block_size":0,"hunk.compression_codec":""}' )
Update

The DataModels table allows updates for the Acceleration column when Id is specified. You can also set the Provisional pseudocolumn.

UPDATE DataModels SET Provisional = 'true', Acceleration = '{"enabled":false,"earliest_time": "-1mon", "cron_schedule": "0 */12 * * *","hunk.file_format":"","hunk.dfs_block_size":0,"hunk.compression_codec":""}' WHERE Id = 'initialname' 
Delete

The DataModels table allows deleting a record when Id is specified.

DELETE FROM Datamodels WHERE Id = 'initialname' 
Columns

Name Type ReadOnly References Description
Id [KEY] String False

Id of the data model.

LinkId String True

Link of the data model.

Disabled Boolean True

Indicates if the data model is disabled/enabled.

UpdatedAt Datetime True

Datetime of the last update of the data model.

Description String True

Description of the data model.

Name String True

The name displayed for the data model in Splunk.

Author String True

Splunk user who created the data model.

App String True

Splunk app where the data model is shared.

Owner String True

Splunk user who owns the data model.

CanShareApp Boolean True

Boolean indicating whether the data model can be shared in an app.

CanShareGlobal Boolean True

Boolean indicating whether the data model can be shared globally.

CanShareUser Boolean True

Boolean indicating whether the data model can be shared by the user.

CanWrite Boolean True

Boolean indicating whether the data model can be extended by the user.

Modifiable Boolean True

Boolean indicating whether the data model can be modified.

Removable Boolean True

Boolean indicating whether the data model can be removed.

Acceleration String False

Acceleration settings for the data model. Supply JSON to specify any or all of the following settings: enabled (true or false), earliest_time (time modifier), or cron_schedule (cron string).

AccelerationAllowed Boolean True

Boolean indicating that acceleration is allowed or not for the data model.

AccelerationHunkCompression String True

Specifies the compression codec to be used for the accelerated orc or parquet format files.

DatasetCommands String True

Data model commands.

DatasetDescription String True

The JSON describing the data model.

DatasetCurrentCommand Integer True

Current command of the data model.

DatasetEarliestTime Datetime True

Earliest time of data model events being processed.

DatasetLatestTime Datetime True

Latest time of data model events being processed.

DatasetDiversity String True

Diversity of events being processed.

DatasetLimiting Integer True

Limitations of events being processed.

DatasetMode String True

Search mode events being processed.

DatasetSampleRatio String True

Sample ratio of the data model.

DatasetFields String True

Indexed fields the data model has.

DatasetType String True

Dataset type.

Type String True

Data model type.

Digest String True

Content digest type.

TagsWhitelist String True

Whitelist of data model tags.

ReadPermitions String True

Permissions to read this data model.

WritePermitions String True

Permissions to write to this data model.

Sharing String True

Data model sharing type.

Username String True

Username of the Splunk user.

Pseudo-Columns

Pseudo column fields are used in the WHERE clause of SELECT statements and offer a more granular control over the tuples that are returned from the data source.

Name Type Description
Provisional Boolean

Indicates whether the data model is provisional. Provisional data models are not saved. Specify true to validate a data model before saving it.



Datasets

Create, query, update, and delete datasets in Splunk.

Select

The Datasets table requires DataModelId in the WHERE clause. The DataModelId column supports server-side processing for the = operator. The driver processes other search criteria client-side within the driver.

SELECT * FROM DataSets  WHERE DataModelId = 'SampleModel' 
You can turn off the client-side execution of the query by setting SupportEnhancedSQL to false in which case any search criteria that refers to other columns will cause an error or inconsistent data.
Insert

Splunk allows inserts only when DataModelId, ParentName, and ObjectName are all specified.

INSERT INTO  [Datasets] (ObjectName, ParentName, DataModelId) VALUES ('SampleSet', 'BaseEvent','SampleModel')
Update

The Datasets table allows updates when DataModelId is specified. The columns that can be updated in this case are the following: Description and DisplayName.

When ObjectName is also specified, you can update the following columns: ObjectDisplayName, ParentName, Comment, Fields, Calculations, Constraints, Lineage, ObjectSearchNoFields, ObjectSearch, AutoextractSearch, PreviewSearch, AccelerationSearch, BaseSearch, and TsidxNamespace.

UPDATE Datasets SET Description = 'model description' , DisplayName = 'Model Display Name' WHERE DataModelId = 'SampleModel' 



UPDATE Datasets SET ParentName = 'BaseEvent', BaseSearch = '| search (index=* OR index=_*) | fields _time, RootObject', AccelerationSearch = ' search (index=* OR index=_*) ' WHERE DataModelId = 'SampleModel' AND ObjectName = 'SampleSet' 
Delete

Datasets can be deleted by providing the DataModelId and the ObjectName of the dataset.

DELETE FROM Datasets WHERE DataModelId = 'SampleModel' AND ObjectName = 'SampleSet'
Columns

Name Type ReadOnly References Description
ObjectName [KEY] String False

Name of the dataset object.

DatamodelId [KEY] String False

DataModels.Id

Id of the data model the object belongs to.

DisplayName String False

Name of the data model the object belongs to.

Description String False

Dataset description.

ObjectNameList String True

List of the objects in the data model.

ObjectDisplayName String False

Name displayed in Splunk for the object.

ParentName String False

Name of the Parent Event.

Comment String False

Dataset comments.

Fields String False

Dataset events indexed fields.

Calculations String False

Saved calculations for dataset fields.

Constraints String False

Saved constraints for dataset fields.

Lineage String False

Dataset lineage.

ObjectSearchNoFields String False

Object search query without fields.

ObjectSearch String False

Saved search query for the object.

AutoextractSearch String False

Search query for autoextraction.

PreviewSearch String False

Search preview query.

AccelerationSearch String False

Search query including acceleration.

BaseSearch String False

Basic search query.

TsidxNamespace String False

Allocated namespace.

EventBased Integer True

Number of Event-Based objects in the data model.

TransactionBased Integer True

Number of Transaction-Based objects in the data model.

SearchBased Integer True

Number of Search-Based objects in the data model.



SearchJobs

Create, query, update, and delete search jobs in Splunk.

Select

The driver will use the Splunk APIs to process the search Id (Sid) criteria specified in the WHERE clause. The Sid column supports server-side processing for the = operator. The driver processes other search criteria client-side within the driver.

SELECT * FROM SearchJobs

SELECT * FROM SearchJobs WHERE Sid = '123456789.1234' 
You can turn off the client-side execution of the query by setting SupportEnhancedSQL to false in which case any search criteria that refers to other columns will cause an error or inconsistent data.
Insert

Splunk allows inserts only when EventSearch is specified. You can insert the Custom, EarliestTime, LatestTime, Label, and StatusBuckets columns and all pseudocolumns.

INSERT Into SearchJobs (Custom, EventSearch, LatestTime, Timeout) VALUES ('custom1=test1, custom2=test2', ' from datamodel SampleModel', 'now', '60')
Update

The SearchJobs table allows updates of the Custom column only when Sid is specified.

UPDATE SearchJobs SET Custom = 'custom1=test3, custom2=test4' WHERE sid = '123456789.1234' 
Delete

SearchJobs can be deleted by providing the Sid.

DELETE FROM SearchJobs WHERE Sid = '123456789.1234'
Columns

Name Type ReadOnly References Description
Sid [KEY] String False

The search Id number.

EventSearch String False

Subset of the entire search that is before any transforming commands.

Custom String False

Custom job property. In an INSERT operation, pass the values as a comma-separated list of pairs of keys and values.

EarliestTime String False

The earliest time a search job is configured to start.

LatestTime String False

The latest time a search job is configured to start.

CursorTime String True

The earliest time from which no events are later scanned. Can be used to indicate progress.

Delegate String True

For saved searches, specifies jobs that were started by the user. Defaults to scheduler.

DiskUsage Long True

The total amount of disk space used, in bytes.

DispatchState String True

The state of the search. Can be any of QUEUED, PARSING, RUNNING, PAUSED, FINALIZING, FAILED, or DONE.

DoneProgress Double True

A number between 0 and 1.0 that indicates the approximate progress of the search. doneProgress = (latestTime-cursorTime) / (latestTime-earliestTime)

DropCount Integer True

For real-time searches only, the number of possible events that were dropped due to the rt_queue_size (defaults to 100000).

EventAvailableCount Integer True

The number of events that are available for export.

EventCount Integer True

The number of events returned by the search.

EventFieldCount Integer True

The number of fields found in the search results.

EventIsStreaming Boolean True

Indicates if the events of this search are being streamed.

EventIsTruncated Boolean True

Indicates if the events of the search are not stored, making them unavailable from the events endpoint for the search.

EventPreviewableCount Integer True

Number of in-memory events that are not yet committed to disk.

EventSorting String True

Indicates if the events of this search are sorted, and in which order.

IsDone Boolean True

Indicates if the search has completed.

IsEventsPreviewEnabled String True

Indicates if the timeline_events_preview setting is enabled in limits.conf.

IsFailed Boolean True

Indicates if there was a fatal error executing the search. For example, invalid search string syntax.

IsFinalized Boolean True

Indicates if the search was finalized (stopped before completion).

IsPaused Boolean True

Indicates if the search is paused.

IsPreviewEnabled Boolean True

Indicates if previews are enabled.

IsRealTimeSearch Boolean True

Indicates if the search is a real-time search.

IsRemoteTimeline Boolean True

Indicates if the remote timeline feature is enabled.

IsSaved Boolean True

Indicates that the search job is saved on disk. Search artifacts are saved on disk for 7 days from the last time that the job was viewed or touched.

IsSavedSearch Boolean True

Indicates if this is a saved search run using the scheduler.

IsZombie Boolean True

Indicates if the process running the search died without finishing the search.

Keywords String True

All positive keywords used by this search. A positive keyword is a keyword that is not in a NOT clause.

Label String False

Custom name created for this search.

Messages String True

Errors and debug messages.

NumPreviews Integer True

Number of previews generated so far for this search job.

Performance String True

A representation of the execution costs.

Priority Integer True

An integer between 0-10 that indicates the search priority.

RemoteSearch String True

The search string that is sent to every search peer.

ReportSearch String True

If reporting commands are used, the reporting search.

ResultCount Integer True

The total number of results returned by the search. In other words, this is the subset of scanned events (represented by the ScanCount) that actually matches the search terms.

ResultIsStreaming Boolean True

Indicates if the final results of the search are available using streaming (for example, no transforming operations).

ResultPreviewCount Integer True

The number of result rows in the latest preview results.

RunDuration Decimal True

Time in seconds that the search took to complete.

ScanCount Integer True

The number of events that are scanned or read off disk.

SearchEarliestTime Datetime True

Specifies the earliest time for a search, as specified in the search command rather than the EarliestTime parameter. It does not snap to the indexed data time bounds for all-time searches.

SearchLatestTime Datetime True

Specifies the latest time for a search, as specified in the search command rather than the LatestTime parameter. It does not snap to the indexed data time bounds for all-time searches.

SearchProviders String True

A list of all the search peers that were contacted.

StatusBuckets Integer False

Maximum number of timeline buckets.

TTL String True

The time to live, or the time before the search job expires after it completes.

Pseudo-Columns

Pseudo column fields are used in the WHERE clause of SELECT statements and offer a more granular control over the tuples that are returned from the data source.

Name Type Description
SearchMode String

Searching mode, realtime or normal. If set to realtime, the search runs over the live data.

The allowed values are normal, realtime.

EnableLookups Boolean

Indicates whether lookups should be applied to events.

AutoPause Integer

If specified, the search job pauses after this many seconds of inactivity. (0 means never autopause.)

AutoCancel Integer

If specified, the job automatically cancels after this many seconds of inactivity. (0 means never autocancel.)

AdhocSearchLevel Integer

Specify a search mode. Use one of the following search modes: verbose, fast, or smart.

The allowed values are verbose, fast, smart.

ForceBundleReplication Boolean

Specifies whether this search should cause (and wait depending on the value of SyncBundleReplication) for bundle synchronization with all search peers.

IndexEarliest String

Specify a time string. Sets the earliest inclusive time bounds for the search, based on the index time bounds.

IndexLatest String

Specify a time string. Sets the latest exclusive time bounds for the search, based on the index time bounds.

IndexedRealtime Boolean

Indicates whether or not to use the indexed-realtime mode for real-time searches.

IndexedRealtimeOffset Integer

Sets disk sync delay for indexed real-time search (seconds).

MaxCount Integer

The number of events that can be accessible in any given status bucket.

MaxTime Integer

Comma-separated list of (possibly wildcarded) servers from which raw events should be pulled.

Namespace String

The application namespace in which to restrict searches.

Now String

Specify a time string to set the absolute time used for any relative time specifier in the search. Defaults to the current system time. You can specify a relative time modifier for this parameter. For example, specify +2d to specify the current time plus two days.

ReduceFrequency Integer

Determines how frequently to run the MapReduce reduce phase on accumulated map values.

ReloadMacros Boolean

Specifies whether to reload macro definitions from the configuration file.

RemoteServerList Integer

The number of seconds to run this search before finalizing. Specify 0 to never finalize.

ReplaySpeed Integer

Indicate a real-time search replay speed factor. For example, 1 indicates normal speed, 0.5 indicates half of normal speed, and 2 indicates twice as fast as normal.

ReplayStartTime String

Relative wall-clock start time for the replay.

ReplayEndTime String

Relative end time for the replay clock. The replay stops when the clock time reaches this time.

ReuseMaxSecondsAgo Integer

Specifies the number of seconds ago to check when an identical search is started and return the search Id of the job instead of starting a new job.

RequiredField String

Adds a required field to the search.

RealTimeBlocking Boolean

For a real-time search, indicates if the indexer blocks if the queue for this search is full.

RealTimeIndexFilter Boolean

For a real-time search, indicates if the indexer prefilters events.

RealTimeMaxBlockSecs Integer

For a real-time search with RealTimeBlocking set to true, the maximum time to block. Specify 0 to indicate no limit.

RealTimeQueueSize Integer

For a real-time search, the queue size (in events) that the indexer should use for this search.

Timeout Integer

The number of seconds to keep this search after processing has stopped.

SyncBundleReplication String

Specifies whether this search should wait for bundle replication to complete.





ViewsBack To Top

  1. AlertsInInternalServer
  2. LookUpReport
  3. UploadedModel

AlertsInInternalServer

A dataset object in the example InternalServer data model.

Select

This is an example of a dataset view. These views are generated from dataset objects inside a data model. The driver will use the Splunk APIs to process the following query components; the driver processes other parts of the query client-side in memory.

All columns support server-side processing for the following operators and functions:

LIMIT, ORDER BY, GROUP BY, and HAVING are also processed server-side. An exception is the case when in the selected columns, there are fields that are not in the GROUP BY, and GROUP BY, criteria, and limiting are handled client-side.

In the case when an unsupported criteria or function is used, all processing will be completed client-side (except selecting specified fields). This is also the case when a SELECT statement has a column that is not in the GroupBy clause.

For example, the driver uses the Splunk APIs to process the following queries.

SELECT Component, Timeendpos as Timeend FROM [AlertsInInternalServer] WHERE Component = 'Saved' OR EventType != '' AND Priority IS NOT NULL AND Linecount NOT IN ('1', '2') ORDER BY Priority DESC LIMIT 5 



SELECT AVG(Suppressed), Priority FROM [AlertsInInternalServer] GROUP BY Priority HAVING AVG(Suppressed) > 0 
You can turn off the client-side execution of the query by setting SupportEnhancedSQL to false in which case any search criteria that refers to other columns will cause an error or inconsistent data.
Columns

Name Type Description
_time Datetime
component String
date_hour Int
date_mday Int
date_minute Int
date_month String
date_second Int
date_wday String
date_year Int
date_zone Int
digest_mode Int
dispatch_time Int
host String
linecount Int
log_level String
priority String
punct String
savedsearch_id String
scheduled_time Int
search_type String
server_alert_actions String
server_app String
server_message String
server_result_count Int
server_run_time Double
server_savedsearch_name String
server_sid String
server_status String
server_user String
source String
sourcetype String
splunk_server String
suppressed Int
thread_id String
timeendpos Int
timestartpos Int
window_time Int



LookUpReport

An example lookup report representing a view based on a saved report in Splunk.

Select

This is an example of a report view. These views are generated from saved reports in Splunk.

The driver will use the Splunk APIs to process the following query components; the driver processes other parts of the query client-side in memory. You can turn off the client-side execution of the query by setting SupportEnhancedSQL to false in which case any search criteria that refers to other columns will cause an error or inconsistent data.

Saved Search

Runs a saved search, or report, and returns the search results of a saved search. If the search contains replacement placeholder terms, such as $replace_me$, the search processor replaces the placeholders with the strings you specify.

For example:

Will generate the following search statement:

All replacement placeholder terms will be dynamic and saved as Pseudo-Columns.

All columns support server-side processing for the following operators and functions:

LIMIT, ORDER BY, GROUP BY, and HAVING are also processed server-side. An exception is the case when in the selected columns, there are fields that are not in the GROUP BY, and GROUP BY, criteria, and limiting are handled client-side.

In the case when an unsupported criteria or function is used, all processing will be completed client-side (except selecting specified fields). This is also the case when a SELECT statement has a column that is not in the GROUP BY clause.

For example, the driver processes the following queries server-side:

SELECT Country, Subregion as Sub FROM LookUpReport WHERE Iso2 != '123' OR continent = 'Europe' AND iso3 NOT IN ('example_1', 'example_2') ORDER BY Country DESC LIMIT 5 



SELECT AVG(Iso2), Subregion FROM LookUpReport GROUP BY Subregion HAVING AVG(Iso2) > 0 
You can turn off the client-side execution of the query by setting SupportEnhancedSQL to false in which case any search criteria that refers to other columns will cause an error or inconsistent data.
Columns

Name Type Description
continent String
country String
iso2 String
iso3 String
region_un String
region_wb String
subregion String



UploadedModel

An example of a table object inside a data model.

Select

This is an example of a view generated from a table object inside a data model. The driver will use the Splunk APIs to process the following query components; the driver processes other parts of the query client-side in memory.

All columns support server-side processing for the following operators and functions.

LIMIT, ORDER BY, GROUP BY, and HAVING are also processed server-side. An exception is the case when in the selected columns, there are fields that are not in the GROUP BY, and GROUP BY, criteria, and limiting are handled client-side.

In the case when an unsupported criteria or function is used, all processing will be completed client-side (except selecting specified fields). This is also the case when a SELECT statement has a column that is not in the GROUP BY clause.

For example, the following queries are processed server side:

SELECT Component, Timeendpos as Timeend FROM [UploadedModel] WHERE Component = 'Saved' OR DEST_CITY_MARKET_ID != '' AND DEST_AIRPORT_ID NOT IN ('1', '2') ORDER BY ORIGIN_AIRPORT_ID DESC LIMIT 5 



SELECT AVG(DEST_AIRPORT_ID), ORIGIN_AIRPORT_ID FROM [UploadedModel] GROUP BY ORIGIN_AIRPORT_ID HAVING AVG(DEST_AIRPORT_ID) > 0 
You can turn off the client-side execution of the query by setting SupportEnhancedSQL to false in which case any search criteria that refers to other columns will cause an error or inconsistent data.
Columns

Name Type Description
_time Datetime
DEST_AIRPORT_ID Int
DEST_AIRPORT_SEQ_ID Int
DEST_CITY_MARKET_ID Int
host String
linecount Int
ORIGIN_AIRPORT_ID Int
ORIGIN_AIRPORT_SEQ_ID Int
ORIGIN_CITY_MARKET_ID Int
punct String
source String
sourcetype String
splunk_server String
timestamp String