8.1 Leveraging Time-based Partitioning
All imported data is automatically partitioned into hourly buckets, based on the ‘time’ field within each data record. By specifying the time range to query, you avoid reading unnecessary data and can thus speed up your query significantly.
8.1.1 WHERE time <=> Integer
When the ‘time’ field within the WHERE clause is specified, the query parser will automatically detect which partition(s) should be processed. Please note that this auto detection will not work if you specify the time withfloat _instead of int_.
[GOOD]: SELECT field1, field2, field3 FROM tbl WHERE time > 1349393020
[GOOD]: SELECT field1, field2, field3 FROM tbl WHERE time > 1349393020 + 3600
[GOOD]: SELECT field1, field2, field3 FROM tbl WHERE time > 1349393020 - 3600
[BAD]: SELECT field1, field2, field3 FROM tbl WHERE time > 13493930200 / 10
[BAD]: SELECT field1, field2, field3 FROM tbl WHERE time > 1349393020.00
[BAD]: SELECT field1, field2, field3 FROM tbl WHERE time BETWEEN 1349392000 AND 1349394000
8.1.2 TD_TIME_RANGE
An easier way to slice the data is to use TD_TIME_RANGE UDF.
[GOOD]: SELECT ... WHERE TD_TIME_RANGE(time, "2013-01-01 PDT")
[GOOD]: SELECT ... WHERE TD_TIME_RANGE(time, "2013-01-01", NULL, "PDT")
[GOOD]: SELECT ... WHERE TD_TIME_RANGE(time, "2013-01-01",
TD_TIME_ADD("2013-01-01", "1day", "PDT"))
However, if you use TD_TIME_FORMAT UDF or division in TD_TIME_RANGE, time partition opimization doesn’t work. For instance, the following conditions disable optimization.
[BAD]: SELECT ... WHERE TD_TIME_RANGE(time, TD_TIME_FORMAT(TD_SCHEDULED_TIME(), 'yyyy-MM-dd'))
[BAD]: SELECT ... WHERE TD_TIME_RANGE(time, TD_TIME_FORMAT(1356998401, 'yyyy-MM-dd'))
[BAD]: SELECT ... WHERE TD_TIME_RANGE(time, TD_SCHEDULED_TIME() / 86400 * 86400))
[BAD]: SELECT ... WHERE TD_TIME_RANGE(time, 1356998401 / 86400 * 86400))