Unit 4:
Database System: KDB/KSQL
• KDB is a database system implemented on top of the K language environment. Data structures
(e.g. tables) can be interchanged between the two and functions can be called in both directions.
• KDB supports an SQL dialect called KSQL. KSQL is easy to learn (for anyone fluent in SQL) and
carries over the speed and functionality of K to large data manipulation. KDB also supports most of
standard SQL.
• The basic data structure in KSQL is the arrable (array-table) which is a table whose order can be
exploited.
• Arrables are non-first-normal form objects: a field of a record can be an array. For example, an
entire time series can be stored in a field.
• Like most modern SQLs, KSQL allows the inclusion of user-defined functions inside database
statements. Unlike other SQL's KSQL allows functions to be defined over arrays as well as scalars.
• Like classical SQL, KSQL has aggregates, grouping, selections and string matching, and so on.
• KSQL adds many useful functions to SQL, permitting economical expression and often better
performance by exploiting order. For example, finding the second highest value is a linear time
operation in KSQL but requires a self-join in SQL, which is only sometimes linear time.
• KDB can function as a high performance distributed server with full recovery and distributed
consistency. (KDB guarantees consistency by using ordered atomic broadcast and a replicated state
machine design rather than two phase commit.)
• KDB is widely used in finance for time series analysis.
KSQL Basics -- an extended example
• Create tables (arrables, really, but we'll conform to standard terminology, even though it's slightly
inexact) either through K or within .t scripts. This example uses a mixture of the two.
• / Calling KSQL from K
•
• \l db / running from K, just load the library
•
•
• `"build stock"
stock.name: `Alcatel `Alstom `"Air France" `"France Telecom" `"Snecma" `Supra
• stock.industry: `telecom `engineering `"aviation" `"telecom" `"aviation" `consulting
•
• .d.r"stock: 'name'key stock"
, •
•.d.r"stock['Alcatel'].industry"
•
• `"build trade"
•
• n:1000 / number of trades
•
• t1: ,/("trade:([]"
• " stock:n rand stock.name,"
• " price:50 + .25 * n rand 200,"
• " amount:100 * 10 + n rand 20,"
• " date:date['1998-01-01'] + n rand 449)")
•
• .d.r t1
•
• t2: "trade:'date' asc trade"
• .d.r t2
•
• .d.r "show'trade'"
•
• / now it's time to calculate 6 month hi, low, average, volume
• / for a given stock.
• getstats:{[name;date]
• x: " select first stock, hi: max price, low: min price, "
• x,: " avgprice: avg price, vol: sum amount "
• x,: " from trade where stock = '"
• x,: ($name)
• x,: "', date within (date['"
• x,: ($date)
• x,: "'], date['"
• x,: ($date)
Database System: KDB/KSQL
• KDB is a database system implemented on top of the K language environment. Data structures
(e.g. tables) can be interchanged between the two and functions can be called in both directions.
• KDB supports an SQL dialect called KSQL. KSQL is easy to learn (for anyone fluent in SQL) and
carries over the speed and functionality of K to large data manipulation. KDB also supports most of
standard SQL.
• The basic data structure in KSQL is the arrable (array-table) which is a table whose order can be
exploited.
• Arrables are non-first-normal form objects: a field of a record can be an array. For example, an
entire time series can be stored in a field.
• Like most modern SQLs, KSQL allows the inclusion of user-defined functions inside database
statements. Unlike other SQL's KSQL allows functions to be defined over arrays as well as scalars.
• Like classical SQL, KSQL has aggregates, grouping, selections and string matching, and so on.
• KSQL adds many useful functions to SQL, permitting economical expression and often better
performance by exploiting order. For example, finding the second highest value is a linear time
operation in KSQL but requires a self-join in SQL, which is only sometimes linear time.
• KDB can function as a high performance distributed server with full recovery and distributed
consistency. (KDB guarantees consistency by using ordered atomic broadcast and a replicated state
machine design rather than two phase commit.)
• KDB is widely used in finance for time series analysis.
KSQL Basics -- an extended example
• Create tables (arrables, really, but we'll conform to standard terminology, even though it's slightly
inexact) either through K or within .t scripts. This example uses a mixture of the two.
• / Calling KSQL from K
•
• \l db / running from K, just load the library
•
•
• `"build stock"
stock.name: `Alcatel `Alstom `"Air France" `"France Telecom" `"Snecma" `Supra
• stock.industry: `telecom `engineering `"aviation" `"telecom" `"aviation" `consulting
•
• .d.r"stock: 'name'key stock"
, •
•.d.r"stock['Alcatel'].industry"
•
• `"build trade"
•
• n:1000 / number of trades
•
• t1: ,/("trade:([]"
• " stock:n rand stock.name,"
• " price:50 + .25 * n rand 200,"
• " amount:100 * 10 + n rand 20,"
• " date:date['1998-01-01'] + n rand 449)")
•
• .d.r t1
•
• t2: "trade:'date' asc trade"
• .d.r t2
•
• .d.r "show'trade'"
•
• / now it's time to calculate 6 month hi, low, average, volume
• / for a given stock.
• getstats:{[name;date]
• x: " select first stock, hi: max price, low: min price, "
• x,: " avgprice: avg price, vol: sum amount "
• x,: " from trade where stock = '"
• x,: ($name)
• x,: "', date within (date['"
• x,: ($date)
• x,: "'], date['"
• x,: ($date)