ClustrixDB有一个流模型,它从容器(表和索引)开始并通过操作符图对行进行流处理。ClustrixDB有一组丰富的操作符,随着更多功能和优化的添加,这些操作符偶尔会增加。我们使用一个地方来记录操作符,逻辑的和物理的。物理操作符出现在解释输出中,而逻辑操作符没有,所以这是您将看到的操作符的超集。
这些是在MySQL提示符的explain输出中常见的操作符。
Example | |||
---|---|---|---|
Operator | Arguments | Details | Example |
display | (REFS [output_cast]) INPUT_ROW |
Display output to user. output_cast: Optional argument, if present display should promoter values for mysql protocol |
|
stream_combine | () INPUT_ROW |
Take an input that generates separate streams (such as an index_scan INPUT_ROW := bulk operator for separate input paths |
|
stream_merge | () INPUT_ROW |
Similar to stream_combine, take an input that generates separate INPUT_ROW := bulk operator for separate input paths |
|
index_scan | (NAMESPACE RELOID REPOID COLS . PROPS) EQUAL LOW HIGH NULLF? |
Scan the representation. Namespaces are unique per relation. NAMESPACE := vs32 |
|
table_scan | (NAMESPACE OID COLS . PROPS) . EXPR | NAMESPACE := vs32 that represents which namespace OID := object ID of table COLS := list of columns needed PROPS := list of 0 more of: update, use_index, force_index EXPR := list of constraints to apply to the output |
|
filter | () INPUT_ROW EXPR... |
Apply filter to input. EXPR := item operator; null, 0, and false are filtered out |
|
row_limit | CALC_FOUND_ROWS INPUT_ROW LVAL OVAL |
Limit the number of rows produced. This operator can stop previous input operators from producing rows once it has enough rows. CALC_FOUND_ROWS := boolean |
|
row_count | NAME INPUT_ROW | Counts the number of rows produced. Assigns to NAME. | |
user_const | VALUE | User provided constant. VALUE := constant. | |
const | VALUE | VALUE := constant. | |
ref | (NAMESPACE . NAME) | NAMESPACE := vs32 of the namespace NAME := string of the column name |
|
param | (VALUE) | VALUE := u32 denoting which prepared parameter it is | |
range | MODE EXPR... | MODE := closed, open, equal EXPR := expression for each column in the range |
|
force_prop | PROP INPUT_ROW | Physical operator that forces a property on its input. PROP is one of the following (sort KEYS) := force sort over list of KEYS (limit LIMVAL OFFVAL) := force limit |
|
func | OP EXPRS... |
Function OP := symbol of the operator |
|
mtable_scan | (NAMESPACE SET? NELEMS . TYPES) MTABLE |
Scan a memory table as if it were a real table. NAMESPACE := valuespace where the columns appear |
|
mtable_build | (COLS) . EXPRS |
Build a memory table. IN sets are sometimes converted to memory tables. COLS := list of column names to use (in the 0 namespace) |
|
mtable_find | (NAMESPACE . TYPES) INPUT MTABLE EQUAL |
Find a value in Memory table. NAMESPACE := valuespace where the columns appear |
|
table_lock | (RELOID LOCK) EXPR |
Grabs one or more table locks for the duration of the subexpression. RELOID := oid of table to lock |
|
table_locks | NUMLOCKS EXPR |
Placeholder for properly costing table locks. NUMLOCKS := vs32 of the number of table locks to grab |
|
pk_lock | (NAMESPACE OID COLS LOCK) INPUT . PKREFS |
Used to take fine grained locks NAMESPACE := vs32 |
|
pk_lookup | (NAMESPACE RELOID REPOID COLS . PROPS) INPUT PKSARGE |
Used to lookup which fine grained lock to take. NAMESPACE := vs32 |
|
table_write | (OID REFS . PROPS) INPUT_ROW | OID := object ID of table REFS := schema references containing values to use for each column PROPS := list of 0 or more of: ignore, replace, impdflt, serialize, multirow, lock_dst_x, lock_dst_s INPUT_ROW := bulk operator that produces rows columns are taken from input row schema |
|
table_delete | (OID REFS) INPUT_ROW | OID := object ID of table REFS := schema references containing values to use for each column INPUT_ROW := bulk operator that produces rows columns are taken from input row schema |
|
table_update | (OID OLD_REFS NEW_REFS . PROPS) INPUT_ROW | OID := object ID of table OLD_REFS := schema references representing old column values NEW_REFS := schema references representing column values PROPS := list of 0 or more of: ignore, serialize INPUT_ROW := bulk operator that produces rows with old/new column values |
聚合操作符执行GROUP BY和DISTINCT操作。
Operator | Arguments | Details | Example |
---|---|---|---|
aggregate |
(KEY AGG_EXPR ONEOF) INPUT_ROW |
Logical operator, implemented as one of the physical aggregates | - |
hash_aggregate | (KEY AGG_EXPR ONEOF) INPUT_ROW | use a sigma, produces unsorted output | |
sigma_aggregate | (KEY AGG_EXPR ONEOF) INPUT_ROW | use a sigma, produces output sorted on KEY | |
stream_aggregate | (KEY AGG_EXPR ONEOF) INPUT_ROW | non-blocking, maintains incoming sort order (if any) | |
dist_stream_aggregate | (KEY AGG_EXPR ONEOF) INPUT_ROW | stream_aggregate that occurs on multiple nodes (partitioned input) | |
hash_aggregate_partial | (KEY AGG_EXPR ONEOF) INPUT_ROW | hash_aggregate that occurs on multiple nodes | |
hash_aggregate_combine | (KEY AGG_EXPR ONEOF) INPUT_ROW | hash_aggregate that consumes results from _partial | |
distinct | _KEYS INPUT_ROW | Logical distinct operator, implemented as one of physical distinct operators. | |
sigma_distinct | _KEYS INPUT_ROW | produces output having only one row per distinct set of values for _KEYS. | |
sigma_distinct_combine | _KEYS INPUT_ROW | does a sigma distinct on results from sigma_distinct_partial. | |
sigma_distinct_partial | _KEYS INPUT_ROW | sigma_distinct that occurs on multiple nodes | |
sort | (KEY) INPUT_ROW |
Logical sort operator. Force an ordering on the input. KEY := list of (STYPE REF) |
|
sigma_sort | () INPUT_ROW [LIMIT] | Force an ordering on the input using a sigma. Context has sort keys so don't show up in input always. |
Notation | Explanation |
---|---|
schema reference |
(NAMESPACE . NAME) NAMESPACE: vs32 that represents which namespace (AUTOGEN = 0) NAME:string of the column name |
KEY | List of schema references to use as keys we are grouping by |
_KEYS | keys to distinct by |
AGG_EXPR | List of (NAME FUNC DISTINCT REF) |
NAME | For aggregates, this means new schema name for this value in the AUTOGEN (= 0) namespace |
FUNC | Aggregate function as a symbol (sum, max, min, count) |
DISTINCT | Does this aggregate function operate on distinct input? |
REF | Schema reference |
ONEOF | List of schema references to pass for aggregate oneof |
INPUT_ROW | Bulk operator providing rows |
LIMIT | (optional) item operator representing max # rows |
Operator | Format | Details | Example |
---|---|---|---|
inner_join | STRAIGHT? INPUT1 INPUT2 . _EXPR |
Logical operator, implemented as a physical joins |
- |
nljoin | ONE_TO_ONE INPUT1 INPUT2 | Nested-loop join. Each row of INPUT1 is read and forwarded to correct next place for INPUT2 | |
nljoin_par | ONE_TO_ONE INPUT1 INPUT2 | Same as nljoin, differs in certain parallelization properties, primarily exists to make mechanics of optimizer work. | |
left_join | STRAIGHT? INPUT1 INPUT2 . _EXPR |
STRAIGHT is always true, so INPUT1 is read before input 2. Also if there is no matching row in INPUT2, NULL value is substituted. |
|
left_semi_join | (PROBEREF) INPUT1 INPUT2 . _EXPR | Perform a left semi-join between INPUT1 and INPUT2, returning any rows from the left side for which the right side returns 1 or more | |
item_join | JTYPE INPUT_ROW IT_EXPR | item_join is a placeholder for subquery expressions where item operators contain bulk operators. IT_EXPR is item operator. | |
msjoin | ONE_TO_ONE INPUT1 INPUT2 | Merge sort nested-loop join. This is similar to nljoin, but is able to preserve sort order. | |
outer | () INPUT1 | Right half of a left outer join. | |
outer_fwd | () INPUT1 [INPUT2] |
Right half of a left outer join. This form of outer performs the outer calculation on a single node, which makes it much more efficient, but can only be applied in some cases. The optional INPUT2 bulk operator does not affect cardinality. |
|
dual | () | Generate a single row. |
Notation | Explanation |
---|---|
STRAIGHT | force reading the left relation before the right |
INPUT1 | bulk operator for the left side |
INPUT2 | bulk operator for the right side |
EXPR | 0 or more item operators representing join constraints |
PROBEREF | optional ref indicating whether the semi-join succeeded |
ONE_TO_ONE | whether INPUT2 only has a cardinality of 1 or not |
JTYPE | is oneof (inner) - attach via inner_join (semi PROBEREF) - attach via left_semi_join |
这些是剩下的操作符,有些是逻辑操作符,有些是物理操作符。
Operator | Format | Details | Example |
---|---|---|---|
genoid | NAME INPUT_ROW |
Create a unique id. NAME := name to assign a new oid |
|
lock_choice | NUMLOCKS INPUT |
Enforces either all row locks or all table locks for all reads in NUMLOCKS := vs32 of the number of locks to grab |
|
any | () BULK_EXPR EXPR | BULK_EXPR := rows to look through EXPR := test to apply for matches |
|
check_scalar check_scalar_expr |
() INPUT_ROW REF INPUT_ROW |
Errors out if input contains more than 1 row. INPUT_ROW := bulk operator providing rows |
|
compute | (NAMES TYPE) INPUT_ROW EXPR0 ... | NAMES := a list of new schema names for these values (in the AUTOGEN namespace) TYPE := normal, fake, serialized INPUT_ROW := bulk operator for rows EXPRn := item operators for the expressions named in NAMES. Expressions (and names) are ordered from least dependant to most, so expressions later in the list can reference names defined earlier, but not vice-versa. |
|
error | (RESULT_CODE STRING ARG1...) INPUT_ROW | ||
exists | () BULK_EXPR | BULK_EXPR := rows to look through | |
ferel_row | (NS FEREL TYPE) | NS := namespace FEREL := ferel (for column description) TYPE := node type to create (t_input, etc) |
|
multi_distinct | ((NS1 NUMKEY . COLSET1) ...) INPUT | NS1 := namespace for first sigma NUMKEY := colindex for # of keys to distinct on COLSET1 := list of refs for first sigma ... INPUT := bulk operator for row generator |
|
parallel_apply | () SOURCE . INPUT | SOURCE := bulk operator to apply INPUTs against INPUT := 1 or more bulk operators that take SOURCE as input |
|
read_sigma | (NS . COLS) | NS := namespace for sigmal COLS := list of column names |
|
tree_filter | () INPUT_ROW EXPR... | A special type of filter that pre-evaluates its expression(s), and then only executes the input if the expression(s) evaluates to true. This op thus requires that its expressions have no dependencies on the input. EXPR := item operator; null, 0, and false are filtered out INPUT_ROW := bulk operator providing rows |
|
tuple | () COL1... | COL1... := item operator expression for each column Used to represent multicolumn comparisons (in, =, etc) |
|
union_all | (REFS1 REFS2 REFSOUT) INPUT_ROW1 INPUT_ROW2 | REFS1 := schema references used from INPUT_ROW1 REFS2 := schema references used from INPUT_ROW2 REFSOUT := schema references representing UNION display columns INPUT_ROW1 := bulk operator providing rows INPUT_ROW2 := bulk operator providing rows |
|
var_assign | (SCOPE NAME) VALUE | SCOPE := global/session/user NAME := variable name VALUE := value to assign variable to |
|
var_read | (SCOPE NAME TYPE) | SCOPE := global/session/user NAME := variable name TYPE := oidtype of variable (only for user) |