362 lines
11 KiB
Plaintext
362 lines
11 KiB
Plaintext
/*
|
|
* Copyright (c) 2015, 2018, Oracle and/or its affiliates. All rights reserved.
|
|
*
|
|
* This program is free software; you can redistribute it and/or modify
|
|
* it under the terms of the GNU General Public License, version 2.0,
|
|
* as published by the Free Software Foundation.
|
|
*
|
|
* This program is also distributed with certain software (including
|
|
* but not limited to OpenSSL) that is licensed under separate terms,
|
|
* as designated in a particular file or component or in included license
|
|
* documentation. The authors of MySQL hereby grant you an additional
|
|
* permission to link the program and your derivative works with the
|
|
* separately licensed software that they have included with MySQL.
|
|
*
|
|
* This program is distributed in the hope that it will be useful,
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
* GNU General Public License, version 2.0, for more details.
|
|
*
|
|
* You should have received a copy of the GNU General Public License
|
|
* along with this program; if not, write to the Free Software
|
|
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
|
|
*/
|
|
|
|
/** @page mysqlx_protocol_use_cases Use Cases
|
|
|
|
Topics in this section:
|
|
|
|
- @ref use_cases_Prepared_Statements_with_Single_Round_Trip
|
|
- @ref use_cases_Streaming_Inserts
|
|
- @ref use_cases_SQL_with_Multiple_Resultsets
|
|
- @ref use_cases_Inserting_CRUD_Data_in_a_Batch
|
|
- @ref use_cases_Cross_Collection_Update_and_Delete
|
|
|
|
Prepared Statements with Single Round-Trip {#use_cases_Prepared_Statements_with_Single_Round_Trip}
|
|
==========================================
|
|
|
|
In the MySQL Client/Server %Protocol, a ``PREPARE``/``EXECUTE`` cycle
|
|
required two round-trips as the ``COM_STMT_EXECUTE`` requires data from
|
|
the ``COM_STMT_PREPARE-ok`` packet.
|
|
|
|
@startuml "Single Round-Trip"
|
|
group tcp packet
|
|
Client -> Server: COM_STMT_PREPARE(...)
|
|
end
|
|
group tcp packet
|
|
Server --> Client: COM_STMT_PREPARE-ok(**stmt_id=1**)
|
|
end
|
|
group tcp packet
|
|
Client -> Server: COM_STMT_EXECUTE(**stmt_id=1**)
|
|
end
|
|
group tcp packet
|
|
Server --> Client: Resultset
|
|
end
|
|
@enduml
|
|
|
|
The X %Protocol is designed in a way that the ``EXECUTE`` stage
|
|
doesn't depend on the response of the ``PREPARE`` stage.
|
|
|
|
|
|
@startuml "Without PREPARE Stage Dependency"
|
|
group tcp packet
|
|
Client -> Server: Sql.PrepareStmt(**stmt_id=1**, ...)
|
|
end
|
|
group tcp packet
|
|
Server --> Client: Sql.PreparedStmtOk
|
|
end
|
|
group tcp packet
|
|
Client -> Server: Sql.PreparedStmtExecute(**stmt_id=1**, ...)
|
|
end
|
|
group tcp packet
|
|
Server --> Client: Sql.PreparedStmtExecuteOk
|
|
end
|
|
@enduml
|
|
|
|
That allows the client to send both ``PREPARE`` and ``EXECUTE`` after
|
|
each other without waiting for the server's response.
|
|
|
|
@startuml "Without Server Response"
|
|
group tcp packet
|
|
Client -> Server: Sql.PrepareStmt(**stmt_id=1**, ...)
|
|
Client -> Server: Sql.PreparedStmtExecute(**stmt_id=1**, ...)
|
|
end
|
|
|
|
group tcp packet
|
|
Server --> Client: Sql.PreparedStmtOk
|
|
Server --> Client: Sql.PreparedStmtExecuteOk
|
|
end
|
|
@enduml
|
|
|
|
@note
|
|
See the @ref mysqlx_protocol_implementation
|
|
"Implementation Notes" about how to efficiently implement pipelining.
|
|
|
|
|
|
Streaming Inserts {#use_cases_Streaming_Inserts}
|
|
=================
|
|
|
|
When inserting a large set of data (data import), make a trade-off
|
|
among:
|
|
|
|
- memory usage on client and server side
|
|
|
|
- network round-trips
|
|
|
|
- error reporting
|
|
|
|
For this example it is assumed that 1 million rows, each 1024 bytes in
|
|
size have to be transferred to the server.
|
|
|
|
**Optimizing for Network Round-Trips**
|
|
|
|
(Assuming the MySQL Client/Server %Protocol in this case) Network
|
|
round-trips can be minimized by creating a huge SQL statements of up to
|
|
1Gbyte in chunks of 16Mbyte (the protocol's maximum frame size) and
|
|
sending it over the wire and letting the server execute it.
|
|
|
|
@code{sql}
|
|
INSERT INTO tbl VALUES
|
|
( 1, "foo", "bar" ),
|
|
( 2, "baz", "fuz" ),
|
|
...;
|
|
@endcode
|
|
|
|
In this case:
|
|
|
|
- the client can generate the SQL statement in chunks of 16Mbyte and
|
|
write them to the network
|
|
|
|
- *(memory usage)* the server waits until the full 1GByte is received
|
|
|
|
- *(execution delay)* before it can start parsing and executing it
|
|
|
|
- *(error-reporting)* in case an error (parse-error, duplicate key
|
|
error, ...) the whole 1Gbyte message will be denied without any good
|
|
way to know where the error in that big message happened
|
|
|
|
The *Execution Time* for inserting all rows in one batch is:
|
|
|
|
@code{unparsed}
|
|
1 * RTT +
|
|
(num_rows * Row Size / Network Bandwidth) +
|
|
num_rows * Row Parse Time +
|
|
num_rows * Row Execution Time
|
|
@endcode
|
|
|
|
**Optimizing for Memory Usage and Error-Reporting**
|
|
|
|
The other extreme is using single row ``INSERT`` statements:
|
|
|
|
@code{sql}
|
|
INSERT INTO tbl VALUES
|
|
( 1, "foo", "bar" );
|
|
|
|
INSERT INTO tbl VALUES
|
|
( 2, "baz", "fuz" );
|
|
|
|
...
|
|
@endcode
|
|
|
|
- client can generate statements as it receives data
|
|
|
|
- streams it to the server
|
|
|
|
- *(execution delay)* server starts executing statements as soon as it
|
|
receives the first row
|
|
|
|
- *(memory usage)* server only has to buffer a single row
|
|
|
|
- *(error-reporting)* if inserting one row fails, the client knows
|
|
about it when it happens
|
|
|
|
- as each statement results in its own round-trip, the network-latency
|
|
is applied for each row instead of once
|
|
|
|
- each statement has to be parsed and executed in the server
|
|
|
|
Using Prepared Statements solves the last bullet point:
|
|
|
|
@startuml "Optimization for Memory"
|
|
Client -> Server: prepare("INSERT INTO tbl VALUES (?, ?, ?)")
|
|
Server --> Client: ok(stmt=1)
|
|
Client -> Server: execute(1, [1, "foo", "bar"])
|
|
Server --> Client: ok
|
|
Client -> Server: execute(1, [2, "baz", "fuz"])
|
|
Server --> Client: ok
|
|
@enduml
|
|
|
|
The *Execution Time* for inserting all rows using prepared statements
|
|
and the MySQL Client/Server %Protocol is:
|
|
|
|
@code{unparsed}
|
|
num_rows * RTT +
|
|
(num_rows * Row Size / Network Bandwidth) +
|
|
1 * Row Parse Time +
|
|
num_rows * Row Execution Time
|
|
@endcode
|
|
|
|
**Optimizing for Execution Time and Error-Reporting**
|
|
|
|
In the X %Protocol, a pipeline can be used to stream messages to the
|
|
server while the server is executing the previous message.
|
|
|
|
@startuml "Optimization for Execution"
|
|
group tcp packet
|
|
Client -> Server: Sql.PrepareStmt(stmt_id=1, ...)
|
|
Client -> Server: Sql.PreparedStmtExecute(stmt_id=1, values= [ .. ])
|
|
Client -> Server: Sql.PreparedStmtExecute(stmt_id=1, values= [ .. ])
|
|
Client -> Server: Sql.PreparedStmtExecute(stmt_id=1, values= [ .. ])
|
|
end
|
|
note over Client, Server
|
|
data too large to be merged into one TCP packet
|
|
end note
|
|
|
|
group tcp packet
|
|
Server --> Client: Sql.PreparedStmtOk
|
|
Server --> Client: Sql.PreparedStmtExecuteOk
|
|
Server --> Client: Sql.PreparedStmtExecuteOk
|
|
end
|
|
|
|
group tcp packet
|
|
Client -> Server: Sql.PreparedStmtExecute(stmt_id=1, values= [ .. ])
|
|
Client -> Server: Sql.PreparedStmtExecute(stmt_id=1, values= [ .. ])
|
|
end
|
|
|
|
group tcp packet
|
|
Server --> Client: Sql.PreparedStmtExecuteOk
|
|
Server --> Client: Sql.PreparedStmtExecuteOk
|
|
Server --> Client: Sql.PreparedStmtExecuteOk
|
|
end
|
|
@enduml
|
|
|
|
The *Execution Time* for inserting all rows using prepared statements
|
|
and using pipelining is (assuming that the network is not saturated):
|
|
|
|
@code{unparsed}
|
|
1 * RTT +
|
|
(1 * Row Size / Network Bandwidth) +
|
|
1 * Row Parse Time +
|
|
num_rows * Row Execution Time
|
|
@endcode
|
|
|
|
- ``one`` *network latency* to get the initial ``prepare``/``execute``
|
|
across the wire
|
|
|
|
- ``one`` *network bandwith* to get the initial ``prepare``/``execute``
|
|
across the wire. All further commands arrive at the server before the
|
|
executor needs them thanks to pipelining.
|
|
|
|
- ``one`` *row parse time* to parse the ``prepare``
|
|
|
|
- ``num_rows`` *row execution time* stays as before
|
|
|
|
In case *error reporting* isn't a major topic one can combine
|
|
``multi-row INSERT`` with pipelining and reduce the per-row network
|
|
overhead. This is important in case the network is saturated.
|
|
|
|
|
|
SQL with Multiple Resultsets {#use_cases_SQL_with_Multiple_Resultsets}
|
|
============================
|
|
|
|
@startuml "Multiple Resultsets"
|
|
group prepare
|
|
Client -> Server: Sql.PrepareStmt(stmt_id=1, "CALL multi_resultset_sp()")
|
|
Server --> Client: Sql.PrepareStmtOk()
|
|
end
|
|
group execute
|
|
Client -> Server: Sql.PreparedStmtExecute(stmt_id=1, cursor_id=1)
|
|
Server --> Client: Sql.PreparedStmtExecuteOk()
|
|
end
|
|
group fetch rows
|
|
Client -> Server: Cursor::FetchResultset(cursor_id=1)
|
|
Server --> Client: Resultset::ColumnMetaData
|
|
Server --> Client: Resultset::Row
|
|
Server --> Client: Resultset::Row
|
|
Server --> Client: Resultset::DoneMoreResultsets
|
|
end
|
|
group fetch last resultset
|
|
Client -> Server: Cursor::FetchResultset(cursor_id=1)
|
|
Server --> Client: Resultset::ColumnMetaData
|
|
Server --> Client: Resultset::Row
|
|
Server --> Client: Resultset::Row
|
|
Server --> Client: Resultset::Done
|
|
end
|
|
group close cursor
|
|
Client -> Server: Cursor::Close(cursor_id=1)
|
|
Server --> Client: Cursor::Ok
|
|
end
|
|
@enduml
|
|
|
|
Inserting CRUD Data in a Batch {#use_cases_Inserting_CRUD_Data_in_a_Batch}
|
|
==============================
|
|
|
|
Inserting multiple documents into a collection ``col1`` is a two-step
|
|
process:.
|
|
|
|
1. prepare the insert
|
|
|
|
2. pipeline the execute messages
|
|
|
|
@startuml "Batch
|
|
Client -> Server: Crud::PrepareInsert(stmt_id=1, Collection(name="col1"))
|
|
Server --> Client: PreparedStmt::PrepareOk
|
|
loop
|
|
Client -> Server: PreparedStmt::Execute(stmt_id=1, values=[ doc ])
|
|
Server --> Client: PreparedStmt::ExecuteOk
|
|
end loop
|
|
Client -> Server: PreparedStmt::Close(stmt_id=1)
|
|
Server --> Client: PreparedStmt::CloseOk
|
|
@enduml
|
|
|
|
By utilizing pipelining the ``execute`` message can be batched without
|
|
waiting for the corresponding ``executeOk`` message to be returned.
|
|
|
|
|
|
Cross-Collection Update and Delete {#use_cases_Cross_Collection_Update_and_Delete}
|
|
==================================
|
|
|
|
Deleting documents from collection ``col2`` based on data from
|
|
collection ``col1``.
|
|
|
|
Instead of fetching all rows from ``col1`` first to construct a big
|
|
``delete`` message it can also be run in nested loop:
|
|
|
|
@code{unparsed}
|
|
Crud.PrepareDelete(stmt_id=2, Collection(name="col2"), filter={ id=? })
|
|
Crud.PrepareFind(stmt_id=1, Collection(name="col1"), filter={ ... })
|
|
|
|
Sql.PreparedStmtExecute(stmt_id=1, cursor_id=2)
|
|
|
|
while ((rows = Sql.CursorFetch(cursor_id=2))):
|
|
Sql.PreparedStmtExecute(stmt_id=2, values = [ rows.col2_id ])
|
|
|
|
Sql.PreparedStmtClose(stmt_id=2)
|
|
Sql.PreparedStmtClose(stmt_id=1)
|
|
@endcode
|
|
|
|
@startuml "Update and Delete"
|
|
Client -> Server: Crud::PrepareFind(stmt_id=1, filter=...)
|
|
Client -> Server: Crud::PrepareDelete(stmt_id=2, filter={ id=? })
|
|
Server --> Client: PrepareStmt::PrepareOk
|
|
Server --> Client: PrepareStmt::PrepareOk
|
|
Client -> Server: PreparedStmt::ExecuteIntoCursor(stmt_id=1, cursor_id=2)
|
|
Server --> Client: PreparedStmt:::ExecuteOk
|
|
Client -> Server: Cursor::FetchResultset(cursor_id=2, limit=batch_size)
|
|
loop batch_size
|
|
Server --> Client: Resultset::Row
|
|
Client -> Server: PreparedStmt::Execute(stmt_id=2, values=[ ? ])
|
|
break
|
|
alt
|
|
Server --> Client: Resultset::Suspended
|
|
else
|
|
Server --> Client: Resultset::Done
|
|
end alt
|
|
end break
|
|
end loop
|
|
loop batch_size
|
|
Server --> Client: PreparedStmt::ExecuteOk
|
|
end loop
|
|
@enduml
|
|
*/
|