PolarDBforPostgreSQL

History
dengwendi 10ecb96088 INIT		2023-11-15 15:13:09 +08:00
..
expected	INIT	2023-11-15 15:13:09 +08:00
sql	INIT	2023-11-15 15:13:09 +08:00
.gitignore	INIT	2023-11-15 15:13:09 +08:00
Makefile	INIT	2023-11-15 15:13:09 +08:00
README	INIT	2023-11-15 15:13:09 +08:00
faultinjector--1.0.sql	INIT	2023-11-15 15:13:09 +08:00
faultinjector.c	INIT	2023-11-15 15:13:09 +08:00
faultinjector.control	INIT	2023-11-15 15:13:09 +08:00
README

Fault Injection Framework
=========================

Fault is defined as a point of interest in the source code with an
associated action to be taken when that point is hit during execution.
Fault points are defined using SIMPLE_FAULT_INJECTOR() macro or by
directly invoking the FaultInjector_TriggerFaultIfSet() function.  A
fault point is identifed by a name.  This module provides an interface
to inject a pre-defined fault point into a running PostgreSQL database
by associating an action with the fault point.  Action can be error,
panic, sleep, skip, infinite_loop, etc.

SQL based tests can make use of the "inject_fault()" interface to
simulate complex scenarios that are otherwise cumbersome to automate.

For example,

   select inject_fault('checkpoint', 'error');

The above command causes the next checkpoint to fail with elog(ERROR).
The 'checkpoint' fault is defined in CreateCheckPoint() function in
xlog.c.  Note that the fault is set to trigger only once by default.
Subsequent checkpoints will not be affected by the above fault.

   select inject_fault('checkpoint', 'status');

The above command checks the status of the fault.  It reports the
number of times the fault has been triggered during execution and
whether it has completed.  Faults that are completed will no longer
trigger.

   select wait_until_triggered_fault('checkpoint', 1);

The above command blocks until the checkpoint fault is triggered once.

   select inject_fault('checkpoint', 'reset');

The above command removes the fault, such that no action will be taken
when the fault point is reached during execution.  A fault can be set
to trigger more than once.  For example:

   select inject_fault_infinite('checkpoint', 'error');

This command causes checkpoints to fail until the fault is removed.

More detailed interface
-----------------------

A more detailed version of the fault injector interface accepts
several more paramters.  Let us assume that a fault named
"heap_insert" has been defined in function heap_insert() in backend
code, like so:

--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1875,6 +1875,13 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
	Buffer		vmbuffer = InvalidBuffer;
	bool		all_visible_cleared = false;

+#ifdef FAULT_INJECTOR
+	FaultInjector_TriggerFaultIfSet(
+		"heap_insert",
+		"" /* database name */,
+		RelationGetRelationName(relation));
+#endif
+

A SQL test may want to inject "heap_insert" fault such that inserts
into a table named "my_table" fail for first 10 tuples.

   select inject_fault(
      'heap_insert',
	  'error',
	  '' /* database name */,
	  'my_table' /* table name */,
	  1 /* start occurrence */,
	  10 /* end occurrence */,
	  0 /* */);

The above command injects heap_insert fault such that the inserting
transaction will abort with elog(ERROR) when the code reaches the
fault point, only if the relation being inserted to has the name
'my_table'.  Moreover, the fault will stop triggering after 10 tuples
have been inserted into the my_table.  The 11th transaction to insert
into my_table will continue the insert as usual.

Fault actions
-------------

Fault action is specified as the type parameter in inject_fault()
interface.  The following types are supported.

error
   elog(ERROR)

fatal
   elog(FATAL)

panic
   elog(PANIC)

sleep
   sleep for specified amount of time, use extraArg in seconds

infinite_loop
   block until the query is canceled or terminated

suspend
   block until the fault is removed

resume
   resume backend processes that are blocked due to a suspend fault

skip
   do nothing (used to implement custom logic that is not supported by
   predefined actions)

reset
   remove a previously injected fault

segv
   crash the backend process due to SIGSEGV

interrupt
   simulate cancel interrupt arrival, such that the next
   interrupt processing cycle will cancel the query

finish_pending
   similar to interrupt, sets the QueryFinishPending global flag

status
   return a text datum with details of how many times a fault has been
   triggered, the state it is currently in.  Fault states are as follows:

      "set" injected but the fault point has not been reached during
      execution yet.

      "tiggered" the fault point has been reached at least once during
      execution.

      "completed" the action associated with the fault point will no
      longer be taken because the fault point has been reached maximum
      number of times during execution.