mirror of
https://github.com/HackerN64/F3DEX3.git
synced 2026-01-21 10:37:45 -08:00
Documenting snake
This commit is contained in:
10
README.md
10
README.md
@@ -83,11 +83,11 @@ all at the same time!
|
||||
loads. If this system incorrectly culls supposedly repeated texture loads
|
||||
which actually differ due to segment manipulation, you can locally disable it
|
||||
using the new `SPDontSkipTexLoadsAcross` command.
|
||||
- New `SPTriangleStrip` and `SPTriangleFan` commands **pack up to 5 tris** into
|
||||
one 64-bit GBI command (up from 2 tris in F3DEX2). In any given object, most
|
||||
tris can be drawn with these commands, with only a few at the end drawn with
|
||||
`SP2Triangles` or `SP1Triangle`. So, this cuts the triangle portion of display
|
||||
lists roughly in half, saving DRAM traffic and ROM space.
|
||||
- New `SPTriSnake` command provides a flexible, generalized triangle strip
|
||||
primitive, which can better leverage the vertex cache than a traditional
|
||||
triangle strip. This packs up to 8 tris per display list command, for up to
|
||||
4x less memory bandwidth for loading tris; typical meshes should see a **2-3x
|
||||
memory bandwidth reduction** for this step.
|
||||
- New `SPAlphaCompareCull` command enables culling of triangles whose computed
|
||||
shade alpha values are all below or above a settable threshold. This
|
||||
**substantially reduces the performance penalty of cel shading**--only tris
|
||||
|
||||
Binary file not shown.
|
Before Width: | Height: | Size: 11 KiB |
@@ -123,3 +123,12 @@ By setting this to -2 and drawing an opaque tri, the tri would appear like a
|
||||
decal, but with no Z-fighting. This has been removed and replaced with the decal
|
||||
fix, which is automatic and does not require any special setup in the display
|
||||
list.
|
||||
|
||||
## `SPTriStrip` and `SPTriFan`
|
||||
|
||||
These commands are still supported in the GBI, but as special cases of
|
||||
`SPTriSnake` with specific sets of directions. In addition to covering both of
|
||||
these commands, the `SPTriSnake` command can draw the mirror-imaged 4-triangle
|
||||
strip which `SPTriStrip` could not (without inefficiency), as well as
|
||||
arbitrarily long triangle strips, fans, and other snake shapes via
|
||||
`SPContinueSnake`.
|
||||
|
||||
63
docs/Documentation/Triangle Snake.md
Normal file
63
docs/Documentation/Triangle Snake.md
Normal file
@@ -0,0 +1,63 @@
|
||||
@page snake Triangle Snake
|
||||
|
||||

|
||||
*A triangle snake, drawn with a single F3DEX3 `gsSPTriSnake` command (and
|
||||
multiple `gsSPContinueSnake`s). Flat shading is used to emphasize that each
|
||||
consecutive triangle in the snake has its Vertex 1 be a new index, not the same
|
||||
as one of the indices of the previous triangle. Drawing this with a single snake
|
||||
uses 3.7x less memory bandwidth for triangle display list commands compared to
|
||||
drawing the same mesh with `gsSP2Triangles` commands like in F3DEX2.*
|
||||
|
||||
**Triangle Snake** is F3DEX3's new accelerated triangles command. It is capable
|
||||
of drawing any shape which is expressible as a single, non-branching chain of
|
||||
connected triangles. At each triangle, the command encodes whether the snake
|
||||
turns left or right--in other words, whether this triangle is attached to one or
|
||||
the other of the yet-unconnected edges of the previous triangle. A traditional
|
||||
triangle strip is a special case of a triangle snake with alternating directions
|
||||
(left-right-left-right-etc.), and similarly a traditional triangle fan is a
|
||||
triangle snake with the same direction repeatedly (left-left-left-etc.).
|
||||
|
||||

|
||||
*A snake can slither by moving in an alternating left and right pattern. This
|
||||
represents a triangle strip. Original photo by Bui Van Dong, free-use licensed*
|
||||
|
||||

|
||||
*If the snake repeatedly turns in the same direction, it coils up. This
|
||||
corresponds to a triangle fan. Original photo by Gabriel Rondina, free-use
|
||||
licensed*
|
||||
|
||||

|
||||
*The snake need not be constrained to either shape; it can turn left or right in
|
||||
any combination. This can be thought of as concatenating triangle strips and
|
||||
fans. Original photo by Al d'Vilas, free-use licensed*
|
||||
|
||||
A snake can be arbitrarily long. It starts with a `SPTriSnake` command, which
|
||||
may be followed by one or more `SPContinueSnake` macros which encode continued
|
||||
indices. The latter are not commands (there's no command byte)--they are just
|
||||
more index data sequentially in the display list. In other words, the display
|
||||
list input buffer is the storage for the indices data. The microcode correctly
|
||||
handles the case when the snake runs off the end of the input buffer and the
|
||||
input buffer needs to be refilled. The refilled data starts from the start of
|
||||
the input buffer, as if it were regular commands; this matters for the hints
|
||||
system.
|
||||
|
||||
## Memory Bandwidth
|
||||
|
||||
The goal of any accelerated triangles system in a microcode is to reduce the
|
||||
memory bandwidth used for loading triangle indices. The actual tris drawn are
|
||||
the same regardless of how their indices are encoded in the display list, so we
|
||||
do not consider the performance of actually drawing the tris, only loading their
|
||||
indices.
|
||||
|
||||
An `SPTriSnake` command by itself contains 7 vertices and draws 5 triangles
|
||||
(because the first triangle needs two extra vertices to start itself). An `SPContinueSnake` macro contains 8 vertices and draws 8 tris, in each case
|
||||
continuing the existing snake. The F3D family microcodes before F3DEX3 only
|
||||
provided `SP1Triangle` and `SP2Triangle` commands, so any snake of 3 or more
|
||||
tris is more efficient than F3DEX2 and older microcodes. The efficiency gain
|
||||
is up to 4x (2 tris -> 8 tris per 8-byte macro), though in typical meshes the
|
||||
gain is expected to be 2-3x.
|
||||
|
||||
## Vertex Cache Locality
|
||||
|
||||
The key advantage of a triangle snake over a traditional triangle strip is that
|
||||
it better exploits the vertex cache.
|
||||
BIN
docs/Documentation/snake_coil.jpg
Normal file
BIN
docs/Documentation/snake_coil.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 324 KiB |
1045
docs/Documentation/snake_demo.svg
Normal file
1045
docs/Documentation/snake_demo.svg
Normal file
File diff suppressed because it is too large
Load Diff
|
After Width: | Height: | Size: 64 KiB |
BIN
docs/Documentation/snake_demo_ingame.png
Normal file
BIN
docs/Documentation/snake_demo_ingame.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 202 KiB |
BIN
docs/Documentation/snake_mixed.jpg
Normal file
BIN
docs/Documentation/snake_mixed.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 307 KiB |
BIN
docs/Documentation/snake_slither.jpg
Normal file
BIN
docs/Documentation/snake_slither.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 312 KiB |
@@ -5,3 +5,4 @@
|
||||
- @subpage removed
|
||||
- @subpage performance
|
||||
- @subpage porting
|
||||
- @subpage snake
|
||||
|
||||
4
f3dex3.s
4
f3dex3.s
@@ -1285,7 +1285,7 @@ tri_snake_loop_from_input_buffer:
|
||||
li $ra, tri_snake_loop // For tri_main
|
||||
bltz $3, tri_snake_end // Upper bit of real index b set = done
|
||||
andi $11, $3, 1 // Get direction flag from index c
|
||||
beqz inputBufferPos, tris_end // TODO tri_snake_over_input_buffer // == 0 at end of input buffer
|
||||
beqz inputBufferPos, tri_snake_over_input_buffer // == 0 at end of input buffer
|
||||
andi $3, $3, 0x7E // Mask out flags from index c
|
||||
sb $3, rdpHalf1Val + 1 // Store index c as vertex 1
|
||||
sb $2, (rdpHalf1Val + 2)($11) // Store old v1 as 2 if dir clear or 3 if set
|
||||
@@ -2604,7 +2604,7 @@ tris_end:
|
||||
|
||||
tri_snake_end:
|
||||
addi inputBufferPos, inputBufferPos, 7 // Round up to whole input command
|
||||
addi $11, $zero, 0xFFF8 // Sign-extend; andi is zero-extend!
|
||||
addi $11, $zero, 0xFFF8 // Sign-extend; andi is zero-extend!
|
||||
j tris_end
|
||||
and inputBufferPos, inputBufferPos, $11 // inputBufferPos has to be negative
|
||||
|
||||
|
||||
126
gbi.h
126
gbi.h
@@ -2327,9 +2327,6 @@ _DW({ \
|
||||
* gSPVertex(glistp++, v, 3, 2);
|
||||
* ```
|
||||
*
|
||||
* @note
|
||||
* Because the RSP geometry transformation engine uses a vertex list with triangle list architecture, it is quite powerful. A simple one-triangle macro retains least performance compared to @ref gSP2Triangles or the new 5 tris commands in EX3 (@ref gSPTriStrip, @ref gSPTriFan).
|
||||
*
|
||||
* @param v is the pointer to the vertex list (segment address)
|
||||
* @param n is the number of vertices
|
||||
* @param v0 is the load vertex by index vo(0~55) in vertex buffer
|
||||
@@ -2681,19 +2678,19 @@ _DW({ \
|
||||
}
|
||||
|
||||
/**
|
||||
* Make the triangle snake turn left before drawing this triangle. In other
|
||||
* Make the triangle snake turn right before drawing this triangle. In other
|
||||
* words, build the new triangle off the newest and middle-age vertices of the
|
||||
* last triangle.
|
||||
* @see gSPTriSnake
|
||||
*/
|
||||
#define G_SNAKE_LEFT 0
|
||||
#define G_SNAKE_RIGHT 0
|
||||
/**
|
||||
* Make the triangle snake turn right before drawing this triangle. In other
|
||||
* Make the triangle snake turn left before drawing this triangle. In other
|
||||
* words, build the new triangle off the newest and oldest vertices of the last
|
||||
* triangle.
|
||||
* @see gSPTriSnake
|
||||
*/
|
||||
#define G_SNAKE_RIGHT 1
|
||||
#define G_SNAKE_LEFT 1
|
||||
/**
|
||||
* Logical-OR this into a triangle index to mark it as the last triangle of the
|
||||
* snake. In other words, this gets OR'd into the last valid index, not the
|
||||
@@ -2706,16 +2703,16 @@ _DW({ \
|
||||
*/
|
||||
#define G_SNAKE_LAST 0x40
|
||||
|
||||
#define _gSPTriSnakeW0(i1, i2, i3) \
|
||||
(_SHIFTL(G_TRISNAKE, 24, 8) | \
|
||||
_SHIFTL((i2)*2, 16, 8) | \
|
||||
_SHIFTL((i1)*2, 8, 8) | \
|
||||
_SHIFTL((i3)*2|G_SNAKE_RIGHT, 0, 8))
|
||||
#define _gSPTriSnakeW0(i1, i2, i3) \
|
||||
(_SHIFTL(G_TRISNAKE, 24, 8) | \
|
||||
_SHIFTL((i2)*2, 16, 8) | \
|
||||
_SHIFTL((i1)*2, 8, 8) | \
|
||||
_SHIFTL((i3)*2|G_SNAKE_LEFT, 0, 8))
|
||||
#define _gSPTriSnakeW1(i4, i4d, i5, i5d, i6, i6d, i7, i7d) \
|
||||
(_SHIFTL((i4)*2|(i4d), 24, 8) | \
|
||||
_SHIFTL((i5)*2|(i5d), 16, 8) | \
|
||||
_SHIFTL((i6)*2|(i6d), 8, 8) | \
|
||||
_SHIFTL((i7)*2|(i7d), 0, 8))
|
||||
(_SHIFTL((i4)*2|(i4d), 24, 8) | \
|
||||
_SHIFTL((i5)*2|(i5d), 16, 8) | \
|
||||
_SHIFTL((i6)*2|(i6d), 8, 8) | \
|
||||
_SHIFTL((i7)*2|(i7d), 0, 8))
|
||||
|
||||
/**
|
||||
* Triangle snake is F3DEX3's accelerated triangles command. It is a generalized
|
||||
@@ -2728,31 +2725,31 @@ _DW({ \
|
||||
* The drawing algorithm is:
|
||||
* - Initialize 3 bytes of stored triangle indices, A-B-C, to i3-i1-i2, and draw
|
||||
* this triangle. (This initialization and draw is actually implemented by
|
||||
* storing i2-i1-i3 and then running the algorithm below with G_SNAKE_RIGHT,
|
||||
* storing i2-i1-i3 and then running the algorithm below with G_SNAKE_LEFT,
|
||||
* which ends up storing i2 to C and i3 to A, ultimately creating i3-i1-i2.)
|
||||
* - Loop:
|
||||
* - If the index in A has G_SNAKE_LAST or'd into it, exit.
|
||||
* - Increment the input pointer, and read the next index and its direction
|
||||
* flag (currently i4 and i4d).
|
||||
* - If the direction flag is G_SNAKE_LEFT, copy A to B; else
|
||||
* (G_SNAKE_RIGHT), copy A to C.
|
||||
* - If the direction flag is G_SNAKE_RIGHT, copy A to B; else
|
||||
* (G_SNAKE_LEFT), copy A to C.
|
||||
* - Store the new index (currently i4) to A.
|
||||
* - Draw the triangle A-B-C and repeat the loop.
|
||||
*
|
||||
* For example, after drawing the first triangle i3-i1-i2, if i4 is
|
||||
* G_SNAKE_LEFT, the snake turns left and draws i4-i3-i2:
|
||||
* 4 -->-- 3
|
||||
* \' /'\ (winding order and
|
||||
* \ / \ first vertex for flat
|
||||
* \ / \ shading are marked)
|
||||
* 2 --<-- 1
|
||||
* Conversely, after the first triangle i3-i1-i2, if i4 is G_SNAKE_RIGHT, the
|
||||
* snake turns right and draws i4-i1-i3:
|
||||
* 3 -->-- 4
|
||||
* /'\ '/
|
||||
* / \ /
|
||||
* / \ /
|
||||
* 2 --<-- 1
|
||||
* G_SNAKE_RIGHT, the snake turns right and draws i4-i3-i2:
|
||||
* 3 --<-- 4
|
||||
* /'\ '/ (winding order and
|
||||
* / \ / first vertex for flat
|
||||
* / \ / shading are marked)
|
||||
* 1 -->-- 2
|
||||
* Conversely, after the first triangle i3-i1-i2, if i4 is G_SNAKE_LEFT, the
|
||||
* snake turns left and draws i4-i1-i3:
|
||||
* 4 --<-- 3
|
||||
* \' /'\
|
||||
* \ / \
|
||||
* \ / \
|
||||
* 1 -->-- 2
|
||||
* If the snake turns in the same direction repeatedly, it will coil up, forming
|
||||
* a triangle fan. If it slithers left and right alternately, this will form a
|
||||
* triangle strip. Any combination of these is also possible. In particular, a
|
||||
@@ -2762,6 +2759,11 @@ _DW({ \
|
||||
* snake, except for tris which have two unconnected edges which can only be the
|
||||
* first or last tris of the snake.
|
||||
*
|
||||
* Logical-OR G_SNAKE_LAST into the last valid index of the snake. This index
|
||||
* still needs a valid G_SNAKE_LEFT or G_SNAKE_RIGHT for its direction. However,
|
||||
* for all indices after this, you can fill the index and direction parameters
|
||||
* with 0s.
|
||||
*
|
||||
* @see gSPContinueSnake to extend the snake to more than 5 triangles.
|
||||
*/
|
||||
#define gSPTriSnake(pkt, i1, i2, i3, i4, i4d, i5, i5d, i6, i6d, i7, i7d) \
|
||||
@@ -2802,7 +2804,7 @@ _DW({ \
|
||||
#define gsSPContinueSnake(i0, i0d, i1, i1d, i2, i2d, i3, i3d, \
|
||||
i4, i4d, i5, i5d, i6, i6d, i7, i7d) \
|
||||
{ \
|
||||
_gSPTriSnakeW1(i0, i0d, i1, i1d, i2, i2d, i3, i3d) \
|
||||
_gSPTriSnakeW1(i0, i0d, i1, i1d, i2, i2d, i3, i3d), \
|
||||
_gSPTriSnakeW1(i4, i4d, i5, i5d, i6, i6d, i7, i7d) \
|
||||
}
|
||||
|
||||
@@ -2823,25 +2825,25 @@ _DW({ \
|
||||
* @note One of the two handednesses of a 4 tri strip cannot be drawn directly
|
||||
* with gSPTriStrip, unless v1 and v2 are set to the same vertex to create a
|
||||
* degenerate triangle, which costs a little performance. However, now this
|
||||
* shape can be drawn with gSPTriSnake (directions right-left-right).
|
||||
* shape can be drawn with gSPTriSnake (directions left-right-left).
|
||||
*/
|
||||
#define gSPTriStrip(pkt, v1, v2, v3, v4, v5, v6, v7) \
|
||||
gSPTriSnake(pkt, v1, v2, \
|
||||
v3 | ((v4 & 0x80) ? G_SNAKE_LAST : 0), \
|
||||
v4 | ((v5 & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_LEFT, \
|
||||
v5 | ((v6 & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_RIGHT, \
|
||||
v6 | ((v7 & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_LEFT, \
|
||||
v7, G_SNAKE_RIGHT)
|
||||
#define gSPTriStrip(pkt, v1, v2, v3, v4, v5, v6, v7) \
|
||||
gSPTriSnake(pkt, v1, v2, \
|
||||
(v3) | (((v4) & 0x80) ? G_SNAKE_LAST : 0), \
|
||||
(v4) | (((v5) & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_RIGHT, \
|
||||
(v5) | (((v6) & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_LEFT, \
|
||||
(v6) | (((v7) & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_RIGHT, \
|
||||
(v7) | G_SNAKE_LAST, G_SNAKE_LEFT)
|
||||
/**
|
||||
* @copydetails gSPTriStrip
|
||||
*/
|
||||
#define gsSPTriStrip(v1, v2, v3, v4, v5, v6, v7) \
|
||||
gsSPTriSnake(v1, v2, \
|
||||
v3 | ((v4 & 0x80) ? G_SNAKE_LAST : 0), \
|
||||
v4 | ((v5 & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_LEFT, \
|
||||
v5 | ((v6 & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_RIGHT, \
|
||||
v6 | ((v7 & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_LEFT, \
|
||||
v7, G_SNAKE_RIGHT)
|
||||
#define gsSPTriStrip(v1, v2, v3, v4, v5, v6, v7) \
|
||||
gsSPTriSnake(v1, v2, \
|
||||
(v3) | (((v4) & 0x80) ? G_SNAKE_LAST : 0), \
|
||||
(v4) | (((v5) & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_RIGHT, \
|
||||
(v5) | (((v6) & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_LEFT, \
|
||||
(v6) | (((v7) & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_RIGHT, \
|
||||
(v7) | G_SNAKE_LAST, G_SNAKE_LEFT)
|
||||
/**
|
||||
* 5 Triangles in fan arrangement. Draws the following tris:
|
||||
* v3-v1-v2, v4-v1-v3, v5-v1-v4, v6-v1-v5, v7-v1-v6
|
||||
@@ -2849,23 +2851,23 @@ _DW({ \
|
||||
*
|
||||
* @deprecated Use gSPTriSnake directly.
|
||||
*/
|
||||
#define gSPTriFan(pkt, v1, v2, v3, v4, v5, v6, v7) \
|
||||
gSPTriSnake(pkt, v1, v2, \
|
||||
v3 | ((v4 & 0x80) ? G_SNAKE_LAST : 0), \
|
||||
v4 | ((v5 & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_RIGHT, \
|
||||
v5 | ((v6 & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_RIGHT, \
|
||||
v6 | ((v7 & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_RIGHT, \
|
||||
v7, G_SNAKE_RIGHT)
|
||||
#define gSPTriFan(pkt, v1, v2, v3, v4, v5, v6, v7) \
|
||||
gSPTriSnake(pkt, v1, v2, \
|
||||
(v3) | (((v4) & 0x80) ? G_SNAKE_LAST : 0), \
|
||||
(v4) | (((v5) & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_LEFT, \
|
||||
(v5) | (((v6) & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_LEFT, \
|
||||
(v6) | (((v7) & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_LEFT, \
|
||||
(v7) | G_SNAKE_LAST, G_SNAKE_LEFT)
|
||||
/**
|
||||
* @copydetails gSPTriFan
|
||||
*/
|
||||
#define gsSPTriFan(v1, v2, v3, v4, v5, v6, v7) \
|
||||
gsSPTriSnake(v1, v2, \
|
||||
v3 | ((v4 & 0x80) ? G_SNAKE_LAST : 0), \
|
||||
v4 | ((v5 & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_RIGHT, \
|
||||
v5 | ((v6 & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_RIGHT, \
|
||||
v6 | ((v7 & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_RIGHT, \
|
||||
v7, G_SNAKE_RIGHT)
|
||||
#define gsSPTriFan(v1, v2, v3, v4, v5, v6, v7) \
|
||||
gsSPTriSnake(v1, v2, \
|
||||
(v3) | (((v4) & 0x80) ? G_SNAKE_LAST : 0), \
|
||||
(v4) | (((v5) & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_LEFT, \
|
||||
(v5) | (((v6) & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_LEFT, \
|
||||
(v6) | (((v7) & 0x80) ? G_SNAKE_LAST : 0), G_SNAKE_LEFT, \
|
||||
(v7) | G_SNAKE_LAST, G_SNAKE_LEFT)
|
||||
|
||||
|
||||
/*
|
||||
|
||||
Reference in New Issue
Block a user