After thinking about this more, I realized it is probably never useful, and
certainly completely incorrect in all of the cases it was still being used in
the examples.
Necessarily, this means that dma_start must now know what the size of the
response is, so that it can issue the appropriate number of ocbp instructions.
This also cleans up the inconsistent _command_buf and _recieve_buf declarations.
From the GCC manual.
> GCC permits a C structure to have no members:
struct empty {
};
> The structure has size zero. In C++, empty structures are part of the
> language. G++ treats empty structures as if they had a single member of type
> char.
I was not aware of the different behavior in C++.
This fixes every maple example--most were broken for multiple reasons, including
this one.
This also enables SH4 caching. This includes linking code/data into the P1
area (previously this was not the case).
The maple examples (which indeed involve much use of DMA) require much work to
successfully work with the operand and copyback caches. The vibration example
currently is the most complete, though I should consider more on how I want to
structure maple response operand cache invalidation more generally.
Though I did spend much time thinking about this, my idea was not correct.
The "tearing" and "previous frame is being shown while it is being drawn" is
simply because that's exactly what the logic in holly/core.cpp did.
This is no longer the case--by the time the newly-created core_flip function is
called, the core render is complete, and we should switch the FB_R_SOF1 to the
current framebuffer, not the one that is going to be written on next frame.
This also modifies alt.lds so that (non-startup) code now runs in the P1 area,
with operand/instruction/copyback caches enabled. This caused a 10x speed
increase in my testing.
There are still texture sampling issues that I don't understand. Until
I properly understand this, using (bitmap) fonts that have
power-of-two dimensions seem to produce "acceptable but incorrect"
results.
Also adds the incomplete modifier_volume example.
This also adds vec2 for UV coordinates, and obj_to_cpp has been
modified to parse vertex texture coordinates from obj files.
I think this was only relevant when END_OF_RENDER_VIDEO was in use;
this doesn't seem to affect flycast's END_OF_RENDER_TSP
generation. The former is definitely a flycast bug.
The main issue with the previous code:
constexpr uint32_t tiles = (640 / 32) * (320 / 32);
Should have been:
constexpr uint32_t tiles = (640 / 32) * (480 / 32);
The consequence of this is some OPBs were being overwritten by
TA_NEXT_OPB, causing corruption (missing triangles, incomplete
drawings) in some tiles.
0x15 is 21, which is larger than the OPB size (16).
The 0x15 value directly causes CORE to hang given a sufficiently
"large" object list (more than ~2 triangles per tile).
After changing pointer burst size to the intended value, 15, CORE no
longer hangs while drawing "large" object lists.