{"id":8,"date":"2015-06-30T16:42:38","date_gmt":"2015-06-30T15:42:38","guid":{"rendered":"https:\/\/pinkieduck.wordpress.com\/?p=3"},"modified":"2019-02-22T22:54:23","modified_gmt":"2019-02-22T21:54:23","slug":"rsx-in-rpcs3","status":"publish","type":"post","link":"https:\/\/pinkieduck.net\/?p=8","title":{"rendered":"RSX in RPCS3"},"content":{"rendered":"<p>I&rsquo;m assuming the reader has some knowledge in graphic programming (with D3D or GL) and generally knows how modern CPU works.<\/p>\n<p><b>The specs of RSX<\/b><\/p>\n<p>RSX is the graphic processor of the PS3. The acronym stands for Reality Synthetiser according to Wikipedia. It&rsquo;s actually based from Nvidia own Geforce 7800 Gtx, a directX9 class gpu slightly modified to allow Cell&rsquo;s SPU to do image processing. It has 256 MB of local\u00a0memory (PS3 terminology for \u00ab\u00a0video memory\u00a0\u00bb) with a 22.4 GB\/s bandwidth according to Wikipedia again. It&rsquo;s processing power is 228 Gflops\/s and supports up to 4 render targets.<br \/>\nThe aforementioned customisations of RSX allows it to access the main memory (shared by PPU and SPU) at 20 GB\/s in read direction and 15 GB\/s in write direction. This means that even if it&rsquo;s slower to render a scene in main memory instead of local memory, the bandwidth hit is rouhly 30%.<\/p>\n<p>Now let&rsquo;s compare these numbers with the ones from a 2015 PC architecture :<br \/>\nA Geforce 970 has 4 GB of video memory (x16) \u00a0with a 200 GB\/s bandwidth (x10), can process 5 Teraflop of data (x25) and support 8 render targets. However the theorical peak bandwidth of DD3 is 20 GB\/s which is\u00a0barely what the PS3 did offer in 2007.<br \/>\nThe implication for emulation are strong : we can&rsquo;t afford extra memory transfers between main memory and video memory.<\/p>\n<p><strong>RSX and Cell interaction<\/strong><\/p>\n<p>In all modern architecture CPU and GPU are executing independently from each others and PS3 is no exception.<br \/>\nRSX commands are 32 bits instructions puts in command buffers in a sequential maneer by Cell. There are 3 special command that are used to break the sequential flow of RSX, namely JUMP (mostly used to move from one command buffer to another) and CALL\/RETURN pair (used to implement subroutines). Of course Cell needs to be able to prevent RSX to read commands faster than it can fill command buffer and thus RSX provides a\u00a0\u00ab\u00a0get\u00a0\u00bb and a \u00ab\u00a0put\u00a0\u00bb register accessible from Cell. \u00ab\u00a0Get\u00a0\u00bb contains the memory address of the command the RSX is currently reading. \u00ab\u00a0Put\u00a0\u00bb can be written to by Cell, and is used as a \u00ab\u00a0barrier\u00a0\u00bb, ie the RSX reads command only if Get and Put are different. Put register&rsquo;s purpose is similar to glFlush in OpenGL.<\/p>\n<p>RSX commands can be sorted in 3 categories :<\/p>\n<ul>\n<li>Commands that set RSX register. Like most ancient GPU the RSX doesn&rsquo;t fetch non buffer inputs in memory but in hardware registers. This includes textures, buffers, render surface&#8230; description (their location, their format, their stride, &#8230;), vertex constants (RSX has 512 4&#215;32 registers storage for vertex constants), some pipeline state related to blending operation or depth testing. I didn&rsquo;t mention fragment constants (or pixel shader constant if you prefer D3D terminology) because it looks like there is no true storage for them, the cell has to \u00ab\u00a0patch\u00a0\u00bb\u00a0fragment program\/pixel shader in memory.<\/li>\n<li>Commands that issues actual rendering operations. So far I only saw 3 of them, a \u00ab\u00a0clear surface\u00a0\u00bb command that clear render targets (the clear value being stored in register), a \u00ab\u00a0draw\u00a0\u00bb command that issues an unindexed rendering call, an an \u00ab\u00a0indexed draw\u00a0\u00bb command that issues an indexed rendering call.<\/li>\n<li>Commands managing semaphores. Semaphore\u00a0provide a more powerful sync mechanism than the get\/put register. Semaphore are basically location in memory associated with a 32 bits values. A wait command can be used to make RSX hold execution until the semaphore values is the same as the expected one, for instance to allow Cell to complete buffer filling task. A release command can be used to make RSX writes a specific value to the semaphore location ; this way a Cell thread can be notified that RSX has finished with a given rendering command and is free to update buffers.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>I&rsquo;m assuming the reader has some knowledge in graphic programming (with D3D or GL) and generally knows how modern CPU works. The specs of RSX RSX is the graphic processor of the PS3. The acronym stands for Reality Synthetiser according to Wikipedia. It&rsquo;s actually based from Nvidia own Geforce 7800 Gtx, a directX9 class gpu [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/pinkieduck.net\/index.php?rest_route=\/wp\/v2\/posts\/8"}],"collection":[{"href":"https:\/\/pinkieduck.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/pinkieduck.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/pinkieduck.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/pinkieduck.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=8"}],"version-history":[{"count":1,"href":"https:\/\/pinkieduck.net\/index.php?rest_route=\/wp\/v2\/posts\/8\/revisions"}],"predecessor-version":[{"id":143,"href":"https:\/\/pinkieduck.net\/index.php?rest_route=\/wp\/v2\/posts\/8\/revisions\/143"}],"wp:attachment":[{"href":"https:\/\/pinkieduck.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=8"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/pinkieduck.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=8"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/pinkieduck.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=8"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}