This page lists my published academic research. The publications includes papers as well as SIGGRAPH courses and talks. Mostly, this is research carried out during my PhD.
2015 |
|
![]() | Olsson, Ola; Persson, Emil; Billeter, Markus Real-time Many-light Management and Shadows with Clustered Shading Inproceedings ACM SIGGRAPH 2015 Courses, pp. 12:1–12:398, ACM, Los Angeles, California, 2015, ISBN: 978-1-4503-3634-5. @inproceedings{Olsson:2015:RMM:2776880.2792712, title = {Real-time Many-light Management and Shadows with Clustered Shading}, author = {Ola Olsson and Emil Persson and Markus Billeter}, url = {http://doi.acm.org/10.1145/2776880.2792712}, doi = {10.1145/2776880.2792712}, isbn = {978-1-4503-3634-5}, year = {2015}, date = {2015-08-10}, booktitle = {ACM SIGGRAPH 2015 Courses}, pages = {12:1--12:398}, publisher = {ACM}, address = {Los Angeles, California}, series = {SIGGRAPH '15}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
![]() | Olsson, Ola; Billeter, Markus; Sintorn, Erik; Kämpe, Viktor; Assarsson, Ulf More Efficient Virtual Shadow Maps for Many Lights Journal Article Visualization and Computer Graphics, IEEE Transactions on, 21 (6), pp. 701-713, 2015, ISSN: 1077-2626. @article{olsson:Shadows:2015:pp, title = {More Efficient Virtual Shadow Maps for Many Lights}, author = {Ola Olsson and Markus Billeter and Erik Sintorn and Viktor Kämpe and Ulf Assarsson}, url = {http://dx.doi.org/10.1109/TVCG.2015.2418772 /wp-content/uploads/clustered_shadows_tvcg.pdf}, doi = {10.1109/TVCG.2015.2418772}, issn = {1077-2626}, year = {2015}, date = {2015-06-01}, journal = {Visualization and Computer Graphics, IEEE Transactions on}, volume = {21}, number = {6}, pages = {701-713}, abstract = {Recently, several algorithms have been introduced that enable real-time performance for many lights in applications such as games. In this paper, we explore the use of hardware-supported virtual cube-map shadows to efficiently implement high-quality shadows from hundreds of light sources in real time and within a bounded memory footprint. In addition, we explore the utility of ray tracing for shadows from many lights and present a hybrid algorithm combining ray tracing with cube maps to exploit their respective strengths. Our solution supports real-time performance with hundreds of lights in fully dynamic high-detail scenes.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Recently, several algorithms have been introduced that enable real-time performance for many lights in applications such as games. In this paper, we explore the use of hardware-supported virtual cube-map shadows to efficiently implement high-quality shadows from hundreds of light sources in real time and within a bounded memory footprint. In addition, we explore the utility of ray tracing for shadows from many lights and present a hybrid algorithm combining ray tracing with cube maps to exploit their respective strengths. Our solution supports real-time performance with hundreds of lights in fully dynamic high-detail scenes. |
2014 |
|
![]() | Olsson, Ola; Billeter, Markus; Persson, Emil Efficient Real-Time Shading with Many Lights Inproceedings SIGGRAPH Asia 2014 Courses, pp. 11:1–11:310, ACM, Shenzhen, China, 2014, ISBN: 978-1-4503-3195-1. @inproceedings{Olsson:2014:ERS:2659467.2659475b, title = {Efficient Real-Time Shading with Many Lights}, author = {Ola Olsson and Markus Billeter and Emil Persson}, url = {http://doi.acm.org/10.1145/2659467.2659475}, doi = {10.1145/2659467.2659475}, isbn = {978-1-4503-3195-1}, year = {2014}, date = {2014-12-06}, booktitle = {SIGGRAPH Asia 2014 Courses}, pages = {11:1--11:310}, publisher = {ACM}, address = {Shenzhen, China}, series = {SA '14}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
![]() | Olsson, Ola; Sintorn, Erik; Kämpe, Viktor; Billeter, Markus; Assarsson, Ulf Implementing Efficient Virtual Shadow Maps for Many Lights Inproceedings ACM SIGGRAPH 2014 Talks, pp. 50:1–50:1, ACM, Vancouver, Canada, 2014, ISBN: 978-1-4503-2960-6. @inproceedings{olsson:impl:2014, title = {Implementing Efficient Virtual Shadow Maps for Many Lights}, author = {Ola Olsson and Erik Sintorn and Viktor Kämpe and Markus Billeter and Ulf Assarsson}, url = {http://doi.acm.org/10.1145/2614106.2614202 /wp-content/uploads/clustered_with_shadows_siggraph_2014.pdf /wp-content/uploads/clustered_with_shadows_siggraph_2014.pptx }, doi = {10.1145/2614106.2614202}, isbn = {978-1-4503-2960-6}, year = {2014}, date = {2014-08-10}, booktitle = {ACM SIGGRAPH 2014 Talks}, pages = {50:1--50:1}, publisher = {ACM}, address = {Vancouver, Canada}, series = {SIGGRAPH '14}, abstract = {In the past few years, several techniques have been presented that enable real-time shading using many hundreds or thousands of lights [Harada et al. 2013]. However, only recently has a comprehensive study including shadows been presented by Olsson et al. [2014], where real-time performance is achieved for several hundred light sources with high quality and controllable memory footprint. The new algorithm uses many modern features of OpenGL and contains many design choices only described very briefly in the paper. We present additional details and focus on the practical implementation aspects of the system, in order to facilitate the implementation of the algorithm for the game development community.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In the past few years, several techniques have been presented that enable real-time shading using many hundreds or thousands of lights [Harada et al. 2013]. However, only recently has a comprehensive study including shadows been presented by Olsson et al. [2014], where real-time performance is achieved for several hundred light sources with high quality and controllable memory footprint. The new algorithm uses many modern features of OpenGL and contains many design choices only described very briefly in the paper. We present additional details and focus on the practical implementation aspects of the system, in order to facilitate the implementation of the algorithm for the game development community. |
![]() | Sintorn, Erik; Kämpe, Viktor; Olsson, Ola; Assarsson, Ulf Compact Precomputed Voxelized Shadows Journal Article ACM Transactions on Graphics, 33 (4), 2014, (SIGGRAPH 2014). @article{SintornCompact2014, title = {Compact Precomputed Voxelized Shadows}, author = {Erik Sintorn and Viktor Kämpe and Ola Olsson and Ulf Assarsson}, url = {http://doi.acm.org.proxy.lib.chalmers.se/10.1145/2601097.2601221 /wordpress/wp-content/uploads/precomputed_shadows_2014.pdf}, doi = {10.1145/2601097.2601221}, year = {2014}, date = {2014-07-10}, journal = {ACM Transactions on Graphics}, volume = {33}, number = {4}, abstract = {Producing high-quality shadows in large environments is an important and challenging problem for real-time applications such as games. We propose a novel data structure for precomputed shadows, which enables high-quality filtered shadows to be reconstructed for any point in the scene. We convert a high-resolution shadow map to a sparse voxel octree, where each node encodes light visibility for the corresponding voxel, and compress this tree by merging common subtrees. The resulting data structure can be many orders of magnitude smaller than the corresponding shadow map. We also show that it can be efficiently evaluated in real time with large filter kernels.}, note = {SIGGRAPH 2014}, keywords = {}, pubstate = {published}, tppubtype = {article} } Producing high-quality shadows in large environments is an important and challenging problem for real-time applications such as games. We propose a novel data structure for precomputed shadows, which enables high-quality filtered shadows to be reconstructed for any point in the scene. We convert a high-resolution shadow map to a sparse voxel octree, where each node encodes light visibility for the corresponding voxel, and compress this tree by merging common subtrees. The resulting data structure can be many orders of magnitude smaller than the corresponding shadow map. We also show that it can be efficiently evaluated in real time with large filter kernels. |
![]() | Olsson, Ola; Sintorn, Erik; Kämpe, Viktor; Billeter, Markus; Assarsson, Ulf Efficient Virtual Shadow Maps for Many Lights Inproceedings Proceedings of the 18th Meeting of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, pp. 87-96, ACM, New York, NY, USA, 2014, ISBN: 978-1-4503-2717-6. @inproceedings{olsson:Shadows:2014, title = {Efficient Virtual Shadow Maps for Many Lights}, author = {Ola Olsson and Erik Sintorn and Viktor Kämpe and Markus Billeter and Ulf Assarsson}, url = {/wp-content/uploads/clustered_with_shadows.pdf /wordpress/wp-content/uploads/clustered_with_shadows_slides.pdf https://www.youtube.com/watch?v=jjAE0h5VNT0 }, doi = {10.1145/2556700.2556701}, isbn = {978-1-4503-2717-6}, year = {2014}, date = {2014-03-14}, booktitle = {Proceedings of the 18th Meeting of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games}, pages = {87-96}, publisher = {ACM}, address = {New York, NY, USA}, series = {I3D '14}, abstract = {Recently, several algorithms have been introduced that enable real-time performance for many lights in applications such as games. In this paper, we explore the use of hardware-supported virtual cube-map shadows to efficiently implement high-quality shadows from hundreds of light sources in real time and within a bounded memory footprint. In addition, we explore the utility of ray tracing for shadows from many lights and present a hybrid algorithm combining ray tracing with cube maps to exploit their respective strengths. Our solution supports real-time performance with hundreds of lights in fully dynamic high-detail scenes.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Recently, several algorithms have been introduced that enable real-time performance for many lights in applications such as games. In this paper, we explore the use of hardware-supported virtual cube-map shadows to efficiently implement high-quality shadows from hundreds of light sources in real time and within a bounded memory footprint. In addition, we explore the utility of ray tracing for shadows from many lights and present a hybrid algorithm combining ray tracing with cube maps to exploit their respective strengths. Our solution supports real-time performance with hundreds of lights in fully dynamic high-detail scenes. |
![]() | Sintorn, Erik; Kämpe, Viktor; Olsson, Ola; Assarsson, Ulf Per-Triangle Shadow Volumes Using a View-Sample Cluster Hierarchy Inproceedings Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, pp. 111-118, ACM, 2014, ISBN: 978-1-4503-2717-6. @inproceedings{sintorn:PTSVC:2014, title = {Per-Triangle Shadow Volumes Using a View-Sample Cluster Hierarchy}, author = {Erik Sintorn and Viktor Kämpe and Ola Olsson and Ulf Assarsson}, url = {/wp-content/uploads/clustered_ptsv_2014.pdf}, doi = {10.1145/2556700.2556716}, isbn = {978-1-4503-2717-6}, year = {2014}, date = {2014-01-01}, booktitle = {Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games}, pages = { 111-118}, publisher = {ACM}, series = {I3D '14}, abstract = {Rendering pixel-accurate shadows in scenes lit by a point light-source in real time is still a challenging problem. For scenes of moderate complexity, algorithms based on Shadow Volumes are by far the most efficient in most cases, but traditionally, these algorithms struggle with views where the volumes generate a very high depth complexity. Recently, a method was suggested that alleviates this problem by testing each individual triangle shadow volume against a hierarchical depth map, allowing volumes that are in front of, or behind, the rendered view samples to be efficiently culled. In this paper, we show that this algorithm can be greatly improved by building a full 3D acceleration structure over the view samples and testing per-triangle shadow volumes against that. We show that our algorithm can elegantly maintain high frame-rates even for views with very high-frequency depth-buffers where previous algorithms perform poorly. Our algorithm also performs better than previous work in general, making it, to the best of our knowledge, the fastest pixel-accurate shadow algorithm to date. It can be used with any arbitrary polygon soup as input, with no restrictions on geometry or required pre-processing, and trivially supports transparent and textured shadow-casters.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Rendering pixel-accurate shadows in scenes lit by a point light-source in real time is still a challenging problem. For scenes of moderate complexity, algorithms based on Shadow Volumes are by far the most efficient in most cases, but traditionally, these algorithms struggle with views where the volumes generate a very high depth complexity. Recently, a method was suggested that alleviates this problem by testing each individual triangle shadow volume against a hierarchical depth map, allowing volumes that are in front of, or behind, the rendered view samples to be efficiently culled. In this paper, we show that this algorithm can be greatly improved by building a full 3D acceleration structure over the view samples and testing per-triangle shadow volumes against that. We show that our algorithm can elegantly maintain high frame-rates even for views with very high-frequency depth-buffers where previous algorithms perform poorly. Our algorithm also performs better than previous work in general, making it, to the best of our knowledge, the fastest pixel-accurate shadow algorithm to date. It can be used with any arbitrary polygon soup as input, with no restrictions on geometry or required pre-processing, and trivially supports transparent and textured shadow-casters. |
2013 |
|
![]() | Persson, Emil; Olsson, Ola Practical Clustered Deferred and Forward Shading Miscellaneous SIGGRAPH Course: Advances in Real-Time Rendering in Games, 2013. @misc{persson:olsson:ng:2013, title = {Practical Clustered Deferred and Forward Shading}, author = {Emil Persson and Ola Olsson}, url = {http://s2013.siggraph.org/attendees/courses/session/advances-real-time-rendering-games-part-i /wp-content/uploads/practical_clustered_2013.pptx }, year = {2013}, date = {2013-08-01}, abstract = {Efficient and flexible lighting remains a challenge in modern game engines. Clustered Shading [Olsson et. al. 2012] is a new lighting technique that offers compelling advantages over previous methods such as Tiled Deferred and Forward+. It scales better with complex scenes, while also offering more flexibility and fewer hassles. It is a unified lighting solution that works well with transparency, MSAA, and custom material and lighting models, without requiring extra passes or even necessarily a pre-z pass. This session introduces the latest academic research on this technique, then reviews the adapted version of the technique that is currently in production at Avalanche Studios and the key differences between the implementations and their implications.}, howpublished = {SIGGRAPH Course: Advances in Real-Time Rendering in Games}, keywords = {}, pubstate = {published}, tppubtype = {misc} } Efficient and flexible lighting remains a challenge in modern game engines. Clustered Shading [Olsson et. al. 2012] is a new lighting technique that offers compelling advantages over previous methods such as Tiled Deferred and Forward+. It scales better with complex scenes, while also offering more flexibility and fewer hassles. It is a unified lighting solution that works well with transparency, MSAA, and custom material and lighting models, without requiring extra passes or even necessarily a pre-z pass. This session introduces the latest academic research on this technique, then reviews the adapted version of the technique that is currently in production at Avalanche Studios and the key differences between the implementations and their implications. |
![]() | Billeter, Markus; Olsson, Ola; Assarsson, Ulf Tiled Forward Shading Incollection Engel, Wolfgang (Ed.): GPU Pro 4: Advanced Rendering Techniques, pp. 99–114, A K Peters/CRC Press, Boca Raton, FL, USA, 2013, ISBN: 9781466567436. @incollection{billeter:gpupro4:2013, title = {Tiled Forward Shading}, author = {Markus Billeter and Ola Olsson and Ulf Assarsson}, editor = {Wolfgang Engel}, url = {http://books.google.se/books?id=TUuhiPLNmbAC}, isbn = {9781466567436}, year = {2013}, date = {2013-01-01}, booktitle = {GPU Pro 4: Advanced Rendering Techniques}, pages = {99--114}, publisher = {A K Peters/CRC Press}, address = {Boca Raton, FL, USA}, abstract = {We will explore the tiled forward shading algorithm in this chapter. Tiled forward shading is an extension or modification of tiled deferred shading[Balestra and Engstad 08,Swoboda 09,Andersson 09,Lauritzen 10,Olsson and Assarsson 11], which itself improves upon traditional deferred shading methods[Hargreaves and Harris 04, Engel 09].}, keywords = {}, pubstate = {published}, tppubtype = {incollection} } We will explore the tiled forward shading algorithm in this chapter. Tiled forward shading is an extension or modification of tiled deferred shading[Balestra and Engstad 08,Swoboda 09,Andersson 09,Lauritzen 10,Olsson and Assarsson 11], which itself improves upon traditional deferred shading methods[Hargreaves and Harris 04, Engel 09]. |
2012 |
|
![]() | Olsson, Ola; Billeter, Markus; Assarsson, Ulf Clustered Deferred and Forward Shading Inproceedings HPG '12: Proceedings of the Conference on High Performance Graphics 2012, pp. 87-96, Paris, France, 2012, ISBN: 978-3-905674-41-5. @inproceedings{ClusteredShading2012, title = {Clustered Deferred and Forward Shading}, author = {Ola Olsson and Markus Billeter and Ulf Assarsson}, url = {/wp-content/uploads/clustered_shading_preprint.pdf /wp-content/uploads/clustered_forward_demo.zip https://www.youtube.com/watch?v=6DyTk7917ZI}, doi = {10.2312/EGGH/HPG12/087-096}, isbn = {978-3-905674-41-5}, year = {2012}, date = {2012-01-01}, booktitle = {HPG '12: Proceedings of the Conference on High Performance Graphics 2012}, pages = { 87-96}, address = {Paris, France}, abstract = {This paper presents and investigates Clustered Shading for deferred and forward rendering. In Clustered Shading, view samples with similar properties (e.g. 3D-position and/or normal) are grouped into clusters. This is comparable to tiled shading, where view samples are grouped into tiles based on 2D-position only. We show that Clustered Shading creates a better mapping of light sources to view samples than tiled shading, resulting in a significant reduction of lighting computations during shading. Additionally, Clustered Shading enables using normal information to perform per-cluster back-face culling of lights, again reducing the number of lighting computations. We also show that Clustered Shading not only outperforms tiled shading in many scenes, but also exhibits better worst case behaviour under tricky conditions (e.g. when looking at high-frequency geometry with large discontinuities in depth). Additionally, Clustered Shading enables real-time scenes with two to three orders of magnitudes more lights than previously feasible (up to around one million light sources).}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } This paper presents and investigates Clustered Shading for deferred and forward rendering. In Clustered Shading, view samples with similar properties (e.g. 3D-position and/or normal) are grouped into clusters. This is comparable to tiled shading, where view samples are grouped into tiles based on 2D-position only. We show that Clustered Shading creates a better mapping of light sources to view samples than tiled shading, resulting in a significant reduction of lighting computations during shading. Additionally, Clustered Shading enables using normal information to perform per-cluster back-face culling of lights, again reducing the number of lighting computations. We also show that Clustered Shading not only outperforms tiled shading in many scenes, but also exhibits better worst case behaviour under tricky conditions (e.g. when looking at high-frequency geometry with large discontinuities in depth). Additionally, Clustered Shading enables real-time scenes with two to three orders of magnitudes more lights than previously feasible (up to around one million light sources). |
![]() | Olsson, Ola; Billeter, Markus; Assarsson, Ulf Tiled and Clustered Forward Shading Inproceedings SIGGRAPH '12: ACM SIGGRAPH 2012 Talks, ACM, Los Angeles, California, 2012. @inproceedings{OlssonBilleterAssarsson2012, title = {Tiled and Clustered Forward Shading}, author = {Ola Olsson and Markus Billeter and Ulf Assarsson}, url = {/wp-content/uploads/tiled_shading_siggraph_2012.pdf /wp-content/uploads/tiled_clustered_forward_siggraph_2012.pptx}, doi = {10.1145/2343045.2343095}, year = {2012}, date = {2012-01-01}, booktitle = {SIGGRAPH '12: ACM SIGGRAPH 2012 Talks}, publisher = {ACM}, address = {Los Angeles, California}, abstract = {Tiled and Clustered Forward Shading are new techniques that enable support for thousands of lights, while eliminating many of the drawbacks of deferred techniques. Ths talk shows how these techniques can be used and extended to support transparency with high efficiency.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Tiled and Clustered Forward Shading are new techniques that enable support for thousands of lights, while eliminating many of the drawbacks of deferred techniques. Ths talk shows how these techniques can be used and extended to support transparency with high efficiency. |
2011 |
|
![]() | Olsson, Ola; Assarsson, Ulf Improved Ray Hierarchy Alias Free Shadows Technical Report Chalmers University of Technology (2011:09), 2011. @techreport{ola2011, title = {Improved Ray Hierarchy Alias Free Shadows}, author = {Ola Olsson and Ulf Assarsson}, url = {/wp-content/uploads/2016/04/Improved-Ray-Hierarchy-Alias-Free-Shadows.pdf}, year = {2011}, date = {2011-05-01}, number = {2011:09}, institution = {Chalmers University of Technology}, abstract = {In this article, we introduce and evaluate several new algorithmic improvements to Ray Hierarchies. The improved algorithm is used to produce ray-traced shadows for omni-directional lights in fully dynamic environments. The new algorithmic improvements increase culling rate of the traversal algorithm, in some cases dramatically cutting the number of visited nodes. To evaluate performance, we present a GPU implementation using CUDA, which uses a hybrid breadthdepth algorithm to perform the traversal using bounded memory. The results show that the use of improved cone hierarchies is able to produce high quality shadows quickly, for some scenes in real-time.}, keywords = {}, pubstate = {published}, tppubtype = {techreport} } In this article, we introduce and evaluate several new algorithmic improvements to Ray Hierarchies. The improved algorithm is used to produce ray-traced shadows for omni-directional lights in fully dynamic environments. The new algorithmic improvements increase culling rate of the traversal algorithm, in some cases dramatically cutting the number of visited nodes. To evaluate performance, we present a GPU implementation using CUDA, which uses a hybrid breadthdepth algorithm to perform the traversal using bounded memory. The results show that the use of improved cone hierarchies is able to produce high quality shadows quickly, for some scenes in real-time. |
![]() | Olsson, Ola; Assarsson, Ulf Tiled Shading Journal Article Journal of Graphics, GPU, and Game Tools, 15 (4), pp. 235-251, 2011. @article{OlssonAssarsson2011, title = {Tiled Shading}, author = {Ola Olsson and Ulf Assarsson}, url = {/wp-content/uploads/tiled_shading_preprint.pdf http://www.tandfonline.com/doi/abs/10.1080/2151237X.2011.621761}, doi = {10.1080/2151237X.2011.621761}, year = {2011}, date = {2011-01-01}, journal = {Journal of Graphics, GPU, and Game Tools}, volume = {15}, number = {4}, pages = {235-251}, abstract = {Abstract In this article we describe and investigate tiled shading. The tiled techniques, though simple, enable substantial improvements to both deferred and forward shading. Tiled Shading has been previously discussed only in terms of deferred shading (tiled deferred shading). We contribute a more detailed description of the technique, introduce tiled forward shading (a generalization of tiled deferred shading to also apply to forward shading), and a thorough performance evaluation. Tiled Forward Shading has many of the advantages of deferred shading, for example, scene management and light management are decoupled. At the same time, unlike traditional deferred and tiled deferred shading, full screen antialiasing and transparency are trivially supported. We also present a thorough comparison of the performance of tiled deferred, tiled forward, and traditional deferred shading. Our evaluation shows that tiled deferred shading has the least variable worst-case performance, and scales the best with faster GPUs. Tiled deferred shading is especially suitable when there are many light sources. Tiled forward shading is shown to be competitive for scenes with fewer lights, and is much simpler than traditional forward shading techniques. Tiled shading also enables simple transitioning between deferred and forward shading. We demonstrate how this can be used to handle transparent geometry, frequently a problem when using deferred shading. Demo source code is available online at the address provided at the end of this paper.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Abstract In this article we describe and investigate tiled shading. The tiled techniques, though simple, enable substantial improvements to both deferred and forward shading. Tiled Shading has been previously discussed only in terms of deferred shading (tiled deferred shading). We contribute a more detailed description of the technique, introduce tiled forward shading (a generalization of tiled deferred shading to also apply to forward shading), and a thorough performance evaluation. Tiled Forward Shading has many of the advantages of deferred shading, for example, scene management and light management are decoupled. At the same time, unlike traditional deferred and tiled deferred shading, full screen antialiasing and transparency are trivially supported. We also present a thorough comparison of the performance of tiled deferred, tiled forward, and traditional deferred shading. Our evaluation shows that tiled deferred shading has the least variable worst-case performance, and scales the best with faster GPUs. Tiled deferred shading is especially suitable when there are many light sources. Tiled forward shading is shown to be competitive for scenes with fewer lights, and is much simpler than traditional forward shading techniques. Tiled shading also enables simple transitioning between deferred and forward shading. We demonstrate how this can be used to handle transparent geometry, frequently a problem when using deferred shading. Demo source code is available online at the address provided at the end of this paper. |
2009 |
|
![]() | Billeter, Markus; Olsson, Ola; Assarsson, Ulf Efficient stream compaction on wide SIMD many-core architectures Inproceedings Proceedings of the Conference on High Performance Graphics 2009, pp. 159–166, ACM, New Orleans, Louisiana, 2009, ISBN: 978-1-60558-603-8. @inproceedings{Billeter:2009:ESC:1572769.1572795, title = {Efficient stream compaction on wide SIMD many-core architectures}, author = {Markus Billeter and Ola Olsson and Ulf Assarsson}, url = {http://doi.acm.org/10.1145/1572769.1572795 /wp-content/uploads/Efficient-Stream-Compaction-on-Wide-SIMD-Many-Core-Architectures.pdf}, doi = {10.1145/1572769.1572795}, isbn = {978-1-60558-603-8}, year = {2009}, date = {2009-01-01}, booktitle = {Proceedings of the Conference on High Performance Graphics 2009}, pages = {159--166}, publisher = {ACM}, address = {New Orleans, Louisiana}, series = {HPG '09}, abstract = {Stream compaction is a common parallel primitive used to remove unwanted elements in sparse data. This allows highly parallel algorithms to maintain performance over several processing steps and reduces overall memory usage. For wide SIMD many-core architectures, we present a novel stream compaction algorithm and explore several variations thereof. Our algorithm is designed to maximize concurrent execution, with minimal use of synchronization. Bandwidth and auxiliary storage requirements are reduced significantly, which allows for substantially better performance. We have tested our algorithms using CUDA on a PC with an NVIDIA GeForce GTX280 GPU. On this hardware, our reference implementation provides a 3x speedup over previous published algorithms.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Stream compaction is a common parallel primitive used to remove unwanted elements in sparse data. This allows highly parallel algorithms to maintain performance over several processing steps and reduces overall memory usage. For wide SIMD many-core architectures, we present a novel stream compaction algorithm and explore several variations thereof. Our algorithm is designed to maximize concurrent execution, with minimal use of synchronization. Bandwidth and auxiliary storage requirements are reduced significantly, which allows for substantially better performance. We have tested our algorithms using CUDA on a PC with an NVIDIA GeForce GTX280 GPU. On this hardware, our reference implementation provides a 3x speedup over previous published algorithms. |