onsdag 21 oktober 2015

GPU programming in Haskell using GPipe - Part 4


< Previous episode: Shaders and primitive streams

In this part I will show you how to work with textures and samplers in GPipe. But before that, we should take a closer look at the FragmentStream that was created from the rasterization process that was shown all the way back in part 1, since most use cases for sampling textures are with fragment streams.

FragmentStreams

To recap from part 1: The rasterize function is what creates a FragmentStream from a PrimitiveStream:
rasterize :: forall p a s os. FragmentInput a =>
  (s -> (Side, ViewPort, DepthRange)) -> PrimitiveStream p (VPos, a) -> Shader os s (FragmentStream (FragmentFormat a))

It takes a primitive stream with vertices consisting of a tuple (position, a), and creates a fragment stream of FragmentFormat a values. FragmentFormat is an associated type of the FragmentInput type class, which is yet another one of those classes that use an arrow to convert values from one domain to another. In this case, it basically only converts S V a to S F a. S F Float values are created from interpolating the primitive's corners' vertices, while S F Int or S F Word values are simply copied from the last vertex of the primitive. There is also a Flat data constructor that can be used to suppress interpolation of S V Float values. They will turn into S F Float values in the fragments by just copying the last vertex's value in each primitive.

Remember that the vertices' positions are expressed as homogeneous 3D coordinates, i.e. V4 (S V Float), due to perspective correction. In perspective correct interpolation, each S V Float value will first be divided by the position's 4th component (w), and then interpolated. The interpolated value will then be divided by the interpolated reciprocal of w. This means that if w is the same for all three vertices of a triangle, the interpolation will be a simple linear one. To circumvent perspective correction for a specific S V Float value, you can wrap it in a NoPerspective data constructor, then it will just behave as if w was constant.

Textures

A texture is just like a buffer a storage of data on the GPU, but textures are optimized for being indexed in more than one dimension, and can also be dynamically indexed using S-values in a PrimitiveStream or FragmentStream. When sampling a value from a texture, the GPU also provides native hardware support for interpolating between multiple elements.

Just like buffers, textures are objects that we create in the ContextT monad. There are six kinds of textures in GPipe:
  • Texture1D os a
  • Texture2D os a
  • Texture3D os a
  • Texture1DArray os a
  • Texture2DArray os a
  • TextureCube os a
Just as buffers, they are parameterized on the object space, meaning they can not escape the context. They also have an element type (a), but this is not as flexible as it is for buffers, which we'll see if we study the functions that creates new textures:
newTexture2D :: forall ctx os c m. (ContextHandler ctx, TextureFormat c, MonadIO m) =>
  Format c -> Size2 -> MaxLevels -> ContextT ctx os m (Texture2D os (Format c)) Source
(Throughout this tutorial I will mostly use 2D textures as an example, but analogous functions for the other texture types exists as well. E.g. there exists a newTexture3D as well, that takes a Size3 instead of Size2 as argument.)

The element type of the new texture is denoted by a value of the type Format c, which is a GADT in the Graphics.GPipe.Format module. If you for instance use the format RGB8 as parameter to newTexture2D, the texture will have type Texture2D os (Format RGBFloat). From that type we can read that it contains three components (RGB) that will be represented as Float values when we sample it. Internally, each component will be stored as an 8-bit fixed point value in the range [0.0, 1.0]. If we instead would have used RGB32F, the resulting texture would have had the same type, but internally would have used 32-bits floating point values for each component instead which would yield better precision but cost three times as much memory (RGB8 is most likely padded with an extra byte internally).

newTexture2D also takes a size in form of a V2 Int (aka Size2) and a max level count as an Int (aka MaxLevels). A texture contains extra levels of detail (LODs), which are lower resolution versions of the texture, where each next level of detail has half the size in every dimension of the previous level. If maximum number of LODs are used, the last level will consist of a single texel (another word for texture element). LODs are used if the texture is sampled at a low frequency. E.g. if the texture is mapped on an object that is far away it would cover fewer pixels on screen than the texture contains, and without LODs some texels of the texture would be skipped over, resulting in aliasing in the rendered image. Most of the times all levels are not needed, which is why newTexture2D and friends take a MaxLevels argument. Number of levels is still always limited by the size of the texture, which you will get if you use maxBound for the MaxLevels argument. If you don't want extra levels of detail, you use 1 as the MaxLevels argument. The documentation for DirectX has some nice pictures of how the LODs look like for all the different texture types (just look at the pictures and don't bother too much with the details of the text since it is a bit DirectX specific).

You can use texture2DLevels (and similarly for other texture types) to retrieve the number of levels a texture was created with. There is also a texture2DSizes that return a list of all the LODs sizes, starting with the largest size (i.e. the size that was provided to the newTexture2D function).

Lower LODs are usually a down scaled version of the previous LOD, but this is not enforced by GPipe. Instead you can write each LOD individually, using functions like writeTexture2D or writeTexture2DFromBuffer:
writeTexture2D :: forall b c h ctx os m. (ContextHandler ctx, MonadIO m, BufferFormat b, ColorSampleable c, BufferColor (Color c (ColorElement c)) h ~ b, h ~ HostFormat b) => Texture2D os (Format c) -> Level -> StartPos2 -> Size2 -> [h] -> ContextT ctx os m ()
writeTexture2DFromBuffer :: forall b c h ctx os m. (ContextHandler ctx, MonadIO m, BufferFormat b, ColorSampleable c, BufferColor (Color c (ColorElement c)) (HostFormat b) ~ b) => Texture2D os (Format c) -> Level -> StartPos2 -> Size2 -> Buffer os b -> BufferStartPos -> ContextT ctx os m ()

The Level argument in both of these selects which level to write to, using a zero based index (i.e. 0 selects the level of highest resolution). Next two arguments are a start position and size within this level to write to.

For writeTexture2D, you then provide a list of h values, where h is constrained by this little riddle: BufferFormat b, ColorSampleable c, BufferColor (Color c (ColorElement c)) h ~ b, h ~ HostFormat b. In short, h is the second parameter in any of the equations in the BufferColor type family where the first argument matches the texture's element type representation. E.g. for Format RGBFloat we will use any of the BufferColor equations that has (V3 Float) as first parameter, so in that case a [V3 Float] or a [V3 Word8] or any of the 7 others would work.

For writeTexture2DFromBuffer, you provide a Buffer instead of a list of values. This buffer needs to have an element type that corresponds to one of the results of the BufferColor equations where the first argument matches the texture's element type representation. In the Format RGBFloat example, a buffer of type Buffer os (B3 Float) or Buffer os (Normalized (B3 Word8)), among others, would work. For writeTexture2DFromBuffer, you also need to provide a zero based start index in the buffer to start reading from (to achieve the same effect for writeTexture2D you just use drop on the list you provide).

Most of the times you only write LOD level 0 using any of the writeTextureXXX functions, and then generate the other levels. For this you simply call the generateTexture2DMipmap action and friends.

A texture has two purposes in GPipe: To be sampled from and/or to be drawn to. The next part of this tutorial will be devoted to the latter, so let's look at sampling now.

Samplers

A texture can be sampled dynamically using a S-value from a PrimitiveStream or FragmentStream. But in order to do this, we need to create a sampler from that texture, that in addition to the texture also has a filtering and edge mode setting. You create a sampler inside the Shader monad by any of the newSamplerXXX functions. Here's how the 2D version looks like:
newSampler2D :: forall os s c. ColorSampleable c =>
  (s -> (Texture2D os (Format c), SamplerFilter c, (EdgeMode2, BorderColor c))) -> Shader os s (Sampler2D (Format c))

It gets all its arguments from the shader environment: a texture, a filter and a tuple of an edge mode and a border color. Let's look at the filter first.

The SamplerFilter c type is a GADT with two constructors:
data SamplerFilter c where
  SamplerFilter :: (ColorElement c ~ Float) => 
                     MagFilter -> MinFilter -> LodFilter -> Anisotropy -> SamplerFilter c
  SamplerNearest :: SamplerFilter c

If you have a texture with a Float component format (e.g. RGB8 or RGB32F) then you may use either of the two constructors, but if your texture has Int or Word components you may only use the SamplerNearest constructor. This will effectively use no filtering at all: at each sample, the lod that closest matches the sampling frequency will be used and the nearest texel of that LOD will be picked. Sampling frequency is calculated from how far apart adjacent fragments samples the texture. If the texture is sampled in a primitive stream instead there is no notion of sampling frequency, and as we will see soon the types of GPipe will in that case require us to either pass in an explicit sampling gradient or explicitly select a LOD.

When you use SamplerFilter on the other hand you have more options. For each sample you make, there will always be one or two closest LODs based on the sampling frequency, and for each LOD there will be two closest texels per dimension to choose from (e.g. for a 2D texture there will be 4 texels to choose from and for 3D there will be 8). MagFilter, MinFilter and LodFilter are all type synonyms for the enum Filter. The MagFilter is used when the texture is sampled at a too high frequency (i.e. the texture cover more pixels on screen than lowest LOD has texels for). Using Nearest in this case will simply pick the closest texel of LOD 0 while Linear will use linear interpolation of the closest texels. When sampling at lower frequency the MinFilter will be used, in conjunction with the LodFilter. If MinFilter is Nearest a single texel will be picked from each of the two closest LOD levels, while if it is Linear a value will be linearly interpolated for each of the levels. Then LodFilter will dictate whether one of the two levels' values will simply be picked or if the result should be interpolated from both of them.

The last argument to SamplerFilter is Anisotropy, which is a type synonym for Maybe Float. Anisotropic filtering is actually not in the core OpenGl 3.3 specification but is defined in an extension, in fact the only one GPipe uses besides core functionality. It is useful when the frequency of the sampling varies for fragments in the x- and y-axises (e.g. when sampling the texture on a surface that slopes wrt the view, like the walls when looking down a corridor). In this case a too high LOD (i.e. of too low resolution) will been selected for one of the axises with unnecessary blurring as a result. When using Maybe a for Anisotropy, up to a samples will instead be taken from the line of anisotropy and the average will be returned as final sample value. Using Nothing for Anisotropy is equivalent to Maybe 1.0. I don't know why a is a Float, but that's how OpenGl likes it.

The last thing we needed to create a sampler was a tuple (EdgeMode2, BorderColor c) where EdgeMode2 is simply a type synonym for V2 EdgeMode. EdgeMode is used to instruct the sampler what to do if a coordinate outside the texture region is provided, and we can give different instructions for each dimension of the texture, hence an EdgeMode2. For a 3D texture, you would have provided an EdgeMode3 instead. EdgeMode is an enum with four possible values: Repeat, Mirror, ClampToEdge or ClampToBorder. Repeat will make the texture tile in the given dimension, while Mirror makes every other tile be mirrored. ClampToEdge will clamp all sample coordinates to the range of the texture so it simply returns the first or last value from the texture in the given dimension. ClampToBorder will make any samples outside the texture in the given dimension return a predefined constant color instead. This color is defined by the second element of the tuple and if neither of the dimensions uses the edge mode ClampToBorder, the border color can be given as undefined.

Since colors like the border color is such a central concept of GPipe it deserves a section of its own:

Colors

A texture, as well as the window surface our context owns, can have a variable number of color channels. A texture can also have different types of channels: Float, Int or Word (for the window's surface the channel type is always Float). In plain old OpenGl e.g. accidentally using a float border color with an int texture would result in an error. In GPipe on the other hand, these invariants are enforced through types, and your program will always work once it compiles.

The BorderColor c type we just saw is a type synonym that expands to Color c (ColorElement c). Several other similar type synonyms exists in GPipe:
type BorderColor f = Color f (ColorElement f)
type ColorSample x f = Color f (S x (ColorElement f))
type FragColor f = Color f (S F (ColorElement f))
type ColorMask f = Color f Bool

They all are a combination of the associated types Color f a and ColorElement f from the ColorSampleable type class, where f is one of the types used as parameter to the Format f GADT. E.g. the format RGB8 had the type Format RGBFloat, so in this case f would be RGBFloat.

Here are all the instances of the ColorSampleable type class and their associated types:

f
Color f a    
ColorElement f
RGBAFloatV4 aFloat
RGBAIntV4 aInt
RGBAWordV4 aWord
RGBFloatV3 aFloat
RGBIntV3 aInt
RGBWordV3 aWord
RGFloatV2 aFloat
RGIntV2 aInt
RGWordV2 aWord
RFloataFloat
RIntaInt
RWordaWord
DepthaFloat
DepthStencil   aFloat

For an RGB8 texture where f is RGBFloat then BorderColor RGBFloat = V3 Float, ColorSample x RGBFloat = V3 (S x Float), FragColor RGBFloat = V3 (S F Float) and ColorMask RGBFloat = V3 Bool.

Now if you look back at the type of writeTexture2D, and in particular the constraint BufferFormat b, ColorSampleable c, BufferColor (Color c (ColorElement c)) h ~ b, h ~ HostFormat b, you'll understand why we looked for equations in BufferColor where the first parameter was V3 Float for RGB8 textures.

Outside the shader you could use texture2DSizes and friends to get the sizes of a texture's all lods. Inside the shader you can instead get the size of a sampler's lod with sampler2DSize (and similar for other texture types).

Sampling

Let's put the sampler we just created to use!

{-# LANGUAGE ScopedTypeVariables, PackageImports, TypeFamilies #-}
module Main where

import Graphics.GPipe
import qualified "GPipe-GLFW" Graphics.GPipe.Context.GLFW as GLFW
import "lens" Control.Lens
import Control.Monad (unless)
import Data.Word (Word32)
import Control.Applicative (pure)

main =
  runContextT GLFW.defaultHandleConfig $ do
    win <- newWindow (WindowFormatColor RGB8) (GLFW.defaultWindowConfig "Checkers")
    vertexBuffer :: Buffer os (B2 Float) <- newBuffer 4
    writeBuffer vertexBuffer 0 [V2 0 0, V2 1 0, V2 0 1, V2 1 1]
    tex <- newTexture2D R8 (V2 8 8) 1
    let whiteBlack = cycle [minBound,maxBound] :: [Word32]
        blackWhite = tail whiteBlack
    writeTexture2D tex 0 0 (V2 8 8) (cycle (take 8 whiteBlack ++ take 8 blackWhite))
    shader <- compileShader $ do
      primitiveStream <- toPrimitiveStream id
      let primitiveStream2 = fmap (\pos2d -> (make3d pos2d, pos2d)) primitiveStream
      fragmentStream <- rasterize (const (FrontAndBack, ViewPort (V2 0 0) (V2 500 500), DepthRange 0 1)) primitiveStream2
      let filter = SamplerFilter Nearest Nearest Nearest Nothing
          edge = (pure Repeat, undefined)
      samp <- newSampler2D (const (tex, filter, edge))
      let sampleTexture = pure . sample2D samp SampleAuto Nothing Nothing
          fragmentStream2 = fmap sampleTexture fragmentStream
      drawWindowColor (const (win, ContextColorOption NoBlending (pure True))) fragmentStream2

    renderLoop win $ do
      clearWindowColor win 0.5
      vertexArray <- newVertexArray vertexBuffer
      shader (toPrimitiveArray TriangleStrip vertexArray)

make3d (V2 x y) = projMat !*! viewMat !* V4 x y 0 1
  where
    viewMat = lookAt' (V3 1 2 2) (V3 0.5 0.5 0) (V3 0 1 0)
    projMat = perspective (pi/3) 1 1 100

renderLoop win rendering = do
  render rendering
  swapWindowBuffers win
  closeRequested <- GLFW.windowShouldClose win
  unless (closeRequested == Just True) $
    renderLoop win rendering

-- Copy of lookAt from linear with normalize replaced with signorm
lookAt' eye center up =
  V4 (V4 (xa^._x) (xa^._y) (xa^._z) xd)
     (V4 (ya^._x) (ya^._y) (ya^._z) yd)
     (V4 (-za^._x) (-za^._y) (-za^._z) zd)
     (V4 0     0     0     1)
  where za = signorm $ center - eye
        xa = signorm $ cross za up
        ya = cross xa za
        xd = -dot xa eye
        yd = -dot ya eye
        zd = dot za eye

This simple example will create a small 8x8 texture and give it a checkers pattern. We create a square from 4 vertices, and use a rotation and a perspective projection matrix to transform it into our view, with this result:



I have highlighted the expression that samples the texture in the example above: sample2D samp SampleAuto Nothing Nothing. It evaluates to a function of type V2 (S F Float) -> ColorSample F RFloat, which simplifies to V2 (S F Float) -> S F Float. The input, commonly referred to as an uv-coordinate, is a 2D coordinate (since it's a 2D texture) where (0.0, 0.0) maps to the top left corner of the top-left most texel in the texture, and (1.0, 1.0) maps to the bottom right corner of the bottom-right most texel.

In addition to the parameters we gave when creating the sampler, you also give a bunch of parameters at the time of sampling to the sample2D function (and its 5 siblings for the other texture types). Specifically, you can specify a SampleLod, a SampleProj and a SampleOffset value.

SampleLod is used to control which lod should be used and is defined like this:
data SampleLod vx x where
  SampleAuto :: SampleLod vx F
  SampleBias :: FFloat -> SampleLod vx F
  SampleLod :: S x Float -> SampleLod vx x
  SampleGrad :: vx -> vx -> SampleLod vx x

SampleAuto and SampleBias are only available if you are sampling inside a fragment stream. In a vertex stream you have to resort to one of the last two. SampleAuto will simply select a lod based on the sampling distance between this and adjacent fragments. This requires you to not do the sampling inside a branch (such as ifB), or the sampling distance to adjacent fragments may become undefined and wrong lod may be used. SampleBias lets you add a bias to select a higher lod (i.e. with lower resolution) than automatically selected. This enables you to create cheap blurring effects (the higher bias, the more blur). SampleLod simply picks a lod without any automatic calculations, and with SampleGrad, you provide the gradients for the lod calculations yourself. If you really need to sample inside a branch, you can actually get the gradients for the coordinate using the functions dFdx and dFdy. These two functions also have the limitation that they need to be performed outside any branching or they'll also get undefined return values.

The SampleProj x type is a synonym for Maybe (S x Float). If you give a Just value as this parameter, then the input coordinate's components will be divided by this value. This might seem like an odd addition to the API, but the hardware is already doing divisions behind the curtains for perspective correction (which is done on hardware), so you get this extra division "for free" since it can just multiply it with the existing denominator. You can use SampleProj to zoom the texture for instance. Using Nothing for SampleProj is equivalent to Just 1.

The last parameter to sample2D besides the actual sampling coordinate is SampleOffset2 x, which is just a type synonym for Maybe (V2 Int). This lets you add a constant (note the type: Int and not S x Int) offset in texels to any sample made. The offset has to stay within certain hardware dependent limits, in practice within [-8,7].



A texture that has a Format Depth or Format DepthStencil format, sampling will return the depth value as a float in the range [0,1]. A common use case for depth textures are shadow maps, in which you rather want to compare the depth value with a reference. For this, you can create a specialized shadow sampler with newSampler2DShadow and friends. These actions also require a ComparisonFunction that will be used when sampling. Sampling with shadow samplers is not done with the usual sample2D function, instead you use sample2DShadow, and similar for other texture types. These shadow sample functions also take a reference value as input, and will compare the texels of the texture with this reference value, using the ComparisonFunction that the sampler was created with to result in a value of 0 or 1 for each texel. These will then be sampled with the samplers filter to produce a final sample float value in the range [0,1] with lower values being "more false" and higher ones "more true".



Before we end this chapter, you should play around with the different parameters I've presented. Try replacing (make3d pos2d, pos2d) with (make3d pos2d, fmap NoPerspective pos2d) to see why we need perspective correct interpolation of the uv-coordinates. How would a Linear magnifying filter look like? In vertexBuffer, you can change each element's component of value 1 with value 2 instead and then try other edge modes than pure Repeat. What if you specify different edge modes for x and y? Remember to replace undefined with an actual color if you try ClampToBorder




In next part, which will be the last for this tutorial, we will cover drawing. Stay tuned!

Inga kommentarer:

Skicka en kommentar