OpenGL &
Vulkan
Complete Guide
From "what is a vertex?" to a fully animated, hardware-accelerated Vulkan application — taught from first principles, with zero magic and every line explained.
How GPUs Work
Before writing a single line of OpenGL or Vulkan, you need to understand what a GPU actually is and why it exists. Every API decision makes sense once you see the hardware it controls.
CPU vs GPU — Two Different Minds
A CPU is a serial genius. It has 8–32 very powerful cores, each capable of complex logic, branching, and memory access. It runs your game loop, physics, AI — anything that needs smart sequential decisions.
A GPU is a parallel army. It has thousands of tiny, simple cores — an NVIDIA RTX 4090 has 16,384 CUDA cores. Each one can barely do more than multiply and add floats. But they all run simultaneously. While a CPU needs 1 millisecond to transform 1 vertex, a GPU transforms 1,000,000 vertices in the same time — because all 16,384 cores share the work.
The Rendering Pipeline — How a Triangle Appears on Screen
Every 3D scene you see is produced by a fixed sequence of transformations called the rendering pipeline. Understanding this pipeline is the foundation of everything in OpenGL and Vulkan.
You only write two stages — the vertex shader and the fragment shader. Everything else is automatic GPU hardware. Both OpenGL and Vulkan give you control over these two stages via GLSL programs called shaders. The difference is how much control you have over everything else around them.
Normalised Device Coordinates (NDC)
The GPU does not think in pixels. It thinks in a standardised coordinate system where the center of the screen is (0, 0), the right edge is X=+1, the left is X=−1, the top is Y=+1 (OpenGL) or Y=−1 (Vulkan), and depth goes from 0 to 1. Your vertex shader's job is to transform your 3D world coordinates into this NDC space.
OpenGL NDC: Y=+1 is the top of the screen. Vulkan NDC: Y=+1 is the bottom. If you port code without fixing this, every scene appears upside down. We will handle this in the demos.
OpenGL — The Smart Driver Model
OpenGL (1992) was designed when GPUs were simple. Its driver model made sense then. Understanding what OpenGL does automatically is the key to understanding why Vulkan does things differently.
OpenGL's Core Idea — State Machine
OpenGL is a global state machine. There is one active shader, one bound buffer, one active texture. You change state by calling functions. All subsequent draw calls use whatever is currently active. This makes simple programs simple — and complex programs unpredictable.
// The essential OpenGL programme structure — every OpenGL app follows this skeleton #include <GL/glew.h> #include <GLFW/glfw3.h> #include <iostream> const char* vertSrc = R"( #version 330 core layout(location=0) in vec2 pos; // receive XY from VBO void main() { gl_Position = vec4(pos, 0.0, 1.0); } )"; const char* fragSrc = R"( #version 330 core out vec4 colour; void main() { colour = vec4(1.0, 0.5, 0.2, 1.0); } // orange )"; int main() { glfwInit(); glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3); glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 3); glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE); GLFWwindow* win = glfwCreateWindow(800, 600, "OpenGL Triangle", nullptr, nullptr); glfwMakeContextCurrent(win); // ← GLFW creates OpenGL CONTEXT here glewInit(); // 3 vertices: {x,y} in NDC (bottom-left, bottom-right, top-center) float verts[] = { -0.5f,-0.5f, 0.5f,-0.5f, 0.0f,0.5f }; // Upload vertices to GPU (VBO), describe layout (VAO) GLuint VAO, VBO; glGenVertexArrays(1, &VAO); glBindVertexArray(VAO); glGenBuffers(1, &VBO); glBindBuffer(GL_ARRAY_BUFFER, VBO); glBufferData(GL_ARRAY_BUFFER, sizeof(verts), verts, GL_STATIC_DRAW); glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 2*sizeof(float), (void*)0); glEnableVertexAttribArray(0); // Compile + link shaders GLuint vs = glCreateShader(GL_VERTEX_SHADER); glShaderSource(vs,1,&vertSrc,nullptr); glCompileShader(vs); GLuint fs = glCreateShader(GL_FRAGMENT_SHADER); glShaderSource(fs,1,&fragSrc,nullptr); glCompileShader(fs); GLuint prog = glCreateProgram(); glAttachShader(prog,vs); glAttachShader(prog,fs); glLinkProgram(prog); glDeleteShader(vs); glDeleteShader(fs); // Render loop — runs ~60× per second while(!glfwWindowShouldClose(win)) { glfwPollEvents(); glClearColor(0.1f, 0.1f, 0.15f, 1.0f); glClear(GL_COLOR_BUFFER_BIT); glUseProgram(prog); // set active shader glBindVertexArray(VAO); // set active VBO layout glDrawArrays(GL_TRIANGLES, 0, 3); // DRAW — immediate execution glfwSwapBuffers(win); // show the rendered frame } glfwTerminate(); }
What OpenGL Does Automatically (That You Never See)
That short programme above triggers a long chain of hidden driver work on every glDrawArrays call:
GL_STATIC_DRAW hint. It may have guessed wrong — maybe it put it in CPU RAM when your access pattern needed GPU VRAM.OpenGL is like a personal assistant who watches you work and tries to anticipate every request. They book your meetings before you ask (sometimes the wrong time), order your meals (sometimes the wrong food), and file your documents (sometimes in the wrong folder). Thoughtful — but the constant second-guessing overhead costs more than just doing things yourself. Vulkan fires the PA and lets you decide everything. More work, no surprises.
Why Vulkan Exists
OpenGL was designed in 1992. The GPUs of 2016 — and today — are nothing like the hardware of 1992. Vulkan is the API designed for the hardware that actually exists.
The Problem OpenGL Cannot Solve
By 2014, GPU hardware had outgrown OpenGL's design. Three problems were unsolvable within OpenGL's architecture:
1. Unpredictable compilation. OpenGL compiles GLSL to GPU machine code at runtime, inside the driver, at an unpredictable moment — usually your first draw call. This causes mid-game stutters that game engines have been fighting since the 1990s. There is no fix within OpenGL.
2. No multi-threading. An OpenGL context is owned by one thread. In 2024, high-end systems have 24+ CPU cores. Using them for rendering in OpenGL is impossible by design.
3. Hidden overhead. Every OpenGL draw call costs ~10,000 nanoseconds in driver overhead — even for a trivial draw. Vulkan's equivalent costs ~200 nanoseconds. The 50× gap comes entirely from the hidden work the driver does.
The Core Bargain
Vulkan gives you 3–5× less CPU overhead, predictable frame times, and the ability to use all your CPU cores for rendering. In return, a triangle that takes 5 lines of OpenGL takes ~300 lines of Vulkan. Every one of those extra lines removes a decision the driver was making — and hands it to you. This handbook explains every line.
| Topic | OpenGL | Vulkan |
|---|---|---|
| Shader compilation | Runtime, by driver, stutters | Build time via glslc → .spv, zero stutter |
| Render state | Global mutable (glEnable, glBlend…) | Immutable VkPipeline baked at startup |
| Memory | Driver chooses heap automatically | You choose every memory type |
| Draw commands | Immediate — one call, one GPU op | Record batch → submit batch |
| CPU threading | Single thread only | Record on any thread simultaneously |
| Synchronisation | Driver handles invisibly | You declare all GPU-CPU-GPU dependencies |
| Error checking | Built in, always on | Optional validation layer — zero cost in release |
| Draw call overhead | ~10,000 ns/call | ~200 ns/call (50× less) |
The Three Mental Models
Lock these three models in before touching any Vulkan code. With them, every API call is obvious. Without them, everything looks like arbitrary ceremony.
Model 1 — Vulkan Is a Kitchen, Not a Restaurant
In OpenGL you walked into a restaurant, said "draw a lit textured cube", and the kitchen (driver) handled everything. You never saw the recipe, the ingredients, or the cooking process. In Vulkan, you are the head chef. You write every recipe (pipeline), source every ingredient (memory allocation), coordinate every station (command recording), and decide when to serve (submit + present).
The practical consequence: Vulkan has no defaults. Clear colour, depth function, blend equation, shader entry point, memory type, queue family — all must be specified. This is not verbosity. This is the API saying: "I will not make assumptions. Tell me exactly what you want."
OpenGL: you order "pasta carbonara" and the waiter brings it. You never touch the kitchen. Vulkan: you have the kitchen keys, every ingredient, every utensil. You make exactly the pasta you designed. More effort, zero surprises. For a military-grade tactical display that cannot stutter at a critical moment — the kitchen is the right choice.
Model 2 — Record, Then Execute
This is the single biggest behavioural difference from OpenGL. In OpenGL, every gl*() call executed approximately immediately. Your CPU and GPU were tightly coupled — draw, draw, draw, swap.
In Vulkan, your CPU records a list of commands into a VkCommandBuffer. Later, you submit the entire list to the GPU as one batch. The GPU executes it asynchronously while your CPU has already moved on to the next frame's recording.
Why does this matter? Command buffers can be recorded on multiple CPU threads in parallel. Eight threads can each record a portion of the scene, then merge and submit all at once. This is physically impossible in OpenGL's single-threaded model.
The key rule: every Vulkan function starting with vkCmd records a command. No GPU work happens. The GPU only works after vkQueueSubmit().
Model 3 — All State Is Frozen at Pipeline Creation
In OpenGL, render state was global and mutable. You called glEnable(GL_DEPTH_TEST) and from that moment all draws used depth testing — until glDisable(). The driver tracked every state change and silently recompiled the pipeline when anything changed.
In Vulkan, everything — shaders, vertex layout, topology, viewport, rasterisation, depth test, blending — is locked into one immutable VkPipeline object when you create it. If you need different blending, you create a second pipeline. Switching state at draw time means binding a different pipeline — an instant register write on the GPU.
OpenGL render state is a frying pan — add oil, change heat, swap ingredients on the fly. Always reactive. A Vulkan pipeline is a pre-packaged meal kit — every ingredient and instruction sealed together at manufacture (pipeline creation). To cook something different, you open a different box. Once the box is open (pipeline bound), execution is maximally efficient because every decision is already made.
Create all pipelines at application startup. A typical Vulkan application creates 5–20 pipeline objects during initialisation and binds them at draw time. Pipeline creation is expensive (hundreds of microseconds). Binding is essentially free (nanoseconds).
The Vulkan Object Hierarchy
Vulkan objects have a strict parent-child ownership tree. Creating an object requires its parent. Destroying an object requires its children to be destroyed first. This is not optional — the validation layer enforces it.
Every Object You Will Create
Here is the complete hierarchy for a basic rendering application, in creation order:
Always destroy in reverse order of creation. VkFence before VkDevice. VkPipeline before VkDevice. VkSwapchain before VkDevice. VkDevice before VkInstance. Every violation is caught by the validation layer with a clear error message. In this handbook, every cleanup section shows the correct order.
Why So Many Objects?
Each object is a specific hardware resource decision. The separation allows: creation on different CPU threads, independent lifetime management, sharing memory across objects, and the validation layer to catch every error. In OpenGL, all of this was hidden inside one opaque "context" with no visibility into what was allocated or where.
The Vulkan Struct Pattern
Every single Vulkan object creation follows the same pattern. Learn it once and the entire API feels consistent.
The Universal Creation Pattern
// ── STEP 1: Declare the CreateInfo struct with {} zero-initialisation VkSomethingCreateInfo info{}; // {} sets every field to zero — CRITICAL. Missing this = garbage memory = undefined behaviour. // ── STEP 2: Set sType — NEVER skip this info.sType = VK_STRUCTURE_TYPE_SOMETHING_CREATE_INFO; // sType tells Vulkan which struct type you're passing. // The validation layer uses this to catch struct mismatches. // The pNext extension chain uses this to walk linked structs. // Rule: the sType value always matches the struct name. // ── STEP 3: Set pNext — almost always nullptr info.pNext = nullptr; // pNext is for extension chains. Leave nullptr until you specifically need it. // ── STEP 4: Fill in all your fields info.someField = someValue; info.count = 1; // ── STEP 5: Call the creation function VkSomething handle; VkResult r = vkCreateSomething(parentHandle, &info, nullptr, &handle); // Parameters: parent object, &createInfo, custom allocator (nullptr=default), &output handle // ── STEP 6: ALWAYS check the result if (r != VK_SUCCESS) throw std::runtime_error("Creation failed!"); // Unlike OpenGL's glGetError() which most code never checks, // Vulkan returns success/failure from every function directly.
Why {} Zero-Initialisation Matters
Vulkan structs have many fields — some have 20+. A zero-value field usually means "disabled" or "no count". If you skip {}, C++ leaves the struct with whatever garbage bytes happen to be in that memory location. This causes crashes or invisible wrong behaviour with no error message, because the validation layer sees enabledExtensionCount = 3847213 and tries to read 3 million extension name pointers.
Every Vulkan struct declaration ends with {}. No exceptions. Make this muscle memory.
VkResult — The Error Signal You Must Always Check
VK_SUCCESS // = 0. The only success value. VK_ERROR_OUT_OF_DEVICE_MEMORY // GPU VRAM exhausted VK_ERROR_DEVICE_LOST // GPU crashed — recreate everything VK_ERROR_EXTENSION_NOT_PRESENT // Requested extension not available VK_ERROR_LAYER_NOT_PRESENT // Validation layer not installed VK_SUBOPTIMAL_KHR // Swapchain works but not ideal (window resized) VK_ERROR_OUT_OF_DATE_KHR // Swapchain is stale — must recreate (window resized) // Handy macro for your code — add to every demo: #define VK_CHECK(call) { \ VkResult _r = (call); \ if (_r != VK_SUCCESS) { \ std::cerr << "VULKAN ERROR in " << __FILE__ << ":" << __LINE__ << "\n"; \ throw std::runtime_error("vk call failed"); \ } \ } // Usage: VK_CHECK(vkCreateInstance(&ci, nullptr, &instance));
Vulkan Memory Model
Vulkan gives you direct control over every GPU memory allocation. Understanding where data lives — and why — unlocks the performance advantage Vulkan offers over OpenGL.
Two Physical Memory Pools
Your system has two separate memory pools connected by the PCIe bus:
Memory Property Flags — Your Controls
| Flag | Meaning | Typical use |
|---|---|---|
| DEVICE_LOCAL | On the GPU chip. Fastest GPU reads. CPU cannot access directly. | Static meshes, textures, render targets |
| HOST_VISIBLE | CPU can map and write via vkMapMemory | Staging buffers, per-frame uniforms |
| HOST_COHERENT | CPU writes immediately visible to GPU. No manual flush. | Always combine with HOST_VISIBLE for simplicity |
| HOST_CACHED | CPU reads are cached (fast readback). Needs manual flush. | GPU→CPU readbacks: picking, simulation results |
The Staging Buffer Pattern
For static mesh data that never changes, you want it in DEVICE_LOCAL memory (fastest GPU reads). But the CPU cannot write there directly. Solution: use a temporary HOST_VISIBLE staging buffer as the middleman.
vkCreateBuffer(VK_BUFFER_USAGE_TRANSFER_SRC_BIT). Allocate with HOST_VISIBLE | HOST_COHERENT.vkCreateBuffer(VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT). Allocate with DEVICE_LOCAL.vkDestroyBuffer + vkFreeMemory. The device-local buffer is now the permanent home of your data.In Demos 1–3 we skip staging and use HOST_VISIBLE directly for simplicity. For production work — especially the Naval display — use staging for all static geometry so it lives in DEVICE_LOCAL memory at maximum GPU bandwidth.
VkBuffer and VkDeviceMemory Are Two Separate Objects
This surprises everyone coming from OpenGL. In OpenGL, glGenBuffers + glBufferData created AND filled a buffer in one call. In Vulkan, VkBuffer is just a descriptor — it says "I am a 60-byte vertex buffer". VkDeviceMemory is the actual bytes. You create them separately and then bind them together with vkBindBufferMemory. This separation allows sharing one large memory allocation across many small buffers — a critical pattern for reducing allocation overhead.
Queues & Command Buffers
Queues are the submission channels between your CPU and the GPU hardware. Command buffers are the scripts you record and hand to the queue. Understanding both is essential before writing a single draw call.
What a Queue Is
A GPU does not execute commands the moment you call a function. It has hardware work queues — a FIFO list of command batches. When you call vkQueueSubmit(), you push a batch onto the queue. The GPU processes it asynchronously while your CPU continues.
The GPU queue is like an airport check-in desk. Your CPU is the passenger filing in. You hand in your luggage (command batch). The ground crew (GPU hardware) handles it from there, and you walk to the gate. You don't stand at check-in waiting — you proceed asynchronously. The baggage claim buzzer (fence/semaphore) tells you when it is ready.
Queue Families — Not All Queues Do Everything
Every GPU exposes its queues grouped by capability, called queue families:
| Queue Family | What it can do | When you need it |
|---|---|---|
| Graphics | Draw commands, compute, transfers | All rendering. Your primary queue. |
| Compute | Compute shaders only | Physics, signal processing, sonar — no raster output needed |
| Transfer | Memory copies only | Background asset streaming without blocking graphics |
| Present | Display images on screen | Always needed. Usually the same family as Graphics. |
Command Pool → Command Buffer Lifecycle
Before recording commands, you need a VkCommandPool — a memory allocator for command buffers tied to one queue family. Command buffers allocated from a graphics pool can only be submitted to graphics queues.
// ── Create the command pool ─────────────────────────────────────────── VkCommandPoolCreateInfo poolCI{}; poolCI.sType = VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO; poolCI.queueFamilyIndex = graphicsQueueFamily; poolCI.flags = VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT; // RESET_COMMAND_BUFFER_BIT: lets you reset individual buffers per frame. // Without this flag: you must reset the ENTIRE pool to re-record any buffer. vkCreateCommandPool(device, &poolCI, nullptr, &cmdPool); // ── Allocate a command buffer from the pool ─────────────────────────── VkCommandBufferAllocateInfo allocCI{}; allocCI.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO; allocCI.commandPool = cmdPool; allocCI.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY; // PRIMARY: submitted directly to the queue // SECONDARY: recorded into a primary buffer (multi-threaded recording) allocCI.commandBufferCount = 1; vkAllocateCommandBuffers(device, &allocCI, &cmdBuffer);
// ── Every frame: reset, record, submit ─────────────────────────────── // 1. Reset — erase previous frame's commands vkResetCommandBuffer(cmdBuffer, 0); // 2. Open the recording VkCommandBufferBeginInfo beginInfo{}; beginInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO; vkBeginCommandBuffer(cmdBuffer, &beginInfo); // ── From here: vkCmd* calls RECORD, not execute ────────────────────── vkCmdBeginRenderPass(cmdBuffer, &rpBeginInfo, VK_SUBPASS_CONTENTS_INLINE); vkCmdBindPipeline(cmdBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline); vkCmdBindVertexBuffers(cmdBuffer, 0, 1, &vertexBuffer, offsets); vkCmdDraw(cmdBuffer, 3, 1, 0, 0); // recorded — NOT yet executed vkCmdEndRenderPass(cmdBuffer); vkEndCommandBuffer(cmdBuffer); // seal the recording // 3. Submit — GPU executes the sealed recording asynchronously vkQueueSubmit(graphicsQueue, 1, &submitInfo, fence); // CPU is FREE to do other work while GPU executes ↑
Environment & CMake Setup
Every Vulkan project follows the same folder structure and the same CMakeLists.txt. Set this up once and every subsequent demo takes 30 seconds to create.
Verify Your Installation First
# 1. Confirm Vulkan SDK is installed and GPU supports Vulkan vulkaninfo --summary # Expected: Your GPU name + "Vulkan version: 1.3.x" # 2. Confirm Vulkan runtime with a spinning textured cube vkcube # Expected: A window opens with a spinning cube. Close it. # 3. Confirm the GLSL → SPIR-V compiler is available glslc --version # Expected: "shaderc v202X.X, spirv-tools v202X.X"
The Standard Project Folder Structure
├── V01_Window\ ← Demo 1: just a window + GPU info
│ ├── CMakeLists.txt
│ └── src\
│ └── main.cpp
├── V02_Triangle\ ← Demo 2: full Vulkan triangle
│ ├── CMakeLists.txt
│ ├── src\
│ │ └── main.cpp
│ └── shaders\
│ ├── triangle.vert ← GLSL source (you edit this)
│ ├── triangle.frag ← GLSL source (you edit this)
│ ├── triangle.vert.spv ← compiled bytecode (Vulkan reads this)
│ └── triangle.frag.spv ← compiled bytecode (Vulkan reads this)
└── V03_Animation\ ← Demo 3: push constants + rotation
├── CMakeLists.txt
├── src\
│ └── main.cpp
└── shaders\ ← same structure as V02
The CMakeLists.txt — Copy for Every Demo
cmake_minimum_required(VERSION 3.20) project(V02_Triangle) # ← change per demo set(CMAKE_CXX_STANDARD 17) set(CMAKE_CXX_STANDARD_REQUIRED ON) # ── Vulkan SDK ──────────────────────────────────────────────────────── # find_package searches VULKAN_SDK env variable (set by LunarG installer) # Sets: Vulkan::Vulkan target (headers + vulkan-1.lib), Vulkan_FOUND find_package(Vulkan REQUIRED) # ── GLFW (same as OpenGL days) ──────────────────────────────────────── set(GLFW_DIR $ENV{GLFW_DIR}) include_directories(${GLFW_DIR}/include) add_executable(V02_Triangle src/main.cpp) # ← change per demo target_link_libraries(V02_Triangle Vulkan::Vulkan # vulkan-1.lib + all headers ${GLFW_DIR}/lib-vc2022/glfw3.lib # same GLFW as before ) # NOTE: No GLEW. Vulkan exports all functions from vulkan-1.lib directly. # GLEW was only needed for OpenGL's dynamic function pointer loading.
Build Commands
cd C:\Labs\V02_Triangle # Compile shaders FIRST (must exist before the exe tries to load them) glslc shaders/triangle.vert -o shaders/triangle.vert.spv glslc shaders/triangle.frag -o shaders/triangle.frag.spv # Configure CMake (generates VS project files) cmake -B build -G "Visual Studio 17 2022" -A x64 # Build (Release mode for best performance) cmake --build build --config Release # Run — IMPORTANT: run from project root so shaders/ folder is found cd C:\Labs\V02_Triangle build\Release\V02_Triangle.exe
Running the exe from inside build\Release\ instead of from the project root. Your programme calls readFile("shaders/triangle.vert.spv") — that path is relative to wherever you run from. Run from C:\Labs\V02_Triangle\. Always.
Shader Compilation — GLSL to SPIR-V
Vulkan does not compile GLSL at runtime. You compile it at build time using glslc. The result is SPIR-V bytecode — a binary format that Vulkan loads directly, with zero runtime compilation.
Why SPIR-V?
In OpenGL, shader source strings are compiled by the GPU driver at runtime. Every driver has its own GLSL compiler with different behaviour, different performance, and different bugs. A shader that runs fast on NVIDIA may be slow on AMD — different compilers making different optimisation choices.
SPIR-V is an intermediate bytecode designed by Khronos. You compile GLSL → SPIR-V once with glslc. Every GPU driver receives the same SPIR-V and compiles it to GPU machine code — but now from a well-defined intermediate rather than ambiguous GLSL source. Zero runtime GLSL parsing, zero mid-frame stutter, consistent behaviour across all GPU vendors.
Vulkan GLSL vs OpenGL GLSL — Three Differences
| Topic | OpenGL GLSL | Vulkan GLSL |
|---|---|---|
| Input/output locations | Optional — OpenGL infers them | Required — always write layout(location=N) |
| Y-axis direction | Y=+1 is TOP of screen (NDC) | Y=+1 is BOTTOM of screen — negate Y in shader |
| Fragment output | Write to gl_FragColor (or named out) | Must declare explicit out variable: layout(location=0) out vec4 outColor |
| Uniforms for small data | glUniform*() calls per frame | layout(push_constant) uniform block |
| Texture access | uniform sampler2D texture | layout(set=N, binding=N) uniform sampler2D |
// ══ shaders/triangle.vert ══════════════════════════════════════════════ // Compile: glslc shaders/triangle.vert -o shaders/triangle.vert.spv #version 450 // layout(location=N) REQUIRED in Vulkan GLSL — not optional like OpenGL layout(location = 0) in vec2 inPos; // attribute 0: XY position layout(location = 1) in vec3 inColor; // attribute 1: RGB colour layout(location = 0) out vec3 fragColor; // pass colour to frag shader void main() { // NOTE: negate Y — Vulkan NDC has Y=+1 at bottom (opposite of OpenGL) gl_Position = vec4(inPos.x, -inPos.y, 0.0, 1.0); fragColor = inColor; } // ══ shaders/triangle.frag ══════════════════════════════════════════════ // Compile: glslc shaders/triangle.frag -o shaders/triangle.frag.spv #version 450 layout(location = 0) in vec3 fragColor; layout(location = 0) out vec4 outColor; // NO gl_FragColor in Vulkan GLSL void main() { outColor = vec4(fragColor, 1.0); }
The readSpv Helper — Load .spv at Runtime
#include <fstream> #include <vector> #include <stdexcept> std::vector<char> readSpv(const std::string& path) { std::ifstream f(path, std::ios::ate | std::ios::binary); // ate = open at end (so tellg() gives size immediately) // binary = raw bytes, no newline translation if (!f.is_open()) throw std::runtime_error("Cannot open shader: " + path); size_t sz = f.tellg(); std::vector<char> buf(sz); f.seekg(0); f.read(buf.data(), sz); return buf; } // Usage: auto vertCode = readSpv("shaders/triangle.vert.spv");
VkInstance — Your First Vulkan Object
VkInstance is the Vulkan library handle. It is your application's connection to the Vulkan loader. Everything in Vulkan flows from this one object.
What VkInstance Does
The instance holds global configuration: which Vulkan version your app requires, which instance-level extensions are enabled, and which validation layers are active. There is exactly one VkInstance per application. It is the first Vulkan object you create and the last you destroy.
It does not represent a GPU. It represents "Vulkan is loaded and configured with these settings."
#define GLFW_INCLUDE_VULKAN // tells GLFW to include Vulkan headers #include <GLFW/glfw3.h> #include <vulkan/vulkan.h> #include <vector> #include <stdexcept> // ── Optional but essential: validation layer ───────────────────────── // During development: always enable. In release: remove. // Catches ~95% of all Vulkan API errors before they crash. const char* VALIDATION_LAYER = "VK_LAYER_KHRONOS_validation"; VkInstance createInstance() { // Application metadata — optional but good practice VkApplicationInfo appInfo{}; appInfo.sType = VK_STRUCTURE_TYPE_APPLICATION_INFO; appInfo.pApplicationName = "RR Graphics Demo"; appInfo.applicationVersion = VK_MAKE_VERSION(1, 0, 0); appInfo.apiVersion = VK_API_VERSION_1_3; // apiVersion: minimum Vulkan version your app requires. // 1.3 is safe for any GPU from 2020+. Older GPU → vkCreateInstance fails. // Driver can use it to apply per-app workarounds too. // Ask GLFW which extensions it needs for window surface creation uint32_t glfwExtCount = 0; const char** glfwExts = glfwGetRequiredInstanceExtensions(&glfwExtCount); // Typically: "VK_KHR_surface" + "VK_KHR_win32_surface" on Windows std::vector<const char*> exts(glfwExts, glfwExts + glfwExtCount); VkInstanceCreateInfo ci{}; ci.sType = VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO; ci.pApplicationInfo = &appInfo; ci.enabledExtensionCount = (uint32_t)exts.size(); ci.ppEnabledExtensionNames = exts.data(); ci.enabledLayerCount = 1; // enable validation ci.ppEnabledLayerNames = &VALIDATION_LAYER; VkInstance instance; if (vkCreateInstance(&ci, nullptr, &instance) != VK_SUCCESS) throw std::runtime_error("vkCreateInstance failed"); return instance; } // Cleanup (always last): vkDestroyInstance(instance, nullptr);
Physical Device — Reading the GPU Spec Sheet
VkPhysicalDevice represents actual GPU hardware. You query it, not create it. It tells you everything about what the GPU supports and what memory it has.
The Two-Call Pattern — Query Count, Then Fill
Wherever Vulkan returns a variable-length list, it always uses this two-call pattern: call once with nullptr to get the count, then call again with a properly-sized vector to fill it. You will see this dozens of times.
VkPhysicalDevice pickPhysicalDevice(VkInstance instance, VkSurfaceKHR surface) { // ── The two-call pattern ───────────────────────────────────────────── uint32_t count = 0; vkEnumeratePhysicalDevices(instance, &count, nullptr); // ① get count if (count == 0) throw std::runtime_error("No Vulkan GPU found!"); std::vector<VkPhysicalDevice> gpus(count); vkEnumeratePhysicalDevices(instance, &count, gpus.data()); // ② fill array std::cout << "\n[ GPUs found: " << count << " ]\n"; VkPhysicalDevice chosen = VK_NULL_HANDLE; for (auto& gpu : gpus) { // VkPhysicalDeviceProperties: name, type, Vulkan version, limits VkPhysicalDeviceProperties props; vkGetPhysicalDeviceProperties(gpu, &props); std::cout << " GPU: " << props.deviceName; std::cout << " Vulkan " << VK_VERSION_MAJOR(props.apiVersion) << "." << VK_VERSION_MINOR(props.apiVersion); if (props.deviceType == VK_PHYSICAL_DEVICE_TYPE_DISCRETE_GPU) { std::cout << " [DISCRETE — SELECTED]\n"; chosen = gpu; // prefer dedicated GPU over integrated } else if (props.deviceType == VK_PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU) { std::cout << " [INTEGRATED]\n"; if (chosen == VK_NULL_HANDLE) chosen = gpu; // fallback } else { std::cout << " [other]\n"; } } // ── Print memory heaps — information OpenGL never gave you ──────────── VkPhysicalDeviceMemoryProperties memP; vkGetPhysicalDeviceMemoryProperties(chosen, &memP); std::cout << "\n Memory heaps:\n"; for (uint32_t i = 0; i < memP.memoryHeapCount; i++) { float gb = memP.memoryHeaps[i].size / 1e9f; bool gpu = memP.memoryHeaps[i].flags & VK_MEMORY_HEAP_DEVICE_LOCAL_BIT; std::cout << " Heap " << i << ": " << gb << " GB" << (gpu ? " [GPU-LOCAL fastest]" : " [CPU-visible]") << "\n"; } // ── Queue family check — does this GPU support graphics + present? ──── uint32_t qfCount; vkGetPhysicalDeviceQueueFamilyProperties(chosen, &qfCount, nullptr); std::vector<VkQueueFamilyProperties> qfs(qfCount); vkGetPhysicalDeviceQueueFamilyProperties(chosen, &qfCount, qfs.data()); for (uint32_t i = 0; i < qfCount; i++) { VkBool32 presentOK; vkGetPhysicalDeviceSurfaceSupportKHR(chosen, i, surface, &presentOK); if ((qfs[i].queueFlags & VK_QUEUE_GRAPHICS_BIT) && presentOK) std::cout << " Queue family " << i << ": graphics + present OK\n"; } return chosen; }
Logical Device & Queues
VkDevice is your software contract with the GPU. You declare what queue families you need, what features to enable, and what extensions are required. Everything below in the hierarchy is created from VkDevice.
VkPhysicalDevice = the GPU exists. You queried it. It is hardware. You cannot configure it. VkDevice = your software interface to that hardware. One physical device can theoretically have multiple logical devices (rare). Think of VkPhysicalDevice as "which laptop I own" and VkDevice as "what software and accounts I set up on it."
float priority = 1.0f; VkDeviceQueueCreateInfo qci{}; qci.sType = VK_STRUCTURE_TYPE_DEVICE_QUEUE_CREATE_INFO; qci.queueFamilyIndex = graphicsQueueFamily; qci.queueCount = 1; // one queue from this family qci.pQueuePriorities = &priority; // 0.0=lowest, 1.0=highest scheduling priority const char* devExts[] = { VK_KHR_SWAPCHAIN_EXTENSION_NAME }; // VK_KHR_SWAPCHAIN_EXTENSION_NAME = "VK_KHR_swapchain" // This adds: vkCreateSwapchainKHR, vkQueuePresentKHR, etc. // Required if you want to display anything on screen. VkPhysicalDeviceFeatures features{}; // {} = all features disabled. // Enable only what you use, e.g.: // features.samplerAnisotropy = VK_TRUE; // high-quality texture filtering VkDeviceCreateInfo dci{}; dci.sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO; dci.queueCreateInfoCount = 1; dci.pQueueCreateInfos = &qci; dci.enabledExtensionCount = 1; dci.ppEnabledExtensionNames = devExts; dci.pEnabledFeatures = &features; VkDevice device; if (vkCreateDevice(physDevice, &dci, nullptr, &device) != VK_SUCCESS) throw std::runtime_error("vkCreateDevice failed"); // Retrieve queue handle — queues are created with the device, you retrieve them VkQueue graphicsQueue; vkGetDeviceQueue(device, graphicsQueueFamily, 0, &graphicsQueue); // Args: device, queueFamilyIndex, queueIndex (0=first), &output // graphicsQueue is now your submission channel to the GPU
Surface & Swapchain
The surface is the bridge between Vulkan and your OS window. The swapchain is the ring of images you draw into and present to the screen — the explicit replacement for OpenGL's invisible double-buffering.
VkSurfaceKHR — Vulkan Meets the Window
Vulkan's core has no windowing system. KHR extensions add platform support. GLFW handles the platform-specific surface creation for you on Windows, Linux, and macOS — one line:
// In main(), after glfwCreateWindow(): glfwWindowHint(GLFW_CLIENT_API, GLFW_NO_API); // CRITICAL: no OpenGL context // In OpenGL: GLFW_CONTEXT_VERSION_MAJOR was set. Vulkan: NO API at all. // This one hint changes everything — GLFW creates a bare OS window, // no OpenGL context attached, ready for Vulkan to claim it. VkSurfaceKHR surface; glfwCreateWindowSurface(instance, window, nullptr, &surface); // GLFW handles: Win32 on Windows, XCB/Wayland on Linux, Metal on macOS // Cleanup: vkDestroySurfaceKHR(instance, surface, nullptr) — before vkDestroyInstance
VkSwapchainKHR — Explicit Double Buffering
In OpenGL, glfwSwapBuffers() flipped front and back buffers invisibly. You never touched the images. In Vulkan, you create a ring of 2–3 images explicitly. You draw into one while the monitor displays another.
| Presentation Mode | Behaviour | Use When |
|---|---|---|
| FIFO_KHR | Strict vsync queue. Always available. Never tears. | Default choice. Slight input lag. |
| MAILBOX_KHR | Vsync with newest frame. Renders as fast as GPU allows, discards old queued frames. | Best quality + low latency. |
| IMMEDIATE_KHR | No vsync. Frames shown immediately. May tear. | Benchmarking only. |
Render Pass — Declaring Your Drawing Intent
A render pass is an upfront declaration of what framebuffer attachments you will draw into and what must happen to them. This lets the GPU driver optimise memory bandwidth for tile-based renderers.
Before filming starts, a director hands the crew a shot list: "We shoot Scene 3 today. Start with a clean slate (loadOp=CLEAR). Save the footage at end (storeOp=STORE). Print it for screening (finalLayout=PRESENT_SRC)." The crew sets up optimally based on this upfront plan. Vulkan's render pass gives the GPU driver the same plan — and the driver optimises memory layout accordingly.
VkAttachmentDescription colorAtt{}; colorAtt.format = swapchainFormat; colorAtt.samples = VK_SAMPLE_COUNT_1_BIT; // no MSAA colorAtt.loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR; // loadOp: what happens at START of render pass // CLEAR = fill with clear value (our black background) // LOAD = preserve existing content (for compositing) // DONT_CARE = undefined — saves bandwidth on mobile, but we'd see garbage colorAtt.storeOp = VK_ATTACHMENT_STORE_OP_STORE; // storeOp: what happens at END of render pass // STORE = keep the result (we need it for presentation) // DONT_CARE = discard — correct for depth buffers not used later colorAtt.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE; // not using stencil colorAtt.stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE; colorAtt.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED; // initialLayout: expected image layout when pass begins. // UNDEFINED = "I don't care what was there" — matches CLEAR loadOp perfectly. colorAtt.finalLayout = VK_IMAGE_LAYOUT_PRESENT_SRC_KHR; // finalLayout: layout GPU auto-transitions image to when pass ends. // PRESENT_SRC_KHR = ready for vkQueuePresentKHR to show on screen. VkAttachmentReference colorRef{}; colorRef.attachment = 0; // index into the attachments array colorRef.layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL; // This is the layout DURING the subpass — optimal for colour writes. VkSubpassDescription subpass{}; subpass.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS; subpass.colorAttachmentCount = 1; subpass.pColorAttachments = &colorRef; VkSubpassDependency dep{}; dep.srcSubpass = VK_SUBPASS_EXTERNAL; // "before this render pass" dep.dstSubpass = 0; // our subpass dep.srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT; dep.dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT; dep.srcAccessMask = 0; dep.dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT; // This dependency ensures the swapchain image is ready before we write to it. VkRenderPassCreateInfo rpCI{}; rpCI.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO; rpCI.attachmentCount = 1; rpCI.pAttachments = &colorAtt; rpCI.subpassCount = 1; rpCI.pSubpasses = &subpass; rpCI.dependencyCount = 1; rpCI.pDependencies = &dep; vkCreateRenderPass(device, &rpCI, nullptr, &renderPass);
The Graphics Pipeline
VkPipeline is the most important and most verbose Vulkan object. It bakes every piece of render state into one immutable GPU-compiled object. Expensive once, free every frame.
The Pipeline Combines Everything
In OpenGL, you called 20 separate functions to configure state — glEnable, glBlendFunc, glCullFace, glPolygonMode... and every time any of them changed, the driver secretly recompiled an internal pipeline. In Vulkan, you declare all of this state together in one VkGraphicsPipelineCreateInfo. The driver compiles once. At draw time, you bind the pipeline with one hardware register write.
// ── Load SPIR-V shaders ─────────────────────────────────────────────── auto vertCode = readSpv("shaders/triangle.vert.spv"); auto fragCode = readSpv("shaders/triangle.frag.spv"); // Shader modules — thin wrappers around the SPIR-V bytecode VkShaderModuleCreateInfo smCI{}; smCI.sType = VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO; smCI.codeSize = vertCode.size(); smCI.pCode = reinterpret_cast<const uint32_t*>(vertCode.data()); VkShaderModule vertMod, fragMod; vkCreateShaderModule(device, &smCI, nullptr, &vertMod); smCI.codeSize = fragCode.size(); smCI.pCode = reinterpret_cast<const uint32_t*>(fragCode.data()); vkCreateShaderModule(device, &smCI, nullptr, &fragMod); // ── Shader stages: which module runs at which stage ─────────────────── VkPipelineShaderStageCreateInfo stages[2]{}; stages[0].sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO; stages[0].stage = VK_SHADER_STAGE_VERTEX_BIT; stages[0].module = vertMod; stages[0].pName = "main"; // entry point function name stages[1].sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO; stages[1].stage = VK_SHADER_STAGE_FRAGMENT_BIT; stages[1].module = fragMod; stages[1].pName = "main"; // ── Vertex input: replaces glVertexAttribPointer ────────────────────── // Struct: { float pos[2]; float col[3]; } = 20 bytes per vertex VkVertexInputBindingDescription bind{}; bind.binding = 0; // VBO slot 0 bind.stride = 5 * sizeof(float); // 20 bytes per vertex bind.inputRate = VK_VERTEX_INPUT_RATE_VERTEX; VkVertexInputAttributeDescription attrs[2]{}; attrs[0] = {0, 0, VK_FORMAT_R32G32_SFLOAT, 0}; // pos: location=0, 2 floats, offset 0 attrs[1] = {1, 0, VK_FORMAT_R32G32B32_SFLOAT, 2*sizeof(float)}; // col: location=1, 3 floats, offset 8 VkPipelineVertexInputStateCreateInfo vertInput{}; vertInput.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO; vertInput.vertexBindingDescriptionCount = 1; vertInput.pVertexBindingDescriptions = &bind; vertInput.vertexAttributeDescriptionCount = 2; vertInput.pVertexAttributeDescriptions = attrs; // ── Input assembly: how to group vertices into primitives ───────────── VkPipelineInputAssemblyStateCreateInfo ia{}; ia.sType = VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO; ia.topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; // = GL_TRIANGLES // ── Viewport + scissor: which screen region to draw into ───────────── VkViewport vp{0, 0, (float)w, (float)h, 0.0f, 1.0f}; VkRect2D sc{{0,0},{w,h}}; VkPipelineViewportStateCreateInfo vpS{}; vpS.sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO; vpS.viewportCount = 1; vpS.pViewports = &vp; vpS.scissorCount = 1; vpS.pScissors = ≻ // ── Rasterisation: triangles → fragments ──────────────────────────── VkPipelineRasterizationStateCreateInfo rast{}; rast.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO; rast.polygonMode = VK_POLYGON_MODE_FILL; // solid (GL_FILL) rast.cullMode = VK_CULL_MODE_NONE; // no back-face culling for learning rast.frontFace = VK_FRONT_FACE_CLOCKWISE; // OPPOSITE of OpenGL CCW! rast.lineWidth = 1.0f; // ── Multisampling: anti-aliasing — off for now ──────────────────────── VkPipelineMultisampleStateCreateInfo ms{}; ms.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO; ms.rasterizationSamples = VK_SAMPLE_COUNT_1_BIT; // ── Colour blending: how new pixel mixes with existing ─────────────── VkPipelineColorBlendAttachmentState blendAtt{}; blendAtt.colorWriteMask = VK_COLOR_COMPONENT_R_BIT|VK_COLOR_COMPONENT_G_BIT| VK_COLOR_COMPONENT_B_BIT|VK_COLOR_COMPONENT_A_BIT; blendAtt.blendEnable = VK_FALSE; // no alpha blending — opaque VkPipelineColorBlendStateCreateInfo blend{}; blend.sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO; blend.attachmentCount = 1; blend.pAttachments = &blendAtt; // ── Pipeline layout: declares push constants + descriptor sets ──────── VkPipelineLayoutCreateInfo plCI{}; plCI.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO; // For simple triangle: no push constants, no descriptors yet vkCreatePipelineLayout(device, &plCI, nullptr, &pipelineLayout); // ── Final pipeline assembly: EVERYTHING baked together ──────────────── VkGraphicsPipelineCreateInfo gpCI{}; gpCI.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO; gpCI.stageCount = 2; gpCI.pStages = stages; gpCI.pVertexInputState = &vertInput; gpCI.pInputAssemblyState = &ia; gpCI.pViewportState = &vpS; gpCI.pRasterizationState = &rast; gpCI.pMultisampleState = &ms; gpCI.pColorBlendState = &blend; gpCI.layout = pipelineLayout; gpCI.renderPass = renderPass; gpCI.subpass = 0; vkCreateGraphicsPipelines(device, VK_NULL_HANDLE, 1, &gpCI, nullptr, &pipeline); // Shader modules no longer needed after pipeline bakes them in vkDestroyShaderModule(device, vertMod, nullptr); vkDestroyShaderModule(device, fragMod, nullptr);
Vertex Buffers — Explicit Memory Allocation
Creating a vertex buffer in Vulkan is 6 explicit steps where OpenGL needed 1 implicit one. Every step reveals a decision the driver was silently making for you.
struct Vertex { float pos[2], col[3]; }; // XY + RGB = 5 floats = 20 bytes const std::vector<Vertex> verts = { {{ 0.0f, -0.5f}, {1.0f, 0.2f, 0.2f}}, // top — red {{ 0.5f, 0.5f}, {0.2f, 1.0f, 0.3f}}, // right — green {{-0.5f, 0.5f}, {0.2f, 0.3f, 1.0f}} // left — blue }; VkDeviceSize size = sizeof(verts[0]) * verts.size(); // 60 bytes total // ── Step 1: Create the buffer descriptor ───────────────────────────── VkBufferCreateInfo bci{}; bci.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO; bci.size = size; bci.usage = VK_BUFFER_USAGE_VERTEX_BUFFER_BIT; // usage: declares what you will use the buffer for. // VERTEX_BUFFER_BIT: can be bound as a vertex source. // Other flags: UNIFORM_BUFFER_BIT, INDEX_BUFFER_BIT, TRANSFER_SRC/DST_BIT bci.sharingMode = VK_SHARING_MODE_EXCLUSIVE; // only one queue family accesses it VkBuffer vertexBuffer; vkCreateBuffer(device, &bci, nullptr, &vertexBuffer); // NOTE: vertexBuffer has no memory yet — it is just a descriptor // ── Step 2: Query what memory requirements this buffer has ─────────── VkMemoryRequirements req; vkGetBufferMemoryRequirements(device, vertexBuffer, &req); // req.size: may be LARGER than 60 due to GPU alignment requirements // req.alignment: memory start must align to this boundary // req.memoryTypeBits: bitmask — bit i is set if memory type i is compatible // ── Step 3: Find a compatible memory type ──────────────────────────── VkPhysicalDeviceMemoryProperties mps; vkGetPhysicalDeviceMemoryProperties(physDevice, &mps); uint32_t memIdx = UINT32_MAX; auto needed = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT // CPU can write via vkMapMemory | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT; // no manual flush needed for (uint32_t i = 0; i < mps.memoryTypeCount; i++) { bool compatible = req.memoryTypeBits & (1 << i); bool hasFlags = (mps.memoryTypes[i].propertyFlags & needed) == needed; if (compatible && hasFlags) { memIdx = i; break; } } if (memIdx == UINT32_MAX) throw std::runtime_error("No suitable memory type"); // ── Step 4: Allocate the memory ────────────────────────────────────── VkMemoryAllocateInfo mai{}; mai.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO; mai.allocationSize = req.size; // use req.size, not our original 60 mai.memoryTypeIndex = memIdx; VkDeviceMemory vertexMemory; vkAllocateMemory(device, &mai, nullptr, &vertexMemory); // ── Step 5: Bind the memory to the buffer ──────────────────────────── vkBindBufferMemory(device, vertexBuffer, vertexMemory, 0); // Last arg: byte offset into the allocation. 0 = start at beginning. // This is where sharing one large VkDeviceMemory across many buffers is done. // ── Step 6: Write vertex data ──────────────────────────────────────── void* data; vkMapMemory(device, vertexMemory, 0, size, 0, &data); // vkMapMemory: returns a CPU pointer to the GPU memory region. // HOST_COHERENT flag: writes are immediately visible to GPU after unmap. memcpy(data, verts.data(), (size_t)size); vkUnmapMemory(device, vertexMemory); // After unmap: GPU owns the data. Cannot access from CPU anymore.
Recording Command Buffers — Every Frame
Every frame begins with resetting the command buffer and recording a fresh set of commands. This is the Vulkan equivalent of everything inside your OpenGL render loop body.
void recordCommandBuffer(VkCommandBuffer cmd, uint32_t imageIdx) { // 1. Reset: erase all previously recorded commands vkResetCommandBuffer(cmd, 0); // 2. Open the recording VkCommandBufferBeginInfo bi{}; bi.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO; vkBeginCommandBuffer(cmd, &bi); // 3. Begin render pass — what framebuffer, what clear colour VkClearValue clear = {{{0.05f, 0.07f, 0.12f, 1.0f}}}; // dark navy bg // Nested braces: VkClearValue { VkClearColorValue { float[4] } } VkRenderPassBeginInfo rpBI{}; rpBI.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO; rpBI.renderPass = renderPass; rpBI.framebuffer = framebuffers[imageIdx]; // which swapchain image rpBI.renderArea.offset = {0, 0}; rpBI.renderArea.extent = swapExtent; rpBI.clearValueCount = 1; rpBI.pClearValues = &clear; vkCmdBeginRenderPass(cmd, &rpBI, VK_SUBPASS_CONTENTS_INLINE); // 4. Bind pipeline — selects shaders + all baked render state vkCmdBindPipeline(cmd, VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline); // 5. Bind vertex buffer — which VBO to read vertices from VkBuffer vbs[] = { vertexBuffer }; VkDeviceSize offs[] = { 0 }; vkCmdBindVertexBuffers(cmd, 0, 1, vbs, offs); // Args: cmd, firstBinding=0, count=1, buffers, offsets // firstBinding=0 matches binding=0 in VkVertexInputBindingDescription // 6. Draw — RECORD the draw call (does NOT execute yet) vkCmdDraw(cmd, 3, 1, 0, 0); // Args: vertexCount=3, instanceCount=1, firstVertex=0, firstInstance=0 // Compare: glDrawArrays(GL_TRIANGLES, 0, 3) — same intent, different timing vkCmdEndRenderPass(cmd); vkEndCommandBuffer(cmd); // seal the recording — ready to submit } // GPU executes this ONLY after vkQueueSubmit in drawFrame()
Synchronisation — Semaphores & Fences
Vulkan is asynchronous. CPU and GPU run simultaneously. Without explicit synchronisation, the CPU would reuse a command buffer the GPU is still reading, or present an image the GPU is still drawing. Semaphores and fences prevent both.
A semaphore is the traffic light between two kitchen stations — the grill signals "plate is ready" and the service station waits. No human involved. A fence is the pickup buzzer a customer holds — the kitchen (GPU) presses it when the order is done so the customer (CPU) knows it is safe to place the next order.
void drawFrame() { // ── 1. Wait for the PREVIOUS frame's GPU work to finish ────────────── // Prevents CPU from re-recording into a command buffer the GPU still reads. vkWaitForFences(device, 1, &inFlightFence, VK_TRUE, UINT64_MAX); // VK_TRUE = wait for ALL fences (we only have one) // UINT64_MAX = wait forever (no timeout) vkResetFences(device, 1, &inFlightFence); // reset to unsignaled for this frame // ── 2. Ask: which swapchain image can I draw into? ─────────────────── uint32_t imageIdx; vkAcquireNextImageKHR(device, swapchain, UINT64_MAX, imageAvailableSema, VK_NULL_HANDLE, &imageIdx); // imageAvailableSema: GPU will SIGNAL this when the image is truly free // imageIdx: index of the swapchain image you can draw into this frame // ── 3. Record draw commands for this frame ─────────────────────────── recordCommandBuffer(cmdBuffer, imageIdx); // ── 4. Submit to the graphics queue ───────────────────────────────── VkPipelineStageFlags waitStage = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT; // waitStage: at which GPU pipeline stage to pause and wait for the semaphore. // COLOR_ATTACHMENT_OUTPUT = "don't write colour pixels until image is ready" // GPU can still run vertex shading before this point — not wasted cycles. VkSubmitInfo si{}; si.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO; si.waitSemaphoreCount = 1; si.pWaitSemaphores = &imageAvailableSema; si.pWaitDstStageMask = &waitStage; si.commandBufferCount = 1; si.pCommandBuffers = &cmdBuffer; si.signalSemaphoreCount = 1; si.pSignalSemaphores = &renderFinishedSema; // signalSemaphores: GPU signals renderFinishedSema when commands complete vkQueueSubmit(graphicsQueue, 1, &si, inFlightFence); // inFlightFence: GPU signals this fence when done — CPU can safely reset next frame // ── 5. Present the completed image ─────────────────────────────────── VkPresentInfoKHR pi{}; pi.sType = VK_STRUCTURE_TYPE_PRESENT_INFO_KHR; pi.waitSemaphoreCount = 1; pi.pWaitSemaphores = &renderFinishedSema; // Don't present until renderFinishedSema fires — never show a half-drawn frame pi.swapchainCount = 1; pi.pSwapchains = &swapchain; pi.pImageIndices = &imageIdx; vkQueuePresentKHR(presentQueue, &pi); // Returns immediately — actual display happens at next vsync. }
Demo 1 — Window & GPU Hardware Info
Before drawing a pixel, get Vulkan alive and reading your hardware. This demo creates a window, finds your GPU, and prints everything about it that OpenGL never told you.
#define GLFW_INCLUDE_VULKAN #include <GLFW/glfw3.h> #include <vulkan/vulkan.h> #include <iostream> #include <vector> #include <stdexcept> int main() { std::cout << "\n=== V01 — First Vulkan Contact ===\n\n"; // ── Init GLFW ───────────────────────────────────────────────────── glfwInit(); glfwWindowHint(GLFW_CLIENT_API, GLFW_NO_API); // KEY difference from OpenGL glfwWindowHint(GLFW_RESIZABLE, GLFW_FALSE); GLFWwindow* win = glfwCreateWindow(800, 600, "V01 Vulkan Window", nullptr, nullptr); // ── Create VkInstance ───────────────────────────────────────────── uint32_t ec = 0; auto exts = glfwGetRequiredInstanceExtensions(&ec); VkApplicationInfo ai{}; ai.sType = VK_STRUCTURE_TYPE_APPLICATION_INFO; ai.apiVersion = VK_API_VERSION_1_3; VkInstanceCreateInfo ici{}; ici.sType = VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO; ici.pApplicationInfo = &ai; ici.enabledExtensionCount = ec; ici.ppEnabledExtensionNames = exts; VkInstance inst; if (vkCreateInstance(&ici, nullptr, &inst) != VK_SUCCESS) throw std::runtime_error("vkCreateInstance failed"); std::cout << "[1] VkInstance created\n"; // ── Create Window Surface ───────────────────────────────────────── VkSurfaceKHR surf; glfwCreateWindowSurface(inst, win, nullptr, &surf); std::cout << "[2] VkSurfaceKHR created\n"; // ── Enumerate GPUs (two-call pattern) ──────────────────────────── uint32_t cnt = 0; vkEnumeratePhysicalDevices(inst, &cnt, nullptr); std::vector<VkPhysicalDevice> gpus(cnt); vkEnumeratePhysicalDevices(inst, &cnt, gpus.data()); std::cout << "\n[3] GPUs found: " << cnt << "\n"; VkPhysicalDevice chosen = VK_NULL_HANDLE; for (auto& g : gpus) { VkPhysicalDeviceProperties p; vkGetPhysicalDeviceProperties(g, &p); std::cout << " Device: " << p.deviceName << " | Vulkan " << VK_VERSION_MAJOR(p.apiVersion) << "." << VK_VERSION_MINOR(p.apiVersion); if (p.deviceType == VK_PHYSICAL_DEVICE_TYPE_DISCRETE_GPU) { std::cout << " [DISCRETE — selected]\n"; chosen = g; } else { std::cout << " [integrated]\n"; if (!chosen) chosen = g; } } // ── Print memory heaps — OpenGL never showed you this ──────────── VkPhysicalDeviceMemoryProperties mp; vkGetPhysicalDeviceMemoryProperties(chosen, &mp); std::cout << "\n Memory heaps:\n"; for (uint32_t i = 0; i < mp.memoryHeapCount; i++) { float gb = mp.memoryHeaps[i].size / 1e9f; bool loc = mp.memoryHeaps[i].flags & VK_MEMORY_HEAP_DEVICE_LOCAL_BIT; std::cout << " Heap " << i << ": " << gb << " GB" << (loc ? " [GPU-local FAST]" : " [CPU-visible]") << "\n"; } // ── Render loop ─────────────────────────────────────────────────── std::cout << "\n[4] Window open. Press ESC to quit.\n"; while (!glfwWindowShouldClose(win)) { glfwPollEvents(); if (glfwGetKey(win, GLFW_KEY_ESCAPE) == GLFW_PRESS) glfwSetWindowShouldClose(win, true); } // ── Cleanup — REVERSE order of creation ────────────────────────── vkDestroySurfaceKHR(inst, surf, nullptr); vkDestroyInstance(inst, nullptr); glfwDestroyWindow(win); glfwTerminate(); std::cout << "[5] Done.\n"; }
Demo 2 — Hello Triangle
Every object from Chapters 11–19 assembled into one working programme. This is the foundation for all subsequent Vulkan work — the "Hello World" of the graphics pipeline.
Step 1 — Write and compile the shaders first
#version 450 layout(location = 0) in vec2 inPos; layout(location = 1) in vec3 inColor; layout(location = 0) out vec3 fragColor; void main() { gl_Position = vec4(inPos.x, -inPos.y, 0.0, 1.0); // negate Y: Vulkan NDC flip fragColor = inColor; }
#version 450 layout(location = 0) in vec3 fragColor; layout(location = 0) out vec4 outColor; void main() { outColor = vec4(fragColor, 1.0); }
glslc shaders/triangle.vert -o shaders/triangle.vert.spv glslc shaders/triangle.frag -o shaders/triangle.frag.spv
A 800×600 window with a solid RGB triangle — red top, green bottom-right, blue bottom-left — on a dark navy background. The triangle is static. Terminal shows [1] through [14] as each Vulkan object is created, then "Rendering. ESC to quit." Press ESC to cleanly shut down.
Demo 3 — Animated Triangle with Push Constants
Push constants are the fastest way to send per-draw-call data to shaders. No buffer, no descriptor set, no allocation — just 4–128 bytes injected directly into the command buffer recording.
What Push Constants Are
Push constants are a small (4–128 bytes guaranteed, most GPUs support 256) block of data that you push directly into the command buffer before a draw call. The GPU receives it inline — no buffer allocation, no descriptor set binding, no memory management. Perfect for per-object matrices, animation time, or any small rapidly-changing value.
#version 450 layout(location = 0) in vec2 inPos; layout(location = 1) in vec3 inColor; layout(location = 0) out vec3 fragColor; // Push constant block — small data pushed from CPU each draw call // layout(push_constant): declares this is a push constant block (not UBO) layout(push_constant) uniform Push { float time; // seconds since app start — drives rotation angle } push; void main() { float c = cos(push.time), s = sin(push.time); // 2D rotation matrix: [c -s] [x] [cx - sy] // [s c] [y] = [sx + cy] vec2 r = vec2(c * inPos.x - s * inPos.y, s * inPos.x + c * inPos.y); gl_Position = vec4(r.x, -r.y, 0.0, 1.0); fragColor = inColor; }
// ── Addition 1: Declare push constant range in pipeline layout ──────── struct PushData { float time; }; // must match the GLSL block exactly VkPushConstantRange pcRange{}; pcRange.stageFlags = VK_SHADER_STAGE_VERTEX_BIT; // which shader accesses it pcRange.offset = 0; pcRange.size = sizeof(PushData); // 4 bytes — one float VkPipelineLayoutCreateInfo plCI{}; plCI.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO; plCI.pushConstantRangeCount = 1; plCI.pPushConstantRanges = &pcRange; // Now the pipeline knows: "vertex shader expects 4 bytes of push constants" vkCreatePipelineLayout(device, &plCI, nullptr, &pipelineLayout); // ── Addition 2: Push data before every vkCmdDraw (in recordCommandBuffer) ─ PushData pd{ (float)glfwGetTime() }; // current time in seconds vkCmdPushConstants( cmd, // command buffer pipelineLayout, // layout that declared the push constant range VK_SHADER_STAGE_VERTEX_BIT, // must match the range's stageFlags 0, // byte offset into the push constant range sizeof(PushData), // bytes to push &pd // pointer to the data ); // Data is written INTO the command buffer recording — no separate transfer. // When submitted, the GPU receives pd.time in push.time every frame. vkCmdDraw(cmd, 3, 1, 0, 0);
The same RGB triangle from Demo 2 now spins continuously at ~1 radian/second. The same glfwGetTime() function you used in OpenGL for animation — the same seconds-since-start value — now drives Vulkan rotation via 4 bytes of push constant data. No uniform buffer, no descriptor set, just inline data in the command recording.
OpenGL → Vulkan Complete Mapping
Every OpenGL concept you learned in Days 1–3 has a direct Vulkan equivalent. This is your translation dictionary.
| Concept | OpenGL | Vulkan | Key Change |
|---|---|---|---|
| Init | glewInit() | vkCreateInstance() | Explicit version, extensions, layers |
| Window | GLFW_CONTEXT_VERSION_MAJOR → context | GLFW_CLIENT_API=GLFW_NO_API → surface | No GL context; surface bridges Vulkan to OS window |
| GPU ref | Implicit single context | VkPhysicalDevice (query) + VkDevice (create) | You choose GPU; physical=hardware, logical=interface |
| Frame present | glfwSwapBuffers() | vkAcquireNextImageKHR + vkQueuePresentKHR | Explicit image ring management |
| Shader compile | glCompileShader() at runtime | glslc at build time → .spv | No runtime compilation = zero stutter |
| Render state | glEnable, glDepthFunc, glBlend (mutable) | VkPipeline (immutable) | All state baked once; zero cost to use |
| VBO creation | glGenBuffers + glBufferData | vkCreateBuffer + vkAllocateMemory + vkBind | Explicit memory type choice per buffer |
| Vertex layout | glVertexAttribPointer per draw | VkVertexInputAttributeDescription in pipeline | Declared once at pipeline creation |
| Clear | glClear(GL_COLOR_BUFFER_BIT) | loadOp=CLEAR in render pass + VkClearValue | Intent declared upfront, not per frame |
| Draw call | glDrawArrays() — immediate | vkCmdDraw() → recorded → vkQueueSubmit | Record now, execute when submitted |
| Uniforms (small) | glUniform*() per frame | vkCmdPushConstants per draw | Fastest per-draw data; no buffer needed |
| Uniforms (large) | glUniformBlockBinding + UBO | VkDescriptorSet + VkBuffer (uniform) | Explicit binding point and set/binding slots |
| Textures | glGenTextures + glTexImage2D | VkImage + VkImageView + VkSampler + Descriptor | Image data, view desc, and sampling separated |
| Depth test | glEnable(GL_DEPTH_TEST) | VkPipelineDepthStencilStateCreateInfo | Declared in pipeline; not global state |
| Sync | Implicit (driver) | VkSemaphore (GPU-GPU) + VkFence (GPU-CPU) | You own all sync decisions |
The Y-Axis Flip — Three Solutions
gl_Position = vec4(pos.x, -pos.y, pos.z, 1.0) — simplest fix for learning. Zero extra setup.viewport.y = (float)height; viewport.height = -(float)height;. Requires VK_KHR_maintenance1 (included in Vulkan 1.1+). Correct for GLM projection matrices.proj[1][1] *= -1; after glm::perspective(). Most common pattern when porting OpenGL GLM code.Common Errors & Fixes
The validation layer catches ~95% of Vulkan errors. Here are the remaining 5% and the common mistakes that trip up every beginner, with exact fixes.
| Symptom | Cause | Fix |
|---|---|---|
| Black window, no triangle | .spv files not found at runtime | Run exe from project root (where shaders/ lives), not from build\Release\ |
| VK_LAYER_KHRONOS_validation not found | Validation layer not installed | Reinstall LunarG SDK. Verify: vulkaninfo --summary in command prompt. |
| vkCreateInstance fails: EXTENSION_NOT_PRESENT | Requested extension not supported by driver | Call vkEnumerateInstanceExtensionProperties first to check availability. |
| Triangle appears upside down | Vulkan Y-axis is opposite to OpenGL | Negate Y in vertex shader: gl_Position.y *= -1.0f; |
| VK_ERROR_OUT_OF_DATE_KHR from AcquireNextImage | Window was resized — swapchain is stale | Recreate swapchain, image views, and framebuffers. Handle VK_SUBOPTIMAL_KHR the same way. |
| "sType is VK_STRUCTURE_TYPE_MAX_ENUM" | Forgot {} initialisation or sType field | Add {} to every struct declaration. Set sType explicitly every time. |
| App hangs forever (CPU at vkWaitForFences) | Fence never signaled — usually a failed vkQueueSubmit | Always check vkQueueSubmit return value. Fence only signals on successful submit. |
| Validation: "vkDestroyDevice: object not destroyed" | Destroying VkDevice before its children | Destroy children before parents. Exact reverse of creation order. |
| glslc: command not found | SDK Bin not in PATH | Add C:\VulkanSDK\1.3.xxx\Bin to system PATH. Restart terminal after. |
| find_package(Vulkan REQUIRED) fails in CMake | VULKAN_SDK env variable not set | System Properties → Environment Variables → add VULKAN_SDK = C:\VulkanSDK\1.3.xxx |
| vkCreateGraphicsPipelines: pNext chain error | Old pipeline cache handle is stale | Pass VK_NULL_HANDLE for pipeline cache unless you are explicitly managing one. |
| Triangle flickers or tears | Missing semaphore wait in vkQueuePresentKHR | Ensure renderFinishedSemaphore is in pWaitSemaphores of VkPresentInfoKHR. |
The Golden Rule of Vulkan Development
Enable VK_LAYER_KHRONOS_validation in every programme during development. Every demo, every prototype, every lab exercise. The layer costs zero performance in debug builds and catches every API mistake — wrong struct type, wrong argument count, wrong destruction order, wrong synchronisation. The message format is: [VUID-VkXxx-field-parameter]: description of exactly what went wrong. Read it fully before searching the internet. Disable the layer only in your final release build.
Validation Layer Output — What to Look For
// ── Good message (tells you exactly what is wrong) ──────────────────── VALIDATION ERROR: vkCreateGraphicsPipelines(): pCreateInfos[0].pVertexInputState->vertexBindingDescriptionCount (0) is not greater than 0. VUID-VkGraphicsPipelineCreateInfo-pStages-... // → Fix: set vertexBindingDescriptionCount = 1 and pVertexBindingDescriptions // ── sType error (forgot {} or forgot to set sType) ──────────────────── VALIDATION ERROR: vkCreateInstance(): pCreateInfo->sType (2147483647 = VK_STRUCTURE_TYPE_MAX_ENUM) is not equal to VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO. // → Fix: VkInstanceCreateInfo ci{}; and ci.sType = VK_STRUCTURE_TYPE_... // ── Destruction order error ────────────────────────────────────────── VALIDATION ERROR: vkDestroyDevice: OBJ ERROR: For VkDevice, object 0x..., has a child object VkPipeline that has not been destroyed. // → Fix: vkDestroyPipeline before vkDestroyDevice
RR Skillverse — Complete OpenGL & Vulkan Handbook
By Raushan Ranjan · MCT | Senior Corporate Trainer · RR Skillverse, Noida
"Sweat in the right direction brings Peace, Money, and Respect."
References & Study Guide
Every authoritative resource curated and ranked — with a clear week-by-week study path so you know exactly what to read next, why, and where to find it.
Study Guide
The Learning Path — 5 Phases
Graphics programming is built in layers. Skip a layer and the next one feels arbitrary. Follow this sequence:
This handbook Ch 01–02
This handbook Ch 03–24
Khronos Tutorial Pt 2
Sellers + Spec PDF
Cookbook + NVIDIA
Every beginner who skips Phase 1 and jumps straight to Vulkan spends weeks confused about what Vulkan is replacing. The mental models from Phase 1 are the vocabulary that makes Phase 2 click. Do not skip it, even if you are in a hurry.
Free Official Resources
Maintained by Khronos — the organisation that owns the Vulkan specification. These are canonical and always up to date.
Books — PDF & eBook
Essential Developer Tools
Install these once. Use them on every project.
Quick Lookup — Which Resource Right Now?
| I need to… | Best resource | Go there |
|---|---|---|
| Understand a concept with analogies | This handbook (you're here) | ← scroll up |
| See a complete running C++ triangle demo | Khronos Official Tutorial | docs.vulkan.org ↗ |
| Look up an OpenGL function signature | docs.gl | docs.gl ↗ |
| Look up a Vulkan function signature | Vulkan Man Pages | registry ↗ |
| Decode a validation layer VUID error | Vulkan Spec PDF (Ctrl+F the VUID) | PDF ↗ |
| See the same concept done 10 different ways | Sascha Willems Examples | GitHub ↗ |
| Deep-dive extensions, sync, SPIR-V | Khronos Vulkan Guide | guide ↗ |
| Add textures / 3D model loading | Khronos Tutorial Pt 2 | docs.vulkan.org ↗ |
| Learn XR / ray tracing / GPU-driven render | Modern Vulkan Cookbook | Packt ↗ |
| Debug wrong pixels visually (not via code) | RenderDoc | renderdoc.org ↗ |
| Learn OpenGL from absolute scratch | LearnOpenGL.com | learnopengl.com ↗ |
| Understand GPU/memory performance on NVIDIA | NVIDIA Vulkan Do's & Don'ts | NVIDIA blog ↗ |
Want the full reference page with 5 detailed study phases and all resources in one place?
Open Full Reference & Study Guide ↗
RR Skillverse — Complete OpenGL & Vulkan Handbook
By Raushan Ranjan · MCT | Educator | Developer
"Sweat in the right direction brings Peace, Money, and Respect."