Skip to main content

Command Palette

Search for a command to run...

How a Browser Works – A Beginner‑Friendly Guide to Browser Internals

Published
7 min read
A

I'm a Software engineer and recently graduated (2025) in computer engineering, I'm more into Fullstack Developement using Angular, Nodejs, Express js, Mongodb, currently getting my hands on React js and Next js. Connect me on Linkedin here: https://linkedin.com/in/amanpatel2529

You open Chrome or Firefox, type a URL like chaicode.com, hit Enter… and within a second, a full website appears.

From the outside it looks like magic.
Inside, it’s just a lot of small components following a fixed flow:

URL → Network → HTML/CSS/JS → Parsers + Content Sink → DOM/CSSOM
→ Frame constructor → Reflow → Painting → Display

In this blog we’ll build a simple, story‑style model of that flow and also see where React’s Virtual DOM fits into it.


Browser as a collection of components

A browser is a program whose job is:

Take web content (HTML, CSS, JavaScript, images, etc.)
and turn it into something you can see and interact with.

The main parts are:

  • User Interface (UI)
    The outer shell you interact with: address bar, back/forward buttons, tabs, refresh icon, etc.

  • Browser Engine
    A small co‑ordinator. It talks to the UI and tells the rendering engine what to load, what to re‑render, when to scroll, etc.
    Think of it as the manager between the “outside” (UI) and the “inside” (rendering engine, networking).

  • Rendering Engine (often called Browser Rendering Engine)
    This is what Hitesh Choudhary Sir often calls the “browser engine or rendering engine”.
    Its simple job:

    Take HTML + CSS + JS results and convert them into pixels on your screen.
    It parses, builds trees, runs layout (reflow), paints, and hands the final result to be displayed.

  • Networking Module
    Handles HTTP/HTTPS, DNS, caching and low‑level communication with servers.

  • JavaScript Engine
    Runs JavaScript code and lets pages change dynamically.

You can imagine:

  • UI = reception

  • Browser engine = manager

  • Rendering engine = design + layout + painting team

  • Networking = courier team

  • JS engine = automation/logic team

They all work together to show one web page.


What happens after you type a URL and press Enter?

You type chaicode.com in the address bar and press Enter.

The browser:

  • Checks cache first: “Do I already have this page stored?”

  • If not, networking does a DNS lookup to turn chaicode.com into an IP address.

  • Opens a TCP connection to that IP (3‑way handshake).

  • For HTTPS, performs a TLS key exchange so the connection is encrypted.

Then it sends an HTTP request:

GET / for chaicode.com

The server replies with headers + content (HTML, and references to CSS/JS).
Now the rendering engine starts its pipeline.


HTML parsing → content sink → DOM

The HTML from the server is just text.
The browser has to convert this text into a structured model.

Parsing the HTML

Parsing is like reading a math expression:

2 + 3 * 4

Your brain groups it as 3 * 4 first, then adds 2.
If you draw that grouping as a tree, you’ve basically “parsed” it.

The HTML parser does something similar:

<body>
  <h1>Hello</h1>
  <p>Welcome to the web</p>
</body>

It figures out:

  • <body> contains <h1> and <p>

  • <h1> and <p> are siblings, etc.

Role of the content sink

HTML → HTML Parser → Content Sink → DOM

The HTML parser only understands the structure.
The content sink is an internal helper that:

  • Takes the parser’s understanding

  • Creates real DOM nodes in memory

  • Connects them as parent/child to form the DOM tree

So content sink is like a construction crew:
the parser provides blueprints, content sink builds the actual building.

The final result is the DOM (Document Object Model) – a tree of elements:

  • document

    • html

      • body

        • h1

        • p

This is how the browser internally represents “what is on the page”.


CSS parsing → content sink → CSSOM

The page also needs styles, so the browser fetches CSS files listed in the HTML:

<link rel="stylesheet" href="styles.css">

The CSS parser reads the CSS text:

h1{ 
   color: red; 
}
p{ 
   font-size: 16px; 
}

It understands:

  • Selectors (h1, p, .btn, etc.)

  • Declarations (properties + values)

Again there is a content sink:

CSS → CSS Parser → Content Sink → CSSOM

Here:

  • The CSS parser figures out the rules.

  • The content sink builds the actual CSSOM (CSS Object Model) in memory from those rules.

So after processing HTML and CSS, the rendering engine has:

  • DOM – which elements exist

  • CSSOM – which styles should apply


From DOM and CSSOM to the Frame Constructor

Now the rendering engine must turn “elements + styles” into something that can be physically laid out and drawn.

Next phase as:

Browser comes in → Frame constructor → Frame (render) tree → Reflow → Painting → Display

Frame Constructor: building the render/frame tree

The frame constructor takes:

  • The DOM (structure of elements)

  • The CSSOM (rules and computed styles)

and creates a frame / render tree:

  • Filters out non‑visual nodes (like <head>)

  • For each visible DOM node, creates a corresponding frame with its styles attached

Think of it as:

“Create a list of all the visible boxes we need to lay out and paint.”

This frame/render tree is the input to the layout and painting steps.


Reflow (layout), painting and display

With the frame tree ready, the rendering engine now decides where and how big everything is, and then draws it.

Reflow / Layout

Reflow (or layout) is the process of calculating:

  • Width and height of each frame

  • Its position on the page (x, y coordinates)

Analogy: laying out furniture in a room.

  • You know which items you have (frames)

  • Reflow decides where to place each item and how much space it occupies

If the page or window changes (e.g., resize, content changes, new DOM nodes), parts of the tree might need to reflow again.

Painting

Once layout is done, the engine starts painting:

  • Drawing text

  • Filling backgrounds

  • Drawing borders, shadows

  • Rendering images and other visual effects

This is similar to painting and decorating the already arranged room.

Display / Render

Finally:

  • Painted content is combined (composited), often with help of the GPU

  • The final image is sent to your screen

This is the display/render step – the moment you actually see the page.

So, from the point where DOM and CSSOM exist, the full flow is:

DOM + CSSOM → Frame constructor → Frame/Render tree → Reflow → Painting → Display


Where React’s Virtual DOM fits into this flow

Real DOM updates are expensive because they can trigger:

  • New frame construction

  • More reflows (layout)

  • More painting

Libraries like React try to reduce how much work the browser’s pipeline has to do.

React’s idea:

  • It keeps a Virtual DOM in JavaScript: a lightweight copy of the real DOM tree.

  • When something changes (state/props), React:

    • Updates the Virtual DOM first

    • Compares old Virtual DOM vs new Virtual DOM

    • Figures out the smallest set of changes needed to the real DOM

    • Applies only those few changes to the actual DOM

Because of this:

  • Fewer real DOM changes

  • Fewer frame tree updates, reflows and paints

  • Faster and smoother rendering, especially on complex UIs

Important: React doesn’t change how the browser works internally.
The browser still does:

DOM/CSSOM → Frame constructor → Reflow → Paint → Display

React just tries to minimise DOM changes, so the heavy rendering pipeline runs less often.


You don’t need to memorise everything

It’s completely fine if terms like content sink, frame constructor, reflow, Virtual DOM feel new.

Treat this blog as a map:

  • Browser is a set of cooperating parts (UI, browser engine, rendering engine, networking, JS engine).

  • HTML/CSS go through parser → content sink → DOM/CSSOM.

  • Browser engine + rendering engine then run frame constructor → reflow → painting → display.

  • React and similar frameworks create a Virtual DOM to keep this whole pipeline efficient.

Over time, as you debug layouts and inspect elements in DevTools, this mental model will make everything click.