Skip to content

Introduce Sourcemap#8393

Open
mununki wants to merge 14 commits intomasterfrom
sourcemap
Open

Introduce Sourcemap#8393
mununki wants to merge 14 commits intomasterfrom
sourcemap

Conversation

@mununki
Copy link
Copy Markdown
Member

@mununki mununki commented Apr 27, 2026

[2026-05-03 UPDATE] Add inline and hidden source map modes

Summary

Introduce Source Map v3 generation for ReScript JavaScript output, including linked, inline, and hidden source map modes.

This adds per-file source map generation and wires source map options through rescript.json, bsc, and rewatch. Depending on sourceMap.mode, the compiler can emit linked .js.map files, inline data-URI source maps, or hidden .js.map files without a JS sourceMappingURL comment.

Basic rescript.json usage:

{
  "sourceMap": {
    "enabled": "dev",
    "mode": "linked",
    "sourcesContent": true,
    "sourceRoot": ""
  }
}
  • enabled: "dev" generates source maps only during rescript watch.
  • enabled: "always" generates source maps during both rescript build and rescript watch.
  • mode: "linked" generates *.js.map files next to generated JavaScript and links them from the JS output with a sourceMappingURL comment.
  • mode: "inline" embeds the source map into the generated JavaScript as a data:application/json;base64,... source map comment and does not emit a sibling .js.map file.
  • mode: "hidden" generates *.js.map files next to generated JavaScript without adding a sourceMappingURL comment. This is useful for workflows that upload source maps to error monitoring services without publicly linking them from JS.
  • sourcesContent: true embeds original .res source text in sourcesContent. With mode: "inline", this also embeds the original source text directly into the generated JS.
  • sourceRoot is optional and most local/Vite workflows do not need it.

This PR also preserves source locations through function output, call expressions, pipe expressions, and pattern-match branch output so browser breakpoints and Node stack traces can resolve back to .res files more usefully.

Validation

From the repository root:

git fetch origin
git checkout sourcemap
yarn install
opam exec -- make test
cargo test source_map --lib --manifest-path rewatch/Cargo.toml

The build test suite includes tests/build_tests/source_map, which verifies linked, inline, and hidden source map modes. It checks linked .js.map output and JS comments, hidden .js.map output without JS comments, inline data-URI map decoding, stale .map cleanup, non-empty mappings, source contents, sourceRoot, sourcesContent: false, and inline maps on stdout.

For manual browser and Node sourcemap behavior, see the external Vite test project:
https://github.com/mununki/rescript-sourcemap-test

The test project README explains how to check out the sourcemap branch of a local ReScript compiler clone, link it from package.json, and verify sourcemaps in Vite/Chrome DevTools and Node.

Next To Do

  • Investigate bundler integration so Vite/Rollup/Webpack can consume ReScript input source maps without custom user configuration.
  • Consider names mappings for better symbol-level debugging.
  • Improve JSX/helper-code mapping granularity.

@cknitt
Copy link
Copy Markdown
Member

cknitt commented Apr 27, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 68c516ba9d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread compiler/core/js_source_map.ml Outdated
Comment on lines +183 to +186
(match (!current, Hashtbl.find_opt marker_locs id) with
| Some builder, Some loc ->
let generated_line, generated_column = Ext_pp.position fmt in
add_mapping builder ~generated_line ~generated_column loc
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Clear consumed source-map markers from global cache

comment_of_loc stores every marker ID in the process-global marker_locs, but mark_comment only does a find_opt and never removes the entry. In long-lived processes (notably rewatch watch mode), each rebuild/file compile adds more IDs and this table grows without bound, which steadily increases memory use and retained location data even after maps are emitted.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Good catch. The rewatch-specific leak is limited today because rewatch spawns bsc as a separate process, so the OCaml global table does not live for the whole watch session. Still, consumed markers should be removed from marker_locs; they are one-shot internal markers and keeping them unnecessarily retains location data for the rest of the compiler process. I’ll update mark_comment to remove marker entries after lookup.

Comment thread compiler/core/js_source_map.ml Outdated
Comment on lines +85 to +89
| from_root :: _, to_root :: _ when from_root = to_root ->
let from_rest, to_rest = drop_common from_parts to_parts in
let parts = repeat ".." (List.length from_rest) @ to_rest in
if parts = [] then Filename.basename to_file else String.concat "/" parts
| _ -> Filename.basename to_file
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve full relative source paths across different roots

When the first normalized path segment differs, relative_path falls back to Filename.basename, which drops directory context entirely. If generated outputs and source files resolve through different absolute prefixes (for example symlink/canonicalized roots), the map will emit only bare filenames, causing ambiguous or unresolvable sources entries when sourcesContent is disabled.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Good catch. Falling back to Filename.basename can lose useful source path context and make sources ambiguous when sourcesContent is disabled. I updated the path logic to preserve relative paths whenever possible, even when the first normalized path segment differs. It now only falls back to a normalized absolute path for cross-drive Windows paths, where a filesystem-relative path is not representable.

Comment thread rewatch/src/config.rs

pub namespace: Option<NamespaceConfig>,
pub jsx: Option<JsxSpecs>,
#[serde(rename = "sourceMap")]
Copy link
Copy Markdown
Member

@fhammerschmidt fhammerschmidt Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe these new options need to be nested in another "sourceMap": {}
object? You only use them together anyway, right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s a fair point. I kept the MVP shape as top-level fields to keep the implementation simple, but nesting the source map-specific options under a sourceMap object also looks like a good direction, especially if we add more options later.

{
  "sourceMap": {
    "mode": "linked",
    "sourcesContent": true,
    "sourceRoot": "",
    ...
  }
}

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Apr 27, 2026

Open in StackBlitz

rescript

npm i https://pkg.pr.new/rescript@8393

@rescript/darwin-arm64

npm i https://pkg.pr.new/@rescript/darwin-arm64@8393

@rescript/darwin-x64

npm i https://pkg.pr.new/@rescript/darwin-x64@8393

@rescript/linux-arm64

npm i https://pkg.pr.new/@rescript/linux-arm64@8393

@rescript/linux-x64

npm i https://pkg.pr.new/@rescript/linux-x64@8393

@rescript/runtime

npm i https://pkg.pr.new/@rescript/runtime@8393

@rescript/win32-x64

npm i https://pkg.pr.new/@rescript/win32-x64@8393

commit: 5238b01

@mununki mununki mentioned this pull request Apr 28, 2026
@BlueHotDog
Copy link
Copy Markdown

amazing!! really excited to see this land. will help me leak my app source-code more easily :-p

@mununki
Copy link
Copy Markdown
Member Author

mununki commented Apr 29, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f7ca25e7a6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread compiler/core/js_source_map.ml Outdated
match Hashtbl.find_opt marker_locs id with
| None -> None
| Some loc ->
Hashtbl.remove marker_locs id;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve marker locations across multi-target map generation

lambda_as_module renders the same compiled program once per package spec (e.g. both CommonJS and ESM), but mark_comment resolves locations via take_marker_loc, which deletes each marker after the first render pass. That means the first .js.map gets mappings, while subsequent outputs encounter missing marker ids and end up with empty/partial mappings, breaking sourcemaps whenever package-specs contains more than one output target.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Good catch! Resolved e258af0 (this PR)

@mununki
Copy link
Copy Markdown
Member Author

mununki commented Apr 29, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Can't wait for the next one!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread compiler/ext/ext_pp.ml
t.column <- 0;
loop (i + 1)
| c ->
let byte = Char.code c in
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to add some comment here to explain what's going on with those hex constants.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to the comment f941252 (this PR)

@@ -0,0 +1,293 @@
type source = {relative_path: string; content: string option}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can let your agent add some comments in this file, at least at the top, to explain the structure of source maps and any other relevant considerations.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to the comment f941252 (this PR)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, these are great!

@cknitt
Copy link
Copy Markdown
Member

cknitt commented Apr 30, 2026

I was wondering about the following: Aren't we sort of "abusing" comments here in order to store references to source locations? Wouldn't it be cleaner to introduce a separate field?

I chatted with Codex about this and got:


In PR #8393, source-map locations are carried through the JS printer by putting internal marker strings into `J.expression.comment`, `J.statement.comment`, and `J.case_clause.comment`.

The key flow is:

  • Js_source_map.comment_of_loc loc creates a string like \000RESCRIPT_SOURCE_MAP:<id> and stores id -> Location.t in a side table.
  • lam_compile.ml attaches that string via existing comment fields.
  • js_dump.ml calls Js_source_map.mark_comment f comment; if it recognizes the marker, it records the current generated printer position and suppresses the comment from JS output.
  • Real comments still use the same field, so pp_comment now has two meanings: printable JS comment or hidden source-map marker.

So yes, it is an overload of the comment channel. “Abuse” is a fair characterization, though it is a pragmatic one: the comment field already exists on expressions/statements/case clauses and is already printed exactly where source-map segments need to be observed.

I agree a specific field would be cleaner if this is meant to become maintained infrastructure. Something like:

type node_meta = {
  comment: string option;
  source_loc: Location.t option;
}

or even just adding source_loc: Location.t option next to comment on relevant JS IR nodes would separate concerns:

  • comment remains “emit this as JS comment”
  • source_loc means “record a source-map segment here if a builder is active”
  • no marker prefix parsing
  • no global marker_locs id table
  • no risk of silently dropping a real comment when a source marker is already present
  • no hidden behavior in pp_comment

The tradeoff is churn. The current PR reuses an existing field and existing optional ?comment plumbing, so it avoids updating a lot of JS IR constructors and helper APIs. But conceptually, source locations are metadata, not comments. If source maps are going beyond MVP, I would prefer a dedicated metadata field before merging this shape too deeply into the compiler.

@mununki
Copy link
Copy Markdown
Member Author

mununki commented Apr 30, 2026

@cknitt You’re right. The current implementation does overload the existing comment field.

I reused it because comment is already present on the JS IR nodes where source map positions are useful, such as expressions, statements, and case clauses. It also flows through the printer at almost exactly the points where we need to observe generated line/column positions. For the MVP, this kept the implementation small and avoided touching a large number of JS IR constructors and helper APIs.

That said, I agree that source locations are metadata rather than comments, and a dedicated field would be cleaner conceptually. As you mentioned, introducing something like source_loc would likely create a much larger diff because it needs to be threaded through expression, statement, case clause constructors, folds/maps, and printer code.

So my intent here was to keep the first implementation pragmatic and low-churn, but I agree this is a good direction if we want to harden the source map support beyond the MVP.

@cknitt
Copy link
Copy Markdown
Member

cknitt commented Apr 30, 2026

So my intent here was to keep the first implementation pragmatic and low-churn, but I agree this is a good direction if we want to harden the source map support beyond the MVP.

Yes, MVP is great so that people can already do first testing (which I see from the feedback here that they are already doing: https://forum.rescript-lang.org/t/feedback-wanted-experimental-rescript-source-map-support/7170). I personally didn't have time to test yet, but still planning to.

But before merging, I think we should move to the cleaner implementation.

@mununki
Copy link
Copy Markdown
Member Author

mununki commented Apr 30, 2026

But before merging, I think we should move to the cleaner implementation.

I agree. This will likely create a fairly large diff across the JS IR helpers and related code, but I’ll keep working on it.

@mununki
Copy link
Copy Markdown
Member Author

mununki commented May 1, 2026

@cknitt I refactor the implementation to avoid reusing comment as the source map marker channel. Source locations are now carried explicitly on the JS IR via source_loc fields on expressions, statements, and case clauses.

@cknitt
Copy link
Copy Markdown
Member

cknitt commented May 3, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1892e2f1e9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread rewatch/src/config.rs Outdated
Comment on lines +883 to +885
SourceMapConfig::Bool(true) => {
panic!("sourceMap true is unsupported; use {{ \"mode\": \"linked\" }}")
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reject sourceMap: true instead of panicking at runtime

This branch panics for sourceMap: true, and because SourceMapConfig currently deserializes any boolean, a user config typo can crash rescript watch/rescript build with an unrecoverable panic instead of a normal validation error. That turns an input error into process termination (including watch-session interruption), so this should be handled as a parse/config error path rather than panic!.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Good catch, this is valid.

sourceMap: true is intentionally unsupported, but it should be rejected during config parsing rather than accepted and handled later with panic!. I updated SourceMapConfig so true is no longer representable after deserialization: false becomes Disabled, object config becomes Options, and true now returns a normal config error. I also replaced the panic-based test with a parse-error test.

@mununki
Copy link
Copy Markdown
Member Author

mununki commented May 3, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Nice work!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@cknitt
Copy link
Copy Markdown
Member

cknitt commented May 3, 2026

Great work!

I wonder if test coverage should be expanded?
Demo.res looks a bit thin.

@mununki
Copy link
Copy Markdown
Member Author

mununki commented May 4, 2026

I wonder if test coverage should be expanded? Demo.res looks a bit thin.

Good point. I expanded the source map build test fixture so Demo.res now covers more realistic syntax: variant/pattern matching, a dynamic pipe call, multiple raiseError sites, and Unicode source content.

The test now checks exact generated-to-original mappings for both raiseError calls, and also verifies that pipe and pattern-match branch output maps back to the corresponding .res source lines across linked, hidden, inline, and stdout source map modes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants