分类: NodeJS

  • TypeScript原生版来了!Go语言重构,性能提升10倍

    TypeScript原生版来了!Go语言重构,性能提升10倍

    微软近日发布了 TypeScript 原生预览版。这个版本用 Go 语言重新写了编译器。性能提升了 10 倍!

    这不是简单的优化。这是一次彻底的重构。微软团队把整个 TypeScript 编译器都用 Go 重写了。

    什么是 TypeScript 原生版?

    TypeScript 原生版是微软的新项目。他们把原来用 JavaScript 写的编译器,改成了用 Go 语言写。

    为什么要这么做?原因很简单:

    • 速度更快:Go 是编译型语言,运行速度比 JavaScript 快很多
    • 并行处理:Go 天生支持并发,可以同时处理多个任务
    • 内存共享:新版本可以更好地利用内存

    结果就是:编译速度提升了 10 倍!

    性能到底有多强?

    微软用 Sentry 项目做了测试。这个项目有 9000 多个文件,100 多万行 TypeScript 代码。

    原版 TypeScript 5.8

    • 编译时间:72.81 秒
    • 内存使用:3.3GB

    新版 TypeScript 原生版

    • 编译时间:约 7 秒
    • 内存使用:3.9GB

    你没看错,从 72 秒到 7 秒!这就是 10 倍的性能提升。

    对于大型项目来说,这个改进太重要了。以前编译一次要等一分多钟,现在只要几秒钟。

    如何使用?

    安装编译器

    现在就可以试用了。在你的项目里运行:

    npm install -D @typescript/native-preview

    然后用 tsgo 命令代替 tsc

    npx tsgo --project ./src/tsconfig.json

    VS Code 扩展

    微软还发布了 VS Code 扩展。你可以在扩展市场搜索 “TypeScript (Native Preview)”。

    安装后需要手动开启:

    1. 打开命令面板
    2. 运行 “TypeScript Native Preview: Enable (Experimental)”

    或者在设置里开启:

    "typescript.experimental.useTsgo"true

    新功能支持

    SX 支持

    最新版本已经支持 JSX 了。React 开发者可以放心使用。

    JavaScript 支持

    TypeScript 原生版也支持检查 JavaScript 文件。它会读取 JSDoc 注释来做类型检查。

    编辑器功能

    目前支持的功能:

    • ✅ 错误检查
    • ✅ 跳转定义
    • ✅ 悬停提示
    • ✅ 代码补全
    • ❌ 自动导入(开发中)
    • ❌ 查找引用(开发中)
    • ❌ 重命名(开发中)

    注意事项

    这还是预览版,有一些限制:

    模块解析

    如果你用的是老的模块解析方式,可能会遇到错误。建议改成:

    {
        "compilerOptions": {
            "module": "preserve",
            "moduleResolution": "bundler"
        }
    }

    或者:

    {
        "compilerOptions": {
            "module": "nodenext"
        }
    }

    暂不支持的功能

    • --build 模式
    • --declaration 生成声明文件
    • 一些老的编译目标

    技术细节

    为什么选择 Go?

    微软选择 Go 语言有几个原因:

    1. 性能好:Go 是编译型语言,运行速度快
    2. 并发强:Go 的 goroutine 让并行处理变得简单
    3. 内存安全:Go 有垃圾回收,不容易出现内存问题
    4. 生态好:Go 有很多成熟的工具和库

    架构改进

    新版本不只是换了语言。微软还重新设计了架构:

    • 并行解析:可以同时解析多个文件
    • 增量编译:只重新编译改动的部分
    • 内存优化:更好的内存管理

    API 兼容性

    微软知道很多工具依赖 TypeScript API。所以他们开发了兼容层。

    新版本提供了 IPC 接口。其他工具可以通过进程间通信来使用 TypeScript。

    他们还用 Rust 写了一个 Node.js 模块,让 JavaScript 代码可以同步调用新的编译器。

    对开发者的影响

    大型项目的福音

    如果你的项目很大,编译时间很长,这个更新对你来说是巨大的好消息。

    想象一下:

    • 以前编译要等 2 分钟,现在只要 10 秒
    • 以前保存文件后要等很久才能看到错误,现在几乎是实时的
    • CI/CD 流水线会快很多

    开发体验提升

    更快的编译意味着:

    • 更快的热重载
    • 更快的错误反馈
    • 更流畅的开发体验

    对工具链的影响

    很多工具都依赖 TypeScript:

    • Webpack
    • Vite
    • ESLint
    • Prettier

    这些工具都会从性能提升中受益。

    未来规划

    微软的计划是:

    1. 2025 年内:完善所有主要功能
    2. TypeScript 7.0:正式版本会包含原生编译器
    3. 逐步迁移:最终 tsgo 会变成 tsc

    开发路线图

    短期目标:

    • 完善 --build 模式
    • 支持声明文件生成
    • 完善编辑器功能

    长期目标:

    • 完全替代现有编译器
    • 保持 API 兼容性
    • 提供迁移工具

    社区反应

    开发者们对这个消息很兴奋:

    “终于不用等编译了!”

    “10 倍性能提升,这太疯狂了!”

    “大型项目的救星!”

    当然也有一些担心:

    • API 兼容性问题
    • 生态系统适配
    • 学习成本

    但总体来说,社区反应很积极。

    总结

    TypeScript 原生版是一个重大突破。10 倍的性能提升不是小改进,而是质的飞跃。

    虽然还是预览版,但已经可以在很多项目中使用了。如果你的项目编译慢,强烈建议试试。

    这个改进对整个前端生态都有积极影响。它证明了:

    • 性能优化还有很大空间
    • 选择合适的技术栈很重要
    • 大公司愿意投入资源做基础设施

    未来几个月,我们会看到更多功能完善。到 TypeScript 7.0 正式发布时,这将成为所有 TypeScript 开发者的标配。

    前端开发的新时代来了!

    参考资料:

    • Microsoft DevBlogs – Announcing TypeScript Native Previews

    文章来源: 公众号Nodejs技术栈

  • Vue3 官方宣布淘汰 Axios,拥抱新趋势

    Vue3 官方宣布淘汰 Axios,拥抱新趋势

    过去十年,Axios 凭借其简洁的API设计和浏览器/Node.js双环境支持,成为前端开发者的首选请求库。但随着现代前端框架的演进和工程化需求的升级,Alova.js 以更轻量、更智能、更符合现代开发范式的姿态登场

    Axios的四大时代痛点

    1. 冗余的适配逻辑

    // Axios的通用配置(但实际你可能只用浏览器端)
    axios.create({ adapter: isNode ? nodeAdapter : xhrAdapter })

    2. 弱TypeScript支持

    // Axios需要手动定义响应类型
    interface Response<T> { data: T }
    axios.get<Response<User>>('/api/user'// 仍需手动解构data

    3. 过度封装的反模式

    // 层层拦截器叠加导致调试困难
    axios.interceptors.request.use(config => {
      // 权限校验拦截器
    })
    axios.interceptors.response.use(response => {
      // 全局错误处理拦截器
    })

    4. 生态割裂

    需要额外引入第三方库实现缓存、队列等高级功能,增加维护成本

    Alova.js 的核心优势

    1. 极致的轻量与性能

    • Tree-shaking优化:仅打包使用到的功能模块
    • 零依赖:基础包仅 4KB(Axios 12KB)

    2. 智能请求管理(开箱即用)

    // 一个配置实现请求竞态取消+重复请求合并+错误重试
    const { data } = useRequest(userInfoAPI, {
      abortOnUnmount: true,    // 组件卸载自动取消请求
      dedupe: true,            // 自动合并重复请求
      retry: 3                 // 自动重试3次
    })

    3. 声明式编程范式

    与 React/Vue 深度集成,提供Hooks风格API:

    // Vue3示例:自动管理loading/error状态
    const { 
      data, 
      loading, 
      error, 
      send: fetchUser 
    } = useRequest(() => userAPI.get({ id: 1 }))

    4. 多场景解决方案内置

    • SSR友好:服务端渲染直出数据
    • 文件分片上传:内置进度监听和暂停恢复
    • 数据缓存:支持内存/SessionStorage多级缓存策略

    实战迁移指南

    1. 基础请求改造

    Axios:

    axios.get('/api/user', { params: { id: 1 } })
      .then(res => console.log(res.data))

    Alova:

    // 定义API模块(享受类型提示)
    const userAPI = {
      get: (id) => alova.Get('/api/user', { params: { id } })
    }
    
    // 发起请求(自动推导user类型!)
    const { data: user } = useRequest(userAPI.get(1))

    2. 复杂拦截器迁移

    Axios的混乱拦截:

    // 请求拦截
    axios.interceptors.request.use(config => {
      config.headers.token = localStorage.getItem('token')
      return config
    })
    
    // 响应拦截
    axios.interceptors.response.use(
      response => response.data,
      error => Promise.reject(error.response)
    )

    Alova的优雅中间件:

    // 全局统一逻辑(类型安全!)
    const alova = createAlova({
      beforeRequest: (method) => {
        method.config.headers.token = localStorage.getItem('token')
      },
      responded: (response) => response.json() // 自动解析JSON
    })

    迁移成本高?Alova给你保底方案

    // 兼容模式:在Alova中使用Axios适配器
    import { axiosAdapter } from '@alova/adapter-axios'
    
    const alova = createAlova({
      requestAdapter: axiosAdapter(axios)
    })
  • Control-flow Integrity in V8

    Control-flow Integrity in V8

    Published 09 October 2023 · Tagged with security

    Control-flow integrity (CFI) is a security feature aiming to prevent exploits from hijacking control-flow. The idea is that even if an attacker manages to corrupt the memory of a process, additional integrity checks can prevent them from executing arbitrary code. In this blog post, we want to discuss our work to enable CFI in V8.

    Background

    The popularity of Chrome makes it a valuable target for 0-day attacks and most in-the-wild exploits we’ve seen target V8 to gain initial code execution. V8 exploits typically follow a similar pattern: an initial bug leads to memory corruption but often the initial corruption is limited and the attacker has to find a way to arbitrarily read/write in the whole address space. This allows them to hijack the control-flow and run shellcode that executes the next step of the exploit chain that will try to break out of the Chrome sandbox.

    To prevent the attacker from turning memory corruption into shellcode execution, we’re implementing control-flow integrity in V8. This is especially challenging in the presence of a JIT compiler. If you turn data into machine code at runtime, you now need to ensure that corrupted data can’t turn into malicious code. Fortunately, modern hardware features provide us with the building blocks to design a JIT compiler that is robust even while processing corrupted memory.

    Following, we’ll look at the problem divided into three separate parts:

    • Forward-Edge CFI verifies the integrity of indirect control-flow transfers such as function pointer or vtable calls.
    • Backward-Edge CFI needs to ensure that return addresses read from the stack are valid.
    • JIT Memory Integrity validates all data that is written to executable memory at runtime.

    Forward-Edge CFI

    There are two hardware features that we want to use to protect indirect calls and jumps: landing pads and pointer authentication.

    Landing Pads

    Landing pads are special instructions that can be used to mark valid branch targets. If enabled, indirect branches can only jump to a landing pad instruction, anything else will raise an exception.
    On ARM64 for example, landing pads are available with the Branch Target Identification (BTI) feature introduced in Armv8.5-A. BTI support is already enabled in V8.
    On x64, landing pads were introduced with the Indirect Branch Tracking (IBT) part of the Control Flow Enforcement Technology (CET) feature.

    However, adding landing pads on all potential targets for indirect branches only provides us with coarse-grained control-flow integrity and still gives attackers lots of freedom. We can further tighten the restrictions by adding function signature checks (the argument and return types at the call site must match the called function) as well as through dynamically removing unneeded landing pad instructions at runtime.
    These features are part of the recent FineIBT proposal and we hope that it can get OS adoption.

    Pointer Authentication

    Armv8.3-A introduced pointer authentication (PAC) which can be used to embed a signature in the upper unused bits of a pointer. Since the signature is verified before the pointer is used, attackers won’t be able to provide arbitrary forged pointers to indirect branches.

    Backward-Edge CFI

    To protect return addresses, we also want to make use of two separate hardware features: shadow stacks and PAC.

    Shadow Stacks

    With Intel CET’s shadow stacks and the guarded control stack (GCS) in Armv9.4-A, we can have a separate stack just for return addresses that has hardware protections against malicious writes. These features provide some pretty strong protections against return address overwrites, but we will need to deal with cases where we legitimately modify the return stack such as during optimization / deoptimization and exception handling.

    Pointer Authentication (PAC-RET)

    Similar to indirect branches, pointer authentication can be used to sign return addresses before they get pushed to the stack. This is already enabled in V8 on ARM64 CPUs.

    A side effect of using hardware support for Forward-edge and Backward-edge CFI is that it will allow us to keep the performance impact to a minimum.

    JIT Memory Integrity

    A unique challenge to CFI in JIT compilers is that we need to write machine code to executable memory at runtime. We need to protect the memory in a way that the JIT compiler is allowed to write to it but the attacker’s memory write primitive can’t. A naive approach would be to change the page permissions temporarily to add / remove write access. But this is inherently racy since we need to assume that the attacker can trigger an arbitrary write concurrently from a second thread.

    Per-thread Memory Permissions

    On modern CPUs, we can have different views of the memory permissions that only apply to the current thread and can be changed quickly in userland.
    On x64 CPUs, this can be achieved with memory protection keys (pkeys) and ARM announced the permission overlay extensions in Armv8.9-A.
    This allows us to fine-grained toggle the write access to executable memory, for example by tagging it with a separate pkey.

    The JIT pages are now not attacker writable anymore but the JIT compiler still needs to write generated code into it. In V8, the generated code lives in AssemblerBuffers on the heap which can be corrupted by the attacker instead. We could protect the AssemblerBuffers too in the same fashion, but this just shifts the problem. For example, we’d then also need to protect the memory where the pointer to the AssemblerBuffer lives.
    In fact, any code that enables write access to such protected memory constitutes CFI attack surface and needs to be coded very defensively. E.g. any write to a pointer that comes from unprotected memory is game over, since the attacker can use it to corrupt executable memory. Thus, our design goal is to have as few of these critical sections as possible and keep the code inside short and self-contained.

    Control-Flow Validation

    If we don’t want to protect all compiler data, we can assume it to be untrusted from the point of view of CFI instead. Before writing anything to executable memory, we need to validate that it doesn’t lead to arbitrary control-flow. That includes for example that the written code doesn’t perform any syscall instructions or that it doesn’t jump into arbitrary code. Of course, we also need to check that it doesn’t change the pkey permissions of the current thread. Note that we don’t try to prevent the code from corrupting arbitrary memory since if the code is corrupted we can assume the attacker already has this capability.
    To perform such validation safely, we will also need to keep required metadata in protected memory as well as protect local variables on the stack.
    We ran some preliminary tests to assess the impact of such validation on performance. Fortunately, the validation is not occurring in performance-critical code paths, and we did not observe any regressions in the jetstream or speedometer benchmarks.

    Evaluation

    Offensive security research is an essential part of any mitigation design and we’re continuously trying to find new ways to bypass our protections. Here are some examples of attacks that we think will be possible and ideas to address them.

    Corrupted Syscall Arguments

    As mentioned before, we assume that an attacker can trigger a memory write primitive concurrently to other running threads. If another thread performs a syscall, some of the arguments could then be attacker-controlled if they’re read from memory. Chrome runs with a restrictive syscall filter but there’s still a few syscalls that could be used to bypass the CFI protections.

    Sigaction for example is a syscall to register signal handlers. During our research we found that a sigaction call in Chrome is reachable in a CFI-compliant way. Since the arguments are passed in memory, an attacker could trigger this code path and point the signal handler function to arbitrary code. Luckily, we can address this easily: either block the path to the sigaction call or block it with a syscall filter after initialization.

    Other interesting examples are the memory management syscalls. For example, if a thread calls munmap on a corrupted pointer, the attacker could unmap read-only pages and a consecutive mmap call can reuse this address, effectively adding write permissions to the page.
    Some OSes already provide protections against this attack with memory sealing: Apple platforms provide the VM_FLAGS_PERMANENT flag and OpenBSD has an mimmutable syscall.

    Signal Frame Corruption

    When the kernel executes a signal handler, it will save the current CPU state on the userland stack. A second thread could corrupt the saved state which will then get restored by the kernel.
    Protecting against this in user space seems difficult if the signal frame data is untrusted. At that point one would have to always exit or overwrite the signal frame with a known save state to return to.
    A more promising approach would be to protect the signal stack using per-thread memory permissions. For example, a pkey-tagged sigaltstack would protect against malicious overwrites, but it would require the kernel to temporarily allow write permissions when saving the CPU state onto it.

    v8CTF

    These were just a few examples of potential attacks that we’re working on addressing and we also want to learn more from the security community. If this interests you, try your hand at the recently launched v8CTF! Exploit V8 and gain a bounty, exploits targeting n-day vulnerabilities are explicitly in scope!

  • The V8 Sandbox

    The V8 Sandbox

    After almost three years since the initial design document and hundreds of CLs in the meantime, the V8 Sandbox — a lightweight, in-process sandbox for V8 — has now progressed to the point where it is no longer considered an experimental security feature. Starting today, the V8 Sandbox is included in Chrome’s Vulnerability Reward Program (VRP). While there are still a number of issues to resolve before it becomes a strong security boundary, the VRP inclusion is an important step in that direction. Chrome 123 could therefore be considered to be a sort of “beta” release for the sandbox. This blog post uses this opportunity to discuss the motivation behind the sandbox, show how it prevents memory corruption in V8 from spreading within the host process, and ultimately explain why it is a necessary step towards memory safety.

    Motivation

    Memory safety remains a relevant problem: all Chrome exploits caught in the wild in the last three years (2021 – 2023) started out with a memory corruption vulnerability in a Chrome renderer process that was exploited for remote code execution (RCE). Of these, 60% were vulnerabilities in V8. However, there is a catch: V8 vulnerabilities are rarely “classic” memory corruption bugs (use-after-frees, out-of-bounds accesses, etc.) but instead subtle logic issues which can in turn be exploited to corrupt memory. As such, existing memory safety solutions are, for the most part, not applicable to V8. In particular, neither switching to a memory safe language, such as Rust, nor using current or future hardware memory safety features, such as memory tagging, can help with the security challenges faced by V8 today.

    To understand why, consider a highly simplified, hypothetical JavaScript engine vulnerability: the implementation of JSArray::fizzbuzz(), which replaces values in the array that are divisible by 3 with “fizz”, divisible by 5 with “buzz”, and divisible by both 3 and 5 with “fizzbuzz”. Below is an implementation of that function in C++. JSArray::buffer_ can be thought of as a JSValue*, that is, a pointer to an array of JavaScript values, and JSArray::length_ contains the current size of that buffer.

    for (int index = 0; index < length_; index++) {
      JSValue js_value = buffer_[index];
      int value = ToNumber(js_value).int_value();
      if (value % 15 == 0)
        buffer_[index] = JSString("fizzbuzz");
      else if (value % 5 == 0)
        buffer_[index] = JSString("buzz");
      else if (value % 3 == 0)
        buffer_[index] = JSString("fizz");
    }

    Seems simple enough? However, there’s a somewhat subtle bug here: the ToNumber conversion in line 3 can have side effects as it may invoke user-defined JavaScript callbacks. Such a callback could then shrink the array, thereby causing an out-of-bounds write afterwards. The following JavaScript code would likely cause memory corruption:

    let array = new Array(100);
    let evil = { [Symbol.toPrimitive]() { array.length = 1; return 15; } };
    array.push(evil);
    // At index 100, the @@toPrimitive callback of |evil| is invoked in
    // line 3 above, shrinking the array to length 1 and reallocating its
    // backing buffer. The subsequent write (line 5) goes out-of-bounds.
    array.fizzbuzz();

    Note that this vulnerability could occur both in hand-written runtime code (as in the example above) or in machine code generated at runtime by an optimizing just-in-time (JIT) compiler (if the function was implemented in JavaScript instead). In the former case, the programmer would conclude that an explicit bounds-check for the store operations is not necessary as that index has just been accessed. In the latter case, it would be the compiler drawing the same incorrect conclusion during one of its optimization passes (for example redundancy elimination or bounds-check elimination) because it doesn’t model the side effects of ToNumber() correctly.

    While this is an artificially simple bug (this specific bug pattern has become mostly extinct by now due to improvements in fuzzers, developer awareness, and researcher attention), it is still useful to understand why vulnerabilities in modern JavaScript engines are difficult to mitigate in a generic way. Consider the approach of using a memory safe language such as Rust, where it is the compiler’s responsibility to guarantee memory safety. In the above example, a memory safe language would likely prevent this bug in the hand-written runtime code used by the interpreter. However, it would not prevent the bug in any just-in-time compiler as the bug there would be a logic issue, not a “classic” memory corruption vulnerability. Only the code generated by the compiler would actually cause any memory corruption. Fundamentally, the issue is that memory safety cannot be guaranteed by the compiler if a compiler is directly part of the attack surface.

    Similarly, disabling the JIT compilers would also only be a partial solution: historically, roughly half of the bugs discovered and exploited in V8 affected one of its compilers while the rest were in other components such as runtime functions, the interpreter, the garbage collector, or the parser. Using a memory-safe language for these components and removing JIT compilers could work, but would significantly reduce the engine’s performance (ranging, depending on the type of workload, from 1.5–10× or more for computationally intensive tasks).

    Now consider instead popular hardware security mechanisms, in particular memory tagging. There are a number of reasons why memory tagging would similarly not be an effective solution. For example, CPU side channels, which can easily be exploited from JavaScript, could be abused to leak tag values, thereby allowing an attacker to bypass the mitigation. Furthermore, due to pointer compression, there is currently no space for the tag bits in V8’s pointers. As such, the entire heap region would have to be tagged with the same tag, making it impossible to detect inter-object corruption. As such, while memory tagging can be very effective on certain attack surfaces, it is unlikely to represent much of a hurdle for attackers in the case of JavaScript engines.

    In summary, modern JavaScript engines tend to contain complex, 2nd-order logic bugs which provide powerful exploitation primitives. These cannot be effectively protected by the same techniques used for typical memory-corruption vulnerabilities. However, nearly all vulnerabilities found and exploited in V8 today have one thing in common: the eventual memory corruption necessarily happens inside the V8 heap because the compiler and runtime (almost) exclusively operate on V8 HeapObject instances. This is where the sandbox comes into play.

    The V8 (Heap) Sandbox

    The basic idea behind the sandbox is to isolate V8’s (heap) memory such that any memory corruption there cannot “spread” to other parts of the process’ memory.

    As a motivating example for the sandbox design, consider the separation of user- and kernel space in modern operating systems. Historically, all applications and the operating system’s kernel would share the same (physical) memory address space. As such, any memory error in a user application could bring down the whole system by, for example, corrupting kernel memory. On the other hand, in a modern operating system, each userland application has its own dedicated (virtual) address space. As such, any memory error is limited to the application itself, and the rest of the system is protected. In other words, a faulty application can crash itself but not affect the rest of the system. Similarly, the V8 Sandbox attempts to isolate the untrusted JavaScript/WebAssembly code executed by V8 such that a bug in V8 does not affect the rest of the hosting process.

    In principle, the sandbox could be implemented with hardware support: similar to the userland-kernel split, V8 would execute some mode-switching instruction when entering or leaving sandboxed code, which would cause the CPU to be unable to access out-of-sandbox memory. In practice, no suitable hardware feature is available today, and the current sandbox is therefore implemented purely in software.

    The basic idea behind the software-based sandbox is to replace all data types that can access out-of-sandbox memory with “sandbox-compatible” alternatives. In particular, all pointers (both to objects on the V8 heap or elsewhere in memory) and 64-bit sizes must be removed as an attacker could corrupt them to subsequently access other memory in the process. This implies that memory regions such as the stack cannot be inside the sandbox as they must contain pointers (for example return addresses) due to hardware and OS constraints. As such, with the software-based sandbox, only the V8 heap is inside the sandbox, and the overall construction is therefore not unlike the sandboxing model used by WebAssembly.

    To understand how this works in practice, it is useful to look at the steps an exploit has to perform after corrupting memory. The goal of an RCE exploit would typically be to perform a privilege escalation attack, for example by executing shellcode or performing a return-oriented programming (ROP)-style attack. For either of these, the exploit will first want the ability to read and write arbitrary memory in the process, for example to then corrupt a function pointer or place a ROP-payload somewhere in memory and pivot to it. Given a bug that corrupts memory on the V8 heap, an attacker would therefore look for an object such as the following:

    class JSArrayBuffer: public JSObject {
      private:
        byte* buffer_;
        size_t size_;
    };

    Given this, the attacker would then either corrupt the buffer pointer or the size value to construct an arbitrary read/write primitive. This is the step that the sandbox aims to prevent. In particular, with the sandbox enabled, and assuming that the referenced buffer is located inside the sandbox, the above object would now become:

    class JSArrayBuffer: public JSObject {
      private:
        sandbox_ptr_t buffer_;
        sandbox_size_t size_;
    };

    Where sandbox_ptr_t is a 40-bit offset (in the case of a 1TB sandbox) from the base of the sandbox. Similarly, sandbox_size_t is a “sandbox-compatible” size, currently limited to 32GB.
    Alternatively, if the referenced buffer was located outside of the sandbox, the object would instead become:

    class JSArrayBuffer: public JSObject {
      private:
        external_ptr_t buffer_;
    };

    Here, an external_ptr_t references the buffer (and its size) through a pointer table indirection (not unlike the file descriptor table of a unix kernel or a WebAssembly.Table) which provides memory safety guarantees.

    In both cases, an attacker would find themselves unable to “reach out” of the sandbox into other parts of the address space. Instead, they would first need an additional vulnerability: a V8 Sandbox bypass. The following image summarizes the high-level design, and the interested reader can find more technical details about the sandbox in the design documents linked from src/sandbox/README.md.

    A high-level diagram of the sandbox design

    Solely converting pointers and sizes to a different representation is not quite sufficient in an application as complex as V8 and there are a number of other issues that need to be fixed. For example, with the introduction of the sandbox, code such as the following suddenly becomes problematic:

    std::vector<std::string> JSObject::GetPropertyNames() {
        int num_properties = TotalNumberOfProperties();
        std::vector<std::string> properties(num_properties);
    
        for (int i = 0; i < NumberOfInObjectProperties(); i++) {
            properties[i] = GetNameOfInObjectProperty(i);
        }
    
        // Deal with the other types of properties
        // ...

    This code makes the (reasonable) assumption that the number of properties stored directly in a JSObject must be less than the total number of properties of that object. However, assuming these numbers are simply stored as integers somewhere in the JSObject, an attacker could corrupt one of them to break this invariant. Subsequently, the access into the (out-of-sandbox) std::vector would go out of bounds. Adding an explicit bounds check, for example with an SBXCHECK, would fix this.

    Encouragingly, nearly all “sandbox violations” discovered so far are like this: trivial (1st order) memory corruption bugs such as use-after-frees or out-of-bounds accesses due to lack of a bounds check. Contrary to the 2nd order vulnerabilities typically found in V8, these sandbox bugs could actually be prevented or mitigated by the approaches discussed earlier. In fact, the particular bug above would already be mitigated today due to Chrome’s libc++ hardening. As such, the hope is that in the long run, the sandbox becomes a more defensible security boundary than V8 itself. While the currently available data set of sandbox bugs is very limited, the VRP integration launching today will hopefully help produce a clearer picture of the type of vulnerabilities encountered on the sandbox attack surface.

    Performance

    One major advantage of this approach is that it is fundamentally cheap: the overhead caused by the sandbox comes mostly from the pointer table indirection for external objects (costing roughly one additional memory load) and to a lesser extent from the use of offsets instead of raw pointers (costing mostly just a shift+add operation, which is very cheap). The current overhead of the sandbox is therefore only around 1% or less on typical workloads (measured using the Speedometer and JetStream benchmark suites). This allows the V8 Sandbox to be enabled by default on compatible platforms.

    Testing

    A desirable feature for any security boundary is testability: the ability to manually and automatically test that the promised security guarantees actually hold in practice. This requires a clear attacker model, a way to “emulate” an attacker, and ideally a way of automatically determining when the security boundary has failed. The V8 Sandbox fulfills all of these requirements:

    1. A clear attacker model: it is assumed that an attacker can read and write arbitrarily inside the V8 Sandbox. The goal is to prevent memory corruption outside of the sandbox.
    2. A way to emulate an attacker: V8 provides a “memory corruption API” when built with the v8_enable_memory_corruption_api = true flag. This emulates the primitives obtained from typical V8 vulnerabilities and in particular provides full read- and write access inside the sandbox.
    3. A way to detect “sandbox violations”: V8 provides a “sandbox testing” mode (enabled via either --sandbox-testing or --sandbox-fuzzing) which installs a signal handler that determines if a signal such as SIGSEGV represents a violation of the sandbox’s security guarantees.

    Ultimately, this allows the sandbox to be integrated into Chrome’s VRP program and be fuzzed by specialized fuzzers.

    Usage

    The V8 Sandbox must be enabled/disabled at build time using the v8_enable_sandbox build flag. It is (for technical reasons) not possible to enable/disable the sandbox at runtime. The V8 Sandbox requires a 64-bit system as it needs to reserve a large amount of virtual address space, currently one terabyte.

    The V8 Sandbox has already been enabled by default on 64-bit (specifically x64 and arm64) versions of Chrome on Android, ChromeOS, Linux, macOS, and Windows for roughly the last two years. Even though the sandbox was (and still is) not feature complete, this was mainly done to ensure that it does not cause stability issues and to collect real-world performance statistics. Consequently, recent V8 exploits already had to work their way past the sandbox, providing helpful early feedback on its security properties.

    Conclusion

    The V8 Sandbox is a new security mechanism designed to prevent memory corruption in V8 from impacting other memory in the process. The sandbox is motivated by the fact that current memory safety technologies are largely inapplicable to optimizing JavaScript engines. While these technologies fail to prevent memory corruption in V8 itself, they can in fact protect the V8 Sandbox attack surface. The sandbox is therefore a necessary step towards memory safety.