vmbench Formal VM Benchmark

MCPOpen SourceMIT25.0

by kroq86 • Uncategorized

An MCP server providing a formal VM benchmark, dataset generation, and inspectable runtime for evaluating language models on synthetic execution tasks.

Example Use Cases

1
Benchmark language models on formal machine execution semantics.
2
Generate and evaluate synthetic execution datasets with inspectable reasoning.
3
Export supervised fine-tuning datasets from benchmark results.

Description

vmbench offers a toy ISA and reference VM, synthetic dataset generation, bounded evaluation, and an inspectable search runtime with verification and budget control. It enables users to run benchmarks, inspect reasoning steps, compare policies, and export data for supervised fine-tuning. The system emphasizes transparency in model execution and verification rather than black-box outputs, facilitating research into model adherence to formal machine semantics.

Quick Actions

View on GitHub

Quick Stats

Service TypeMCP

Pricing ModelFree

Capabilities0 Tools / 0 Prompts / 0 Resources

Ownerkroq86

CategoryUncategorized

Set Your Username

vmbench Formal VM Benchmark

Example Use Cases

Description

Quick Actions

Quick Stats