New Finance Agent Benchmark Released
openai/o1-2024-12-17 o1

Points to the most recent snapshot of the o1 model: o1-2024-12-17

Released Date: 12/17/2024

Avg. Accuracy:

74.1%

Latency:

68.59s

Performance by Benchmark

Benchmarks

Accuracy

Rankings

FinanceAgent

20.8%

( 15 / 29 )

CaseLaw

81.9%

( 21 / 63 )

ContractLaw

72.7%

( 5 / 69 )

TaxEval

78.6%

( 4 / 50 )

Math500

90.4%

( 13 / 46 )

AIME

71.5%

( 10 / 40 )

MGSM

88.9%

( 27 / 44 )

LegalBench

77.6%

( 32 / 68 )

MedQA

96.5%

( 1 / 48 )

GPQA

73.0%

( 10 / 41 )

MMLU Pro

83.5%

( 6 / 41 )

LiveCodeBench

50.3%

( 20 / 41 )

MMMU

77.7%

( 4 / 27 )

Academic Benchmarks
Proprietary Benchmarks (contact us to get access)

Cost Analysis

Input Cost

$15.00 / M Tokens

Output Cost

$60.00 / M Tokens

Input Cost (per char)

N/A

Output Cost (per char)

N/A

Overview

OpenAI o1 is the production-ready version of their previously released o1-preview. OpenAI claims it generally has lower latency than its predecessor. Unlike o1-preview, it also supports system prompts and temperature.

It also supports a new “reasoning effort” parameter, which allows users to control the level of reasoning the model will do.

Key Specifications

  • Context Window: 200,000 tokens
  • Output Limit: 100,000 tokens
  • Training Cutoff: October 2023
  • Pricing:
    • Input: $15.00 per million tokens
    • Cached Input: $7.50 per million tokens
    • Output: $60.00 per million tokens
Join our mailing list to receive benchmark updates on

Stay up to date as new benchmarks and models are released.