Fast visual discovery for photos, concepts, and creative inspiration.

Explore

Home
Discover Boards
Trending Search

Account

Sign In
Create Account
Saved Images
My Boards

© 2026 Mungart. All rights reserved.

Built for speed, clarity, and visual exploration.

…

Gpt4 Humaneval

Family-friendly

SizeAspectAccentType

Showing 120 of 120on this page. Filters & sort apply to loaded results; URL updates for sharing.120 of 120 on this page

GPT4 reported HumanEval base significantly higher than OpenAI’s ...

⚫No, Gemini isn't better at HumanEval than GPT 4 ⚫Gemini was only ...

Beating GPT-4 on HumanEval with a fine-tuned CodeLlama-34B - Bens Bites

WizardCoder-34B surpasses GPT-4, ChatGPT-3.5 and Claude-2 on HumanEval ...

HumanEval leaderboard got updated with GPT-4 Turbo : r/LocalLLaMA

Gpt4 comparison to anthropic Opus on benchmarks - Community - OpenAI ...

WizardCoder-34B surpasses GPT-4, ChatGPT-3.5 and Claude-2 on HumanEval ...

Grok 1.5 发布！ HumanEval 排名超过 GPT-4 - 知乎

Grok 1.5 now beats GPT-4 (2023) in HumanEval (code generation ...

CodeLlama-34B de Anthropic Supera a GPT-4 en HumanEval

I ran humaneval (base and plus) on the new GPT-4-Turbo-2024-04-09, and ...

HumanEval leaderboard got updated with GPT-4 Turbo : r/LocalLLaMA

GPT-4技术报告翻译by GPT4 and Human Feedback - 知乎

GPT-4技术报告翻译by GPT4 and Human Feedback - 知乎

Table I from HumanEval on Latest GPT Models - 2024 | Semantic Scholar

HumanEval - 知乎

WizardCoder-34B surpasses GPT-4, ChatGPT-3.5 and Claude-2 on HumanEval ...

Blog | FLAML

What Is GPT-4o Mini? How It Works, Use Cases, API & More | DataCamp

发布一天，Code Llama代码能力突飞猛进，微调版HumanEval得分超GPT-4 - 知乎

OpenCodeInterpreter: Integrating Code Generation with Execution and ...

GPT-4 and human tests (Mar/2023) : r/GPT4

Average ranks of generated texts in terms of human evaluation and GPT-4 ...

파인튜닝된 CodeLlama-34B로 HumanEval에서 GPT-4를 이기다 | GeekNews

GPT-4 becomes 30% more accurate when asked to critique itself

GitHub - daniel442li/gpt-human-eval: Runs gpt-4 or any other OpenAI ...

Vinija's Notes • Primers • Agents

寻找最聪明的AI：大模型评估与基准测试的完整指南 – 天天悦读

Llama 3 vs. GPT-4 vs. GPT-4o: Which is Best? | Neoteric

Code Llama가 공개된지 하루만에, 파인튜닝을 통해 GPT-4를 넘은 모델이 공개되었습니다. 😱 GPT-4가 ...

How does L2MAC compare against AutoGPT, GPT-4 and existing methods? | L2MAC

JiaweiGuo123/Alpaca-gpt4-English-with-humaneval-structure-similarity at ...

HumanEval: A Benchmark for Evaluating LLM Code Generation Capabilities ...

(PDF) GPT-4 vs. GPT-3.5: A Concise Showdown

PPT - What is GPT-4 and What New Changes it Brings? PowerPoint ...

发布一天，Code Llama代码能力突飞猛进，微调版HumanEval得分超GPT-4-51CTO.COM

Yang Liu on Twitter: "In G-Eval, we proposed the idea of using GPT-4 as ...

Language Agent Tree Search achieves SOTA at 94.4% for programming on ...

【论文】智源CodeGeeX + HumanEval评测集_humanevalx-CSDN博客

《通用人工智能的火花：GPT-4的早期实验》 Sparks of Artificial General Intelligence: Early ...

发布一天，Code Llama代码能力突飞猛进，微调版HumanEval得分超GPT-4 - 知乎

GPT-4技术报告 - 知乎

发布一天，Code Llama代码能力突飞猛进，微调版HumanEval得分超GPT-4 - 知乎

GPT-4 介绍_gpt4介绍-CSDN博客

Phind AI, the leading developer-focused model surpassing ChatGPT 4 ...

AI-assisted coding: Experiments with GPT-4

LLM评测一：HumanEval+ - 知乎

发布一天，Code Llama代码能力突飞猛进，微调版HumanEval得分超GPT-4 - 知乎

GPT-4 官方技术报告（译） - 知乎

HumanEval评测接近GPT-4-Turbo！阿里巴巴开源70亿参数编程大模型CodeQwen1.5-7B！ | DataLearnerAI

HumanEval评测接近GPT-4-Turbo！阿里巴巴开源70亿参数编程大模型CodeQwen1.5-7B！ - 知乎

This is an interesting paper in general, but this picture is worth 1000 ...

GitHub - hisirlab/GPT4o-ClinicalEval: Evaluating ChatGPT-4o for ...

Microsoft의 챗 GPT-4 vs Google의 제미나이 Gemini Ultra 성능 비교 과연 인공지능 AI 대결 누가 ...

Performance of GPT-4 and smaller models. The metric is mean log pass ...

AutoCoder: The First Large Language Model to Surpass GPT-4 Turbo (April ...

AutoCoder: The First Large Language Model to Surpass GPT-4 Turbo (April ...

(PDF) GPT-4V exhibits human-like performance in biomedical image ...

HumanEval评测接近GPT-4-Turbo！阿里巴巴开源70亿参数编程大模型CodeQwen1.5-7B！ - 知乎

[2303.08774] GPT-4 Technical Report

GPT-4 官方技术报告（译） - 掘金

Unlocking the Power of GPT-4: A Guide to Using the API

GPT-4o Guide: How it Works, Use Cases, Pricing, Benchmarks | DataCamp

GPT-4: Complete Guide, Benchmarks & Review 2026

论文分析｜点燃通用人工智能的火花， GPT-4的早期实验（含154页中文PDF下载） - 智源社区

HumanEval评测接近GPT-4-Turbo！阿里巴巴开源70亿参数编程大模型CodeQwen1.5-7B！ | DataLearnerAI

【日本語訳】GPT-4 Technical Report【OpenAI】

(PDF) G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment

马斯克突发Grok 1.5！上下文长度至128k、HumanEval得分超GPT-4 - 知乎

GPT-4 shows comparable performance to human examiners in ranking open ...

GPT-4技术报告 - 知乎

GPT-4技术文档 - 知乎

Large-Scale Validation of the Feasibility of GPT-4 as a Proofreading ...

GPT-4 Technical Report | AI前沿分享

Self-collaboration-Code-Generation/humaneval_output_gpt-4-0613.jsonl at ...

(PDF) Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning ...

cchoi1/eval_humaneval_att_qwen7b_sol_gpt-4o-mini · Datasets at Hugging Face

What is GPT-4? Here's everything you need to know

寻找最聪明的AI：大模型评估与基准测试的完整指南 – 天天悦读

Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks ...

(PDF) G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment

AGI最前沿：GPT-4之后大模型学术进展速览 - 知乎

GPT-4 Explained and Exemplified: Eleven Ways It Might Blow Your Mind ...

ConTextual

完胜GPT-4，秒杀闭源模型！Code Llama神秘版本曝光 - 智源社区

OpenAI Luncurkan GPT-4, Model AI Baru dengan Kemampuan Setara Manusia

深入浅出ChatGPT：技术原理一探究竟

[PDF] GPT-4 Technical Report | Semantic Scholar

论文阅读_GPT-4 - 知乎

当要求GPT-4进行自我检讨时其准确性提高了30% - AI 人工智能 - cnBeta.COM

Open AI's NEW INSANE GPT-4 SHOCKS The Entire Industry! (Microsoft GPT-4 ...

(PDF) A comparison of human, GPT-3.5, and GPT-4 performance in a ...

GPT-4 | Prompt Engineering Guide

10 Best LLMs in 2025: Large Language Models Reviewed

HumanEval评测接近GPT-4-Turbo！阿里巴巴开源70亿参数编程大模型CodeQwen1.5-7B！ | DataLearnerAI

[Survey] Deep dive into AI Agent & Multi-Agent System (MAS)

Thread by @random_walker on Thread Reader App – Thread Reader App

Retrieval-augmented generation improves precision and trust of a GPT-4 ...

GPT-4 Omni (GPT-4o) — Klu

GPT-4o System Card | OpenAI

微软154页研究论文刷屏，对GPT-4最全测试曝光，称其初次叩开AGI的大门

Comparing humans, GPT-4, and GPT-4V on abstraction and reasoning tasks ...

The potential of Generative Pre-trained Transformer 4 (GPT-4) to ...

GPT-4: A New Milestone in Scaling Up Deep Learning | Shaped Blog

发布一天，Code Llama代码能力突飞猛进，微调版HumanEval得分超GPT-4-腾讯云开发者社区-腾讯云

Performance of GPT-4 and smaller models. The metric is mean log pass ...

GPT-4 官方技术报告（译） - 知乎

墨滴社区

GPT-4技术文档

华尔街见闻

GPT-4: Everything you want to know about OpenAI’s new AI model | by ...

论文分析｜点燃通用人工智能的火花， GPT-4的早期实验（含154页中文PDF下载） - 智源社区

Revolutionizing the Future: GPT-4 Bids Farewell - Fusion Chat

GPT-4 Technical Report - 穷酸秀才大草包 - 博客园

GPT-4

openai chat GPT-4 Technical Report 技术报告论文 - 老马啸西风 - 博客园

Grok1.5がリリースされ、HumanEvalでGPT-4を抜いた！｜Zun-Beho

GPT-4 앞지른 ‘무료 AI’…수조 쓴 빅테크 고민 깊어진다 | 서울경제

Les 5 meilleures nouvelles fonctionnalités GPT-4 expliquées - Astuce Tech

发布一天，Code Llama代码能力突飞猛进，微调版HumanEval得分超GPT-4_腾讯新闻

People also searched

Humaneval Logo Humaneval Examples Gpt4 Humaneval Global Humane Africa Logo Humane Meme Llama Ai Icon Claude 2 Humaneval Code Llama Logo Llama Coder Llama Performance Chart Humanimal Costume Humanimal Clay Humanimals Cryptid Humanimal Illustarrtion Elephant with Elaf Big-Picture Tuary Africa Releases 14 Elephants Meta Ai Llama 3 Llama Images Meta Only Text LLM Model Size Chart Meta Ai Llama