Logo na Zephyrnet

Claude AI na Anthropic ya Haɓaka ChatGPT akan Jagoran Chatbot Arena - Decrypt

kwanan wata:

Yayin da ChatGPT daga Buɗe AI ke jin daɗin mafi girman tsarin tunani na duk kayan aikin AI na haɓaka, babban madaidaicin layin Claude 3 Opus ya sace shi daga ɗan takarar Anthropic na shekara-shekara akan mashahurin jagorar taron jama'a da masu binciken AI ke amfani da su.

Hawan Claude a cikin martabar Chatbot Arena shine karo na farko da OpenAI's GPT-4, wanda ke da iko da ChatGPT Plus, an sauke shi tun lokacin da ya fara bayyana akan allon jagora a watan Mayun bara.

Ana gudanar da Chatbot Arena ta Large Model Systems Organization (LMSYS ORG), ƙungiyar bincike da aka sadaukar don buɗe samfuran da ke tallafawa haɗin gwiwa tsakanin ɗalibai da malamai a Jami'ar California, Berkeley, UC San Diego, da Jami'ar Carnegie Mellon. Dandalin yana ba wa masu amfani da nau'ikan yare guda biyu marasa lakabi kuma suna tambayar su don tantance wanene ya fi yin aiki da kyau bisa kowane ma'auni da suka ga ya dace.

Bayan tara dubunnan kwatancen kwatance, Chatbot Arena yana ƙididdige samfuran "mafi kyawun" don allon jagora, yana sabunta shi akan lokaci.

Wannan dabarar da ta dace, dangane da bambancin ɗanɗanon ɗan adam, ita ce ta keɓe filin wasan Chatbot ban da sauran ma'auni na AI. Masu horar da ƙirar ƙila ba za su iya “yaudara” ta hanyar keɓance samfuran su don doke algorithm ba, kamar yadda za su iya da ƙididdiga masu ƙima. Ta hanyar auna abin da mutane kawai suka fi so, Chatbot Arena wata hanya ce mai mahimmanci, mai inganci ga masu binciken AI.

Dandalin yana tattara ra'ayoyin masu amfani kuma yana gudanar da shi ta hanyar Bradley-Terry statistic model don yin hasashen yuwuwar wani samfuri ya zarce wasu a gasar kai tsaye. Wannan hanyar yana ba da damar ƙirƙira cikakkun ƙididdiga, gami da kewayon tazara na aminci don ƙididdige ƙimar ƙimar Elo—dabara iri ɗaya da ake amfani da ita don auna ƙwarewar 'yan wasan dara.

Manyan LLM guda 10 da Chatbot Arena suka zaba. Hoto: Huggingface
Manyan LLM guda 10 da Chatbot Arena suka zaba. Hoto: Huggingface

Hawan Claude 3 Opus zuwa saman ba shine kawai gagarumin ci gaba akan allon jagora ba. Claude 3 Sonnet (matsakaicin girman samfurin da ake samu kyauta) da Claude 3 Haiku (ƙanami, ƙirar sauri), wanda kuma Anthropic ya haɓaka, a halin yanzu suna matsayi na 4th da 6th, bi da bi.

Jagoran ya ƙunshi nau'ikan GPT-4 daban-daban, kamar GPT-4-0314 (sifin "na asali" na GPT-4 daga Maris 2023), GPT-4-0613, GPT-4-1106-preview, da GPT-4 -0125-preview (sabuwar samfurin GPT-4 Turbo da ake samu ta API daga Janairu 2024). Dangane da martabar, Sonnet da Haiku duka sun fi GPT-4 na asali tare da Sonnet kuma sun zarce sigar tweaked wanda OpenAI ta ƙaddamar a watan Yuni 2023.

Wannan kuma yana nufin cewa, abin baƙin ciki, akwai buɗaɗɗen tushen LLM guda ɗaya a halin yanzu a cikin manyan 10: Qwen, tare da Starling 7b da Mixtral 8x7B kawai sauran samfuran buɗe ido a cikin manyan 20.

Ɗayan fa'idodin Claude akan GPT-4 shine ikon mahallin sa alama da iyawar dawo da shi. Sigar jama'a ta Claude 3 Opus tana ɗaukar sama da 200K-kuma ƙungiyar ta yi iƙirarin tana da ƙayyadaddun sigar da ke da ikon sarrafa alamun miliyan 1 tare da kusan cikakkiyar ƙimar dawowa. Wannan yana nufin cewa Claude zai iya fahimtar dogon tsokaci kuma ya riƙe bayanai yadda ya kamata fiye da idan aka kwatanta da GPT-4 Turbo, wanda ke sarrafa alamun 128K kuma ya rasa damar dawo da shi tare da dogon lokaci.

Tuna daidaiton Claude 3 Opus vs GPT-4 Turbo. Hoto daga Decrypt ta amfani da bayanai daga Anthropic da Greg Kamradt
Tuna daidaiton Claude 3 Opus vs GPT-4 Turbo. Hoto daga Decrypt ta amfani da bayanai daga Anthropic da Greg Kamradt.

Google ta Gemini Advanced Hakanan ya kasance yana samun karɓuwa a cikin sararin mataimakan AI. Kamfanin yana ba da tsari wanda ya haɗa da 2TB na ajiya da damar AI a cikin rukunin samfuran Google akan farashi ɗaya da biyan kuɗin Chat GPT Plus ($ 20 kowace wata).

Gemini Pro kyauta a halin yanzu yana matsayi lamba 4, tsakanin GPT-4 Turbo da Claude 3 Sonnet. Samfurin Gemini Ultra na saman-layi ba shi da samuwa don gwaji kuma har yanzu ba a bayyana shi a cikin martaba ba.

edited by Ryan Ozawa.

Kasance kan saman labaran crypto, samun sabuntawar yau da kullun a cikin akwatin saƙo naka.

tabs_img

Sabbin Hankali

tabs_img