Logo na Zephyrnet

Rikicin LLM Faɗaɗɗen Faɗin Haɓaka Ƙididdigar Ƙididdiga

kwanan wata:

Amfani da masu haɓaka software na manyan nau'ikan harshe (LLMs) yana ba da babbar dama fiye da yadda ake tsammani a baya ga maharan don rarraba fakitin ɓarna zuwa wuraren ci gaba, bisa ga binciken da aka fitar kwanan nan.

Binciken daga mai siyar da tsaro na LLM Lasso Security shine bin rahoton rahoton bara akan yuwuwar hakan maharan don cin zarafin LLMs na hallucinate, ko don samar da da alama mai yiwuwa amma ba bisa ga gaskiya ba, yana haifar da martani ga shigar da mai amfani.

Kunshin AI Hallucination

The binciken da ya gabata mai da hankali kan dabi'ar ChatGPT na ƙirƙira sunayen ɗakunan karatu na lamba - a tsakanin sauran ƙirƙira - lokacin da masu haɓaka software suka nemi taimakon AI-enabled chatbot a cikin yanayin ci gaba. A wasu kalmomi, chatbot wani lokaci yana fitar da hanyoyin haɗi zuwa fakitin da ba su wanzu akan wuraren ajiyar lambobin jama'a lokacin da mai haɓakawa zai iya tambayarsa ya ba da shawarar fakiti don amfani da su a cikin aikin.

Masanin tsaro Bar Lanyado, marubucin binciken kuma yanzu a Lasso Security, ya gano cewa maharan na iya sauke ainihin mugunyar kunshin cikin sauƙi a wurin da ChatGPT ke nunawa kuma su ba shi suna iri ɗaya da kunshin da aka ruɗe. Duk wani mai haɓakawa wanda ya zazzage fakitin bisa shawarar ChatGPT zai iya ƙarasa gabatar da malware a cikin yanayin ci gaban su.

Lanyado's bin diddigin bincike yayi nazarin yadda matsalar hallucination ɗin fakitin ke yaɗuwa a cikin manyan nau'ikan harshe huɗu daban-daban: GPT-3.5-Turbo, GPT-4, Gemini Pro (tsohon Bard), da Coral (Cohere). Ya kuma gwada ingancin kowane samfuri don samar da fakitin da ba a sani ba a cikin harsunan shirye-shirye daban-daban da kuma yawan yadda suke samar da fakiti iri ɗaya.

Don gwaje-gwajen, Lanyado ya tattara jerin dubban tambayoyin "yadda ake" waɗanda masu haɓakawa a cikin mahallin shirye-shirye daban-daban - python, node.js, go, .net, ruby ​​- galibi suna neman taimako daga LLMs a cikin yanayin ci gaba. Daga nan Lanyado ya tambayi kowane samfurin tambaya mai alaƙa da coding da kuma shawarwarin fakitin da ke da alaƙa da tambayar. Ya kuma nemi kowane samfurin da ya ba da shawarar ƙarin fakiti 10 don magance wannan matsala.

Sakamako Maimaituwa

Sakamakon ya kasance mai tayar da hankali. Wani abin ban mamaki 64.5% na "tattaunawar" Lanyado ya yi tare da Gemini ya haifar da fakitin hallucined. Tare da Coral, wannan lambar ya kasance 29.1%; sauran LLMs kamar GPT-4 (24.2%) da GPT3.5 (22.5%) ba su yi kyau sosai ba.

Lokacin da Lanyado ya yi wa kowane samfuri tambayoyi iri ɗaya sau 100 don ganin sau da yawa samfuran za su haskaka fakiti iri ɗaya, ya ga adadin maimaitawar yana ƙara gira. Cohere, alal misali, ya fitar da fakitoci iri ɗaya sama da kashi 24% na lokacin; Taɗi GPT-3.5 da Gemini a kusa da 14%, da GPT-4 a 20%. A lokuta da yawa, samfura daban-daban sun haskaka fakiti iri ɗaya ko makamantansu. Mafi girman adadin irin waɗannan samfuran giciye sun faru tsakanin GPT-3.5 da Gemini.

Lanyado ya ce ko da masu haɓakawa daban-daban sun tambayi LLM tambaya akan maudu'i iri ɗaya amma suka ƙirƙira tambayoyin daban-daban, akwai yuwuwar LLM ta ba da shawarar fakiti iri ɗaya a kowane yanayi. A takaice dai, duk wani mai haɓakawa da ke amfani da LLM don taimakon coding zai iya fuskantar yawancin fakiti iri ɗaya.

Lanyado ya ce "Tambayar za ta iya zama daban-daban amma a kan batun makamancin haka, kuma har yanzu za a iya yin hasashe, wanda hakan zai sa wannan dabara ta yi tasiri sosai," in ji Lanyado. "A cikin binciken da ake yi na yanzu, mun sami 'fakitoci masu maimaitawa' don tambayoyi da batutuwa daban-daban har ma da nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan iri daban-daban, wanda ke ƙara yuwuwar yin amfani da waɗannan fakitin da aka ruɗe."

Sauƙi don Amfani

Wani maharin dauke da sunayen ƴan fakitin hallucined, alal misali, zai iya loda fakiti masu suna iri ɗaya zuwa wuraren da suka dace da sanin cewa akwai yuwuwar LLM zai nuna masu haɓakawa gare shi. Don nuna barazanar ba ita ce ka'ida ba, Lanyado ya ɗauki fakitin hallucined mai suna "huggingface-cli" wanda ya ci karo da shi yayin gwaje-gwajensa kuma ya loda wani fakitin mara komai mai suna iri ɗaya zuwa ma'ajiyar Fuskar Hugging don ƙirar koyon injin. Masu haɓakawa sun zazzage wannan fakitin fiye da sau 32,000, in ji shi.

Daga ra'ayin mai wasan kwaikwayo na barazanar, fakitin hallucinations yana ba da ingantacciyar hanyar rarraba malware. "Kamar yadda muka [gani] daga sakamakon binciken, ba shi da wahala," in ji shi. Lanyado ya kara da cewa, a matsakaita, duk samfuran sun hallara tare da kashi 35% na tambayoyi kusan 48,000. GPT-3.5 yana da mafi ƙasƙanci kashi na hallucinations; Gemini ya zira mafi girma, tare da matsakaita maimaituwa na 18% a duk samfuran guda huɗu, in ji shi.

Lanyado yana ba da shawarar cewa masu haɓakawa suna yin taka tsantsan yayin aiwatar da shawarwarin fakiti daga LLM lokacin da ba su da cikakken tabbacin ingancinsa. Ya kuma ce a lokacin da masu haɓakawa suka ci karo da wani buɗaɗɗen tushen da ba a sani ba suna buƙatar ziyarci ma'ajiyar kunshin kuma su bincika girman al'ummarsu, bayanan kula da su, sanannun raunin da ya yi, da kuma yawan haɗin gwiwa. Masu haɓakawa kuma yakamata su bincika kunshin sosai kafin su gabatar da shi cikin yanayin haɓakawa.

tabs_img

Sabbin Hankali

tabs_img