Logo na Zephyrnet

Haɓaka gine-ginen bayanan ku tare da yawo na ainihi ta amfani da Amazon Data Firehose da Snowflake | Ayyukan Yanar Gizo na Amazon

kwanan wata:

Duniya mai sauri ta yau tana buƙatar fahimta da yanke shawara akan lokaci, wanda ke haifar da mahimmancin yada bayanai. Bayanan yawo yana nufin bayanan da ake ci gaba da samarwa daga tushe iri-iri. Tushen wannan bayanan, kamar abubuwan da suka faru na dannawa, canjin bayanan (CDC), aikace-aikace da rajistan ayyukan sabis, da rafukan bayanan Intanet na Abubuwa (IoT) suna yaɗuwa. Snowflake yana ba da zaɓuɓɓuka guda biyu don kawo bayanan yawo a cikin dandalin sa: Snowpipe da Snowflake Snowpipe Streaming. Snowpipe ya dace da shigar da fayil (batching) amfani lokuta, kamar loda manyan fayiloli daga Sabis na Sauƙi na Amazon (Amazon S3) zuwa Snowflake. Snowpipe Streaming, sabon fasalin da aka saki a cikin Maris 2023, ya dace da amfani da layukan layukan layi (yawo), kamar loda ci gaba na bayanai daga Amazon Kinesis Data Streams or Gudanar da Gudanarwar Amazon don Apache Kafka (Amazon MSK).

Kafin Snowpipe Streaming, abokan ciniki na AWS sun yi amfani da Snowpipe don lokuta biyu na amfani: shigar da fayil da shigar da layin layi. Da farko, kun shigar da bayanan yawo zuwa Kinesis Data Streams ko Amazon MSK, sannan kuyi amfani da Amazon Data Firehose don tarawa da rubuta rafukan zuwa Amazon S3, sannan ta amfani da Snowpipe don loda bayanan cikin Snowflake. Koyaya, wannan tsari da yawa na iya haifar da jinkiri har zuwa sa'a guda kafin samun bayanai don bincike a cikin Snowflake. Bugu da ƙari, yana da tsada, musamman idan kuna da ƙananan fayiloli waɗanda Snowpipe dole ne ya loda zuwa gunkin abokin ciniki na Snowflake.

Don magance wannan batu, Amazon Data Firehose yanzu yana haɗuwa tare da Snowpipe Streaming, yana ba ku damar kamawa, canzawa, da isar da rafukan bayanai daga Kinesis Data Streams, Amazon MSK, da Firehose Direct PUT zuwa Snowflake a cikin dakika mai sauƙi. Tare da dannawa kaɗan akan na'ura mai ba da hanya tsakanin hanyoyin sadarwa na Amazon Data Firehose, zaku iya saita rafi na Firehose don isar da bayanai zuwa Snowflake. Babu alkawurra ko saka hannun jari na gaba don amfani da Amazon Data Firehose, kuma kuna biya kawai don adadin bayanan da aka watsa.

Wasu mahimman fasalulluka na Amazon Data Firehose sun haɗa da:

  • Cikakken sarrafa sabis mara sabar - Ba kwa buƙatar sarrafa albarkatu, kuma Amazon Data Firehose yana daidaitawa ta atomatik don dacewa da kayan aikin tushen bayanan ku ba tare da gudanarwa mai gudana ba.
  • Madaidaici don amfani ba tare da lambar ba – Ba kwa buƙatar rubuta aikace-aikace.
  • Isar da bayanai na ainihi - Kuna iya samun bayanai zuwa wuraren da kuke zuwa cikin sauri da inganci cikin daƙiƙa.
  • Haɗin kai tare da sabis na AWS sama da 20 - Ana samun haɗin kai mara kyau don yawancin ayyukan AWS, irin su Kinesis Data Streams, Amazon MSK, Amazon VPC Flow Logs, AWS WAF rajistan ayyukan, Amazon CloudWatch Logs, Amazon EventBridge, AWS IoT Core, da sauransu.
  • Samfurin biya-as-yo-go - Kuna biya kawai don ƙarar bayanan da Amazon Data Firehose ke aiwatarwa.
  • Babban haɗi - Amazon Data Firehose na iya haɗawa zuwa na jama'a ko masu zaman kansu a cikin VPC na ku.

Wannan sakon yana bayanin yadda zaku iya kawo bayanan yawo daga AWS cikin Snowflake a cikin dakika don yin nazari na ci gaba. Muna bincika gine-gine na gama-gari kuma muna kwatanta yadda ake saita ƙaramin lamba, mara sabar, mafita mai fa'ida mai fa'ida don yawowar bayanan ƙarancin latency.

Bayani na mafita

Wadannan sune matakan aiwatar da mafita don watsa bayanai daga AWS zuwa Snowflake:

  1. Ƙirƙiri bayanai na Snowflake, tsari, da tebur.
  2. Ƙirƙiri rafin bayanan Kinesis.
  3. Ƙirƙirar rafin isar da Wuta tare da Kinesis Data Streams a matsayin tushen da Snowflake a matsayin makomarsa ta amfani da amintaccen hanyar haɗin kai.
  4. Don gwada saitin, samar da samfurin bayanan rafi daga cikin Amazon Kinesis Data Generator (KDG) tare da rafin isar da Wuta a matsayin makoma.
  5. Nemi teburin dusar ƙanƙara don tabbatar da bayanan da aka ɗora a cikin Snowflake.

An kwatanta maganin a cikin zane na gine-gine mai zuwa.

abubuwan da ake bukata

Ya kamata ku sami abubuwan da ake buƙata masu zuwa:

Ƙirƙiri bayanai na Snowflake, tsari, da tebur

Cika waɗannan matakai don saita bayanan ku a cikin Snowflake:

  • Shiga cikin asusun ku na Snowflake kuma ƙirƙirar bayanan:
    create database adf_snf;

  • Ƙirƙiri tsari a cikin sabon bayanan bayanai:
    create schema adf_snf.kds_blog;

  • Ƙirƙiri tebur a cikin sabon tsarin:
    create or replace table iot_sensors
    (sensorId number,
    sensorType varchar,
    internetIP varchar,
    connectionTime timestamp_ntz,
    currentTemperature number
    );

Ƙirƙiri rafin bayanan Kinesis

Cika matakai masu zuwa don ƙirƙirar rafin bayananku:

  • A kan Kinesis Data Streams console, zaɓi Rafukan bayanai a cikin hanyar kewayawa.
  • zabi Ƙirƙiri rafin bayanai.
  • Don sunan rafin Data, shigar da suna (misali, KDS-Demo-Stream).
  • Bar sauran saitunan azaman tsoho.
  • Zaɓi Ƙirƙiri rafin bayanai.

Ƙirƙiri rafi isar da Wuta

Cika waɗannan matakan don ƙirƙirar rafin isar da Wuta tare da Kinesis Data Streams a matsayin tushen da Snowflake a matsayin makomarsa:

  • A kan Amazon Data Firehose console, zaɓi Ƙirƙiri rafi na Wuta.
  • Ma source, i Amazon Kinesis Data Streams.
  • Ma manufa, i Snowflake.
  • Ma Kinesis data rafi, bincika zuwa rafin bayanan da kuka ƙirƙira a baya.
  • Ma Sunan rafin wuta, bar tsoho da aka samar ko shigar da sunan abin da kuka fi so.
  • A karkashin Saitunan haɗi, bayar da waɗannan bayanan don haɗa Amazon Data Firehose zuwa Snowflake:
    • Ma URL asusun dusar ƙanƙara, shigar da adireshin asusun ku na Snowflake.
    • Ma Mai amfani, shigar da sunan mai amfani da aka samar a cikin abubuwan da ake buƙata.
    • Ma Maballin sirri, shigar da keɓaɓɓen maɓallin da aka samar a cikin abubuwan da ake buƙata. Tabbatar cewa maɓallin keɓaɓɓen yana cikin tsarin PKCS8. Kar a haɗa da PEM header-BEGIN prefix kuma footer-END kari a matsayin ɓangaren maɓalli na sirri. Idan maɓalli ya rabu a kan layuka da yawa, cire raguwar layin.
    • Ma Aikin, zaɓi Yi amfani da rawar Snowflake na al'ada kuma shigar da aikin IAM wanda ke da damar yin rubutu zuwa teburin bayanai.

Kuna iya haɗawa zuwa Snowflake ta amfani da haɗin jama'a ko na sirri. Idan ba ku samar da ƙarshen ƙarshen VPC ba, yanayin haɗin kai na jama'a ne. Don ba da izinin jeri na Firehose IPs a cikin manufofin cibiyar sadarwar ku na Snowflake, koma zuwa Zaɓi Dusar ƙanƙara don Makomarku. Idan kana amfani da hanyar haɗin yanar gizo mai zaman kansa, samar da ID na VPCE ta amfani da shi SYSTEM$GET_PRIVATELINK_CONFIG:

select SYSTEM$GET_PRIVATELINK_CONFIG();

Wannan aikin yana dawo da wakilcin JSON na bayanan asusun Snowflake da ake buƙata don sauƙaƙe tsarin aikin kai na haɗin kai na sirri zuwa sabis na Snowflake, kamar yadda aka nuna a hoton sikirin mai zuwa.

  • Don wannan sakon, muna amfani da hanyar haɗin yanar gizo, don haka VPCE ID, shigar da VPCE ID.
  • A karkashin Saitunan saitin bayanai, shigar da bayanan ku na Snowflake, makirci, da sunayen tebur.
  • a cikin Saitunan ajiyar waje sashe, don S3 madadin guga, shigar da guga da kuka ƙirƙira a matsayin ɓangare na abubuwan da ake buƙata.
  • zabi Ƙirƙiri rafi na Wuta.

A madadin, za ka iya amfani da wani AWS Cloud Formation samfuri don ƙirƙirar rafin isar da wutar lantarki tare da Snowflake azaman makoma maimakon amfani da na'urar wasan bidiyo ta Amazon Data Firehose.

Don amfani da tarin CloudFormation, zaɓi

BDB-4100-CFN-Launch-Tari

Ƙirƙirar bayanan rafi samfurin samfur
Ƙirƙirar bayanan rafi daga KDG tare da rafin bayanan Kinesis da kuka ƙirƙira:

{ 
"sensorId": {{random.number(999999999)}}, 
"sensorType": "{{random.arrayElement( ["Thermostat","SmartWaterHeater","HVACTemperatureSensor","WaterPurifier"] )}}", 
"internetIP": "{{internet.ip}}", 
"connectionTime": "{{date.now("YYYY-MM-DDTHH:m:ss")}}", 
"currentTemperature": {{random.number({"min":10,"max":150})}} 
}

Tambayi teburin Snowflake

Tambayi teburin Snowflake:

select * from adf_snf.kds_blog.iot_sensors;

Kuna iya tabbatar da cewa bayanan da KDG suka samar da aka aika zuwa Kinesis Data Streams ana loda su a cikin tebur na Snowflake ta Amazon Data Firehose.

Shirya matsala

Idan ba a ɗora bayanai cikin Kinesis Data Steams ba bayan KDG ya aika bayanai zuwa rafin isar da Wuta, sabunta kuma tabbatar kun shiga cikin KDG.

Idan kun yi wasu canje-canje ga ma'anar teburin wurin Snowflake, sake ƙirƙirar rafin isar da Wuta.

Tsaftacewa

Don gujewa jawo cajin da aka yi a gaba, share albarkatun da kuka ƙirƙira a matsayin ɓangare na wannan darasi idan ba kwa shirin yin amfani da su gaba.

Kammalawa

Amazon Data Firehose yana ba da hanya madaidaiciya don isar da bayanai zuwa Snowpipe Streaming, yana ba ku damar adana farashi da rage jinkiri zuwa daƙiƙa. Don gwada Amazon Kinesis Firehose tare da Snowflake, koma zuwa Amazon Data Firehose tare da Snowflake azaman dakin bincike.


Game da Authors

Swapna Bandla Babban Babban Magani ne Architect a cikin AWS Analytics Specialist SA Team. Swapna yana da sha'awar fahimtar bayanan abokan ciniki da buƙatun ƙirƙira da ƙarfafa su don haɓaka ingantaccen tushen tushen girgije. A wajen aiki, tana jin daɗin zama tare da danginta.

Mustapha Mansur Babban Manajan Samfur ne - Tech a Sabis na Yanar Gizo na Amazon inda yake aiki akan Amazon Kinesis Data Firehose. Ya ƙware wajen haɓaka ƙwarewar samfuri masu ƙima waɗanda ke magance ƙalubale masu rikitarwa ga abokan ciniki a sikelin. Lokacin da ba shi da wahala a aiki akan Amazon Kinesis Data Firehose, wataƙila za ku sami Mostafa a kan kotun squash, inda yake son ɗaukar masu kalubalanci kuma ya cika faɗuwar sa.

Bosco Albuquerque shi ne Sr. Partner Solutions Architect a AWS kuma yana da fiye da shekaru 20 na gwaninta aiki tare da bayanan bayanai da samfurori daga masu sayar da bayanan kasuwanci da masu samar da girgije. Ya taimaka wa kamfanonin fasaha tsarawa da aiwatar da hanyoyin nazarin bayanai da samfurori.

tabs_img

Sabbin Hankali

tabs_img