Logo Zephyrnet

Ṣe igbesoke faaji data rẹ pẹlu ṣiṣanwọle akoko gidi ni lilo Amazon Data Firehose ati Snowflake | Amazon Web Services

ọjọ:

Aye ti o yara ti ode oni nbeere awọn oye ti akoko ati awọn ipinnu, eyiti o n ṣe pataki ti data ṣiṣanwọle. Awọn data ṣiṣan n tọka si data ti o jẹ ipilẹṣẹ nigbagbogbo lati awọn orisun oriṣiriṣi. Awọn orisun ti data yii, gẹgẹbi awọn iṣẹlẹ tẹ, iyipada data (CDC), ohun elo ati awọn igbasilẹ iṣẹ, ati Intanẹẹti ti Awọn ohun (IoT) awọn ṣiṣan data n pọ sii. Snowflake nfunni awọn aṣayan meji lati mu data ṣiṣan wa sinu pẹpẹ rẹ: Snowpipe ati Snowflake Snowpipe ṣiṣan. Snowpipe dara fun jijẹ faili (batching) awọn ọran lilo, gẹgẹbi ikojọpọ awọn faili nla lati Iṣẹ Ifipamọ Simple Amazon (Amazon S3) si Snowflake. Sisanwọle Snowpipe, ẹya tuntun ti o tu silẹ ni Oṣu Kẹta ọdun 2023, jẹ deede fun awọn ọran lilo rooset (sisanwọle), gẹgẹbi ikojọpọ ṣiṣan data ti nlọsiwaju lati ọdọ Amazon Kinesis Data ṣiṣan or Ṣiṣanwọle iṣakoso Amazon fun Apache Kafka (Amazon MSK).

Ṣaaju ṣiṣan Snowpipe, awọn alabara AWS lo Snowpipe fun awọn ọran lilo mejeeji: jijẹ faili ati ingestion rowset. Ni akọkọ, o gba data ṣiṣanwọle si Kinesis Data Streams tabi Amazon MSK, lẹhinna lo Amazon Data Firehose lati ṣajọpọ ati kọ awọn ṣiṣan si Amazon S3, atẹle nipa lilo Snowpipe lati gbe data naa sinu Snowflake. Sibẹsibẹ, ilana igbesẹ pupọ yii le ja si awọn idaduro ti to wakati kan ṣaaju ki data wa fun itupalẹ ni Snowflake. Pẹlupẹlu, o gbowolori, paapaa nigbati o ba ni awọn faili kekere ti Snowpipe ni lati gbejade si iṣupọ alabara Snowflake.

Lati yanju ọrọ yii, Amazon Data Firehose ni bayi ṣepọ pẹlu Snowpipe ṣiṣanwọle, ti o fun ọ laaye lati mu, yipada, ati fi awọn ṣiṣan data lati Kinesis Data Streams, Amazon MSK, ati Firehose Direct PUT si Snowflake ni awọn iṣẹju-aaya ni iye owo kekere. Pẹlu awọn titẹ diẹ lori Amazon Data Firehose console, o le ṣeto ṣiṣan Firehose kan lati fi data ranṣẹ si Snowflake. Ko si awọn adehun tabi awọn idoko-owo iwaju lati lo Amazon Data Firehose, ati pe o sanwo nikan fun iye data ṣiṣanwọle.

Diẹ ninu awọn ẹya bọtini ti Amazon Data Firehose pẹlu:

  • Iṣẹ ti ko ni olupin ni iṣakoso ni kikun - O ko nilo lati ṣakoso awọn ohun elo, ati Amazon Data Firehose ṣe iwọn laifọwọyi lati baamu iṣelọpọ ti orisun data rẹ laisi iṣakoso ti nlọ lọwọ.
  • Taara lati lo laisi koodu – O ko nilo lati kọ awọn ohun elo.
  • Ifijiṣẹ data akoko-gidi - O le gba data si awọn ibi rẹ ni iyara ati daradara ni iṣẹju-aaya.
  • Ijọpọ pẹlu awọn iṣẹ AWS to ju 20 lọ - Isopọpọ ailopin wa fun ọpọlọpọ awọn iṣẹ AWS, gẹgẹbi Kinesis Data Streams, Amazon MSK, Amazon VPC Flow Logs, AWS WAF logs, Amazon CloudWatch Logs, Amazon EventBridge, AWS IoT Core, ati siwaju sii.
  • Sanwo-bi-o-lọ awoṣe - Iwọ nikan sanwo fun iwọn data ti Amazon Data Firehose lakọkọ.
  • Asopọmọra - Amazon Data Firehose le sopọ si gbogbo eniyan tabi awọn subnets ikọkọ ninu VPC rẹ.

Ifiweranṣẹ yii n ṣalaye bi o ṣe le mu data ṣiṣanwọle lati AWS sinu Snowflake laarin iṣẹju-aaya lati ṣe awọn atupale ilọsiwaju. A ṣawari awọn ile-iṣọpọ ti o wọpọ ati ṣe apejuwe bi o ṣe le ṣeto koodu kekere kan, ti ko ni olupin, iye owo-doko fun sisanwọle data-kekere.

Akopọ ti ojutu

Awọn atẹle ni awọn igbesẹ lati ṣe imuse ojutu lati san data lati AWS si Snowflake:

  1. Ṣẹda aaye data Snowflake, eto, ati tabili.
  2. Ṣẹda ṣiṣan data Kinesis kan.
  3. Ṣẹda ṣiṣan ifijiṣẹ Firehose pẹlu Awọn ṣiṣan Data Kinesis bi orisun ati Snowflake bi opin irin ajo rẹ nipa lilo ọna asopọ ikọkọ to ni aabo.
  4. Lati ṣe idanwo iṣeto naa, ṣe ipilẹṣẹ data ṣiṣan ayẹwo lati inu Amazon Kinesis Data monomono (KDG) pẹlu ṣiṣan ifijiṣẹ Firehose bi opin irin ajo naa.
  5. Beere tabili Snowflake lati fọwọsi data ti o kojọpọ sinu Snowflake.

Ojutu naa jẹ afihan ninu aworan atọka atẹle yii.

Prerequisites

O yẹ ki o ni awọn ibeere wọnyi:

Ṣẹda aaye data Snowflake, eto, ati tabili

Pari awọn igbesẹ wọnyi lati ṣeto data rẹ ni Snowflake:

  • Wọle si akọọlẹ Snowflake rẹ ki o ṣẹda data data:
    create database adf_snf;

  • Ṣẹda eto kan ninu aaye data tuntun:
    create schema adf_snf.kds_blog;

  • Ṣẹda tabili ni ero tuntun:
    create or replace table iot_sensors
    (sensorId number,
    sensorType varchar,
    internetIP varchar,
    connectionTime timestamp_ntz,
    currentTemperature number
    );

Ṣẹda ṣiṣan data Kinesis kan

Pari awọn igbesẹ wọnyi lati ṣẹda ṣiṣan data rẹ:

  • Lori Kinesis Data Streams console, yan Awọn ṣiṣan data ninu ohun elo lilọ kiri.
  • yan Ṣẹda ṣiṣan data.
  • Fun orukọ ṣiṣan data, tẹ orukọ sii (fun apẹẹrẹ, KDS-Demo-Stream).
  • Fi awọn eto to ku silẹ bi aiyipada.
  • Yan Ṣẹda ṣiṣan data.

Ṣẹda ṣiṣan ifijiṣẹ Firehose

Pari awọn igbesẹ wọnyi lati ṣẹda ṣiṣan ifijiṣẹ Firehose pẹlu Awọn ṣiṣan Data Kinesis bi orisun ati Snowflake bi opin irin ajo rẹ:

  • Lori Amazon Data Firehose console, yan Ṣẹda Firehose ṣiṣan.
  • fun orisun, yan Amazon Kinesis Data ṣiṣan.
  • fun nlo, yan Snowflake.
  • fun Kinesis data ṣiṣan, lọ kiri si ṣiṣan data ti o ṣẹda tẹlẹ.
  • fun Firehose san orukọ, Fi orukọ ti ipilẹṣẹ aiyipada silẹ tabi tẹ orukọ ti o fẹ sii.
  • labẹ Eto asopọ, pese alaye atẹle lati so Amazon Data Firehose pọ si Snowflake:
    • fun URL akọọlẹ Snowflake, tẹ URL akọọlẹ Snowflake rẹ sii.
    • fun User, tẹ orukọ olumulo ti ipilẹṣẹ ni awọn ohun pataki ṣaaju.
    • fun Bọtini ikọkọ, tẹ bọtini ikọkọ ti ipilẹṣẹ ni awọn ohun pataki ṣaaju. Rii daju pe bọtini ikọkọ wa ni ọna kika PKCS8. Ma ṣe pẹlu PEM naa header-BEGIN ìpele ati footer-END suffix gẹgẹbi apakan ti bọtini ikọkọ. Ti bọtini ba pin kọja awọn laini pupọ, yọ awọn fifọ laini kuro.
    • fun ipa, yan Lo ipa Snowflake aṣa ki o si tẹ ipa IAM ti o ni iwọle si kikọ si tabili data data.

O le sopọ si Snowflake nipa lilo gbogbo eniyan tabi ikọkọ Asopọmọra. Ti o ko ba pese aaye ipari VPC kan, ipo Asopọmọra aiyipada jẹ ti gbogbo eniyan. Lati gba atokọ Firehose IPs laaye ninu eto nẹtiwọọki Snowflake rẹ, tọka si Yan Snowflake fun Nlo Rẹ. Ti o ba nlo URL ọna asopọ ikọkọ, pese ID VPCE ni lilo SYSTEM$GET_PRIVATELINK_CONFIG:

select SYSTEM$GET_PRIVATELINK_CONFIG();

Iṣẹ yii da pada aṣoju JSON kan ti alaye akọọlẹ Snowflake pataki lati dẹrọ iṣeto iṣẹ ti ara ẹni ti isopọmọ ikọkọ si iṣẹ Snowflake, bi o ṣe han ninu sikirinifoto atẹle.

  • Fun ifiweranṣẹ yii, a nlo ọna asopọ ikọkọ, bẹ fun VPCE ID, tẹ VPCE ID.
  • labẹ Awọn eto iṣeto aaye data, tẹ ibi ipamọ data Snowflake rẹ, ero ati awọn orukọ tabili.
  • ni awọn Eto afẹyinti apakan, fun S3 afẹyinti garawa, tẹ awọn garawa ti o da bi ara ti awọn ṣaaju.
  • yan Ṣẹda Firehose ṣiṣan.

Ni omiiran, o le lo ohun kan AWS awọsanma Ibiyi awoṣe lati ṣẹda ṣiṣan ifijiṣẹ Firehose pẹlu Snowflake bi opin irin ajo dipo lilo console Amazon Data Firehose.

Lati lo akopọ CloudFormation, yan

BDB-4100-CFN-ifilole-Stack

Ṣe ina ayẹwo san data
Ṣe ipilẹṣẹ data ṣiṣan ayẹwo lati KDG pẹlu ṣiṣan data Kinesis ti o ṣẹda:

{ 
"sensorId": {{random.number(999999999)}}, 
"sensorType": "{{random.arrayElement( ["Thermostat","SmartWaterHeater","HVACTemperatureSensor","WaterPurifier"] )}}", 
"internetIP": "{{internet.ip}}", 
"connectionTime": "{{date.now("YYYY-MM-DDTHH:m:ss")}}", 
"currentTemperature": {{random.number({"min":10,"max":150})}} 
}

Beere awọn Snowflake tabili

Beere tabili Snowflake:

select * from adf_snf.kds_blog.iot_sensors;

O le jẹrisi pe data ti ipilẹṣẹ nipasẹ KDG ti a firanṣẹ si Kinesis Data ṣiṣan ti wa ni ti kojọpọ sinu Snowflake tabili nipasẹ Amazon Data Firehose.

Laasigbotitusita

Ti data ko ba kojọpọ sinu Kinesis Data Steams lẹhin ti KDG fi data ranṣẹ si ṣiṣan ifijiṣẹ Firehose, sọtun ki o rii daju pe o wọle si KDG.

Ti o ba ṣe awọn ayipada eyikeyi si asọye tabili opin irin ajo Snowflake, tun ṣe ṣiṣan ifijiṣẹ Firehose.

Nu kuro

Lati yago fun awọn idiyele ọjọ iwaju, paarẹ awọn orisun ti o ṣẹda gẹgẹbi apakan ti adaṣe yii ti o ko ba gbero lati lo wọn siwaju sii.

ipari

Amazon Data Firehose n pese ọna titọ lati fi data ranṣẹ si Snowpipe ṣiṣanwọle, mu ọ laaye lati ṣafipamọ awọn idiyele ati dinku idaduro si awọn aaya. Lati gbiyanju Amazon Kinesis Firehose pẹlu Snowflake, tọka si Amazon Data Firehose pẹlu Snowflake bi laabu opin irin ajo.


Nipa awọn onkọwe

Swapna Bandla jẹ Onitumọ Awọn Solusan Agba ni Ẹgbẹ Aṣoju Itupalẹ AWS SA. Swapna ni ifẹ si agbọye data awọn alabara ati awọn iwulo atupale ati fi agbara fun wọn lati ṣe agbekalẹ awọn solusan ti o da lori awọsanma ti o da lori daradara. Ni ita iṣẹ, o gbadun lilo akoko pẹlu ẹbi rẹ.

Mostafa Mansour jẹ Oluṣakoso Ọja Alakoso - Tekinoloji ni Awọn Iṣẹ Oju opo wẹẹbu Amazon nibiti o ṣiṣẹ lori Amazon Kinesis Data Firehose. O ṣe amọja ni idagbasoke awọn iriri ọja inu inu ti o yanju awọn italaya eka fun awọn alabara ni iwọn. Nigbati o ko ba ni lile ni iṣẹ lori Amazon Kinesis Data Firehose, o le rii Mostafa lori ile-ẹjọ elegede, nibiti o fẹran lati mu lori awọn olutaja ati pe o ni pipe awọn dropshots rẹ.

Bosco Albuquerque jẹ Onitumọ Awọn Solusan Alabaṣepọ Sr. ni AWS ati pe o ni iriri diẹ sii ju ọdun 20 ti o ṣiṣẹ pẹlu data data ati awọn ọja atupale lati ọdọ awọn olutaja ibi ipamọ data ile-iṣẹ ati awọn olupese awọsanma. O ti ṣe iranlọwọ fun awọn ile-iṣẹ imọ-ẹrọ ṣe apẹrẹ ati imuse awọn solusan atupale data ati awọn ọja.

iranran_img

Titun oye

iranran_img