Azure动手实验 - 使用Azure Data Factory 迁移数据

这篇具有很好参考价值的文章主要介绍了Azure动手实验 - 使用Azure Data Factory 迁移数据。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

该实验使用 Azure CosmosDB,这个实验的点在于:

1:使用了 cosmicworks 生成了实验数据

2:弄清楚cosmosDB 的 accout Name 与 database id 和 container id 关系。

3:创建了 ADF 的连接和任务,让数据从 cosmicworks 数据库的 products 容器,迁移到 cosmicworks数据库的 flatproducts 容器。

实验来自于:练习:使用 Azure 数据工厂迁移现有数据 - Training | Microsoft Learn

Migrate existing data using Azure Data Factory

In Azure Data Factory, Azure Cosmos DB is supported as a source of data ingest and as a target (sink) of data output.

In this lab, we will populate Azure Cosmos DB using a helpful command-line utility and then use Azure Data Factory to move a subset of data from one container to another.

Create and seed your Azure Cosmos DB SQL API account

You will use a command-line utility that creates a cosmicworks database and a products container at 4,000 request units per second (RU/s). Once created, you will adjust the throughput down to 400 RU/s.

To accompany the products container, you will create a flatproducts container manually that will be the target of the ETL transformation and load operation at the end of this lab.

  1. In a new web browser window or tab, navigate to the Azure portal (portal.azure.com).

  2. Sign into the portal using the Microsoft credentials associated with your subscription.

  3. Select + Create a resource, search for Cosmos DB, and then create a new Azure Cosmos DB SQL API account resource with the following settings, leaving all remaining settings to their default values:

    Setting Value
    Subscription Your existing Azure subscription
    Resource group Select an existing or create a new resource group
    Account Name Enter a globally unique name
    Location Choose any available region
    Capacity mode Provisioned throughput
    Apply Free Tier Discount Do Not Apply
    Limit the total amount of throughput that can be provisioned on this account Unchecked

    📝 Your lab environments may have restrictions preventing you from creating a new resource group. If that is the case, use the existing pre-created resource group.

  4. Wait for the deployment task to complete before continuing with this task.

  5. Go to the newly created Azure Cosmos DB account resource and navigate to the Keys pane.

  6. This pane contains the connection details and credentials necessary to connect to the account from the SDK. Specifically:

    1. Record the value of the URI field. You will use this endpoint value later in this exercise.

    2. Record the value of the PRIMARY KEY field. You will use this key value later in this exercise.

  7. Close your web browser window or tab.

  8. Start Visual Studio Code.

    📝 If you are not already familiar with the Visual Studio Code interface, review the Get Started guide for Visual Studio Code

  9. In Visual Studio Code, open the Terminal menu and then select New Terminal to open a new terminal instance.

  10. Install the cosmicworks command-line tool for global use on your machine.

    dotnet tool install --global cosmicworks

    💡 This command may take a couple of minutes to complete. This command will output the warning message (*Tool 'cosmicworks' is already installed') if you have already installed the latest version of this tool in the past.

  11. Run cosmicworks to seed your Azure Cosmos DB account with the following command-line options:

    Option Value
    --endpoint The endpoint value you copied earlier in this lab
    --key The key value you coped earlier in this lab
    --datasets product
    cosmicworks --endpoint <cosmos-endpoint> --key <cosmos-key> --datasets product

    📝 For example, if your endpoint is: https­://dp420.documents.azure.com:443/ and your key is: fDR2ci9QgkdkvERTQ==, then the command would be: cosmicworks --endpoint https://dp420.documents.azure.com:443/ --key fDR2ci9QgkdkvERTQ== --datasets product

  12. Wait for the cosmicworks command to finish populating the account with a database, container, and items.

  13. Close the integrated terminal.

  14. Close Visual Studio Code.

  15. In a new web browser window or tab, navigate to the Azure portal (portal.azure.com).

  16. Sign into the portal using the Microsoft credentials associated with your subscription.

  17. Select Resource groups, then select the resource group you created or viewed earlier in this lab, and then select the Azure Cosmos DB account resource you created in this lab.

  18. Within the Azure Cosmos DB account resource, navigate to the Data Explorer pane.

  19. In the Data Explorer, expand the cosmicworks database node, expand the products container node, and then select Items.

  20. Observe and select the various JSON items in the products container. These are the items created by the command-line tool used in previous steps.

  21. Select the Scale & Settings node. In the Scale & Settings tab, select Manual, update the required throughput setting from 4000 RU/s to 400 RU/s and then Save your changes**.

  22. In the Data Explorer pane, select New Container.

  23. In the New Container popup, enter the following values for each setting, and then select OK:

    Setting Value
    Database id Use existing | cosmicworks
    Container id flatproducts
    Partition key /category
    Container throughput (autoscale) Manual
    RU/s 400
  24. Back in the Data Explorer pane, expand the cosmicworks database node and then observe the flatproducts container node within the hierarchy.

  25. Return to the Home of the Azure portal.

Create Azure Data Factory resource

Now that the Azure Cosmos DB SQL API resources are in place, you will create an Azure Data Factory resource and configure all of the necessary components and connections to perform a one-time data movement from one SQL API container to another to extract data, transform it, and load it to another SQL API container.

  1. Select + Create a resource, search for Data Factory, and then create a new Azure Data Factory resource with the following settings, leaving all remaining settings to their default values:

    Setting Value
    Subscription Your existing Azure subscription
    Resource group Select an existing or create a new resource group
    Name Enter a globally unique name
    Region Choose any available region
    Version V2
    Git configuration Configure Git later

    📝 Your lab environments may have restrictions preventing you from creating a new resource group. If that is the case, use the existing pre-created resource group.

  2. Wait for the deployment task to complete before continuing with this task.

  3. Go to the newly created Azure Data Factory resource and select Open Azure Data Factory Studio.

    💡 Alternatively, you can navigate to (adf.azure.com/home), select your newly created Data Factory resource, and then select the home icon.

  4. From the home screen. Select the Ingest option to begin the quick wizard to perform a one-time copy data at scale operation and move to the Properties step of the wizard.

  5. Starting with the Properties step of the wizard, in the Task type section, select Built-in copy task.

  6. In the Task cadence or task schedule section, select Run once now and then select Next to move to the Source step of the wizard.

  7. In the Source step of the wizard, in the Source type list, select Azure Cosmos DB (SQL API).

  8. In the Connection section, select + New connection.

  9. In the New connection (Azure Cosmos DB (SQL API)) popup, configure the new connection with the following values, and then select Create:

    Setting Value
    Name CosmosSqlConn
    Connect via integration runtime AutoResolveIntegrationRuntime
    Authentication method Account key | Connection string
    Account selection method From Azure subscription
    Azure subscription Your existing Azure subscription
    Azure Cosmos DB account name Your existing Azure Cosmos DB account name you chose earlier in this lab
    Database name cosmicworks
  10. Back in the Source data store section, within the Source tables section, select Use query.

  11. In the Table name list, select products.

  12. In the Query editor, delete the existing content and enter the following query:

    SELECT 
        p.name, 
        p.categoryName as category, 
        p.price 
    FROM 
        products p
  13. Select Preview data to test the query's validity. Select Next to move to the Target step of the wizard.

  14. In the Target step of the wizard, in the Target type list, select Azure Cosmos DB (SQL API).

  15. In the Connection list, select CosmosSqlConn.

  16. In the Target list, select flatproducts and then select Next to move to the Settings step of the wizard.

  17. In the Settings step of the wizard, in the Task name field, enter FlattenAndMoveData.

  18. Leave all remaining fields to their default blank values and then select Next to move to the final step of the wizard.

  19. Review the Summary of the steps you have selected in the wizard and then select Next.

  20. Observe the various steps in the deployment. When the deployment has finished, select Finish.

  21. Close your web browser window or tab.

  22. In a new web browser window or tab, navigate to the Azure portal (portal.azure.com).

  23. Sign into the portal using the Microsoft credentials associated with your subscription.

  24. Select Resource groups, then select the resource group you created or viewed earlier in this lab, and then select the Azure Cosmos DB account resource you created in this lab.

  25. Within the Azure Cosmos DB account resource, navigate to the Data Explorer pane.

  26. In the Data Explorer, expand the cosmicworks database node, select the flatproducts container node, and then select New SQL Query.

  27. Delete the contents of the editor area.

  28. Create a new SQL query that will return all documents where the name is equivalent to HL Headset:

    SELECT 
        p.name, 
        p.category, 
        p.price 
    FROM
        products p
    WHERE
        p.name = 'HL Headset'
  29. Select Execute Query.

  30. Observe the results of the query.文章来源地址https://www.toymoban.com/news/detail-439606.html

到了这里,关于Azure动手实验 - 使用Azure Data Factory 迁移数据的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • 【Microsoft Azure 的1024种玩法】三十四.将本地数据文件快速迁移到Azure Blob云存储最佳实践

    AzCopy 是一个Azure提供的一款命令行工具,我们可通过简单命令将本地的数据快速复制到 Azure Blob 存储中,本文主要讲述了如何通过AzCopy 工具将本地数据文件快速迁移到Azure Blob云存储 【Microsoft Azure 的1024种玩法】一.一分钟快速上手搭建宝塔管理面板 【Microsoft Azure 的1024种玩法

    2024年02月09日
    浏览(51)
  • 利用 Azure Data Bricks的免费资源学习云上大数据

    在这个数据驱动的时代,大数据和云计算已成为推动技术创新和商业智能的关键因素。Azure Databricks,作为一个先进的云平台,为那些渴望深入了解和掌握这些技术的人们提供了一个理想的学习环境。我们这里将利用 Azure Databricks 的免费资源,探索和学习云上大数据的奥秘。

    2024年01月19日
    浏览(47)
  • 【Microsoft Azure 的1024种玩法】三十二. 利用 AzCopy来对Azure Blob Storage中的数据进行复制迁移

    AzCopy 是一个命令行实用工具,可用于向/从存储帐户复制 Blob 或文件,本文将使用AzCopy来对Azure Blob Storage之间数据复制迁移 【Microsoft Azure 的1024种玩法】一.一分钟快速上手搭建宝塔管理面板 【Microsoft Azure 的1024种玩法】二.基于Azure云平台的安全攻防靶场系统构建 【Microsoft A

    2024年02月04日
    浏览(64)
  • [mysql]数据迁移之data目录复制方法

    1、简述: mysql数据迁移有多种方式,最常见的就是先把数据库导出,然后导入新的数据库。拷贝数据目录data是另外一种方式。 尤其是当数据库启动不了,或者大型数据库迁移的时候,可以考虑这个方式。 2、场景: 从老的mysql( mysqlA )迁移到新的mysql( mysqlB )。mysqlA对应

    2024年02月15日
    浏览(48)
  • azure data studio SQL扩展插件开发笔记

    调试扩展,在visual studio code中安装插件即可 然后visual studio code打开进行修改运行即可 image.png 运行后自动打开auzre data studio了, 下面是我开发的扩展, image.png 下面是我的存储过程转sql的包 https://github.com/lozn00/AzureSQLProcConvertSQL/raw/master/StoredProcedureConverter-0.0.1.vsix 官网的介绍

    2024年02月10日
    浏览(39)
  • 自定义数据集使用llama_factory微调模型并导入ollama

    本文所有操作均在linux系统下完成 参考github的安装命令 参考github,使用以下命令启动LLaMA Factory web页面:(web ui界面只支持单卡运行,如需多卡微调请参考github相关部分) 此外可以选择模型下载源,这里推荐国内用户使用魔搭社区下载渠道。  成功启动后会进入web操作界面:

    2024年04月26日
    浏览(50)
  • 【新知实验室】TRTC腾讯实时音视频动手实验

    https://cloud.tencent.com/document/product/647/16788 应用 TRTC 通过应用的形式来管理不同的业务或项目。您可以在 TRTC 控制台 给不同的业务或项目分别创建不同的应用,从而实现业务或项目数据的隔离。每个腾讯云账号最多可以创建100个 TRTC 应用。 SDKAppID SDKAppID(应用标识/应用 ID)是腾

    2024年02月01日
    浏览(53)
  • Azure Machine Learning - 使用自己的数据与 Azure OpenAI 模型对话

    在本文中,可以将自己的数据与 Azure OpenAI 模型配合使用。 对数据使用 Azure OpenAI 模型可以提供功能强大的对话 AI 平台,从而实现更快、更准确的通信。 关注TechLead,分享AI全维度知识。作者拥有10+年互联网服务架构、AI产品研发经验、团队管理经验,同济本复旦硕,复旦机器

    2024年02月04日
    浏览(44)
  • AWS动手实验 - 快速搭建邮件营销平台

    通过 Amazon SES 作为发送邮件平台,整合开源邮件管理软件 listmonk 提供简洁的 UI 界面, 助力外贸、广告、物流等行业客户进行营销拓客,并带来 SES 移出沙盒最佳实践分享。 该实验资料 https://dev.amazoncloud.cn/activity/activityDetail?id=6375ea3998a34777881a46bd 该产品文档 listmonk电子邮件营销

    2024年02月11日
    浏览(55)
  • Azure - 机器学习:使用 Apache Spark 进行交互式数据整理

    关注TechLead,分享AI全维度知识。作者拥有10+年互联网服务架构、AI产品研发经验、团队管理经验,同济本复旦硕,复旦机器人智能实验室成员,阿里云认证的资深架构师,项目管理专业人士,上亿营收AI产品研发负责人。 数据整理已经成为机器学习项目中最重要的步骤之一。

    2024年02月08日
    浏览(50)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包