Explore and configure the Azure Machine Learning workspace
Source: My personal notes from Course DP-100 Designing and implementing a data science solution on Azure - Microsoft Learn with labs from MicrosoftLearning/mslearn-azure-ml · GitHub
About Azure Machine Learning
Section titled “About Azure Machine Learning”Similar products are Azure Foundry (machine learning and generative models) and Azure Machine Learning (training models).
It manages a workspace for working with data (assets), compute resources, and integration with developer tools (Python, Azure CLI, VS Code). Data can be uploaded and external data storage.
Access is managed by Azure role based access control.
Compute and Options
Section titled “Compute and Options”- Compute instances - single computer
- Compute cluster - multiple computers to run jobs
- Kubernetes cluster - run containers for scaling and different environments
- Serverless - good for small jobs, experiments
-
Choosing compute
Use Case Compute Small jobs, experience Serverless Training, pipelines Serverless, Instances, Clusters Regular workload Instances, Clusters Can be configured to start and stop.
Uniform Resource Identifier (URI) are locations of data. Azure ML uses URIs to connect to data directly such as https, abfs, and others.
Assets
Section titled “Assets”- Data used for training and updates. Can be reused and versioned. Examples or URI file, URI folder, and MLTable file with schema to read tabular data.
- Environments for packages and scripts
- Custom environments are supported with container images, build
context like
requirements.txtand Conda specification.
- Custom environments are supported with container images, build
context like
- Models are trained models in the workspace
- Components for sharing code
Development
Section titled “Development”Model training options include one or combinations:
- Automated ML
- Explore algorithms and hyperparameter (a parameter that can be set in order to define any configurable part of a model’s learning process) values
- Experiment by running notebooks, running scripts
- Tune hyperparameters
- Run pipelines of multiple scripts or components
- Optimization of code and inputs
Configuration and Setup, User Interface
Section titled “Configuration and Setup, User Interface”Can be done in web interface, using Azure CLI, and IDE integration like VS Code’s Azure Machine Learning extension.
Notebooks can be converted to scripts, set up in a job and then optimized and parameters set up.
Lab: Explore the Azure Machine Learning workspace
Section titled “Lab: Explore the Azure Machine Learning workspace”- Create a new Azure machine learning workspace in a region close to you, leave defaults which will create an Azure Machine Learning workspace, an Application Insights, a Key Vault, and a Storage Account.
- Open the workspace in Azure ML Studio and see Authoring, Assets, and Manage parts of ML solutions.
- Create a new Automated ML job using
Classificationand upload diabetes data for training. - Use
Diabetic (boolean)as target column andAccuracyas primary metric - Set other experiment settings and compute VM
- Submit the job and when finished, review the run and output.
- For example, in one run, the algorithm selected was
VotingEnsemble
- For example, in one run, the algorithm selected was
Lab: Explore developer tools for ML workspace
Section titled “Lab: Explore developer tools for ML workspace”Log into the Azure CLI to:
- Install ML extension. Use the Azure Cloud shell to avoid issues with extension installations.
- Create the ML workspace and compute VM
Example commands
az loginaz extension remove -n azure-cli-mlaz extension remove -n mlaz extension add -n ml -y
# Create group and workspaceaz group create --name "rg-dp100-labs" --location "eastus"az ml workspace create --name "mlw-dp100-labs60228811" -g "rg-dp100-labs"
# Create compute and compute clusteraz ml compute create --name "ci60228811" --size STANDARD_DS11_V2 --type ComputeInstance -w mlw-dp100-labs60228811 -g rg-dp100-labsaz ml compute create --name "aml-cluster" --size STANDARD_DS11_V2 --max-instances 2 --type AmlCompute -w mlw-dp100-labs60228811 -g rg-dp100-labsRun the Labs/02/Run training script.ipynb notebook. For notebook run,
example config.json to store in same directory as notebook.
{ "subscription_id": "1805fe6f-51e4-4eb0-abc3-4a01d91fa842", "resource_group": "rg-dp100-labs", "workspace_name": "mlw-dp100-labs60228811"}Script will import libraries needs for data, calculation, and trianing. It will load data, select classification, split data for testing, set features and label (Diabetic), hyperparameters, and calculate metrics. After setup, script will submit job using script using compute resources.
Lab: Find the best classification model with Automated Machine Learning
Section titled “Lab: Find the best classification model with Automated Machine Learning”- Create a workspace and associated compute using shell scripts and Azure CLI
- Run scripts to set up training job with specific settings and data
- Automated Machine Learning (AutoML) will automatically try different algorithms to find the best fit.
- Review the results of the jobs, models, and accuracy results, area under the curve (AUC) and other metrics